├── .gitignore
├── Makefile
├── README.md
├── data
└── raw
│ └── nietzsche.txt
├── notebooks
├── .gitignore
├── 0X_pytorch_in_googles_colab.ipynb
├── 0X_pytorch_in_googles_colab.py
├── 0X_teacher_forcing.ipynb
├── 0X_teacher_forcing.py
├── debugging.ipynb
├── debugging.py
├── foreword.ipynb
├── foreword.py
├── hooks.ipynb
├── hooks.py
├── img
│ ├── common_mistakes.png
│ ├── dynamic_graph.gif
│ ├── ml_debt.jpg
│ ├── pytorch-logo.png
│ ├── pytorch_logo.png
│ ├── pytorch_logo_flame.png
│ ├── software_vs_ml.png
│ ├── tensorboardx_demo.gif
│ ├── tensorboardx_demo2.gif
│ ├── the_real_reason.png
│ └── visdom.png
├── lin_reg.ipynb
├── lin_reg.py
├── machine_learning_101.ipynb
├── machine_learning_101.py
├── mean_shift_clustering.ipynb
├── mean_shift_clustering.py
├── pytorch_basics.ipynb
├── pytorch_basics.py
├── rnn_from_scratch.ipynb
├── rnn_from_scratch.py
├── storing_and_loading_models.ipynb
├── storing_and_loading_models.py
├── the_end.ipynb
├── the_end.py
├── torch_jit.ipynb
├── torch_jit.py
├── transfer_learning.ipynb
├── transfer_learning.py
├── visualize_model_loss_optimizer.ipynb
├── visualize_model_loss_optimizer.py
├── working_with_data.ipynb
└── working_with_data.py
└── requirements.txt
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | env/
12 | build/
13 | develop-eggs/
14 | dist/
15 | downloads/
16 | eggs/
17 | .eggs/
18 | lib/
19 | lib64/
20 | parts/
21 | sdist/
22 | var/
23 | wheels/
24 | *.egg-info/
25 | .installed.cfg
26 | *.egg
27 |
28 | # PyInstaller
29 | # Usually these files are written by a python script from a template
30 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
31 | *.manifest
32 | *.spec
33 |
34 | # Installer logs
35 | pip-log.txt
36 | pip-delete-this-directory.txt
37 |
38 | # Unit test / coverage reports
39 | htmlcov/
40 | .tox/
41 | .coverage
42 | .coverage.*
43 | .cache
44 | nosetests.xml
45 | coverage.xml
46 | *.cover
47 | .hypothesis/
48 |
49 | # Translations
50 | *.mo
51 | *.pot
52 |
53 | # Django stuff:
54 | *.log
55 | local_settings.py
56 |
57 | # Flask stuff:
58 | instance/
59 | .webassets-cache
60 |
61 | # Scrapy stuff:
62 | .scrapy
63 |
64 | # Sphinx documentation
65 | docs/_build/
66 |
67 | # PyBuilder
68 | target/
69 |
70 | # Jupyter Notebook
71 | .ipynb_checkpoints
72 |
73 | # pyenv
74 | .python-version
75 |
76 | # celery beat schedule file
77 | celerybeat-schedule
78 |
79 | # SageMath parsed files
80 | *.sage.py
81 |
82 | # dotenv
83 | .env
84 |
85 | # virtualenv
86 | .venv
87 | venv/
88 | ENV/
89 |
90 | # Spyder project settings
91 | .spyderproject
92 | .spyproject
93 |
94 | # Rope project settings
95 | .ropeproject
96 |
97 | # mkdocs documentation
98 | /site
99 |
100 | # mypy
101 | .mypy_cache/
102 |
103 | data/raw
104 | *.zip
105 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | PY = $(wildcard notebooks/*.py)
2 | IPYNB := $(patsubst notebooks/%.py,notebooks/%.ipynb,$(PY))
3 |
4 | run_notebooks: $(IPYNB)
5 |
6 | notebooks/%.ipynb: notebooks/%.py
7 | @echo $@
8 | jupytext --to py:percent $^
9 | jupyter nbconvert --execute --to notebook --inplace $@
10 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
PyTorch Tutorial
2 | ================================================================================
3 |
4 | This repository contains material to get started with
5 | [PyTorch](https://pytorch.org/) v1.7.
6 | It was the base for this
7 | [[pytorch tutorial]](https://nodata.science/pydata-pytorch-tutorial.html)
8 | from PyData Berlin 2018.
9 |
10 |
11 |
12 | Table of Contents
13 | --------------------------------------------------------------------------------
14 |
15 | ### PART 0 - Foreword
16 | - [Foreword](notebooks/foreword.ipynb) - Why PyTorch and why not? Why this talk?
17 |
18 | ### PART 1 - Basics
19 | - [PyTorch basics](notebooks/pytorch_basics.ipynb) - tensors, GPU, autograd -
20 | [open in colab](https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/pytorch_basics.ipynb)
21 | - [Debugging](notebooks/debugging.ipynb) -
22 | [open in colab](https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/debugging.ipynb)
23 | - [Example: linear regression](notebooks/lin_reg.ipynb) -
24 | [open in colab](https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/lin_reg.ipynb)
25 | - [Storing and loading models](notebooks/storing_and_loading_models.ipynb) -
26 | [open in colab](https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/storing_and_loading_models.ipynb)
27 | - [Working with data](notebooks/working_with_data.ipynb) - `Dataset`, `DataLoader`, `Sampler`, `transforms` -
28 | [open in colab](https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/working_with_data.ipynb)
29 |
30 | ### PART 2 - Computer Vision
31 | - [Transfer Learning](notebooks/transfer_learning.ipynb) -
32 | [open in colab](https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/transfer_learning.ipynb)
33 |
34 | ### PART 3 - Misc, Cool Applications, Tips, Advanced
35 | - [Torch JIT](notebooks/torch_jit.ipynb) -
36 | [open in colab](https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/torch_jit.ipynb)
37 | - [Hooks](notebooks/hooks.ipynb) -
38 | register functions to be called during the forward and backward pass -
39 | [open in colab](https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/hooks.ipynb)
40 | - [Machine Learning 101 with numpy and PyTorch](notebooks/machine_learning_101.ipynb) -
41 | [open in colab](https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/machine_learning_101.ipynb)
42 | - [PyTorch + GPU in Google's Colab](notebooks/0X_pytorch_in_googles_colab.ipynb)
43 | - [Teacher Forcing](notebooks/0X_teacher_forcing.ipynb)
44 | - [RNNs from Scratch](notebooks/rnn_from_scratch.ipynb) -
45 | [open in colab](https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/rnn_from_scratch.ipynb)
46 | - [Mean Shift Clustering](notebooks/mean_shift_clustering.ipynb) -
47 | [open in colab](https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/mean_shift_clustering.ipynb)
48 |
49 | ### PART -2 - WIP and TODO
50 | - TODO `nn` and `nn.Module`
51 | - TODO Deployment
52 | - TODO Deployment with TF Serving
53 | - TODO `nn.init`
54 | - TODO PyTorch C++ frontend
55 |
56 | ### PART -1 - The End
57 | - [The End](notebooks/the_end.ipynb)
58 |
59 |
60 |
61 |
62 | Setup
63 | --------------------------------------------------------------------------------
64 |
65 | ### Requirements
66 |
67 | - Python 3.8
68 |
69 |
70 | ### Install Dependencies
71 |
72 | ```bash
73 | python3.8 -m venv .venv
74 | source .venv/bin/activate.fish
75 | pip install -r requirements.txt
76 | ```
77 |
78 | #### Optional
79 | Run the following to enable the [jupyter table of contents plugin](https://github.com/jupyterlab/jupyterlab-toc):
80 | ```bash
81 | jupyter labextension install @jupyterlab/toc
82 | ```
83 | jupyter nbextension enable --py widgetsnbextension
84 |
85 | ### Download data and models
86 |
87 | Download data and models for the tutorial:
88 |
89 | ```bash
90 | python download_data.py
91 | ```
92 |
93 | Then you should be ready to go.
94 | Start jupyter lab:
95 |
96 | ```bash
97 | jupyter lab
98 | ```
99 |
100 |
101 | Prior Versions
102 | --------------------------------------------------------------------------------
103 |
104 | - Version of this tutorial for the PyData 2018 conference:
105 | [[material]](https://github.com/sotte/pytorch_tutorial/tree/pydata2018)
106 | [[video]](https://nodata.science/pydata-pytorch-tutorial.html)
107 |
--------------------------------------------------------------------------------
/notebooks/.gitignore:
--------------------------------------------------------------------------------
1 | *.pt
2 | data/
3 | tf_log/
4 |
--------------------------------------------------------------------------------
/notebooks/0X_pytorch_in_googles_colab.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Using PyTorch + GPU/TPU in Google's Colab\n",
8 | "\n",
9 | "> Colaboratory is a Google research project created to help disseminate machine learning education and research. It's a Jupyter notebook environment that requires no setup to use and runs entirely in the cloud.\n",
10 | "> Colaboratory notebooks are stored in Google Drive and can be shared just as you would with Google Docs or Sheets. Colaboratory is free to use.\n",
11 | "> -- https://colab.research.google.com/notebooks/welcome.ipynb\n",
12 | "\n",
13 | "**Setup**\n",
14 | "- Go to https://colab.research.google.com\n",
15 | "- Create a new python 3 notebook\n",
16 | "- Enable the GPU: \"Edit -> Notebook settings -> Hardware accelerator: GPU -> Save\"\n",
17 | "- Then try the following:\n",
18 | "\n",
19 | "```python\n",
20 | "import torch\n",
21 | "\n",
22 | "print(torch.__version__)\n",
23 | "\n",
24 | "DEVICE = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
25 | "print(DEVICE)\n",
26 | "```"
27 | ]
28 | },
29 | {
30 | "cell_type": "markdown",
31 | "metadata": {},
32 | "source": [
33 | "You should get something like this:\n",
34 | "> 1.0.1.post2\n",
35 | ">\n",
36 | "> cuda"
37 | ]
38 | },
39 | {
40 | "cell_type": "markdown",
41 | "metadata": {},
42 | "source": [
43 | "# Using this Repo in Colab\n",
44 | "You can use this repo with google colab,\n",
45 | "but not all notebooks run without changes.\n",
46 | "Some notebooks import from `utils.py` which is not availbale on colab.\n",
47 | "You have to remove that line and copy and paste the required function/class into the notebook.\n",
48 | "\n",
49 | "It's easy to use colab. Simply append the url from the notebook on github to `https://colab.research.google.com/github/`. E.g. `notebooks/pytorch_basics.ipynb` is available under:\n",
50 | "https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/pytorch_basics.ipynb)"
51 | ]
52 | }
53 | ],
54 | "metadata": {
55 | "kernelspec": {
56 | "display_name": "Python 3",
57 | "language": "python",
58 | "name": "python3"
59 | },
60 | "language_info": {
61 | "codemirror_mode": {
62 | "name": "ipython",
63 | "version": 3
64 | },
65 | "file_extension": ".py",
66 | "mimetype": "text/x-python",
67 | "name": "python",
68 | "nbconvert_exporter": "python",
69 | "pygments_lexer": "ipython3",
70 | "version": "3.8.5"
71 | }
72 | },
73 | "nbformat": 4,
74 | "nbformat_minor": 2
75 | }
76 |
--------------------------------------------------------------------------------
/notebooks/0X_pytorch_in_googles_colab.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # Using PyTorch + GPU/TPU in Google's Colab
17 | #
18 | # > Colaboratory is a Google research project created to help disseminate machine learning education and research. It's a Jupyter notebook environment that requires no setup to use and runs entirely in the cloud.
19 | # > Colaboratory notebooks are stored in Google Drive and can be shared just as you would with Google Docs or Sheets. Colaboratory is free to use.
20 | # > -- https://colab.research.google.com/notebooks/welcome.ipynb
21 | #
22 | # **Setup**
23 | # - Go to https://colab.research.google.com
24 | # - Create a new python 3 notebook
25 | # - Enable the GPU: "Edit -> Notebook settings -> Hardware accelerator: GPU -> Save"
26 | # - Then try the following:
27 | #
28 | # ```python
29 | # import torch
30 | #
31 | # print(torch.__version__)
32 | #
33 | # DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
34 | # print(DEVICE)
35 | # ```
36 |
37 | # %% [markdown]
38 | # You should get something like this:
39 | # > 1.0.1.post2
40 | # >
41 | # > cuda
42 |
43 | # %% [markdown]
44 | # # Using this Repo in Colab
45 | # You can use this repo with google colab,
46 | # but not all notebooks run without changes.
47 | # Some notebooks import from `utils.py` which is not availbale on colab.
48 | # You have to remove that line and copy and paste the required function/class into the notebook.
49 | #
50 | # It's easy to use colab. Simply append the url from the notebook on github to `https://colab.research.google.com/github/`. E.g. `notebooks/pytorch_basics.ipynb` is available under:
51 | # https://colab.research.google.com/github/sotte/pytorch_tutorial/blob/master/notebooks/pytorch_basics.ipynb)
52 |
--------------------------------------------------------------------------------
/notebooks/0X_teacher_forcing.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# \"Teacher Forcing\"\n",
8 | "\n",
9 | "\"Teacher forcing\" is a method used in sequence2sequence models.\n",
10 | "It replaces a wrong words in the predicted sequence with the correct one.\n",
11 | "\n",
12 | "Think of a teacher that corrects your translation as soon as you say a wrong word to prevent you going off on a tangent."
13 | ]
14 | },
15 | {
16 | "cell_type": "markdown",
17 | "metadata": {},
18 | "source": [
19 | "Here is the pseudo code for teacher forcing:\n",
20 | "\n",
21 | "```python\n",
22 | "class Seq2SeqModel(nn.Module):\n",
23 | " def __init__(self, p_teacher_forcing: float):\n",
24 | " self.p_teacher_forcing = p_teacher_forcing\n",
25 | " # ...\n",
26 | " \n",
27 | " def forward(self, X, y):\n",
28 | " # ... some calculation\n",
29 | " current_word = torch.zeros(...)\n",
30 | " result = []\n",
31 | " for i in range(self.sentence_length):\n",
32 | " # ... some calculation with current_word\n",
33 | " result.append(output)\n",
34 | " current_word = torch.argmax(output)\n",
35 | " \n",
36 | " # teacher forcing\n",
37 | " if self.p_teacher_forcing > random.random():\n",
38 | " current_word = y[i]\n",
39 | " \n",
40 | " return torch.stack(result)\n",
41 | "```\n",
42 | "\n",
43 | "Reduce `p_teacher_forcing` during training and let it converge to 0."
44 | ]
45 | }
46 | ],
47 | "metadata": {
48 | "kernelspec": {
49 | "display_name": "Python 3",
50 | "language": "python",
51 | "name": "python3"
52 | },
53 | "language_info": {
54 | "codemirror_mode": {
55 | "name": "ipython",
56 | "version": 3
57 | },
58 | "file_extension": ".py",
59 | "mimetype": "text/x-python",
60 | "name": "python",
61 | "nbconvert_exporter": "python",
62 | "pygments_lexer": "ipython3",
63 | "version": "3.8.5"
64 | }
65 | },
66 | "nbformat": 4,
67 | "nbformat_minor": 2
68 | }
69 |
--------------------------------------------------------------------------------
/notebooks/0X_teacher_forcing.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # "Teacher Forcing"
17 | #
18 | # "Teacher forcing" is a method used in sequence2sequence models.
19 | # It replaces a wrong words in the predicted sequence with the correct one.
20 | #
21 | # Think of a teacher that corrects your translation as soon as you say a wrong word to prevent you going off on a tangent.
22 |
23 | # %% [markdown]
24 | # Here is the pseudo code for teacher forcing:
25 | #
26 | # ```python
27 | # class Seq2SeqModel(nn.Module):
28 | # def __init__(self, p_teacher_forcing: float):
29 | # self.p_teacher_forcing = p_teacher_forcing
30 | # # ...
31 | #
32 | # def forward(self, X, y):
33 | # # ... some calculation
34 | # current_word = torch.zeros(...)
35 | # result = []
36 | # for i in range(self.sentence_length):
37 | # # ... some calculation with current_word
38 | # result.append(output)
39 | # current_word = torch.argmax(output)
40 | #
41 | # # teacher forcing
42 | # if self.p_teacher_forcing > random.random():
43 | # current_word = y[i]
44 | #
45 | # return torch.stack(result)
46 | # ```
47 | #
48 | # Reduce `p_teacher_forcing` during training and let it converge to 0.
49 |
--------------------------------------------------------------------------------
/notebooks/debugging.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Debugging"
8 | ]
9 | },
10 | {
11 | "cell_type": "code",
12 | "execution_count": 1,
13 | "metadata": {
14 | "execution": {
15 | "iopub.execute_input": "2020-12-06T08:34:45.891872Z",
16 | "iopub.status.busy": "2020-12-06T08:34:45.891378Z",
17 | "iopub.status.idle": "2020-12-06T08:34:47.209690Z",
18 | "shell.execute_reply": "2020-12-06T08:34:47.210008Z"
19 | }
20 | },
21 | "outputs": [
22 | {
23 | "name": "stderr",
24 | "output_type": "stream",
25 | "text": [
26 | "/home/stefan/projects/pytorch_tutorial/.venv/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)\n",
27 | " return torch._C._cuda_getDeviceCount() > 0\n"
28 | ]
29 | }
30 | ],
31 | "source": [
32 | "import torch\n",
33 | "import torch.nn as nn\n",
34 | "import torch.nn.functional as F\n",
35 | "import torch.optim as optim\n",
36 | "import torchvision\n",
37 | "\n",
38 | "DEVICE = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")"
39 | ]
40 | },
41 | {
42 | "cell_type": "markdown",
43 | "metadata": {},
44 | "source": [
45 | "**Q: \"No debugger for your code. What do you think?\"**\n",
46 | "\n",
47 | "**A: \"I would NOT be able to code!\"**\n",
48 | "\n",
49 | "- Who does \"print-line-debugging\"?\n",
50 | "- Who likes debugging in tensorflow?\n",
51 | "- What is the intersection of those two groups?\n",
52 | "\n",
53 | "\n",
54 | "## IPDB cheatsheet\n",
55 | "IPython Debugger\n",
56 | "\n",
57 | "Taken from http://wangchuan.github.io/coding/2017/07/12/ipdb-cheat-sheet.html\n",
58 | "\n",
59 | "- h(help): Print help\n",
60 | "\n",
61 | "- n(ext): Continue execution until the next line in the current function is reached or it returns.\n",
62 | "- s(tep): Execute the current line, stop at the first possible occasion (either in a function that is called or in the current function).\n",
63 | "- r(eturn): Continue execution until the current function returns.\n",
64 | "- c(ont(inue)): Continue execution, only stop when a breakpoint is encountered.\n",
65 | "\n",
66 | "- r(eturn): Continue execution until the current function returns.\n",
67 | "- a(rgs): Print the argument list of the current function."
68 | ]
69 | },
70 | {
71 | "cell_type": "markdown",
72 | "metadata": {},
73 | "source": [
74 | "Note: Python 3.7 has `breakpoint()` built-in! [[PEP 553]](https://www.python.org/dev/peps/pep-0553/)"
75 | ]
76 | },
77 | {
78 | "cell_type": "code",
79 | "execution_count": 2,
80 | "metadata": {
81 | "execution": {
82 | "iopub.execute_input": "2020-12-06T08:34:47.212530Z",
83 | "iopub.status.busy": "2020-12-06T08:34:47.212225Z",
84 | "iopub.status.idle": "2020-12-06T08:34:47.214085Z",
85 | "shell.execute_reply": "2020-12-06T08:34:47.213730Z"
86 | }
87 | },
88 | "outputs": [],
89 | "source": [
90 | "from IPython.core.debugger import set_trace"
91 | ]
92 | },
93 | {
94 | "cell_type": "code",
95 | "execution_count": 3,
96 | "metadata": {
97 | "execution": {
98 | "iopub.execute_input": "2020-12-06T08:34:47.219577Z",
99 | "iopub.status.busy": "2020-12-06T08:34:47.219206Z",
100 | "iopub.status.idle": "2020-12-06T08:34:47.221480Z",
101 | "shell.execute_reply": "2020-12-06T08:34:47.221160Z"
102 | }
103 | },
104 | "outputs": [
105 | {
106 | "data": {
107 | "text/plain": [
108 | "54"
109 | ]
110 | },
111 | "execution_count": 1,
112 | "metadata": {},
113 | "output_type": "execute_result"
114 | }
115 | ],
116 | "source": [
117 | "def my_function(x):\n",
118 | " answer = 42\n",
119 | " # set_trace() # <-- uncomment!\n",
120 | " answer += x\n",
121 | " return answer\n",
122 | "\n",
123 | "my_function(12)"
124 | ]
125 | },
126 | {
127 | "cell_type": "markdown",
128 | "metadata": {},
129 | "source": [
130 | "## Example: debuging a NN"
131 | ]
132 | },
133 | {
134 | "cell_type": "code",
135 | "execution_count": 4,
136 | "metadata": {
137 | "execution": {
138 | "iopub.execute_input": "2020-12-06T08:34:47.223812Z",
139 | "iopub.status.busy": "2020-12-06T08:34:47.223510Z",
140 | "iopub.status.idle": "2020-12-06T08:34:47.253488Z",
141 | "shell.execute_reply": "2020-12-06T08:34:47.253136Z"
142 | }
143 | },
144 | "outputs": [
145 | {
146 | "data": {
147 | "text/plain": [
148 | "tensor([[0.4467, 0.0772, 0.7921],\n",
149 | " [0.8916, 0.5952, 0.2477],\n",
150 | " [0.9442, 0.3809, 0.3629],\n",
151 | " [0.3867, 0.1336, 0.0256],\n",
152 | " [0.1272, 0.6342, 0.3937]])"
153 | ]
154 | },
155 | "execution_count": 1,
156 | "metadata": {},
157 | "output_type": "execute_result"
158 | }
159 | ],
160 | "source": [
161 | "X = torch.rand((5, 3))\n",
162 | "X"
163 | ]
164 | },
165 | {
166 | "cell_type": "code",
167 | "execution_count": 5,
168 | "metadata": {
169 | "execution": {
170 | "iopub.execute_input": "2020-12-06T08:34:47.256914Z",
171 | "iopub.status.busy": "2020-12-06T08:34:47.256587Z",
172 | "iopub.status.idle": "2020-12-06T08:34:47.262137Z",
173 | "shell.execute_reply": "2020-12-06T08:34:47.261822Z"
174 | }
175 | },
176 | "outputs": [],
177 | "source": [
178 | "class MyModule(nn.Module):\n",
179 | " def __init__(self):\n",
180 | " super().__init__()\n",
181 | " self.lin = nn.Linear(3, 1)\n",
182 | " \n",
183 | " def forward(self, X):\n",
184 | " # set_trace()\n",
185 | " x = self.lin(X)\n",
186 | " return X\n",
187 | "\n",
188 | " \n",
189 | "model = MyModule()\n",
190 | "y_ = model(X)\n",
191 | "\n",
192 | "# assert y_.shape == (5, 1), y_.shape"
193 | ]
194 | },
195 | {
196 | "cell_type": "markdown",
197 | "metadata": {},
198 | "source": [
199 | "## Debug Layer"
200 | ]
201 | },
202 | {
203 | "cell_type": "code",
204 | "execution_count": 6,
205 | "metadata": {
206 | "execution": {
207 | "iopub.execute_input": "2020-12-06T08:34:47.264564Z",
208 | "iopub.status.busy": "2020-12-06T08:34:47.264237Z",
209 | "iopub.status.idle": "2020-12-06T08:34:47.265995Z",
210 | "shell.execute_reply": "2020-12-06T08:34:47.265649Z"
211 | }
212 | },
213 | "outputs": [],
214 | "source": [
215 | "class DebugModule(nn.Module):\n",
216 | " def forward(self, x):\n",
217 | " set_trace()\n",
218 | " return x"
219 | ]
220 | },
221 | {
222 | "cell_type": "code",
223 | "execution_count": 7,
224 | "metadata": {
225 | "execution": {
226 | "iopub.execute_input": "2020-12-06T08:34:47.268382Z",
227 | "iopub.status.busy": "2020-12-06T08:34:47.268055Z",
228 | "iopub.status.idle": "2020-12-06T08:34:47.269773Z",
229 | "shell.execute_reply": "2020-12-06T08:34:47.269424Z"
230 | }
231 | },
232 | "outputs": [],
233 | "source": [
234 | "model = nn.Sequential(\n",
235 | " nn.Linear(1, 5),\n",
236 | " DebugModule(),\n",
237 | " nn.Linear(5, 1),\n",
238 | ")"
239 | ]
240 | },
241 | {
242 | "cell_type": "code",
243 | "execution_count": 8,
244 | "metadata": {
245 | "execution": {
246 | "iopub.execute_input": "2020-12-06T08:34:47.271824Z",
247 | "iopub.status.busy": "2020-12-06T08:34:47.271492Z",
248 | "iopub.status.idle": "2020-12-06T08:34:47.273920Z",
249 | "shell.execute_reply": "2020-12-06T08:34:47.273570Z"
250 | }
251 | },
252 | "outputs": [],
253 | "source": [
254 | "X = torch.unsqueeze(torch.tensor([1.]), dim=0)\n",
255 | "# model(X)"
256 | ]
257 | },
258 | {
259 | "cell_type": "markdown",
260 | "metadata": {},
261 | "source": [
262 | "## Tensorboard and `tensorboardX`\n",
263 | "Tensorboard and `tensorboardX` are also great to debug a model, e.g. to look at the gradients."
264 | ]
265 | }
266 | ],
267 | "metadata": {
268 | "kernelspec": {
269 | "display_name": "Python 3",
270 | "language": "python",
271 | "name": "python3"
272 | },
273 | "language_info": {
274 | "codemirror_mode": {
275 | "name": "ipython",
276 | "version": 3
277 | },
278 | "file_extension": ".py",
279 | "mimetype": "text/x-python",
280 | "name": "python",
281 | "nbconvert_exporter": "python",
282 | "pygments_lexer": "ipython3",
283 | "version": "3.8.5"
284 | }
285 | },
286 | "nbformat": 4,
287 | "nbformat_minor": 2
288 | }
289 |
--------------------------------------------------------------------------------
/notebooks/debugging.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # Debugging
17 |
18 | # %%
19 | import torch
20 | import torch.nn as nn
21 | import torch.nn.functional as F
22 | import torch.optim as optim
23 | import torchvision
24 |
25 | DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
26 |
27 | # %% [markdown]
28 | # **Q: "No debugger for your code. What do you think?"**
29 | #
30 | # **A: "I would NOT be able to code!"**
31 | #
32 | # - Who does "print-line-debugging"?
33 | # - Who likes debugging in tensorflow?
34 | # - What is the intersection of those two groups?
35 | #
36 | #
37 | # ## IPDB cheatsheet
38 | # IPython Debugger
39 | #
40 | # Taken from http://wangchuan.github.io/coding/2017/07/12/ipdb-cheat-sheet.html
41 | #
42 | # - h(help): Print help
43 | #
44 | # - n(ext): Continue execution until the next line in the current function is reached or it returns.
45 | # - s(tep): Execute the current line, stop at the first possible occasion (either in a function that is called or in the current function).
46 | # - r(eturn): Continue execution until the current function returns.
47 | # - c(ont(inue)): Continue execution, only stop when a breakpoint is encountered.
48 | #
49 | # - r(eturn): Continue execution until the current function returns.
50 | # - a(rgs): Print the argument list of the current function.
51 |
52 | # %% [markdown]
53 | # Note: Python 3.7 has `breakpoint()` built-in! [[PEP 553]](https://www.python.org/dev/peps/pep-0553/)
54 |
55 | # %%
56 | from IPython.core.debugger import set_trace
57 |
58 |
59 | # %%
60 | def my_function(x):
61 | answer = 42
62 | # set_trace() # <-- uncomment!
63 | answer += x
64 | return answer
65 |
66 | my_function(12)
67 |
68 | # %% [markdown]
69 | # ## Example: debuging a NN
70 |
71 | # %%
72 | X = torch.rand((5, 3))
73 | X
74 |
75 |
76 | # %%
77 | class MyModule(nn.Module):
78 | def __init__(self):
79 | super().__init__()
80 | self.lin = nn.Linear(3, 1)
81 |
82 | def forward(self, X):
83 | # set_trace()
84 | x = self.lin(X)
85 | return X
86 |
87 |
88 | model = MyModule()
89 | y_ = model(X)
90 |
91 | # assert y_.shape == (5, 1), y_.shape
92 |
93 | # %% [markdown]
94 | # ## Debug Layer
95 |
96 | # %%
97 | class DebugModule(nn.Module):
98 | def forward(self, x):
99 | set_trace()
100 | return x
101 |
102 |
103 | # %%
104 | model = nn.Sequential(
105 | nn.Linear(1, 5),
106 | DebugModule(),
107 | nn.Linear(5, 1),
108 | )
109 |
110 | # %%
111 | X = torch.unsqueeze(torch.tensor([1.]), dim=0)
112 | # model(X)
113 |
114 | # %% [markdown]
115 | # ## Tensorboard and `tensorboardX`
116 | # Tensorboard and `tensorboardX` are also great to debug a model, e.g. to look at the gradients.
117 |
--------------------------------------------------------------------------------
/notebooks/foreword.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | ""
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "# Foreword\n",
15 | "\n",
16 | "Material for this tutorial is here: https://github.com/sotte/pytorch_tutorial\n",
17 | "\n",
18 | "**Prerequisites:**\n",
19 | "- you have implemented machine learning models yourself\n",
20 | "- you know what deep learning is\n",
21 | "- you have used numpy\n",
22 | "- maybe you have used tensorflow or similar libs\n",
23 | "\n",
24 | "- if you use PyTorch on a daily basis, this tutorial is probably not for you\n",
25 | "\n",
26 | "**Goals:**\n",
27 | "- understand PyTorch concepts\n",
28 | "- be able to use transfer learning in PyTorch\n",
29 | "- be aware of some handy tools/libs"
30 | ]
31 | },
32 | {
33 | "cell_type": "markdown",
34 | "metadata": {},
35 | "source": [
36 | "Note:\n",
37 | "You don't need a GPU to work on this tutorial, but everything is much faster if you have one.\n",
38 | "However, you can use Google's Colab with a GPU and work on this tutorial:\n",
39 | "[PyTorch + GPU in Google's Colab](0X_pytorch_in_googles_colab.ipynb)"
40 | ]
41 | },
42 | {
43 | "cell_type": "markdown",
44 | "metadata": {},
45 | "source": [
46 | "# Agenda\n",
47 | "\n",
48 | "See README.md"
49 | ]
50 | },
51 | {
52 | "cell_type": "markdown",
53 | "metadata": {},
54 | "source": [
55 | "# PyTorch Overview\n",
56 | "\n",
57 | "\n",
58 | "> \"PyTorch - Tensors and Dynamic neural networks in Python\n",
59 | "with strong GPU acceleration.\n",
60 | "PyTorch is a deep learning framework for fast, flexible experimentation.\"\n",
61 | ">\n",
62 | "> -- https://pytorch.org/*\n",
63 | "\n",
64 | "This was the tagline prior to PyTorch 1.0.\n",
65 | "Now it's:\n",
66 | "\n",
67 | "> \"PyTorch - From Research To Production\n",
68 | "> \n",
69 | "> An open source deep learning platform that provides a seamless path from research prototyping to production deployment.\""
70 | ]
71 | },
72 | {
73 | "cell_type": "markdown",
74 | "metadata": {},
75 | "source": [
76 | "## \"Build by run\" - what is that and why do I care?"
77 | ]
78 | },
79 | {
80 | "cell_type": "markdown",
81 | "metadata": {},
82 | "source": [
83 | ""
84 | ]
85 | },
86 | {
87 | "cell_type": "markdown",
88 | "metadata": {},
89 | "source": [
90 | "This is a much better explanation of PyTorch (I think)"
91 | ]
92 | },
93 | {
94 | "cell_type": "code",
95 | "execution_count": 1,
96 | "metadata": {
97 | "execution": {
98 | "iopub.execute_input": "2020-12-06T08:34:48.900466Z",
99 | "iopub.status.busy": "2020-12-06T08:34:48.899857Z",
100 | "iopub.status.idle": "2020-12-06T08:34:49.335919Z",
101 | "shell.execute_reply": "2020-12-06T08:34:49.333857Z"
102 | }
103 | },
104 | "outputs": [
105 | {
106 | "data": {
107 | "text/plain": [
108 | "tensor([[ 1.7654, 1.0075, 2.6082, 0.9247, 1.4656, -1.8325, 1.0004, 0.0030,\n",
109 | " 1.6098, 1.7323]])"
110 | ]
111 | },
112 | "execution_count": 1,
113 | "metadata": {},
114 | "output_type": "execute_result"
115 | }
116 | ],
117 | "source": [
118 | "import torch\n",
119 | "from IPython.core.debugger import set_trace\n",
120 | "\n",
121 | "def f(x):\n",
122 | " res = x + x\n",
123 | " # set_trace() # <-- OMG! =D\n",
124 | " return res\n",
125 | "\n",
126 | "x = torch.randn(1, 10)\n",
127 | "f(x)"
128 | ]
129 | },
130 | {
131 | "cell_type": "markdown",
132 | "metadata": {},
133 | "source": [
134 | "I like pytorch because\n",
135 | "- \"it's just stupid python\"\n",
136 | "- easy to debug\n",
137 | "- nice and extensible interface\n",
138 | "- research-y feel\n",
139 | "- research is often published as pytorch project"
140 | ]
141 | },
142 | {
143 | "cell_type": "markdown",
144 | "metadata": {},
145 | "source": [
146 | "## A word about TF\n",
147 | "TF 2 is about to be released.\n",
148 | "- eager by default\n",
149 | "- API cleanup\n",
150 | "- No more `session.run()`, `tf.control_dependencies()`, `tf.while_loop()`, `tf.cond()`, `tf.global_variables_initializer()`, etc.\n",
151 | "\n",
152 | "## TF and PyTorch\n",
153 | "- static vs dynamic\n",
154 | "- production vs prototyping "
155 | ]
156 | },
157 | {
158 | "cell_type": "markdown",
159 | "metadata": {},
160 | "source": [
161 | "## *\"The tyranny of choice\"*\n",
162 | "- TensorFlow\n",
163 | "- MXNet\n",
164 | "- Keras\n",
165 | "- CNTK\n",
166 | "- Chainer\n",
167 | "- caffe\n",
168 | "- caffe2\n",
169 | "- many many more\n",
170 | "\n",
171 | "All of them a good!\n"
172 | ]
173 | },
174 | {
175 | "cell_type": "markdown",
176 | "metadata": {},
177 | "source": [
178 | "# References\n",
179 | "- Twitter: https://twitter.com/PyTorch\n",
180 | "- Forum: https://discuss.pytorch.org/\n",
181 | "- Tutorials: https://pytorch.org/tutorials/\n",
182 | "- Examples: https://github.com/pytorch/examples\n",
183 | "- API Reference: https://pytorch.org/docs/stable/index.html\n",
184 | "- Torchvision: https://pytorch.org/docs/stable/torchvision/index.html\n",
185 | "- PyTorch Text: https://github.com/pytorch/text\n",
186 | "- PyTorch Audio: https://github.com/pytorch/audio\n",
187 | "- AllenNLP: https://allennlp.org/\n",
188 | "- Object detection/segmentation: https://github.com/facebookresearch/maskrcnn-benchmark\n",
189 | "- Facebook AI Research Sequence-to-Sequence Toolkit written in PyTorch: https://github.com/pytorch/fairseq\n",
190 | "- FastAI http://www.fast.ai/\n",
191 | "- Stanford CS230 Deep Learning notes https://cs230-stanford.github.io"
192 | ]
193 | },
194 | {
195 | "cell_type": "markdown",
196 | "metadata": {},
197 | "source": [
198 | "# Example Network\n",
199 | "Just to get an idea of how PyTorch feels like here are some examples of networks."
200 | ]
201 | },
202 | {
203 | "cell_type": "code",
204 | "execution_count": 2,
205 | "metadata": {
206 | "execution": {
207 | "iopub.execute_input": "2020-12-06T08:34:49.349448Z",
208 | "iopub.status.busy": "2020-12-06T08:34:49.347611Z",
209 | "iopub.status.idle": "2020-12-06T08:34:49.354574Z",
210 | "shell.execute_reply": "2020-12-06T08:34:49.352797Z"
211 | }
212 | },
213 | "outputs": [],
214 | "source": [
215 | "from collections import OrderedDict\n",
216 | "\n",
217 | "import torch # basic tensor functions\n",
218 | "import torch.nn as nn # everything neural network\n",
219 | "import torch.nn.functional as F # functional/stateless version of nn\n",
220 | "import torch.optim as optim # optimizers :)"
221 | ]
222 | },
223 | {
224 | "cell_type": "code",
225 | "execution_count": 3,
226 | "metadata": {
227 | "execution": {
228 | "iopub.execute_input": "2020-12-06T08:34:49.367216Z",
229 | "iopub.status.busy": "2020-12-06T08:34:49.365788Z",
230 | "iopub.status.idle": "2020-12-06T08:34:49.372334Z",
231 | "shell.execute_reply": "2020-12-06T08:34:49.371472Z"
232 | }
233 | },
234 | "outputs": [],
235 | "source": [
236 | "# Simple sequential model\n",
237 | "model = nn.Sequential(\n",
238 | " nn.Conv2d(in_channels=1, out_channels=20, kernel_size=5),\n",
239 | " nn.ReLU(),\n",
240 | " nn.Conv2d(20, 64, 5),\n",
241 | " nn.ReLU(),\n",
242 | " nn.AdaptiveAvgPool2d(1),\n",
243 | ")"
244 | ]
245 | },
246 | {
247 | "cell_type": "code",
248 | "execution_count": 4,
249 | "metadata": {
250 | "execution": {
251 | "iopub.execute_input": "2020-12-06T08:34:49.377843Z",
252 | "iopub.status.busy": "2020-12-06T08:34:49.377031Z",
253 | "iopub.status.idle": "2020-12-06T08:34:49.381617Z",
254 | "shell.execute_reply": "2020-12-06T08:34:49.380797Z"
255 | }
256 | },
257 | "outputs": [
258 | {
259 | "data": {
260 | "text/plain": [
261 | "Sequential(\n",
262 | " (0): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))\n",
263 | " (1): ReLU()\n",
264 | " (2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))\n",
265 | " (3): ReLU()\n",
266 | " (4): AdaptiveAvgPool2d(output_size=1)\n",
267 | ")"
268 | ]
269 | },
270 | "execution_count": 1,
271 | "metadata": {},
272 | "output_type": "execute_result"
273 | }
274 | ],
275 | "source": [
276 | "model"
277 | ]
278 | },
279 | {
280 | "cell_type": "code",
281 | "execution_count": 5,
282 | "metadata": {
283 | "execution": {
284 | "iopub.execute_input": "2020-12-06T08:34:49.386586Z",
285 | "iopub.status.busy": "2020-12-06T08:34:49.385730Z",
286 | "iopub.status.idle": "2020-12-06T08:34:49.409117Z",
287 | "shell.execute_reply": "2020-12-06T08:34:49.409722Z"
288 | }
289 | },
290 | "outputs": [
291 | {
292 | "data": {
293 | "text/plain": [
294 | "torch.Size([16, 64, 1, 1])"
295 | ]
296 | },
297 | "execution_count": 1,
298 | "metadata": {},
299 | "output_type": "execute_result"
300 | }
301 | ],
302 | "source": [
303 | "# forward pass\n",
304 | "model(torch.rand(16, 1, 32, 32)).shape"
305 | ]
306 | },
307 | {
308 | "cell_type": "code",
309 | "execution_count": 6,
310 | "metadata": {
311 | "execution": {
312 | "iopub.execute_input": "2020-12-06T08:34:49.414480Z",
313 | "iopub.status.busy": "2020-12-06T08:34:49.413860Z",
314 | "iopub.status.idle": "2020-12-06T08:34:49.417378Z",
315 | "shell.execute_reply": "2020-12-06T08:34:49.417808Z"
316 | }
317 | },
318 | "outputs": [
319 | {
320 | "data": {
321 | "text/plain": [
322 | "Sequential(\n",
323 | " (conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))\n",
324 | " (relu1): ReLU()\n",
325 | " (conv2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))\n",
326 | " (relu2): ReLU()\n",
327 | " (aavgp): AdaptiveAvgPool2d(output_size=1)\n",
328 | ")"
329 | ]
330 | },
331 | "execution_count": 1,
332 | "metadata": {},
333 | "output_type": "execute_result"
334 | }
335 | ],
336 | "source": [
337 | "# Simple sequential model with named layers\n",
338 | "layers = OrderedDict([\n",
339 | " (\"conv1\", nn.Conv2d(in_channels=1, out_channels=20, kernel_size=5)),\n",
340 | " (\"relu1\", nn.ReLU()),\n",
341 | " (\"conv2\", nn.Conv2d(20,64,5)),\n",
342 | " (\"relu2\", nn.ReLU()),\n",
343 | " (\"aavgp\", nn.AdaptiveAvgPool2d(1)),\n",
344 | "])\n",
345 | "model = nn.Sequential(layers)\n",
346 | "model"
347 | ]
348 | },
349 | {
350 | "cell_type": "code",
351 | "execution_count": 7,
352 | "metadata": {
353 | "execution": {
354 | "iopub.execute_input": "2020-12-06T08:34:49.423476Z",
355 | "iopub.status.busy": "2020-12-06T08:34:49.423016Z",
356 | "iopub.status.idle": "2020-12-06T08:34:49.426564Z",
357 | "shell.execute_reply": "2020-12-06T08:34:49.426224Z"
358 | }
359 | },
360 | "outputs": [
361 | {
362 | "data": {
363 | "text/plain": [
364 | "Net(\n",
365 | " (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))\n",
366 | " (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n",
367 | " (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))\n",
368 | " (fc1): Linear(in_features=400, out_features=120, bias=True)\n",
369 | " (fc2): Linear(in_features=120, out_features=84, bias=True)\n",
370 | " (fc3): Linear(in_features=84, out_features=10, bias=True)\n",
371 | ")"
372 | ]
373 | },
374 | "execution_count": 1,
375 | "metadata": {},
376 | "output_type": "execute_result"
377 | }
378 | ],
379 | "source": [
380 | "class Net(nn.Module):\n",
381 | " def __init__(self):\n",
382 | " super(Net, self).__init__()\n",
383 | " self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5)\n",
384 | " self.pool = nn.MaxPool2d(2, 2)\n",
385 | " self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5)\n",
386 | " self.fc1 = nn.Linear(in_features=16 * 5 * 5, out_features=120)\n",
387 | " self.fc2 = nn.Linear(120, 84)\n",
388 | " self.fc3 = nn.Linear(84, 10)\n",
389 | "\n",
390 | " def forward(self, x):\n",
391 | " x = self.pool(F.relu(self.conv1(x)))\n",
392 | " x = self.pool(F.relu(self.conv2(x)))\n",
393 | " x = x.view(-1, 16 * 5 * 5)\n",
394 | " x = F.relu(self.fc1(x))\n",
395 | " x = F.relu(self.fc2(x))\n",
396 | " x = self.fc3(x)\n",
397 | " x = F.adaptive_avg_pool2d(x, 1)\n",
398 | " return x\n",
399 | "\n",
400 | "\n",
401 | "model = Net()\n",
402 | "model"
403 | ]
404 | },
405 | {
406 | "cell_type": "markdown",
407 | "metadata": {},
408 | "source": [
409 | "# Versions"
410 | ]
411 | },
412 | {
413 | "cell_type": "code",
414 | "execution_count": 8,
415 | "metadata": {
416 | "execution": {
417 | "iopub.execute_input": "2020-12-06T08:34:49.428867Z",
418 | "iopub.status.busy": "2020-12-06T08:34:49.428514Z",
419 | "iopub.status.idle": "2020-12-06T08:34:49.430642Z",
420 | "shell.execute_reply": "2020-12-06T08:34:49.430276Z"
421 | }
422 | },
423 | "outputs": [
424 | {
425 | "data": {
426 | "text/plain": [
427 | "'1.7.0'"
428 | ]
429 | },
430 | "execution_count": 1,
431 | "metadata": {},
432 | "output_type": "execute_result"
433 | }
434 | ],
435 | "source": [
436 | "import torch\n",
437 | "torch.__version__"
438 | ]
439 | },
440 | {
441 | "cell_type": "code",
442 | "execution_count": 9,
443 | "metadata": {
444 | "execution": {
445 | "iopub.execute_input": "2020-12-06T08:34:49.432667Z",
446 | "iopub.status.busy": "2020-12-06T08:34:49.432346Z",
447 | "iopub.status.idle": "2020-12-06T08:34:49.464593Z",
448 | "shell.execute_reply": "2020-12-06T08:34:49.464259Z"
449 | }
450 | },
451 | "outputs": [
452 | {
453 | "data": {
454 | "text/plain": [
455 | "'0.8.1'"
456 | ]
457 | },
458 | "execution_count": 1,
459 | "metadata": {},
460 | "output_type": "execute_result"
461 | }
462 | ],
463 | "source": [
464 | "import torchvision\n",
465 | "torchvision.__version__"
466 | ]
467 | },
468 | {
469 | "cell_type": "code",
470 | "execution_count": 10,
471 | "metadata": {
472 | "execution": {
473 | "iopub.execute_input": "2020-12-06T08:34:49.466917Z",
474 | "iopub.status.busy": "2020-12-06T08:34:49.466467Z",
475 | "iopub.status.idle": "2020-12-06T08:34:49.468544Z",
476 | "shell.execute_reply": "2020-12-06T08:34:49.468230Z"
477 | }
478 | },
479 | "outputs": [
480 | {
481 | "data": {
482 | "text/plain": [
483 | "'1.19.4'"
484 | ]
485 | },
486 | "execution_count": 1,
487 | "metadata": {},
488 | "output_type": "execute_result"
489 | }
490 | ],
491 | "source": [
492 | "import numpy as np\n",
493 | "np.__version__"
494 | ]
495 | }
496 | ],
497 | "metadata": {
498 | "kernelspec": {
499 | "display_name": "Python 3",
500 | "language": "python",
501 | "name": "python3"
502 | },
503 | "language_info": {
504 | "codemirror_mode": {
505 | "name": "ipython",
506 | "version": 3
507 | },
508 | "file_extension": ".py",
509 | "mimetype": "text/x-python",
510 | "name": "python",
511 | "nbconvert_exporter": "python",
512 | "pygments_lexer": "ipython3",
513 | "version": "3.8.5"
514 | },
515 | "toc-autonumbering": true,
516 | "toc-showcode": false,
517 | "toc-showmarkdowntxt": false
518 | },
519 | "nbformat": 4,
520 | "nbformat_minor": 4
521 | }
522 |
--------------------------------------------------------------------------------
/notebooks/foreword.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # 
17 |
18 | # %% [markdown]
19 | # # Foreword
20 | #
21 | # Material for this tutorial is here: https://github.com/sotte/pytorch_tutorial
22 | #
23 | # **Prerequisites:**
24 | # - you have implemented machine learning models yourself
25 | # - you know what deep learning is
26 | # - you have used numpy
27 | # - maybe you have used tensorflow or similar libs
28 | #
29 | # - if you use PyTorch on a daily basis, this tutorial is probably not for you
30 | #
31 | # **Goals:**
32 | # - understand PyTorch concepts
33 | # - be able to use transfer learning in PyTorch
34 | # - be aware of some handy tools/libs
35 |
36 | # %% [markdown]
37 | # Note:
38 | # You don't need a GPU to work on this tutorial, but everything is much faster if you have one.
39 | # However, you can use Google's Colab with a GPU and work on this tutorial:
40 | # [PyTorch + GPU in Google's Colab](0X_pytorch_in_googles_colab.ipynb)
41 |
42 | # %% [markdown]
43 | # # Agenda
44 | #
45 | # See README.md
46 |
47 | # %% [markdown]
48 | # # PyTorch Overview
49 | #
50 | #
51 | # > "PyTorch - Tensors and Dynamic neural networks in Python
52 | # with strong GPU acceleration.
53 | # PyTorch is a deep learning framework for fast, flexible experimentation."
54 | # >
55 | # > -- https://pytorch.org/*
56 | #
57 | # This was the tagline prior to PyTorch 1.0.
58 | # Now it's:
59 | #
60 | # > "PyTorch - From Research To Production
61 | # >
62 | # > An open source deep learning platform that provides a seamless path from research prototyping to production deployment."
63 |
64 | # %% [markdown]
65 | # ## "Build by run" - what is that and why do I care?
66 |
67 | # %% [markdown]
68 | # 
69 |
70 | # %% [markdown]
71 | # This is a much better explanation of PyTorch (I think)
72 |
73 | # %%
74 | import torch
75 | from IPython.core.debugger import set_trace
76 |
77 | def f(x):
78 | res = x + x
79 | # set_trace() # <-- OMG! =D
80 | return res
81 |
82 | x = torch.randn(1, 10)
83 | f(x)
84 |
85 | # %% [markdown]
86 | # I like pytorch because
87 | # - "it's just stupid python"
88 | # - easy to debug
89 | # - nice and extensible interface
90 | # - research-y feel
91 | # - research is often published as pytorch project
92 |
93 | # %% [markdown]
94 | # ## A word about TF
95 | # TF 2 is about to be released.
96 | # - eager by default
97 | # - API cleanup
98 | # - No more `session.run()`, `tf.control_dependencies()`, `tf.while_loop()`, `tf.cond()`, `tf.global_variables_initializer()`, etc.
99 | #
100 | # ## TF and PyTorch
101 | # - static vs dynamic
102 | # - production vs prototyping
103 |
104 | # %% [markdown]
105 | # ## *"The tyranny of choice"*
106 | # - TensorFlow
107 | # - MXNet
108 | # - Keras
109 | # - CNTK
110 | # - Chainer
111 | # - caffe
112 | # - caffe2
113 | # - many many more
114 | #
115 | # All of them a good!
116 | #
117 |
118 | # %% [markdown]
119 | # # References
120 | # - Twitter: https://twitter.com/PyTorch
121 | # - Forum: https://discuss.pytorch.org/
122 | # - Tutorials: https://pytorch.org/tutorials/
123 | # - Examples: https://github.com/pytorch/examples
124 | # - API Reference: https://pytorch.org/docs/stable/index.html
125 | # - Torchvision: https://pytorch.org/docs/stable/torchvision/index.html
126 | # - PyTorch Text: https://github.com/pytorch/text
127 | # - PyTorch Audio: https://github.com/pytorch/audio
128 | # - AllenNLP: https://allennlp.org/
129 | # - Object detection/segmentation: https://github.com/facebookresearch/maskrcnn-benchmark
130 | # - Facebook AI Research Sequence-to-Sequence Toolkit written in PyTorch: https://github.com/pytorch/fairseq
131 | # - FastAI http://www.fast.ai/
132 | # - Stanford CS230 Deep Learning notes https://cs230-stanford.github.io
133 |
134 | # %% [markdown]
135 | # # Example Network
136 | # Just to get an idea of how PyTorch feels like here are some examples of networks.
137 |
138 | # %%
139 | from collections import OrderedDict
140 |
141 | import torch # basic tensor functions
142 | import torch.nn as nn # everything neural network
143 | import torch.nn.functional as F # functional/stateless version of nn
144 | import torch.optim as optim # optimizers :)
145 |
146 | # %%
147 | # Simple sequential model
148 | model = nn.Sequential(
149 | nn.Conv2d(in_channels=1, out_channels=20, kernel_size=5),
150 | nn.ReLU(),
151 | nn.Conv2d(20, 64, 5),
152 | nn.ReLU(),
153 | nn.AdaptiveAvgPool2d(1),
154 | )
155 |
156 | # %%
157 | model
158 |
159 | # %%
160 | # forward pass
161 | model(torch.rand(16, 1, 32, 32)).shape
162 |
163 | # %%
164 | # Simple sequential model with named layers
165 | layers = OrderedDict([
166 | ("conv1", nn.Conv2d(in_channels=1, out_channels=20, kernel_size=5)),
167 | ("relu1", nn.ReLU()),
168 | ("conv2", nn.Conv2d(20,64,5)),
169 | ("relu2", nn.ReLU()),
170 | ("aavgp", nn.AdaptiveAvgPool2d(1)),
171 | ])
172 | model = nn.Sequential(layers)
173 | model
174 |
175 |
176 | # %%
177 | class Net(nn.Module):
178 | def __init__(self):
179 | super(Net, self).__init__()
180 | self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5)
181 | self.pool = nn.MaxPool2d(2, 2)
182 | self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5)
183 | self.fc1 = nn.Linear(in_features=16 * 5 * 5, out_features=120)
184 | self.fc2 = nn.Linear(120, 84)
185 | self.fc3 = nn.Linear(84, 10)
186 |
187 | def forward(self, x):
188 | x = self.pool(F.relu(self.conv1(x)))
189 | x = self.pool(F.relu(self.conv2(x)))
190 | x = x.view(-1, 16 * 5 * 5)
191 | x = F.relu(self.fc1(x))
192 | x = F.relu(self.fc2(x))
193 | x = self.fc3(x)
194 | x = F.adaptive_avg_pool2d(x, 1)
195 | return x
196 |
197 |
198 | model = Net()
199 | model
200 |
201 | # %% [markdown]
202 | # # Versions
203 |
204 | # %%
205 | import torch
206 | torch.__version__
207 |
208 | # %%
209 | import torchvision
210 | torchvision.__version__
211 |
212 | # %%
213 | import numpy as np
214 | np.__version__
215 |
--------------------------------------------------------------------------------
/notebooks/hooks.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Hooks\n",
8 | "Hooks are simple functions that can be registered to be called during the forward or backward pass of a `nn.Module`.\n",
9 | "These functions can be used to print out information or modify the module.\n",
10 | "\n",
11 | "Here is a simple forward hook example that prints some information about the input and output of a module.\n",
12 | "\n",
13 | "Tip: Don't forget to remove the hook afterwards!\n",
14 | "\n",
15 | "Ref:\n",
16 | "- https://pytorch.org/tutorials/beginner/former_torchies/nn_tutorial.html#forward-and-backward-function-hooks\n",
17 | "- https://pytorch.org/docs/master/nn.html#torch.nn.Module.register_forward_hook\n",
18 | "- https://pytorch.org/docs/master/nn.html#torch.nn.Module.register_forward_pre_hook\n",
19 | "- https://pytorch.org/docs/master/nn.html#torch.nn.Module.register_backward_hook"
20 | ]
21 | },
22 | {
23 | "cell_type": "code",
24 | "execution_count": 1,
25 | "metadata": {
26 | "execution": {
27 | "iopub.execute_input": "2020-12-06T08:34:50.972704Z",
28 | "iopub.status.busy": "2020-12-06T08:34:50.971990Z",
29 | "iopub.status.idle": "2020-12-06T08:34:50.975122Z",
30 | "shell.execute_reply": "2020-12-06T08:34:50.974327Z"
31 | }
32 | },
33 | "outputs": [],
34 | "source": [
35 | "def tensorinfo_hook(module, input_, output):\n",
36 | " \"\"\"\n",
37 | " Register this forward hook to print some infos about the tensor/module.\n",
38 | "\n",
39 | " Example:\n",
40 | "\n",
41 | " >>> from torchvision.models import resnet18\n",
42 | " >>> model = resnet18(pretrained=False)\n",
43 | " >>> hook_fc = model.fc.register_forward_hook(tensorinfo_hook)\n",
44 | " >>> # model(torch.ones(1, 3, 244, 244))\n",
45 | " >>> hook_fc.remove()\n",
46 | "\n",
47 | " \"\"\"\n",
48 | " print(f\"Inside '{module.__class__.__name__}' forward\")\n",
49 | " print(f\" input: {str(type(input_)):<25}\")\n",
50 | " print(f\" input[0]: {str(type(input_[0])):<25} {input_[0].size()}\")\n",
51 | " print(f\" output: {str(type(output)):<25} {output.data.size()}\")\n",
52 | " print()"
53 | ]
54 | },
55 | {
56 | "cell_type": "code",
57 | "execution_count": 2,
58 | "metadata": {
59 | "execution": {
60 | "iopub.execute_input": "2020-12-06T08:34:50.979331Z",
61 | "iopub.status.busy": "2020-12-06T08:34:50.978607Z",
62 | "iopub.status.idle": "2020-12-06T08:34:51.322078Z",
63 | "shell.execute_reply": "2020-12-06T08:34:51.322341Z"
64 | }
65 | },
66 | "outputs": [],
67 | "source": [
68 | "import torch\n",
69 | "import torch.nn as nn"
70 | ]
71 | },
72 | {
73 | "cell_type": "code",
74 | "execution_count": 3,
75 | "metadata": {
76 | "execution": {
77 | "iopub.execute_input": "2020-12-06T08:34:51.324816Z",
78 | "iopub.status.busy": "2020-12-06T08:34:51.324506Z",
79 | "iopub.status.idle": "2020-12-06T08:34:51.325876Z",
80 | "shell.execute_reply": "2020-12-06T08:34:51.326156Z"
81 | }
82 | },
83 | "outputs": [],
84 | "source": [
85 | "m = nn.Linear(1, 3)"
86 | ]
87 | },
88 | {
89 | "cell_type": "code",
90 | "execution_count": 4,
91 | "metadata": {
92 | "execution": {
93 | "iopub.execute_input": "2020-12-06T08:34:51.328097Z",
94 | "iopub.status.busy": "2020-12-06T08:34:51.327781Z",
95 | "iopub.status.idle": "2020-12-06T08:34:51.329533Z",
96 | "shell.execute_reply": "2020-12-06T08:34:51.329223Z"
97 | }
98 | },
99 | "outputs": [],
100 | "source": [
101 | "hook = m.register_forward_hook(tensorinfo_hook)"
102 | ]
103 | },
104 | {
105 | "cell_type": "code",
106 | "execution_count": 5,
107 | "metadata": {
108 | "execution": {
109 | "iopub.execute_input": "2020-12-06T08:34:51.331641Z",
110 | "iopub.status.busy": "2020-12-06T08:34:51.331334Z",
111 | "iopub.status.idle": "2020-12-06T08:34:51.337940Z",
112 | "shell.execute_reply": "2020-12-06T08:34:51.337595Z"
113 | }
114 | },
115 | "outputs": [
116 | {
117 | "name": "stdout",
118 | "output_type": "stream",
119 | "text": [
120 | "Inside 'Linear' forward\n",
121 | " input: \n",
122 | " input[0]: torch.Size([1])\n",
123 | " output: torch.Size([3])\n",
124 | "\n"
125 | ]
126 | },
127 | {
128 | "data": {
129 | "text/plain": [
130 | "tensor([ 0.3400, -1.6105, 0.1007], grad_fn=)"
131 | ]
132 | },
133 | "execution_count": 1,
134 | "metadata": {},
135 | "output_type": "execute_result"
136 | }
137 | ],
138 | "source": [
139 | "m(torch.rand(1));"
140 | ]
141 | },
142 | {
143 | "cell_type": "code",
144 | "execution_count": 6,
145 | "metadata": {
146 | "execution": {
147 | "iopub.execute_input": "2020-12-06T08:34:51.339936Z",
148 | "iopub.status.busy": "2020-12-06T08:34:51.339628Z",
149 | "iopub.status.idle": "2020-12-06T08:34:51.341377Z",
150 | "shell.execute_reply": "2020-12-06T08:34:51.341055Z"
151 | }
152 | },
153 | "outputs": [],
154 | "source": [
155 | "hook.remove()"
156 | ]
157 | },
158 | {
159 | "cell_type": "markdown",
160 | "metadata": {},
161 | "source": [
162 | "## Exercise\n",
163 | "- Write a context manager hook that removes the hook when leaving the with block."
164 | ]
165 | }
166 | ],
167 | "metadata": {
168 | "kernelspec": {
169 | "display_name": "Python 3",
170 | "language": "python",
171 | "name": "python3"
172 | },
173 | "language_info": {
174 | "codemirror_mode": {
175 | "name": "ipython",
176 | "version": 3
177 | },
178 | "file_extension": ".py",
179 | "mimetype": "text/x-python",
180 | "name": "python",
181 | "nbconvert_exporter": "python",
182 | "pygments_lexer": "ipython3",
183 | "version": "3.8.5"
184 | }
185 | },
186 | "nbformat": 4,
187 | "nbformat_minor": 2
188 | }
189 |
--------------------------------------------------------------------------------
/notebooks/hooks.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # Hooks
17 | # Hooks are simple functions that can be registered to be called during the forward or backward pass of a `nn.Module`.
18 | # These functions can be used to print out information or modify the module.
19 | #
20 | # Here is a simple forward hook example that prints some information about the input and output of a module.
21 | #
22 | # Tip: Don't forget to remove the hook afterwards!
23 | #
24 | # Ref:
25 | # - https://pytorch.org/tutorials/beginner/former_torchies/nn_tutorial.html#forward-and-backward-function-hooks
26 | # - https://pytorch.org/docs/master/nn.html#torch.nn.Module.register_forward_hook
27 | # - https://pytorch.org/docs/master/nn.html#torch.nn.Module.register_forward_pre_hook
28 | # - https://pytorch.org/docs/master/nn.html#torch.nn.Module.register_backward_hook
29 |
30 | # %%
31 | def tensorinfo_hook(module, input_, output):
32 | """
33 | Register this forward hook to print some infos about the tensor/module.
34 |
35 | Example:
36 |
37 | >>> from torchvision.models import resnet18
38 | >>> model = resnet18(pretrained=False)
39 | >>> hook_fc = model.fc.register_forward_hook(tensorinfo_hook)
40 | >>> # model(torch.ones(1, 3, 244, 244))
41 | >>> hook_fc.remove()
42 |
43 | """
44 | print(f"Inside '{module.__class__.__name__}' forward")
45 | print(f" input: {str(type(input_)):<25}")
46 | print(f" input[0]: {str(type(input_[0])):<25} {input_[0].size()}")
47 | print(f" output: {str(type(output)):<25} {output.data.size()}")
48 | print()
49 |
50 |
51 | # %%
52 | import torch
53 | import torch.nn as nn
54 |
55 | # %%
56 | m = nn.Linear(1, 3)
57 |
58 | # %%
59 | hook = m.register_forward_hook(tensorinfo_hook)
60 |
61 | # %%
62 | m(torch.rand(1));
63 |
64 | # %%
65 | hook.remove()
66 |
67 | # %% [markdown]
68 | # ## Exercise
69 | # - Write a context manager hook that removes the hook when leaving the with block.
70 |
--------------------------------------------------------------------------------
/notebooks/img/common_mistakes.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sotte/pytorch_tutorial/2b0e44df0d7f19e68237481213a89479f980969d/notebooks/img/common_mistakes.png
--------------------------------------------------------------------------------
/notebooks/img/dynamic_graph.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sotte/pytorch_tutorial/2b0e44df0d7f19e68237481213a89479f980969d/notebooks/img/dynamic_graph.gif
--------------------------------------------------------------------------------
/notebooks/img/ml_debt.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sotte/pytorch_tutorial/2b0e44df0d7f19e68237481213a89479f980969d/notebooks/img/ml_debt.jpg
--------------------------------------------------------------------------------
/notebooks/img/pytorch-logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sotte/pytorch_tutorial/2b0e44df0d7f19e68237481213a89479f980969d/notebooks/img/pytorch-logo.png
--------------------------------------------------------------------------------
/notebooks/img/pytorch_logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sotte/pytorch_tutorial/2b0e44df0d7f19e68237481213a89479f980969d/notebooks/img/pytorch_logo.png
--------------------------------------------------------------------------------
/notebooks/img/pytorch_logo_flame.png:
--------------------------------------------------------------------------------
1 | --2019-03-18 14:43:17-- https://pytorch.org/assets/images/pytorch-logo.png
2 | Resolving pytorch.org (pytorch.org)... 185.199.108.153
3 | Connecting to pytorch.org (pytorch.org)|185.199.108.153|:443... connected.
4 | HTTP request sent, awaiting response... 200 OK
5 | Length: 22916 (22K) [image/png]
6 | Saving to: ‘pytorch-logo.png’
7 |
8 | 0K .......... .......... .. 100% 664K=0,03s
9 |
10 | 2019-03-18 14:43:18 (664 KB/s) - ‘pytorch-logo.png’ saved [22916/22916]
11 |
12 |
--------------------------------------------------------------------------------
/notebooks/img/software_vs_ml.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sotte/pytorch_tutorial/2b0e44df0d7f19e68237481213a89479f980969d/notebooks/img/software_vs_ml.png
--------------------------------------------------------------------------------
/notebooks/img/tensorboardx_demo.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sotte/pytorch_tutorial/2b0e44df0d7f19e68237481213a89479f980969d/notebooks/img/tensorboardx_demo.gif
--------------------------------------------------------------------------------
/notebooks/img/tensorboardx_demo2.gif:
--------------------------------------------------------------------------------
1 | --2018-07-05 11:05:43-- https://raw.githubusercontent.com/lanpa/tensorboard-pytorch/master/screenshots/Demo.gif
2 | Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.36.133
3 | Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.36.133|:443... connected.
4 | HTTP request sent, awaiting response... 200 OK
5 | Length: 490837 (479K) [image/gif]
6 | Saving to: ‘Demo.gif.1’
7 |
8 | 0K .......... .......... .......... .......... .......... 10% 959K 0s
9 | 50K .......... .......... .......... .......... .......... 20% 2.48M 0s
10 | 100K .......... .......... .......... .......... .......... 31% 5.91M 0s
11 | 150K .......... .......... .......... .......... .......... 41% 3.05M 0s
12 | 200K .......... .......... .......... .......... .......... 52% 4.57M 0s
13 | 250K .......... .......... .......... .......... .......... 62% 4.07M 0s
14 | 300K .......... .......... .......... .......... .......... 73% 4.67M 0s
15 | 350K .......... .......... .......... .......... .......... 83% 3.12M 0s
16 | 400K .......... .......... .......... .......... .......... 93% 5.47M 0s
17 | 450K .......... .......... ......... 100% 6.33M=0.2s
18 |
19 | 2018-07-05 11:05:43 (2.96 MB/s) - ‘Demo.gif.1’ saved [490837/490837]
20 |
21 |
--------------------------------------------------------------------------------
/notebooks/img/the_real_reason.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sotte/pytorch_tutorial/2b0e44df0d7f19e68237481213a89479f980969d/notebooks/img/the_real_reason.png
--------------------------------------------------------------------------------
/notebooks/img/visdom.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sotte/pytorch_tutorial/2b0e44df0d7f19e68237481213a89479f980969d/notebooks/img/visdom.png
--------------------------------------------------------------------------------
/notebooks/lin_reg.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# LinReg with PyTorch, Gradient Descent, and GPU"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "## Init, helpers, utils ..."
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": 1,
20 | "metadata": {
21 | "execution": {
22 | "iopub.execute_input": "2020-12-06T08:34:52.845595Z",
23 | "iopub.status.busy": "2020-12-06T08:34:52.843684Z",
24 | "iopub.status.idle": "2020-12-06T08:34:53.191367Z",
25 | "shell.execute_reply": "2020-12-06T08:34:53.191620Z"
26 | }
27 | },
28 | "outputs": [],
29 | "source": [
30 | "%matplotlib inline"
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": 2,
36 | "metadata": {
37 | "execution": {
38 | "iopub.execute_input": "2020-12-06T08:34:53.194066Z",
39 | "iopub.status.busy": "2020-12-06T08:34:53.193706Z",
40 | "iopub.status.idle": "2020-12-06T08:34:53.465407Z",
41 | "shell.execute_reply": "2020-12-06T08:34:53.465662Z"
42 | }
43 | },
44 | "outputs": [
45 | {
46 | "name": "stderr",
47 | "output_type": "stream",
48 | "text": [
49 | "/home/stefan/projects/pytorch_tutorial/.venv/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)\n",
50 | " return torch._C._cuda_getDeviceCount() > 0\n"
51 | ]
52 | },
53 | {
54 | "data": {
55 | "text/plain": [
56 | "device(type='cpu')"
57 | ]
58 | },
59 | "execution_count": 1,
60 | "metadata": {},
61 | "output_type": "execute_result"
62 | }
63 | ],
64 | "source": [
65 | "import torch\n",
66 | "import torch.nn as nn\n",
67 | "import torch.nn.functional as F\n",
68 | "import torch.optim as optim\n",
69 | "import torchvision\n",
70 | "\n",
71 | "DEVICE = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n",
72 | "DEVICE"
73 | ]
74 | },
75 | {
76 | "cell_type": "code",
77 | "execution_count": 3,
78 | "metadata": {
79 | "execution": {
80 | "iopub.execute_input": "2020-12-06T08:34:53.467950Z",
81 | "iopub.status.busy": "2020-12-06T08:34:53.467641Z",
82 | "iopub.status.idle": "2020-12-06T08:34:53.469044Z",
83 | "shell.execute_reply": "2020-12-06T08:34:53.469282Z"
84 | }
85 | },
86 | "outputs": [],
87 | "source": [
88 | "from pprint import pprint\n",
89 | "\n",
90 | "import matplotlib.pyplot as plt\n",
91 | "import numpy as np\n",
92 | "from IPython.core.debugger import set_trace"
93 | ]
94 | },
95 | {
96 | "cell_type": "markdown",
97 | "metadata": {},
98 | "source": [
99 | "# The Problem"
100 | ]
101 | },
102 | {
103 | "cell_type": "code",
104 | "execution_count": 4,
105 | "metadata": {
106 | "execution": {
107 | "iopub.execute_input": "2020-12-06T08:34:53.471764Z",
108 | "iopub.status.busy": "2020-12-06T08:34:53.471456Z",
109 | "iopub.status.idle": "2020-12-06T08:34:54.489102Z",
110 | "shell.execute_reply": "2020-12-06T08:34:54.489343Z"
111 | }
112 | },
113 | "outputs": [
114 | {
115 | "data": {
116 | "text/plain": [
117 | "[]"
118 | ]
119 | },
120 | "execution_count": 1,
121 | "metadata": {},
122 | "output_type": "execute_result"
123 | },
124 | {
125 | "data": {
126 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX8AAAD4CAYAAAAEhuazAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAAsTAAALEwEAmpwYAAAWl0lEQVR4nO3df4wc5X3H8c9373woaZG4GAeIzz+wYpAw6g97ZS5KKyUNTUyE6gbnhwlKaQNx/wC1aSu1pEhuRRQpSps2/eGmchyUVDKmNDgySkj5EblCqnKEWyuKbAjp6cLiMySYY0kjgbjb22//uF1379jd29ndmXlm5/2SwLczs7cPK/OZZ77PM8+YuwsAkC+FtBsAAEge4Q8AOUT4A0AOEf4AkEOEPwDk0GjaDejGpZde6lu3bk27GQCQKaVS6WV339BqXybCf+vWrZqenk67GQCQKWZWbrePsg8A5BDhDwA5RPgDQA4R/gCQQ4Q/AOQQ4Q8AOUT4A0APSuWKDp2cUalcSbspPcnEPH8ACEmpXNEtR6a0UK1pbLSgo7dPateW8bSbFQk9fwCIaGp2XgvVmmouLVZrmpqdT7tJkRH+ABDR5Lb1GhstaMSkdaMFTW5bn3aTIqPsAwAR7doyrqO3T2pqdl6T29ZnruQjEf4A0JNdW8YzGfoNlH0AIIcIfwDIIcIfAHKI8AeAHCL8ASCHCH8AyKGBhL+Z3WtmL5nZ6aZtbzOzx8zsf+p/jte3m5n9o5nNmNkPzWznINoAAOjeoHr+X5O0Z9W2uyR91923S/pu/bUk3SBpe/2fA5K+PKA2AAC6NJDwd/cnJL2yavNeSV+v//x1Sb/btP3ffNmUpEvM7IpBtAMA0J04a/6XufuL9Z9/Kumy+s8bJZ1tOm6uvg0AkJBEBnzd3SV5lPeY2QEzmzaz6fPnz8fUMgDIpzjD/2eNck79z5fq289J2tR03ER92wruftjdi+5e3LBhQ4zNBID8iTP8H5J0a/3nWyWdaNr+e/VZP5OSft5UHgIAJGAgq3qa2TFJ75F0qZnNSforSZ+X9ICZ3SapLOmj9cMflvRBSTOSXpP0B4NoAwCgewMJf3e/uc2u97U41iXdMYjPBQD0hjt8ASCHCH8AGIBSuaJDJ2dUKlfSbkpXeJIXAPSpVK7oliNTWqjWNDZa0NHbJ4N/yhc9fwDo09TsvBaqNdVcWqzWNDU7n3aT1kT4A0CfJret19hoQSMmrRstaHLb+rSbtCbKPgDQp11bxnX09kk9eGpOlnZjukTPHwBW6XXw9vipOR37/vO65chU8AO/9PwBoEmvg7et6v4hD/rS8weAJr0O3mat7k/PHwCaNEJ8sVqLFOKNuv/U7Lwmt60PutcvSba82kLYisWiT09Pp90MADlRKleCCPF+22FmJXcvttpHzx8AVtm1ZTz1nnvcN45R8weAAMV94xjhDwABinsAmbIPAAQo7gFkwh8AAhXn2ANlHwDIIcIfAHKI8AeQuqw9CGUYUPMHkKosPggllJvA+kH4A0hV1hZEy+LJqhXKPgBSFWU+ewjloSw+tasVev4AUtXtfPZQety9LvwWGsIfQOo6zWdv1NdfePX1IMpDWVu9sx3CH0Cwmnv7owXT6EhBS0vp97hDWPitX4Q/gMR1O1umub6+VHN9bPcmbbzkLcH1uLM4+4fwB5CoKLX71fX1fTsnggvXUMYioiL8ASQqytTOLNTXszZVtYHwB5CoqLNlQq+vZ3X2D49xBJC45hq5pK579u1q62nX3NP+/HZ4jCOAoDR681Hq5e2ODaHmHvrVSSvc4QsgNVHulm137LDccZu02Hv+ZvacpF9IWpJUdfeimb1N0r9L2irpOUkfdXeW8wNyJkq9vN2xWa25py32mn89/Ivu/nLTti9IesXdP29md0kad/e/aPc7qPkDwytKvTzUmn+oOtX80wr/ZyW9x91fNLMrJP2Xu1/d7ncQ/gAQXafwT6Lm75IeNbOSmR2ob7vM3V+s//xTSZetfpOZHTCzaTObPn/+fALNBID8SGK2z2+4+zkze7ukx8zsR8073d3N7E2XH+5+WNJhabnnn0A7AaBvWSlBxR7+7n6u/udLZvZNSbsl/czMrmgq+7wUdzsAIG4hTDvtVqxlHzP7JTO7uPGzpPdLOi3pIUm31g+7VdKJONsBAEnI0rTTuHv+l0n6ppk1Pus+d/9PM3tK0gNmdpuksqSPxtwOAF3ISskiVFmadhpr+Lv7rKRfbbF9XtL74vxsANEkUbIY9pNLFhaia2B5ByDHmsM47tUps1QP70dWlnog/IGcWh3GB2/cEWvJIqtLHw8rwh/IqdVhXHltIdaSRZbq4XlA+AM51SqM4yxZdKqHD/tYQIgIfyCnGmF8/NScBn0XZbswb3VyyctYQGgIf6APw9BjffDUnBaqNR0/NTeQ4I0a5owFpIP1/IEeNULui48+q1uOTKlUzt6q5MdPzemNxcHelBT1RqdG+WnExFhAguj5Az3Keo+1VK7oP6bPXij5jIwMJnh7eUZvVubGDxPCH+hR1mevTM3Oq1pbjn6T9J6rNlzopfcTwL2EeVbmxg8THuAO9CHLNf9G2WqxWtPISEFyV7XmDLoOER7gDsQkyz3W5h76uVdf1/3ffz6zJSxER/gDOdY4eZXKFR0/NZfZEhaiI/yBnGsE/29u36C3X3yRbto5Qa8/Bwh/IMdK5Ypu/srynHxJGhsx3bRzIuVWIQnM8wdybGp2Xov14JekxSUP+gEkDaVyRYdOzmTy3opQ0PMHMqzf2UaT29Zr3WjhQs9/3Yj1Ve9PYvYTy0EMBuEPZNQgQnDXlnEd+9T/r++zr496f1KhnPWb60JB+AMZ1RyCC4s1fenxH+vT11/V0wlgEOEZRyi3upLI+s11oSD8gYwaf+uYCmZyd9Uk/ffMy3rquVdiL4O0K+0MOpTbXUmwHMRgEP5ABpXKFd3zrTOquctMkiuRMkin0s6gQ7nTlUSWb64LBeEPxCDugc/mYCyYNFJYvgKIuwyyVmlnkKFMeSdehD8wYO16x/2cEFa/d3UwHrxxhyqvLcReBkkykLu5ksjy2kppI/yBAWu3nn2vM2HanUzSqnvv2znR98ygbnW6kmDKZ38If2DAWvWOo8yEWd2bbffeKCWWQfSQV4ftvpTvBGbKZ38If2DA2vXKuymXtOrN9ltqGVQPObSwZUygP4Q/EIPVvfLmE8L4W8faPjSlVcDe8d539lXiGVRohxa2TPnsD+EPJKQRTp164e0CttdZNKVyRedefV2jIwUtLa38nVFLQSGGLVM+e0f4Awlq1wtvDuLmgJWkQydnep4h1DjRjBZM+3dvvrBcc6+lIMJ2eBD+QIJa9exbBfEd731n37X65hPNUs31jkvecuH9odXvkTyWdAZ61Muywo3SyZ++/+oLYd5uami77d1qnGhGTG+q0Xfah3yg54/c62UaZD+98tWlk3Z1/n4HWDvV6EOs3yNZhD9yrdcQ76Z2H+Vk0CqIBxHQnWr01O/zLbXwN7M9kv5B0oikI+7++bTagvzqtfbdbe2+16uBtbY3i3LCYTkENKQS/mY2IumQpN+WNCfpKTN7yN2fTqM9yK9eSyuteuWHTs4kPoga5YSz1rGcGPIlrZ7/bkkz7j4rSWZ2v6S9kgh/JKqf0kq3tfs4Rbly6XQs6+TkT1rhv1HS2abXc5Kuaz7AzA5IOiBJmzdvTq5lyJ1B1b7TGESNcsLpdCxTP/Mn2AFfdz8s6bAkFYtFT7k5QFeSHkSNcsLpdGxoSzcgfmmF/zlJm5peT9S3AUMpznp6lBNOp4Flpn7mS1rh/5Sk7WZ2pZZDf7+kj6fUFiBWIdbTW52MmPqZL6mEv7tXzexOSY9oearnve5+Jo22AHGbmp3XG4s1uaSFxfTr6SGejJC81Gr+7v6wpIfT+nwMj9CnKI6/dUyNQata/XWaGNyFFPCAL9CNLPRiK68tqGBSzSWTdPqFn0d6/6BPbgzuQiL8kXFZ6MVObluv0YJpYcnlkr5Rmlvx/NtO4d7rya3T7xzU4G7oV1zojPBHpmWhF7try7g+Utyk+558Xi5paWnlWkCdwr2Xk1s3J4x+B3ezcMWFzljSGZnWaonkEN20c0IXrXvzEsprLdvcy9LL/S4F3Y0kPgPxouePzIt7iuIgyhvtSi1rXbn0UqJJ4mooC1dc6Mzcw795tlgs+vT0dNrNQA71Ut6IerKIo3ZeKld0/NScXFoxvjBI1PzDZ2Yldy+22kfPH+ggas29l5NFXFcuD56a00K1puOn5mIpiXFTWLZR8wc6iFpzD6UWHko7EC56/kAbjbLGwRt36PQLP5d18Z5QauGhtAPhouYPtHDfk8/r4InTWqq5CiZZwVSreVelnFBq4aG0A+mh5g9EUCpXdPDEaVVryx2jJW/8q7u6fyi18FDagTBR8wealMoVfenxH18I/mam7ufaJ6VUrujQyRmVypW0m4KMoecP1DXP1GnWCP0P75qIbdpkL7jLFv0g/BGEEOrTzTNkCia9+52X6oZrr1DltYUVd+RKCiJks7CuEcJF+CN1ofRgV8+Q+fT1VwX9gHNm9KAfhD9SF0oPttNSCr3c7BX3lQyPXkQ/CH+kLqQebLsZMlHa2DxN9KJ18V4lMKMHvSL8kbos9GC7bePqaaIhPLYRaIXwRxCae7AhDP620k0ve2p2XrWmGycLBaMWjyAR/ohNLyEe4sBqFI3y0EK1poKZ7tl7babaj/wg/BGLXkM87sHfuK8qslDCAiTCHzHpNcTjHPxN6qqCQVhkAeGPWPQa4nH2nEOZUgqEgPBHLPoJ8Th6zqVyRS+8+rpGC6almnc8IYU64AwMEuGP2IRS/mgu94yOFPSx3ZvartHTrjTECQHDhvDH0Gsu9ywt1bTxkre0DfB2T8C6+StTF0pYxz6VrRlIQCss6Yy+ZGFJ4SiPYmx17PH6s3BduvBMXCDr6PmjZ/3OnkmqlLLW+MPqdqw+9sFVYR/+s++AtRH+6Fk/s2eSvpmr3fhDu3Y0H7tv54S+MX1Wi0uudSOmfTsnYmsnkBTCHz3rZ05+KNMuu2nHri3jOnbgXQz4YqgQ/uhZP9M5Q1nJs9t2hDJzCRgUc4+ngmlmfy3pU5LO1zf9pbs/XN/3GUm3SVqS9Efu/kin31UsFn16ejqWdiI9SdX81/ocpnFiWJlZyd2LrfbF3fP/e3f/21WNuUbSfkk7JL1D0uNmdpW7L8XcFgSml9501KDudt4+oY+8SaPss1fS/e7+hqSfmNmMpN2SvpdCW5AhvQwSt5u3n/TKoVxdIDRxz/O/08x+aGb3mlnjb/xGSWebjpmrbwM6ahfknbSat9/L7+lH46T1xUef1S1HpoK+JwL50VfP38wel3R5i113S/qypM9qeVr0ZyV9UdInI/zuA5IOSNLmzZv7aSaGRC+DxO0GpZMcbA5lZhPQrK/wd/fruznOzL4i6Vv1l+ckbWraPVHftvp3H5Z0WFoe8O2nnRgOvc4uWl3TT3rN/VBmNgHN4pztc4W7v1j/+U8kXefu+81sh6T7tFznf4ek70ra3mnAl9k+yDpq/khDWrN9vmBmv6blss9zkv5Qktz9jJk9IOlpSVVJdzDTB8OOGUUITWzh7+6f6LDvc5I+F9dnAwA6Y1VPAMghwh8Acojwx9DIwrMFgFCwsNuQyPtskqSXiAayjvAfAlkKvrhOUtxIBURD+A+BrARfnCcpbqQCoiH8h0BWgi/Ok1TSd+0CWUf4D4E4gm91eWYQ5Zq4T1LcSAV0L7blHQaJ5R2Sdd+Tz+vgidNaqrkuWlfQwRt36J5vnRlIuSbvA9NAktJ8mAsyplSu6OCJ06rWljsFC4s1fef0iwMr19A7B8LAPH+sMDU7r1rT1WChYLrh2ivetCY+gGyj548VGnX5hWpNBTPds/daffy6zbr68osp1wBDhJo/3mTQdXnq/EA6qPkjkkHW5bN0AxqQJ9T8Eaukn5cLoDuEP2LV6gHqANJH2Qex4s5bIEyEP2LH3H4gPJR9ACCHCH+siYekAMOHsg86YqomMJzo+Q+hQfbUmaoJDCd6/kNm0D31rDwrAEA0hH9ABrEMwqAfmMJUTWA4Ef6BaO6xj44U9OFdE9q3c0KSIgVvHD11pmoCw4fwD0Rzj32hWtOxJ5/XN6bPSmaqLnVfwqGnDqAbhH8gGj32NxZrckkuaXFp+SdXtBIOPXUAa2G2TyAaPfaPX7dZYyO2vBbOiGkd6+IAiAE9/4A0euw37Zy4ULaRotX8AaAbhH+AVpdtCH0Ag0bZBwByiPAHgBzqK/zN7CNmdsbMamZWXLXvM2Y2Y2bPmtkHmrbvqW+bMbO7+vl8AEBv+u35n5Z0k6Qnmjea2TWS9kvaIWmPpH8xsxEzG5F0SNINkq6RdHP9WABAgvoa8HX3ZyTJzFbv2ivpfnd/Q9JPzGxG0u76vhl3n62/7/76sU/30w4AQDRx1fw3Sjrb9Hquvq3d9jcxswNmNm1m0+fPn4+pmQCQT2v2/M3scUmXt9h1t7ufGHyTlrn7YUmHJalYLHpcnwMAebRm+Lv79T383nOSNjW9nqhvU4ftAICExFX2eUjSfjO7yMyulLRd0vclPSVpu5ldaWZjWh4UfiimNgAA2uhrwNfMPiTpnyRtkPRtM/uBu3/A3c+Y2QNaHsitSrrD3Zfq77lT0iOSRiTd6+5n+vovAABEZu7hl9OLxaJPT0+n3QwAyBQzK7l7sdU+7vAFgBwi/AEghwh/AMghwh8AcojwB4AcIvwBIIcIfwDIIcIfAHKI8AeAHCL8ASCHCH8AyCHCHwByiPAHgBwi/AEgh4Y+/Evlig6dnFGpXEm7KQAQjL4e5hK6UrmiW45MaaFa09hoQUdvn9SuLeNpNwsAUjfUPf+p2XktVGuqubRYrWlqdj7tJgFAEIY6/Ce3rdfYaEEjJq0bLWhy2/q0mwQAQRjqss+uLeM6evukpmbnNbltPSUfAKgb6vCXlk8AhD4ArDTUZR8AQGuEPwDkEOEPADlE+ANADhH+AJBDhD8A5JC5e9ptWJOZnZdUTrsdCbhU0stpNyIgfB8r8X2sxPexUqvvY4u7b2h1cCbCPy/MbNrdi2m3IxR8HyvxfazE97FS1O+Dsg8A5BDhDwA5RPiH5XDaDQgM38dKfB8r8X2sFOn7oOYPADlEzx8AcojwB4AcIvwDY2Z/Y2Y/MrMfmtk3zeyStNuUJjP7iJmdMbOameVyWp+Z7TGzZ81sxszuSrs9aTOze83sJTM7nXZbQmBmm8zspJk9Xf9/5Y+7eR/hH57HJF3r7r8i6ceSPpNye9J2WtJNkp5IuyFpMLMRSYck3SDpGkk3m9k16bYqdV+TtCftRgSkKunP3P0aSZOS7ujm7wjhHxh3f9Tdq/WXU5Im0mxP2tz9GXd/Nu12pGi3pBl3n3X3BUn3S9qbcptS5e5PSHol7XaEwt1fdPdT9Z9/IekZSRvXeh/hH7ZPSvpO2o1AqjZKOtv0ek5d/I+NfDKzrZJ+XdKTax079I9xDJGZPS7p8ha77nb3E/Vj7tby5dzRJNuWhm6+DwCdmdkvS3pQ0qfd/X/XOp7wT4G7X99pv5n9vqQbJb3Pc3AjxlrfR86dk7Sp6fVEfRtwgZmt03LwH3X34928h7JPYMxsj6Q/l/Q77v5a2u1B6p6StN3MrjSzMUn7JT2UcpsQEDMzSV+V9Iy7/1237yP8w/PPki6W9JiZ/cDM/jXtBqXJzD5kZnOS3iXp22b2SNptSlJ98P9OSY9oeSDvAXc/k26r0mVmxyR9T9LVZjZnZrel3aaUvVvSJyT9Vj0zfmBmH1zrTSzvAAA5RM8fAHKI8AeAHCL8ASCHCH8AyCHCHwByiPAHgBwi/AEgh/4PeZyx60XvwHkAAAAASUVORK5CYII=\n",
127 | "text/plain": [
128 | ""
129 | ]
130 | },
131 | "metadata": {
132 | "needs_background": "light"
133 | },
134 | "output_type": "display_data"
135 | }
136 | ],
137 | "source": [
138 | "from sklearn.datasets import make_regression\n",
139 | "\n",
140 | "\n",
141 | "n_features = 1\n",
142 | "n_samples = 100\n",
143 | "\n",
144 | "X, y = make_regression(\n",
145 | " n_samples=n_samples,\n",
146 | " n_features=n_features,\n",
147 | " noise=20,\n",
148 | " random_state=42,\n",
149 | ")\n",
150 | "\n",
151 | "fix, ax = plt.subplots()\n",
152 | "ax.plot(X, y, \".\")"
153 | ]
154 | },
155 | {
156 | "cell_type": "markdown",
157 | "metadata": {},
158 | "source": [
159 | "# The Solution"
160 | ]
161 | },
162 | {
163 | "cell_type": "code",
164 | "execution_count": 5,
165 | "metadata": {
166 | "execution": {
167 | "iopub.execute_input": "2020-12-06T08:34:54.491719Z",
168 | "iopub.status.busy": "2020-12-06T08:34:54.491412Z",
169 | "iopub.status.idle": "2020-12-06T08:34:54.492986Z",
170 | "shell.execute_reply": "2020-12-06T08:34:54.493220Z"
171 | }
172 | },
173 | "outputs": [],
174 | "source": [
175 | "X = torch.from_numpy(X).float()\n",
176 | "y = torch.from_numpy(y.reshape((n_samples, n_features))).float()"
177 | ]
178 | },
179 | {
180 | "cell_type": "code",
181 | "execution_count": 6,
182 | "metadata": {
183 | "execution": {
184 | "iopub.execute_input": "2020-12-06T08:34:54.495698Z",
185 | "iopub.status.busy": "2020-12-06T08:34:54.495388Z",
186 | "iopub.status.idle": "2020-12-06T08:34:54.497014Z",
187 | "shell.execute_reply": "2020-12-06T08:34:54.496762Z"
188 | }
189 | },
190 | "outputs": [],
191 | "source": [
192 | "class LinReg(nn.Module):\n",
193 | " def __init__(self, input_dim):\n",
194 | " super().__init__()\n",
195 | " self.beta = nn.Linear(input_dim, 1)\n",
196 | " \n",
197 | " def forward(self, X):\n",
198 | " return self.beta(X)\n",
199 | "\n",
200 | "# or just\n",
201 | "# model = nn.Linear(input_dim, 1)"
202 | ]
203 | },
204 | {
205 | "cell_type": "code",
206 | "execution_count": 7,
207 | "metadata": {
208 | "execution": {
209 | "iopub.execute_input": "2020-12-06T08:34:54.499394Z",
210 | "iopub.status.busy": "2020-12-06T08:34:54.499084Z",
211 | "iopub.status.idle": "2020-12-06T08:34:54.501329Z",
212 | "shell.execute_reply": "2020-12-06T08:34:54.501565Z"
213 | }
214 | },
215 | "outputs": [],
216 | "source": [
217 | "model = LinReg(n_features).to(DEVICE) # <-- here\n",
218 | "loss_fn = nn.MSELoss()\n",
219 | "optimizer = optim.SGD(model.parameters(), lr=0.1)\n",
220 | "\n",
221 | "\n",
222 | "X, y = X.to(DEVICE), y.to(DEVICE) # <-- here"
223 | ]
224 | },
225 | {
226 | "cell_type": "code",
227 | "execution_count": 8,
228 | "metadata": {
229 | "execution": {
230 | "iopub.execute_input": "2020-12-06T08:34:54.504699Z",
231 | "iopub.status.busy": "2020-12-06T08:34:54.504378Z",
232 | "iopub.status.idle": "2020-12-06T08:34:54.594137Z",
233 | "shell.execute_reply": "2020-12-06T08:34:54.594373Z"
234 | }
235 | },
236 | "outputs": [
237 | {
238 | "data": {
239 | "text/plain": [
240 | ""
241 | ]
242 | },
243 | "execution_count": 1,
244 | "metadata": {},
245 | "output_type": "execute_result"
246 | },
247 | {
248 | "data": {
249 | "image/png": "\n",
250 | "text/plain": [
251 | ""
252 | ]
253 | },
254 | "metadata": {
255 | "needs_background": "light"
256 | },
257 | "output_type": "display_data"
258 | }
259 | ],
260 | "source": [
261 | "# Train step\n",
262 | "model.train() # <-- here\n",
263 | "optimizer.zero_grad()\n",
264 | "\n",
265 | "y_ = model(X)\n",
266 | "loss = loss_fn(y_, y)\n",
267 | "\n",
268 | "loss.backward()\n",
269 | "optimizer.step()\n",
270 | "\n",
271 | "# Eval\n",
272 | "model.eval() # <-- here\n",
273 | "with torch.no_grad():\n",
274 | " y_ = model(X) \n",
275 | "\n",
276 | "# Vis\n",
277 | "fig, ax = plt.subplots()\n",
278 | "ax.plot(X.cpu().numpy(), y_.cpu().numpy(), \".\", label=\"pred\")\n",
279 | "ax.plot(X.cpu().numpy(), y.cpu().numpy(), \".\", label=\"data\")\n",
280 | "ax.set_title(f\"MSE: {loss.item():0.1f}\")\n",
281 | "ax.legend();"
282 | ]
283 | },
284 | {
285 | "cell_type": "markdown",
286 | "metadata": {},
287 | "source": [
288 | "Note: I did gradient descent with all the data. I did not split the data into `train` and `valid` which should be done!"
289 | ]
290 | },
291 | {
292 | "cell_type": "code",
293 | "execution_count": 9,
294 | "metadata": {
295 | "execution": {
296 | "iopub.execute_input": "2020-12-06T08:34:54.597411Z",
297 | "iopub.status.busy": "2020-12-06T08:34:54.597041Z",
298 | "iopub.status.idle": "2020-12-06T08:34:54.598974Z",
299 | "shell.execute_reply": "2020-12-06T08:34:54.599203Z"
300 | }
301 | },
302 | "outputs": [
303 | {
304 | "data": {
305 | "text/plain": [
306 | "[0,\n",
307 | " 1,\n",
308 | " 2,\n",
309 | " 3,\n",
310 | " 4,\n",
311 | " 5,\n",
312 | " 6,\n",
313 | " 7,\n",
314 | " 8,\n",
315 | " 9,\n",
316 | " 10,\n",
317 | " 11,\n",
318 | " 12,\n",
319 | " 13,\n",
320 | " 14,\n",
321 | " 15,\n",
322 | " 16,\n",
323 | " 17,\n",
324 | " 18,\n",
325 | " 19,\n",
326 | " 20,\n",
327 | " 21,\n",
328 | " 22,\n",
329 | " 23,\n",
330 | " 24,\n",
331 | " 25,\n",
332 | " 26,\n",
333 | " 27,\n",
334 | " 28,\n",
335 | " 29,\n",
336 | " 30,\n",
337 | " 31,\n",
338 | " 32,\n",
339 | " 33,\n",
340 | " 34,\n",
341 | " 35,\n",
342 | " 36,\n",
343 | " 37,\n",
344 | " 38,\n",
345 | " 39,\n",
346 | " 40,\n",
347 | " 41,\n",
348 | " 42,\n",
349 | " 43,\n",
350 | " 44,\n",
351 | " 45,\n",
352 | " 46,\n",
353 | " 47,\n",
354 | " 48,\n",
355 | " 49,\n",
356 | " 50,\n",
357 | " 51,\n",
358 | " 52,\n",
359 | " 53,\n",
360 | " 54,\n",
361 | " 55,\n",
362 | " 56,\n",
363 | " 57,\n",
364 | " 58,\n",
365 | " 59,\n",
366 | " 60,\n",
367 | " 61,\n",
368 | " 62,\n",
369 | " 63,\n",
370 | " 64,\n",
371 | " 65,\n",
372 | " 66,\n",
373 | " 67,\n",
374 | " 68,\n",
375 | " 69,\n",
376 | " 70,\n",
377 | " 71,\n",
378 | " 72,\n",
379 | " 73,\n",
380 | " 74,\n",
381 | " 75,\n",
382 | " 76,\n",
383 | " 77,\n",
384 | " 78,\n",
385 | " 79,\n",
386 | " 80,\n",
387 | " 81,\n",
388 | " 82,\n",
389 | " 83,\n",
390 | " 84,\n",
391 | " 85,\n",
392 | " 86,\n",
393 | " 87,\n",
394 | " 88,\n",
395 | " 89,\n",
396 | " 90,\n",
397 | " 91,\n",
398 | " 92,\n",
399 | " 93,\n",
400 | " 94,\n",
401 | " 95,\n",
402 | " 96,\n",
403 | " 97,\n",
404 | " 98,\n",
405 | " 99]"
406 | ]
407 | },
408 | "execution_count": 1,
409 | "metadata": {},
410 | "output_type": "execute_result"
411 | }
412 | ],
413 | "source": [
414 | "list(range(100))"
415 | ]
416 | },
417 | {
418 | "cell_type": "markdown",
419 | "metadata": {},
420 | "source": [
421 | "# Exercise:\n",
422 | "- Write a proper training loop for this linear regression example.\n",
423 | "- Split data into train and valid.\n",
424 | "- Use the Dataset and DataLoader abstraction.\n",
425 | "- Create a logistic regression module.\n",
426 | "- Create a Multi Layer Perceptron (MLP)."
427 | ]
428 | }
429 | ],
430 | "metadata": {
431 | "kernelspec": {
432 | "display_name": "Python 3",
433 | "language": "python",
434 | "name": "python3"
435 | },
436 | "language_info": {
437 | "codemirror_mode": {
438 | "name": "ipython",
439 | "version": 3
440 | },
441 | "file_extension": ".py",
442 | "mimetype": "text/x-python",
443 | "name": "python",
444 | "nbconvert_exporter": "python",
445 | "pygments_lexer": "ipython3",
446 | "version": "3.8.5"
447 | }
448 | },
449 | "nbformat": 4,
450 | "nbformat_minor": 2
451 | }
452 |
--------------------------------------------------------------------------------
/notebooks/lin_reg.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # LinReg with PyTorch, Gradient Descent, and GPU
17 |
18 | # %% [markdown]
19 | # ## Init, helpers, utils ...
20 |
21 | # %%
22 | # %matplotlib inline
23 |
24 | # %%
25 | import torch
26 | import torch.nn as nn
27 | import torch.nn.functional as F
28 | import torch.optim as optim
29 | import torchvision
30 |
31 | DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
32 | DEVICE
33 |
34 | # %%
35 | from pprint import pprint
36 |
37 | import matplotlib.pyplot as plt
38 | import numpy as np
39 | from IPython.core.debugger import set_trace
40 |
41 | # %% [markdown]
42 | # # The Problem
43 |
44 | # %%
45 | from sklearn.datasets import make_regression
46 |
47 |
48 | n_features = 1
49 | n_samples = 100
50 |
51 | X, y = make_regression(
52 | n_samples=n_samples,
53 | n_features=n_features,
54 | noise=20,
55 | random_state=42,
56 | )
57 |
58 | fix, ax = plt.subplots()
59 | ax.plot(X, y, ".")
60 |
61 | # %% [markdown]
62 | # # The Solution
63 |
64 | # %%
65 | X = torch.from_numpy(X).float()
66 | y = torch.from_numpy(y.reshape((n_samples, n_features))).float()
67 |
68 |
69 | # %%
70 | class LinReg(nn.Module):
71 | def __init__(self, input_dim):
72 | super().__init__()
73 | self.beta = nn.Linear(input_dim, 1)
74 |
75 | def forward(self, X):
76 | return self.beta(X)
77 |
78 | # or just
79 | # model = nn.Linear(input_dim, 1)
80 |
81 |
82 | # %%
83 | model = LinReg(n_features).to(DEVICE) # <-- here
84 | loss_fn = nn.MSELoss()
85 | optimizer = optim.SGD(model.parameters(), lr=0.1)
86 |
87 |
88 | X, y = X.to(DEVICE), y.to(DEVICE) # <-- here
89 |
90 | # %%
91 | # Train step
92 | model.train() # <-- here
93 | optimizer.zero_grad()
94 |
95 | y_ = model(X)
96 | loss = loss_fn(y_, y)
97 |
98 | loss.backward()
99 | optimizer.step()
100 |
101 | # Eval
102 | model.eval() # <-- here
103 | with torch.no_grad():
104 | y_ = model(X)
105 |
106 | # Vis
107 | fig, ax = plt.subplots()
108 | ax.plot(X.cpu().numpy(), y_.cpu().numpy(), ".", label="pred")
109 | ax.plot(X.cpu().numpy(), y.cpu().numpy(), ".", label="data")
110 | ax.set_title(f"MSE: {loss.item():0.1f}")
111 | ax.legend();
112 |
113 | # %% [markdown]
114 | # Note: I did gradient descent with all the data. I did not split the data into `train` and `valid` which should be done!
115 |
116 | # %%
117 | list(range(100))
118 |
119 | # %% [markdown]
120 | # # Exercise:
121 | # - Write a proper training loop for this linear regression example.
122 | # - Split data into train and valid.
123 | # - Use the Dataset and DataLoader abstraction.
124 | # - Create a logistic regression module.
125 | # - Create a Multi Layer Perceptron (MLP).
126 |
--------------------------------------------------------------------------------
/notebooks/machine_learning_101.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # ML 101 Recap
17 | #
18 | # **ML = model + loss + optimizer**
19 | #
20 | #
21 | # ## Linear regression example
22 | #
23 | # 0. Data
24 | #
25 | # 1. Model:
26 | # - $f(X) = X \beta = \hat y$
27 | #
28 | # 2. Loss / criterion:
29 | # - $ err_i = y_i - f(X_i)$
30 | # - $MSE = \frac{1}{n} \sum_{i=1}^{N} err_i^2$
31 | #
32 | # 3. Optimize:
33 | # - minimize the MSE yields the optimal $\hat\beta$ (after doing some math)
34 | # - $\hat\beta = (X^TX)^{-1}X^Ty$
35 | # - (or, more generally, use gradient descent to optimize the parameters)
36 |
37 | # %%
38 | import numpy as np
39 | from numpy.linalg import inv
40 | from numpy.linalg import multi_dot as mdot
41 |
42 | import matplotlib.pyplot as plt
43 |
44 | # %matplotlib inline
45 |
46 | # %% [markdown]
47 | # ## LinReg with numpy
48 |
49 | # %%
50 | X = np.random.random((5, 3))
51 | y = np.random.random(5)
52 | X.shape, y.shape
53 |
54 | # %% [markdown]
55 | # Calculate the optimal parameter:
56 | # $\hat\beta = (X^T X)^{-1} X^T y$
57 |
58 | # %%
59 | XT = X.T # transpose
60 |
61 | beta_ = mdot([inv(XT @ X), XT, y])
62 | beta_
63 |
64 | # %%
65 | XT = X.T # transpose
66 |
67 | beta_ = inv(XT @ X) @ XT @ y
68 | beta_
69 |
70 |
71 | # %% [markdown]
72 | # The model $f$:
73 |
74 | # %%
75 | def f(X, beta):
76 | return X @ beta
77 |
78 | f(X, beta_)
79 |
80 | # %% [markdown]
81 | # ## LinReg with PyTorch
82 |
83 | # %%
84 | import torch
85 |
86 | # %%
87 | # X = torch.rand((5, 3))
88 | # y = torch.rand(5)
89 | X = torch.from_numpy(X)
90 | y = torch.from_numpy(y)
91 | X.shape, y.shape
92 |
93 | # %% [markdown]
94 | # $\hat\beta = (X^T X)^{-1} X^T y$
95 |
96 | # %%
97 | XT = X.t()
98 |
99 | beta__ = (XT @ X).inverse() @ XT @ y
100 | beta__
101 |
102 | # %%
103 | beta__.numpy() - beta_
104 |
105 | # %% [markdown]
106 | # ## LinReg with PyTorch and Gradent Descent
107 | #
108 | # Previously, we had to do some math to calculate the optimal $\hat\beta$.
109 | # PyTorch calculates the gradients for us automatically (more on that later)
110 | # and we can use some version of gradient desctent to find our $\hat\beta$.
111 |
112 | # %%
113 | from sklearn.datasets import make_regression
114 |
115 | n_features = 1
116 | n_samples = 100
117 |
118 | X, y = make_regression(
119 | n_samples=n_samples,
120 | n_features=n_features,
121 | noise=10,
122 | )
123 |
124 | dom_np = np.linspace(X.min(), X.max(), 20)
125 | dom = torch.from_numpy(dom_np).unsqueeze(-1).float()
126 |
127 | fix, ax = plt.subplots()
128 | ax.plot(X, y, ".")
129 |
130 | # %%
131 | X = torch.from_numpy(X).float()
132 | y = torch.from_numpy(y).float().unsqueeze(-1)
133 | X.shape, y.shape
134 |
135 | # %%
136 | from torch import nn
137 |
138 | class LinReg(nn.Module):
139 | def __init__(self, input_dim):
140 | super().__init__()
141 | self.beta = nn.Linear(input_dim, 1)
142 |
143 | def forward(self, X):
144 | return self.beta(X)
145 |
146 |
147 | model = LinReg(n_features)
148 |
149 | # %%
150 | loss_fn = nn.MSELoss()
151 |
152 | # %%
153 | from torch import optim
154 |
155 | optimizer = optim.SGD(model.parameters(), lr=0.01)
156 |
157 | # %%
158 | # Train step
159 | model.train()
160 | optimizer.zero_grad()
161 |
162 | y_ = model(X)
163 |
164 | loss = loss_fn(y_, y)
165 | loss.backward()
166 | optimizer.step()
167 |
168 | # Eval
169 | model.eval()
170 | with torch.no_grad():
171 | y_ = model(dom)
172 |
173 |
174 | # Vis
175 | fig, ax = plt.subplots()
176 | ax.plot(X.numpy(), y.numpy(), ".", label="data")
177 | ax.plot(dom_np, y_.numpy(), "-", label="pred")
178 | ax.set_title(f"MSE: {loss.item():0.1f}")
179 | ax.legend();
180 |
181 | # %%
182 | model.beta
183 |
184 | # %%
185 | model.beta.weight
186 |
187 | # %%
188 | model.beta.weight.data
189 |
190 | # %%
191 | model.beta.bias
192 |
193 | # %% [markdown]
194 | # ## LinReg with GPU
195 | #
196 | # Simply move the data and the model to the GPU.
197 |
198 | # %%
199 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
200 |
201 | model = LinReg(n_features).to(device) # <-- here
202 | optimizer = optim.SGD(model.parameters(), lr=0.0001)
203 | criterion = nn.MSELoss()
204 |
205 | X, y = X.to(device), y.to(device) # <-- here
206 | dom = dom.to(device)
207 |
208 | # %% [markdown]
209 | # The rest stays the same.
210 |
211 | # %%
212 | # Train step
213 | model.train()
214 | optimizer.zero_grad()
215 |
216 | y_ = model(X)
217 |
218 | loss = loss_fn(y_, y)
219 | loss.backward()
220 | optimizer.step()
221 |
222 | # Eval
223 | model.eval()
224 | with torch.no_grad():
225 | y_ = model(dom)
226 |
227 |
228 | # Vis
229 | fig, ax = plt.subplots()
230 | ax.plot(X.cpu().numpy(), y.cpu().numpy(), ".", label="data")
231 | ax.plot(dom_np, y_.cpu().numpy(), "-", label="pred")
232 | ax.set_title(f"MSE: {loss.cpu().item():0.1f}")
233 | ax.legend();
234 |
--------------------------------------------------------------------------------
/notebooks/mean_shift_clustering.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # Clustering with PyTorch
17 |
18 | # %% [markdown]
19 | # "PyTorch is a python package that provides [...]
20 | # Tensor computation (like numpy) with strong GPU acceleration [...]"
21 | #
22 | # So, let's use it for some Mean-shift clustering.
23 |
24 | # %%
25 | import math
26 | import operator
27 |
28 | import numpy as np
29 | import matplotlib.pyplot as plt
30 |
31 | import torch
32 |
33 | # %matplotlib inline
34 |
35 | # %% [markdown]
36 | # # Mean shitft clustering with numpy
37 |
38 | # %% [markdown]
39 | # ## Create data
40 |
41 | # %%
42 | n_clusters = 6
43 | n_samples = 1000
44 |
45 | # %% [markdown]
46 | # To generate our data, we're going to pick `n_clusters` random points, which we'll call centroids, and for each point we're going to generate `n_samples` random points about it.
47 |
48 | # %%
49 | centroids = np.random.uniform(-35, 35, (n_clusters, 2))
50 | slices = [np.random.multivariate_normal(centroids[i], np.diag([5., 5.]), n_samples)
51 | for i in range(n_clusters)]
52 | data = np.concatenate(slices).astype(np.float32)
53 |
54 |
55 | # %% [markdown]
56 | # Plot the data and the centroids:
57 |
58 | # %%
59 | def plot_data(centroids, data, n_samples):
60 | colour = plt.cm.rainbow(np.linspace(0,1,len(centroids)))
61 |
62 | fig, ax = plt.subplots(figsize=(4, 4))
63 | for i, centroid in enumerate(centroids):
64 | samples = data[i * n_samples : (i + 1) * n_samples]
65 | ax.scatter(samples[:, 0], samples[:, 1], c=colour[i], s=1)
66 | ax.plot(centroid[0], centroid[1], markersize=10, marker="x", color='k', mew=5)
67 | ax.plot(centroid[0], centroid[1], markersize=5, marker="x", color='m', mew=2)
68 | plt.axis('equal')
69 |
70 | plot_data(centroids, data, n_samples)
71 |
72 | # %% [markdown]
73 | # ## The mean shift algorithm
74 | #
75 | # "Mean shift is a **non-parametric** feature-space analysis technique for locating the maxima of a density function, a so-called **mode-seeking algorithm**. Application domains include cluster analysis in computer vision and image processing." -- https://en.wikipedia.org/wiki/Mean_shift
76 | #
77 | # Think of mean-shift clustering as k-means but you don't have to specify the number of clusters.
78 | # (You have to specify the **bandwidth** but that can be automated.)
79 |
80 | # %% [markdown]
81 | # Algo:
82 | # ```python
83 | # # PSEUDO CODE
84 | # while not_converged():
85 | # for i, point in enumerate(points):
86 | # # distance for the given point to all other points
87 | # distances = calc_distances(point, points)
88 | #
89 | # # turn distance into weights using a gaussian
90 | # weights = gaussian(dist, bandwidth=2.5)
91 | #
92 | # # update the weights by using the weights
93 | # points[i] = (weights * points).sum(0) / weights.sum()
94 | #
95 | # return points
96 | # ```
97 |
98 | # %% [markdown]
99 | # ## The implementation
100 | #
101 | # Let's implement this with numpy:
102 |
103 | # %%
104 | from numpy import exp, sqrt, array
105 |
106 |
107 | # %%
108 | def distance(x, X):
109 | # return np.linalg.norm(x - X, axis=1)
110 | return sqrt(((x - X)**2).sum(1))
111 |
112 |
113 | # %% [markdown]
114 | # Let's try it out. (More on how this function works shortly)
115 |
116 | # %%
117 | a = array([1, 2])
118 | b = array([[1, 2],
119 | [2, 3],
120 | [-1, -3]])
121 |
122 | dist = distance(a, b)
123 | dist
124 |
125 |
126 | # %%
127 | def gaussian(dist, bandwidth):
128 | return exp(-0.5 * ((dist / bandwidth))**2) / (bandwidth * math.sqrt(2 * math.pi))
129 |
130 |
131 | # %%
132 | gaussian(dist, 2.5)
133 |
134 |
135 | # %% [markdown]
136 | # Now we can do a single mean shift step:
137 |
138 | # %%
139 | def meanshift_step(X, bandwidth=2.5):
140 | for i, x in enumerate(X):
141 | dist = distance(x, X)
142 | weight = gaussian(dist, bandwidth)
143 | X[i] = (weight[:, None] * X).sum(0) / weight.sum()
144 | return X
145 |
146 |
147 | # %% [markdown]
148 | # Data before:
149 |
150 | # %%
151 | plot_data(centroids, data, n_samples)
152 |
153 | # %% [markdown]
154 | # Data after:
155 |
156 | # %%
157 | _X = meanshift_step(np.copy(data))
158 | plot_data(centroids, _X, n_samples)
159 |
160 |
161 | # %% [markdown]
162 | # Just repeath this/iterate a few times and we have the complete mean shift algorithm:
163 |
164 | # %%
165 | def meanshift(X):
166 | X = np.copy(X)
167 | for _ in range(5):
168 | X = meanshift_step(X)
169 | return X
170 |
171 |
172 | # %%
173 | # %%time
174 | X = meanshift(data)
175 |
176 | # %%
177 | plot_data(centroids, X, n_samples)
178 |
179 | # %% [markdown]
180 | # # Mean shift in PyTorch (with GPU)
181 | #
182 | # PyTorch is like numpy and the interface is very similar.
183 | #
184 | # We actually don't have to adjust anything really to use torch instead of numpy.
185 |
186 | # %%
187 | import torch
188 | from torch import exp, sqrt
189 |
190 |
191 | # %% [markdown]
192 | # We oncly have to copy the data into a PyTorch GPU tensor.
193 |
194 | # %%
195 | def meanshift_torch(X):
196 | X = torch.from_numpy(np.copy(X)).cuda()
197 | for it in range(5):
198 | X = meanshift_step(X)
199 | return X
200 |
201 |
202 | # %%
203 | # %time X = meanshift_torch(data).cpu().numpy()
204 | plot_data(centroids+2, X, n_samples)
205 |
206 |
207 | # %% [markdown]
208 | # Same results, but the implementation is about the same speed.
209 | #
210 | # CUDA kernels have to be started for each calculation and the kernels don't have enough to do.
211 | # Let's not process individual points, but batches of points.
212 |
213 | # %% [markdown]
214 | # ## Batch processing
215 |
216 | # %%
217 | def distance_batch(a, b):
218 | return sqrt(((a[None,:] - b[:,None]) ** 2).sum(2))
219 |
220 |
221 | # %%
222 | a = torch.rand(2, 2)
223 | b = torch.rand(3, 2)
224 | distance_batch(b, a)
225 |
226 |
227 | # %% [markdown]
228 | # `distance_batch` contains some broadcast magic that allows us to compute the distance from each point in a batch to all points in the data.
229 |
230 | # %%
231 | def meanshift_torch2(data, batch_size=500):
232 | n = len(data)
233 | X = torch.from_numpy(np.copy(data)).cuda()
234 | for _ in range(5):
235 | for i in range(0, n, batch_size):
236 | s = slice(i, min(n, i + batch_size))
237 | weight = gaussian(distance_batch(X, X[s]), 2.5)
238 | num = (weight[:, :, None] * X).sum(dim=1)
239 | X[s] = num / weight.sum(1)[:, None]
240 | return X
241 |
242 |
243 | # %%
244 | # %time X = meanshift_torch2(data, batch_size=1).cpu().numpy()
245 |
246 | # %%
247 | # %time X = meanshift_torch2(data, batch_size=10).cpu().numpy()
248 |
249 | # %%
250 | # %time X = meanshift_torch2(data, batch_size=100).cpu().numpy()
251 |
252 | # %%
253 | # %time X = meanshift_torch2(data, batch_size=1000).cpu().numpy()
254 |
255 | # %%
256 | # %time X = meanshift_torch2(data, batch_size=6000).cpu().numpy()
257 |
258 | # %%
259 | plot_data(centroids+2, X, n_samples)
260 |
261 | # %% [markdown]
262 | # # Mean shift in scikit-learn
263 | #
264 | # Of course, sklearn also offers `MeanShift`.
265 | # Let's see how it performs
266 |
267 | # %%
268 | from sklearn.cluster import MeanShift
269 |
270 | # %%
271 | # %%time
272 | model = MeanShift()
273 | model.fit(data)
274 |
275 | # %% [markdown]
276 | # This is a faster than our naive implementation, but much slower than the GPU version.
277 | #
278 |
279 | # %% [markdown]
280 | # # Note
281 | # Keep in mind that this demo is not saying that A is faster than B.
282 | # It rather shows that you can use PyTorch in fun ways!
283 | #
284 | # Ref:
285 | # - https://pytorch.org/docs/stable/notes/broadcasting.html
286 | # - https://pytorch.org/docs/stable/notes/cuda.html
287 | # - https://github.com/fastai/fastai/blob/master/tutorials/meanshift.ipynb
288 |
--------------------------------------------------------------------------------
/notebooks/pytorch_basics.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown] toc-hr-collapsed=true toc-nb-collapsed=true
16 | # # PyTorch Basics
17 | # - tensors like numpy
18 | # - tensors on the gpu
19 | # - tensors and automatic derivatives
20 | # - tensors as neural network abstractions: `torch.nn`
21 | # - optimizers: `nn.optim`
22 |
23 | # %% [markdown]
24 | # ## Init, helpers, utils, ...
25 |
26 | # %%
27 | import torch
28 | import torch.nn as nn
29 | import torch.nn.functional as F
30 | import torch.optim as optim
31 | import torchvision
32 |
33 | # %%
34 | from pprint import pprint
35 |
36 | import matplotlib.pyplot as plt
37 | import numpy as np
38 | from IPython.core.debugger import set_trace
39 |
40 | # %% [markdown] toc-hr-collapsed=true toc-nb-collapsed=true
41 | # # Tensors
42 | # tensors - the atoms of machine learning
43 |
44 | # %% [markdown]
45 | # ## Tensors in numpy and pytorch
46 |
47 | # %%
48 | import numpy as np
49 | from numpy.linalg import inv
50 | from numpy.linalg import multi_dot as mdot
51 |
52 | # %%
53 | # numpy
54 | np.eye(3)
55 |
56 | # %%
57 | # torch
58 | torch.eye(3)
59 |
60 | # %%
61 | # numpy
62 | X = np.random.random((5, 3))
63 | X
64 |
65 | # %%
66 | # pytorch
67 | Y = torch.rand((5, 3))
68 | Y
69 |
70 | # %%
71 | X.shape
72 |
73 | # %%
74 | Y.shape
75 |
76 | # %%
77 | # numpy
78 | X.T @ X
79 |
80 | # %%
81 | # torch
82 | Y.t() @ Y
83 |
84 | # %%
85 | # numpy
86 | inv(X.T @ X)
87 |
88 | # %%
89 | # torch
90 | torch.inverse(Y.t() @ Y)
91 |
92 | # %% [markdown]
93 | # ## More on PyTorch Tensors
94 |
95 | # %% [markdown]
96 | # Operations are also available as methods.
97 |
98 | # %%
99 | A = torch.eye(3)
100 | A.add(1)
101 |
102 | # %%
103 | A
104 |
105 | # %% [markdown]
106 | # Any operation that mutates a tensor in-place has a `_` suffix.
107 |
108 | # %%
109 | A.add_(1)
110 | A
111 |
112 | # %% [markdown]
113 | # ## Indexing and broadcasting
114 | # It works as expected/like numpy:
115 |
116 | # %%
117 | A[0, 0]
118 |
119 | # %%
120 | A[0]
121 |
122 | # %%
123 | A[0:2]
124 |
125 | # %%
126 | A[:, 1:3]
127 |
128 | # %% [markdown]
129 | # ## Converting
130 |
131 | # %%
132 | A = torch.eye(3)
133 | A
134 |
135 | # %%
136 | # torch --> numpy
137 | B = A.numpy()
138 | B
139 |
140 | # %% [markdown]
141 | # Note: torch and numpy can share the same memory / zero-copy
142 |
143 | # %%
144 | A.add_(.5)
145 | A
146 |
147 | # %%
148 | B
149 |
150 | # %%
151 | # numpy --> torch
152 | torch.from_numpy(np.eye(3))
153 |
154 | # %% [markdown]
155 | # ## Much more
156 |
157 | # %%
158 | [o for o in dir(torch) if not o.startswith("_")]
159 |
160 | # %%
161 | [o for o in dir(A) if not o.startswith("_")]
162 |
163 | # %% [markdown]
164 | # # But what about the GPU?
165 | # How do I use the GPU?
166 | #
167 | # If you have a GPU make sure that the right pytorch is installed
168 | # (check https://pytorch.org/ for details).
169 |
170 | # %%
171 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
172 | device
173 |
174 | # %% [markdown]
175 | # If you have a GPU you should get something like:
176 | # `device(type='cuda', index=0)`
177 | #
178 | # You can move data to the GPU by doing `.to(device)`.
179 |
180 | # %%
181 | data = torch.eye(3)
182 | data = data.to(device)
183 | data
184 |
185 | # %% [markdown]
186 | # Now the computation happens on the GPU.
187 |
188 | # %%
189 | res = data + data
190 | res
191 |
192 | # %%
193 | res.device
194 |
195 | # %% [markdown]
196 | # Note: before `v0.4` one had to use `.cuda()` and `.cpu()` to move stuff to and from the GPU.
197 | # This littered the code with many:
198 | # ```python
199 | # if CUDA:
200 | # model = model.cuda()
201 | # ```
202 |
203 | # %% [markdown]
204 | # # Automatic differentiation with `autograd`
205 | # Prior to `v0.4` PyTorch used the class `Variable` to record gradients. You had to wrap `Tensor`s in `Variable`s.
206 | # `Variable`s behaved exactly like `Tensors`.
207 | #
208 | # With `v0.4` `Tensor` can record gradients directly if you tell it do do so, e.g. `torch.ones(3, requires_grad=True)`.
209 | # There is no need for `Variable` anymore.
210 | # Many tutorials still use `Variable`, be aware!
211 | #
212 | # Ref:
213 | # - https://pytorch.org/docs/stable/autograd.html
214 | # - https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html
215 |
216 | # %% [markdown]
217 | # You rarely use `torch.autograd` directly.
218 | # Pretty much everything is part or `torch.Tensor` now.
219 | # Simply add `requires_grad=True` to the tensors you want to calculate the gradients for.
220 | # `nn.Module` track gradients automatically.
221 |
222 | # %%
223 | from torch import autograd
224 |
225 | # %%
226 | x = torch.tensor(2.)
227 | x
228 |
229 | # %%
230 | x = torch.tensor(2., requires_grad=True)
231 | x
232 |
233 | # %%
234 | print(x.requires_grad)
235 |
236 | # %%
237 | print(x.grad)
238 |
239 | # %%
240 | y = x ** 2
241 |
242 | print("Grad of x:", x.grad)
243 |
244 | # %%
245 | y = x ** 2
246 | y.backward()
247 |
248 | print("Grad of x:", x.grad)
249 |
250 | # %%
251 | # What is going to happen here?
252 | # x = torch.tensor(2.)
253 | # x.backward()
254 |
255 | # %%
256 | # Don't record the gradient
257 | # Useful for inference
258 |
259 | params = torch.tensor(2., requires_grad=True)
260 |
261 | with torch.no_grad():
262 | y = x * x
263 | print(x.grad_fn)
264 |
265 | # %% [markdown]
266 | # `nn.Module` and `nn.Parameter` keep track of gradients for you.
267 |
268 | # %%
269 | lin = nn.Linear(2, 1, bias=True)
270 | lin.weight
271 |
272 | # %%
273 | type(lin.weight)
274 |
275 | # %%
276 | isinstance(lin.weight, torch.FloatTensor)
277 |
278 | # %% [markdown]
279 | # ## `torch.nn`
280 | # The neural network modules contains many different layers.
281 |
282 | # %%
283 | from torch import nn
284 |
285 | # %%
286 | lin_reg = nn.Linear(1, 1, bias=True)
287 | lin_reg
288 |
289 | # %%
290 | nn.Conv2d
291 |
292 | # %%
293 | nn.Conv3d
294 |
295 | # %%
296 | nn.BatchNorm2d
297 |
298 | # %% [markdown]
299 | # ### Activations
300 |
301 | # %%
302 | nn.ReLU
303 |
304 | # %%
305 | nn.Sigmoid
306 |
307 | # %% [markdown]
308 | # ### Losses
309 |
310 | # %%
311 | nn.Softmax
312 |
313 | # %%
314 | nn.CrossEntropyLoss
315 |
316 | # %%
317 | nn.BCELoss
318 |
319 | # %%
320 | nn.MSELoss
321 |
322 | # %% [markdown]
323 | # ### Functional (stateless) alternatives
324 |
325 | # %%
326 | from torch.nn import functional as F
327 |
328 | # %%
329 | F.mse_loss
330 |
331 | # %%
332 | F.relu
333 |
334 | # %%
335 | F.relu6
336 |
337 | # %% [markdown]
338 | # ## `torch.optim`
339 |
340 | # %%
341 | from torch import optim
342 |
343 | # %%
344 | optim.SGD
345 |
346 | # %%
347 | optim.Adam
348 |
349 | # %%
350 | optim.AdamW
351 |
352 | # %% [markdown]
353 | # # Exercise
354 | # - Do you remember the analytical solution to solve for the parameters of linear regression? Implement it.
355 |
--------------------------------------------------------------------------------
/notebooks/rnn_from_scratch.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# RNN from scratch with PyTorch\n",
8 | "A RNN ist just a normal NN.\n",
9 | "It's very easy to implement in PyTorch due to its dynamic nature.\n",
10 | "\n",
11 | "We'll build a very simple character based language model.\n",
12 | "\n",
13 | "Taken from http://www.fast.ai/"
14 | ]
15 | },
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {},
19 | "source": [
20 | "## Init and helpers"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 1,
26 | "metadata": {
27 | "execution": {
28 | "iopub.execute_input": "2020-12-06T08:47:20.187029Z",
29 | "iopub.status.busy": "2020-12-06T08:47:20.186373Z",
30 | "iopub.status.idle": "2020-12-06T08:47:20.309644Z",
31 | "shell.execute_reply": "2020-12-06T08:47:20.310735Z"
32 | }
33 | },
34 | "outputs": [],
35 | "source": [
36 | "from pathlib import Path\n",
37 | "import numpy as np"
38 | ]
39 | },
40 | {
41 | "cell_type": "markdown",
42 | "metadata": {},
43 | "source": [
44 | "## Data"
45 | ]
46 | },
47 | {
48 | "cell_type": "code",
49 | "execution_count": 2,
50 | "metadata": {
51 | "execution": {
52 | "iopub.execute_input": "2020-12-06T08:47:20.315957Z",
53 | "iopub.status.busy": "2020-12-06T08:47:20.314473Z",
54 | "iopub.status.idle": "2020-12-06T08:47:20.387279Z",
55 | "shell.execute_reply": "2020-12-06T08:47:20.386890Z"
56 | }
57 | },
58 | "outputs": [
59 | {
60 | "name": "stdout",
61 | "output_type": "stream",
62 | "text": [
63 | "I already have the data.\n"
64 | ]
65 | }
66 | ],
67 | "source": [
68 | "NIETSCHE_PATH = Path(\"../data/raw/nietzsche.txt\")\n",
69 | "if NIETSCHE_PATH.is_file():\n",
70 | " print(\"I already have the data.\")\n",
71 | "else:\n",
72 | " !wget -o ../data/raw/nietzsche.txt https://s3.amazonaws.com/text-datasets/nietzsche.txt\n",
73 | " \n",
74 | "with NIETSCHE_PATH.open() as f:\n",
75 | " data = f.read()"
76 | ]
77 | },
78 | {
79 | "cell_type": "markdown",
80 | "metadata": {},
81 | "source": [
82 | "A tweet of Nietzsche:"
83 | ]
84 | },
85 | {
86 | "cell_type": "code",
87 | "execution_count": 3,
88 | "metadata": {
89 | "execution": {
90 | "iopub.execute_input": "2020-12-06T08:47:20.390460Z",
91 | "iopub.status.busy": "2020-12-06T08:47:20.390010Z",
92 | "iopub.status.idle": "2020-12-06T08:47:20.392264Z",
93 | "shell.execute_reply": "2020-12-06T08:47:20.392662Z"
94 | }
95 | },
96 | "outputs": [
97 | {
98 | "name": "stdout",
99 | "output_type": "stream",
100 | "text": [
101 | "PREFACE\n",
102 | "\n",
103 | "\n",
104 | "SUPPOSING that Truth is a woman--what then? Is there not ground\n",
105 | "for suspecting that all philosophers, in so far as they have been\n",
106 | "\n"
107 | ]
108 | }
109 | ],
110 | "source": [
111 | "print(data[:140])"
112 | ]
113 | },
114 | {
115 | "cell_type": "markdown",
116 | "metadata": {},
117 | "source": [
118 | "We need to know the alphabet and we add a padding value \"\\0\" to the alphabet."
119 | ]
120 | },
121 | {
122 | "cell_type": "code",
123 | "execution_count": 4,
124 | "metadata": {
125 | "execution": {
126 | "iopub.execute_input": "2020-12-06T08:47:20.410193Z",
127 | "iopub.status.busy": "2020-12-06T08:47:20.409477Z",
128 | "iopub.status.idle": "2020-12-06T08:47:20.412698Z",
129 | "shell.execute_reply": "2020-12-06T08:47:20.412171Z"
130 | }
131 | },
132 | "outputs": [
133 | {
134 | "data": {
135 | "text/plain": [
136 | "85"
137 | ]
138 | },
139 | "execution_count": 1,
140 | "metadata": {},
141 | "output_type": "execute_result"
142 | }
143 | ],
144 | "source": [
145 | "alphabet = [\"\\0\", *sorted(list(set(data)))]\n",
146 | "n_alphabet = len(alphabet)\n",
147 | "n_alphabet"
148 | ]
149 | },
150 | {
151 | "cell_type": "code",
152 | "execution_count": 5,
153 | "metadata": {
154 | "execution": {
155 | "iopub.execute_input": "2020-12-06T08:47:20.416803Z",
156 | "iopub.status.busy": "2020-12-06T08:47:20.416181Z",
157 | "iopub.status.idle": "2020-12-06T08:47:20.418426Z",
158 | "shell.execute_reply": "2020-12-06T08:47:20.418912Z"
159 | }
160 | },
161 | "outputs": [],
162 | "source": [
163 | "char2index = {c: i for i, c in enumerate(alphabet)}\n",
164 | "index2char = {i: c for i, c in enumerate(alphabet)}"
165 | ]
166 | },
167 | {
168 | "cell_type": "markdown",
169 | "metadata": {},
170 | "source": [
171 | "Convert the data into a list of integers"
172 | ]
173 | },
174 | {
175 | "cell_type": "code",
176 | "execution_count": 6,
177 | "metadata": {
178 | "execution": {
179 | "iopub.execute_input": "2020-12-06T08:47:20.459645Z",
180 | "iopub.status.busy": "2020-12-06T08:47:20.456579Z",
181 | "iopub.status.idle": "2020-12-06T08:47:20.461267Z",
182 | "shell.execute_reply": "2020-12-06T08:47:20.461520Z"
183 | }
184 | },
185 | "outputs": [],
186 | "source": [
187 | "index = [char2index[c] for c in data]"
188 | ]
189 | },
190 | {
191 | "cell_type": "code",
192 | "execution_count": 7,
193 | "metadata": {
194 | "execution": {
195 | "iopub.execute_input": "2020-12-06T08:47:20.463805Z",
196 | "iopub.status.busy": "2020-12-06T08:47:20.463476Z",
197 | "iopub.status.idle": "2020-12-06T08:47:20.465132Z",
198 | "shell.execute_reply": "2020-12-06T08:47:20.465383Z"
199 | }
200 | },
201 | "outputs": [
202 | {
203 | "name": "stdout",
204 | "output_type": "stream",
205 | "text": [
206 | "[40, 42, 29, 30, 25, 27, 29, 1, 1, 1, 43, 45, 40, 40, 39, 43, 33, 38, 31, 2, 73, 61, 54, 73, 2]\n",
207 | "PREFACE\n",
208 | "\n",
209 | "\n",
210 | "SUPPOSING that \n"
211 | ]
212 | }
213 | ],
214 | "source": [
215 | "print(index[:25])\n",
216 | "print(\"\".join(index2char[i] for i in index[:25]))"
217 | ]
218 | },
219 | {
220 | "cell_type": "code",
221 | "execution_count": 8,
222 | "metadata": {
223 | "execution": {
224 | "iopub.execute_input": "2020-12-06T08:47:20.467519Z",
225 | "iopub.status.busy": "2020-12-06T08:47:20.467164Z",
226 | "iopub.status.idle": "2020-12-06T08:47:20.468936Z",
227 | "shell.execute_reply": "2020-12-06T08:47:20.469190Z"
228 | }
229 | },
230 | "outputs": [
231 | {
232 | "data": {
233 | "text/plain": [
234 | "[40, 42, 29]"
235 | ]
236 | },
237 | "execution_count": 1,
238 | "metadata": {},
239 | "output_type": "execute_result"
240 | }
241 | ],
242 | "source": [
243 | "index[0: 3]"
244 | ]
245 | },
246 | {
247 | "cell_type": "code",
248 | "execution_count": 9,
249 | "metadata": {
250 | "execution": {
251 | "iopub.execute_input": "2020-12-06T08:47:20.531092Z",
252 | "iopub.status.busy": "2020-12-06T08:47:20.521077Z",
253 | "iopub.status.idle": "2020-12-06T08:47:22.563153Z",
254 | "shell.execute_reply": "2020-12-06T08:47:22.563421Z"
255 | }
256 | },
257 | "outputs": [],
258 | "source": [
259 | "X, y = [], []\n",
260 | "for i in range(len(index) - 4):\n",
261 | " X.append(index[i : i + 3])\n",
262 | " y.append(index[i + 3])\n",
263 | " \n",
264 | "X = np.stack(X)\n",
265 | "y = np.stack(y)"
266 | ]
267 | },
268 | {
269 | "cell_type": "code",
270 | "execution_count": 10,
271 | "metadata": {
272 | "execution": {
273 | "iopub.execute_input": "2020-12-06T08:47:22.565676Z",
274 | "iopub.status.busy": "2020-12-06T08:47:22.565356Z",
275 | "iopub.status.idle": "2020-12-06T08:47:22.567072Z",
276 | "shell.execute_reply": "2020-12-06T08:47:22.567331Z"
277 | }
278 | },
279 | "outputs": [
280 | {
281 | "data": {
282 | "text/plain": [
283 | "((600889, 3), (600889,))"
284 | ]
285 | },
286 | "execution_count": 1,
287 | "metadata": {},
288 | "output_type": "execute_result"
289 | }
290 | ],
291 | "source": [
292 | "X.shape, y.shape"
293 | ]
294 | },
295 | {
296 | "cell_type": "code",
297 | "execution_count": 11,
298 | "metadata": {
299 | "execution": {
300 | "iopub.execute_input": "2020-12-06T08:47:22.569823Z",
301 | "iopub.status.busy": "2020-12-06T08:47:22.569358Z",
302 | "iopub.status.idle": "2020-12-06T08:47:22.571051Z",
303 | "shell.execute_reply": "2020-12-06T08:47:22.571308Z"
304 | }
305 | },
306 | "outputs": [
307 | {
308 | "data": {
309 | "text/plain": [
310 | "(array([40, 42, 29]), 30)"
311 | ]
312 | },
313 | "execution_count": 1,
314 | "metadata": {},
315 | "output_type": "execute_result"
316 | }
317 | ],
318 | "source": [
319 | "X[0], y[0]"
320 | ]
321 | },
322 | {
323 | "cell_type": "code",
324 | "execution_count": 12,
325 | "metadata": {
326 | "execution": {
327 | "iopub.execute_input": "2020-12-06T08:47:22.573504Z",
328 | "iopub.status.busy": "2020-12-06T08:47:22.573159Z",
329 | "iopub.status.idle": "2020-12-06T08:47:22.574945Z",
330 | "shell.execute_reply": "2020-12-06T08:47:22.575186Z"
331 | }
332 | },
333 | "outputs": [
334 | {
335 | "data": {
336 | "text/plain": [
337 | "numpy.ndarray"
338 | ]
339 | },
340 | "execution_count": 1,
341 | "metadata": {},
342 | "output_type": "execute_result"
343 | }
344 | ],
345 | "source": [
346 | "type(y)"
347 | ]
348 | },
349 | {
350 | "cell_type": "code",
351 | "execution_count": 13,
352 | "metadata": {
353 | "execution": {
354 | "iopub.execute_input": "2020-12-06T08:47:22.577419Z",
355 | "iopub.status.busy": "2020-12-06T08:47:22.577102Z",
356 | "iopub.status.idle": "2020-12-06T08:47:22.791865Z",
357 | "shell.execute_reply": "2020-12-06T08:47:22.792148Z"
358 | }
359 | },
360 | "outputs": [],
361 | "source": [
362 | "import torch\n",
363 | "from torch.utils.data import DataLoader, Dataset, TensorDataset\n",
364 | "\n",
365 | "\n",
366 | "train_ds = TensorDataset(torch.from_numpy(X), torch.from_numpy(y))\n",
367 | "train_dl = DataLoader(train_ds, batch_size=500)"
368 | ]
369 | },
370 | {
371 | "cell_type": "markdown",
372 | "metadata": {},
373 | "source": [
374 | "# The model"
375 | ]
376 | },
377 | {
378 | "cell_type": "code",
379 | "execution_count": 14,
380 | "metadata": {
381 | "execution": {
382 | "iopub.execute_input": "2020-12-06T08:47:22.794452Z",
383 | "iopub.status.busy": "2020-12-06T08:47:22.794136Z",
384 | "iopub.status.idle": "2020-12-06T08:47:22.795818Z",
385 | "shell.execute_reply": "2020-12-06T08:47:22.795504Z"
386 | }
387 | },
388 | "outputs": [],
389 | "source": [
390 | "import torch\n",
391 | "import torch.nn as nn\n",
392 | "import torch.nn.functional as F\n",
393 | "import torch.optim as optim"
394 | ]
395 | },
396 | {
397 | "cell_type": "code",
398 | "execution_count": 15,
399 | "metadata": {
400 | "execution": {
401 | "iopub.execute_input": "2020-12-06T08:47:22.799526Z",
402 | "iopub.status.busy": "2020-12-06T08:47:22.799090Z",
403 | "iopub.status.idle": "2020-12-06T08:47:22.801673Z",
404 | "shell.execute_reply": "2020-12-06T08:47:22.801383Z"
405 | }
406 | },
407 | "outputs": [
408 | {
409 | "name": "stderr",
410 | "output_type": "stream",
411 | "text": [
412 | "/home/stefan/projects/pytorch_tutorial/.venv/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)\n",
413 | " return torch._C._cuda_getDeviceCount() > 0\n"
414 | ]
415 | },
416 | {
417 | "data": {
418 | "text/plain": [
419 | "device(type='cpu')"
420 | ]
421 | },
422 | "execution_count": 1,
423 | "metadata": {},
424 | "output_type": "execute_result"
425 | }
426 | ],
427 | "source": [
428 | "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n",
429 | "device"
430 | ]
431 | },
432 | {
433 | "cell_type": "code",
434 | "execution_count": 16,
435 | "metadata": {
436 | "execution": {
437 | "iopub.execute_input": "2020-12-06T08:47:22.805642Z",
438 | "iopub.status.busy": "2020-12-06T08:47:22.805321Z",
439 | "iopub.status.idle": "2020-12-06T08:47:22.807151Z",
440 | "shell.execute_reply": "2020-12-06T08:47:22.806837Z"
441 | }
442 | },
443 | "outputs": [],
444 | "source": [
445 | "class CharModel(nn.Module):\n",
446 | " def __init__(self, n_vocab, n_embedding, n_hidden):\n",
447 | " super().__init__()\n",
448 | " self.emb = nn.Embedding(n_vocab, n_embedding)\n",
449 | " self.lin_in = nn.Linear(n_embedding, n_hidden)\n",
450 | " \n",
451 | " self.lin_hidden = nn.Linear(n_hidden, n_hidden)\n",
452 | " self.lin_out = nn.Linear(n_hidden, n_vocab)\n",
453 | " \n",
454 | " def forward(self, X):\n",
455 | " c1, c2, c3 = X[:, 0], X[:, 1], X[:, 2]\n",
456 | " \n",
457 | " in1 = F.relu(self.lin_in(self.emb(c1)))\n",
458 | " h = F.tanh(self.lin_hidden(in1))\n",
459 | " \n",
460 | " in2 = F.relu(self.lin_in(self.emb(c2)))\n",
461 | " h = F.tanh(self.lin_hidden(h + in2))\n",
462 | " \n",
463 | " in3 = F.relu(self.lin_in(self.emb(c3)))\n",
464 | " h = F.tanh(self.lin_hidden(h + in3))\n",
465 | " \n",
466 | " return F.log_softmax(self.lin_out(h), dim=-1)"
467 | ]
468 | },
469 | {
470 | "cell_type": "code",
471 | "execution_count": 17,
472 | "metadata": {
473 | "execution": {
474 | "iopub.execute_input": "2020-12-06T08:47:22.809256Z",
475 | "iopub.status.busy": "2020-12-06T08:47:22.808949Z",
476 | "iopub.status.idle": "2020-12-06T08:47:22.811552Z",
477 | "shell.execute_reply": "2020-12-06T08:47:22.811233Z"
478 | }
479 | },
480 | "outputs": [],
481 | "source": [
482 | "n_embedding = 40\n",
483 | "n_hidden = 256\n",
484 | "\n",
485 | "model = CharModel(n_alphabet, n_embedding=40, n_hidden=128)\n",
486 | "model = model.to(device)"
487 | ]
488 | },
489 | {
490 | "cell_type": "code",
491 | "execution_count": 18,
492 | "metadata": {
493 | "execution": {
494 | "iopub.execute_input": "2020-12-06T08:47:22.813741Z",
495 | "iopub.status.busy": "2020-12-06T08:47:22.813436Z",
496 | "iopub.status.idle": "2020-12-06T08:47:22.815254Z",
497 | "shell.execute_reply": "2020-12-06T08:47:22.814935Z"
498 | }
499 | },
500 | "outputs": [],
501 | "source": [
502 | "optimizer = optim.Adam(model.parameters(), 0.001)\n",
503 | "#criterion = nn.CrossEntropyLoss()\n",
504 | "criterion = F.nll_loss"
505 | ]
506 | },
507 | {
508 | "cell_type": "code",
509 | "execution_count": 19,
510 | "metadata": {
511 | "execution": {
512 | "iopub.execute_input": "2020-12-06T08:47:22.818691Z",
513 | "iopub.status.busy": "2020-12-06T08:47:22.818372Z",
514 | "iopub.status.idle": "2020-12-06T08:47:22.819647Z",
515 | "shell.execute_reply": "2020-12-06T08:47:22.819884Z"
516 | }
517 | },
518 | "outputs": [],
519 | "source": [
520 | "def fit(model, n_epoch=2):\n",
521 | " optimizer = optim.Adam(model.parameters(), 0.001)\n",
522 | " \n",
523 | " for epoch in range(n_epoch):\n",
524 | " print(f\"Epoch {epoch}:\")\n",
525 | " running_loss, correct = 0.0, 0\n",
526 | "\n",
527 | " model.train()\n",
528 | " for X, y in train_dl:\n",
529 | " X, y = X.to(device), y.to(device)\n",
530 | " optimizer.zero_grad()\n",
531 | "\n",
532 | " y_ = model(X)\n",
533 | " loss = criterion(y_, y)\n",
534 | "\n",
535 | " loss.backward()\n",
536 | " optimizer.step()\n",
537 | "\n",
538 | " _, y_label_ = torch.max(y_, 1)\n",
539 | " correct += (y_label_ == y).sum().item()\n",
540 | " running_loss += loss.item() * X.shape[0]\n",
541 | "\n",
542 | " print(f\" Train Loss: {running_loss / len(train_dl.dataset):0.4f}\")\n",
543 | " print(f\" Train Acc: {correct / len(train_dl.dataset):0.2f}\")"
544 | ]
545 | },
546 | {
547 | "cell_type": "code",
548 | "execution_count": 20,
549 | "metadata": {
550 | "execution": {
551 | "iopub.execute_input": "2020-12-06T08:47:22.891629Z",
552 | "iopub.status.busy": "2020-12-06T08:47:22.885064Z",
553 | "iopub.status.idle": "2020-12-06T08:47:44.559163Z",
554 | "shell.execute_reply": "2020-12-06T08:47:44.558554Z"
555 | }
556 | },
557 | "outputs": [
558 | {
559 | "name": "stderr",
560 | "output_type": "stream",
561 | "text": [
562 | "/home/stefan/projects/pytorch_tutorial/.venv/lib/python3.8/site-packages/torch/nn/functional.py:1628: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.\n",
563 | " warnings.warn(\"nn.functional.tanh is deprecated. Use torch.tanh instead.\")\n"
564 | ]
565 | },
566 | {
567 | "name": "stdout",
568 | "output_type": "stream",
569 | "text": [
570 | "Epoch 0:\n",
571 | " Train Loss: 2.2354\n",
572 | " Train Acc: 0.37\n",
573 | "Epoch 1:\n",
574 | " Train Loss: 1.9220\n",
575 | " Train Acc: 0.44\n"
576 | ]
577 | }
578 | ],
579 | "source": [
580 | "fit(model, 2)"
581 | ]
582 | },
583 | {
584 | "cell_type": "code",
585 | "execution_count": 21,
586 | "metadata": {
587 | "execution": {
588 | "iopub.execute_input": "2020-12-06T08:47:44.564413Z",
589 | "iopub.status.busy": "2020-12-06T08:47:44.563821Z",
590 | "iopub.status.idle": "2020-12-06T08:47:44.565659Z",
591 | "shell.execute_reply": "2020-12-06T08:47:44.566157Z"
592 | }
593 | },
594 | "outputs": [],
595 | "source": [
596 | "def predict(word):\n",
597 | " word_idx = [char2index[c] for c in word]\n",
598 | " word_idx\n",
599 | " with torch.no_grad():\n",
600 | " X = torch.tensor(word_idx).unsqueeze(0).to(device)\n",
601 | " model.eval()\n",
602 | " y_ = model(X).cpu()\n",
603 | " pred = index2char[torch.argmax(y_).item()]\n",
604 | " print(f\"{word} --> '{pred}'\")"
605 | ]
606 | },
607 | {
608 | "cell_type": "code",
609 | "execution_count": 22,
610 | "metadata": {
611 | "execution": {
612 | "iopub.execute_input": "2020-12-06T08:47:44.569376Z",
613 | "iopub.status.busy": "2020-12-06T08:47:44.568799Z",
614 | "iopub.status.idle": "2020-12-06T08:47:44.573254Z",
615 | "shell.execute_reply": "2020-12-06T08:47:44.572657Z"
616 | }
617 | },
618 | "outputs": [
619 | {
620 | "name": "stdout",
621 | "output_type": "stream",
622 | "text": [
623 | "the --> ' '\n"
624 | ]
625 | }
626 | ],
627 | "source": [
628 | "predict(\"the\")"
629 | ]
630 | },
631 | {
632 | "cell_type": "code",
633 | "execution_count": 23,
634 | "metadata": {
635 | "execution": {
636 | "iopub.execute_input": "2020-12-06T08:47:44.578272Z",
637 | "iopub.status.busy": "2020-12-06T08:47:44.577584Z",
638 | "iopub.status.idle": "2020-12-06T08:47:44.586743Z",
639 | "shell.execute_reply": "2020-12-06T08:47:44.587254Z"
640 | }
641 | },
642 | "outputs": [
643 | {
644 | "name": "stdout",
645 | "output_type": "stream",
646 | "text": [
647 | "wom --> 'e'\n"
648 | ]
649 | }
650 | ],
651 | "source": [
652 | "predict(\"wom\")"
653 | ]
654 | },
655 | {
656 | "cell_type": "code",
657 | "execution_count": 24,
658 | "metadata": {
659 | "execution": {
660 | "iopub.execute_input": "2020-12-06T08:47:44.591272Z",
661 | "iopub.status.busy": "2020-12-06T08:47:44.590478Z",
662 | "iopub.status.idle": "2020-12-06T08:47:44.594301Z",
663 | "shell.execute_reply": "2020-12-06T08:47:44.594786Z"
664 | }
665 | },
666 | "outputs": [
667 | {
668 | "name": "stdout",
669 | "output_type": "stream",
670 | "text": [
671 | "man --> ' '\n"
672 | ]
673 | }
674 | ],
675 | "source": [
676 | "predict(\"man\")"
677 | ]
678 | },
679 | {
680 | "cell_type": "code",
681 | "execution_count": 25,
682 | "metadata": {
683 | "execution": {
684 | "iopub.execute_input": "2020-12-06T08:47:44.599994Z",
685 | "iopub.status.busy": "2020-12-06T08:47:44.599344Z",
686 | "iopub.status.idle": "2020-12-06T08:47:44.602202Z",
687 | "shell.execute_reply": "2020-12-06T08:47:44.602759Z"
688 | }
689 | },
690 | "outputs": [
691 | {
692 | "name": "stdout",
693 | "output_type": "stream",
694 | "text": [
695 | "hum --> 'a'\n"
696 | ]
697 | }
698 | ],
699 | "source": [
700 | "predict(\"hum\")"
701 | ]
702 | },
703 | {
704 | "cell_type": "code",
705 | "execution_count": 26,
706 | "metadata": {
707 | "execution": {
708 | "iopub.execute_input": "2020-12-06T08:47:44.613497Z",
709 | "iopub.status.busy": "2020-12-06T08:47:44.612587Z",
710 | "iopub.status.idle": "2020-12-06T08:47:44.614830Z",
711 | "shell.execute_reply": "2020-12-06T08:47:44.615544Z"
712 | }
713 | },
714 | "outputs": [],
715 | "source": [
716 | "class CharModel(nn.Module):\n",
717 | " def __init__(self, n_vocab, n_embedding, n_hidden):\n",
718 | " super().__init__()\n",
719 | " self.emb = nn.Embedding(n_vocab, n_embedding)\n",
720 | " self.lin_in = nn.Linear(n_embedding, n_hidden)\n",
721 | " self.lin_hidden = nn.Linear(n_hidden, n_hidden)\n",
722 | " self.lin_out = nn.Linear(n_hidden, n_vocab)\n",
723 | " \n",
724 | " def forward(self, X):\n",
725 | " c1, c2, c3 = X[:, 0], X[:, 1], X[:, 2]\n",
726 | " \n",
727 | " in1 = F.relu(self.lin_in(self.emb(c1))) \n",
728 | " in2 = F.relu(self.lin_in(self.emb(c2)))\n",
729 | " in3 = F.relu(self.lin_in(self.emb(c3)))\n",
730 | "\n",
731 | " h = F.tanh(self.lin_hidden(in1))\n",
732 | " h = F.tanh(self.lin_hidden(h + in2))\n",
733 | " h = F.tanh(self.lin_hidden(h + in3))\n",
734 | " \n",
735 | " return F.log_softmax(self.lin_out(h), dim=-1)"
736 | ]
737 | },
738 | {
739 | "cell_type": "code",
740 | "execution_count": 27,
741 | "metadata": {
742 | "execution": {
743 | "iopub.execute_input": "2020-12-06T08:47:44.620975Z",
744 | "iopub.status.busy": "2020-12-06T08:47:44.620431Z",
745 | "iopub.status.idle": "2020-12-06T08:48:12.736384Z",
746 | "shell.execute_reply": "2020-12-06T08:48:12.736856Z"
747 | }
748 | },
749 | "outputs": [
750 | {
751 | "name": "stdout",
752 | "output_type": "stream",
753 | "text": [
754 | "Epoch 0:\n",
755 | " Train Loss: 2.2230\n",
756 | " Train Acc: 0.37\n",
757 | "Epoch 1:\n",
758 | " Train Loss: 1.9139\n",
759 | " Train Acc: 0.44\n",
760 | "\n",
761 | "the --> ' '\n",
762 | "wom --> 'e'\n",
763 | "man --> ' '\n",
764 | "hum --> 'a'\n"
765 | ]
766 | }
767 | ],
768 | "source": [
769 | "model = CharModel(n_alphabet, n_embedding=n_embedding, n_hidden=128).to(device)\n",
770 | "fit(model)\n",
771 | "\n",
772 | "print()\n",
773 | "predict(\"the\")\n",
774 | "predict(\"wom\")\n",
775 | "predict(\"man\")\n",
776 | "predict(\"hum\")"
777 | ]
778 | },
779 | {
780 | "cell_type": "code",
781 | "execution_count": 28,
782 | "metadata": {
783 | "execution": {
784 | "iopub.execute_input": "2020-12-06T08:48:12.743238Z",
785 | "iopub.status.busy": "2020-12-06T08:48:12.742679Z",
786 | "iopub.status.idle": "2020-12-06T08:48:12.744508Z",
787 | "shell.execute_reply": "2020-12-06T08:48:12.744957Z"
788 | }
789 | },
790 | "outputs": [],
791 | "source": [
792 | "class CharModel(nn.Module):\n",
793 | " def __init__(self, n_vocab, n_embedding, n_hidden):\n",
794 | " super().__init__()\n",
795 | " self.emb = nn.Embedding(n_vocab, n_embedding)\n",
796 | " self.lin_in = nn.Linear(n_embedding, n_hidden)\n",
797 | " self.lin_hidden = nn.Linear(n_hidden, n_hidden)\n",
798 | " self.lin_out = nn.Linear(n_hidden, n_vocab)\n",
799 | " \n",
800 | " self.n_hidden = n_hidden\n",
801 | " \n",
802 | " def forward(self, X):\n",
803 | " c1, c2, c3 = X[:, 0], X[:, 1], X[:, 2]\n",
804 | " \n",
805 | " in1 = F.relu(self.lin_in(self.emb(c1))) \n",
806 | " in2 = F.relu(self.lin_in(self.emb(c2)))\n",
807 | " in3 = F.relu(self.lin_in(self.emb(c3)))\n",
808 | " \n",
809 | " h = torch.zeros(X.shape[0], n_hidden, requires_grad=True).to(device)\n",
810 | " h = F.tanh(self.lin_hidden(h + in1))\n",
811 | " h = F.tanh(self.lin_hidden(h + in2))\n",
812 | " h = F.tanh(self.lin_hidden(h + in3))\n",
813 | " \n",
814 | " return F.log_softmax(self.lin_out(h), dim=-1)"
815 | ]
816 | },
817 | {
818 | "cell_type": "code",
819 | "execution_count": 29,
820 | "metadata": {
821 | "execution": {
822 | "iopub.execute_input": "2020-12-06T08:48:12.748353Z",
823 | "iopub.status.busy": "2020-12-06T08:48:12.747800Z",
824 | "iopub.status.idle": "2020-12-06T08:48:50.701859Z",
825 | "shell.execute_reply": "2020-12-06T08:48:50.701338Z"
826 | }
827 | },
828 | "outputs": [
829 | {
830 | "name": "stdout",
831 | "output_type": "stream",
832 | "text": [
833 | "Epoch 0:\n",
834 | " Train Loss: 2.0938\n",
835 | " Train Acc: 0.40\n",
836 | "Epoch 1:\n",
837 | " Train Loss: 1.8023\n",
838 | " Train Acc: 0.47\n",
839 | "\n",
840 | "the --> ' '\n",
841 | "wom --> 'e'\n",
842 | "man --> ' '\n",
843 | "hum --> 'a'\n"
844 | ]
845 | }
846 | ],
847 | "source": [
848 | "model = CharModel(n_alphabet, n_embedding=n_embedding, n_hidden=n_hidden).to(device)\n",
849 | "fit(model)\n",
850 | "\n",
851 | "print()\n",
852 | "predict(\"the\")\n",
853 | "predict(\"wom\")\n",
854 | "predict(\"man\")\n",
855 | "predict(\"hum\")"
856 | ]
857 | },
858 | {
859 | "cell_type": "code",
860 | "execution_count": 30,
861 | "metadata": {
862 | "execution": {
863 | "iopub.execute_input": "2020-12-06T08:48:50.707961Z",
864 | "iopub.status.busy": "2020-12-06T08:48:50.707177Z",
865 | "iopub.status.idle": "2020-12-06T08:48:50.709589Z",
866 | "shell.execute_reply": "2020-12-06T08:48:50.709092Z"
867 | }
868 | },
869 | "outputs": [],
870 | "source": [
871 | "class CharModel(nn.Module):\n",
872 | " def __init__(self, n_vocab, n_embedding, n_hidden):\n",
873 | " super().__init__()\n",
874 | " self.emb = nn.Embedding(n_vocab, n_embedding)\n",
875 | " self.lin_in = nn.Linear(n_embedding, n_hidden)\n",
876 | " self.lin_hidden = nn.Linear(n_hidden, n_hidden)\n",
877 | " self.lin_out = nn.Linear(n_hidden, n_vocab)\n",
878 | " \n",
879 | " self.n_hidden = n_hidden\n",
880 | " \n",
881 | " def forward(self, X):\n",
882 | " h = torch.zeros(X.shape[0], n_hidden, requires_grad=True).to(device)\n",
883 | " for i in range(X.shape[1]):\n",
884 | " c = X[:, i]\n",
885 | " in_ = F.relu(self.lin_in(self.emb(c)))\n",
886 | " h = F.tanh(self.lin_hidden(h + in_))\n",
887 | "\n",
888 | " return F.log_softmax(self.lin_out(h), dim=-1)"
889 | ]
890 | },
891 | {
892 | "cell_type": "code",
893 | "execution_count": 31,
894 | "metadata": {
895 | "execution": {
896 | "iopub.execute_input": "2020-12-06T08:48:50.713113Z",
897 | "iopub.status.busy": "2020-12-06T08:48:50.712537Z",
898 | "iopub.status.idle": "2020-12-06T08:49:28.649837Z",
899 | "shell.execute_reply": "2020-12-06T08:49:28.650337Z"
900 | }
901 | },
902 | "outputs": [
903 | {
904 | "name": "stdout",
905 | "output_type": "stream",
906 | "text": [
907 | "Epoch 0:\n",
908 | " Train Loss: 2.0920\n",
909 | " Train Acc: 0.40\n",
910 | "Epoch 1:\n",
911 | " Train Loss: 1.7984\n",
912 | " Train Acc: 0.47\n",
913 | "\n",
914 | "the --> ' '\n",
915 | "wom --> 'a'\n",
916 | "man --> ' '\n",
917 | "hum --> 'a'\n"
918 | ]
919 | }
920 | ],
921 | "source": [
922 | "model = CharModel(n_alphabet, n_embedding=n_embedding, n_hidden=n_hidden).to(device)\n",
923 | "fit(model)\n",
924 | "\n",
925 | "print()\n",
926 | "predict(\"the\")\n",
927 | "predict(\"wom\")\n",
928 | "predict(\"man\")\n",
929 | "predict(\"hum\")"
930 | ]
931 | },
932 | {
933 | "cell_type": "code",
934 | "execution_count": 32,
935 | "metadata": {
936 | "execution": {
937 | "iopub.execute_input": "2020-12-06T08:49:28.653466Z",
938 | "iopub.status.busy": "2020-12-06T08:49:28.652870Z",
939 | "iopub.status.idle": "2020-12-06T08:49:28.656600Z",
940 | "shell.execute_reply": "2020-12-06T08:49:28.657108Z"
941 | }
942 | },
943 | "outputs": [
944 | {
945 | "name": "stdout",
946 | "output_type": "stream",
947 | "text": [
948 | "the huma --> 'n'\n"
949 | ]
950 | }
951 | ],
952 | "source": [
953 | "predict(\"the huma\")"
954 | ]
955 | },
956 | {
957 | "cell_type": "code",
958 | "execution_count": 33,
959 | "metadata": {
960 | "execution": {
961 | "iopub.execute_input": "2020-12-06T08:49:28.660105Z",
962 | "iopub.status.busy": "2020-12-06T08:49:28.659516Z",
963 | "iopub.status.idle": "2020-12-06T08:49:28.664202Z",
964 | "shell.execute_reply": "2020-12-06T08:49:28.663749Z"
965 | }
966 | },
967 | "outputs": [
968 | {
969 | "name": "stdout",
970 | "output_type": "stream",
971 | "text": [
972 | "those --> 'o'\n"
973 | ]
974 | }
975 | ],
976 | "source": [
977 | "predict(\"those \")"
978 | ]
979 | },
980 | {
981 | "cell_type": "code",
982 | "execution_count": 34,
983 | "metadata": {
984 | "execution": {
985 | "iopub.execute_input": "2020-12-06T08:49:28.667736Z",
986 | "iopub.status.busy": "2020-12-06T08:49:28.666915Z",
987 | "iopub.status.idle": "2020-12-06T08:49:28.671387Z",
988 | "shell.execute_reply": "2020-12-06T08:49:28.670940Z"
989 | }
990 | },
991 | "outputs": [
992 | {
993 | "name": "stdout",
994 | "output_type": "stream",
995 | "text": [
996 | "those o --> 'f'\n"
997 | ]
998 | }
999 | ],
1000 | "source": [
1001 | "predict(\"those o\")"
1002 | ]
1003 | },
1004 | {
1005 | "cell_type": "code",
1006 | "execution_count": 35,
1007 | "metadata": {
1008 | "execution": {
1009 | "iopub.execute_input": "2020-12-06T08:49:28.674900Z",
1010 | "iopub.status.busy": "2020-12-06T08:49:28.674088Z",
1011 | "iopub.status.idle": "2020-12-06T08:49:28.678752Z",
1012 | "shell.execute_reply": "2020-12-06T08:49:28.678304Z"
1013 | }
1014 | },
1015 | "outputs": [
1016 | {
1017 | "name": "stdout",
1018 | "output_type": "stream",
1019 | "text": [
1020 | "those of --> 's'\n"
1021 | ]
1022 | }
1023 | ],
1024 | "source": [
1025 | "predict(\"those of \")"
1026 | ]
1027 | },
1028 | {
1029 | "cell_type": "code",
1030 | "execution_count": 36,
1031 | "metadata": {
1032 | "execution": {
1033 | "iopub.execute_input": "2020-12-06T08:49:28.682279Z",
1034 | "iopub.status.busy": "2020-12-06T08:49:28.681453Z",
1035 | "iopub.status.idle": "2020-12-06T08:49:28.685718Z",
1036 | "shell.execute_reply": "2020-12-06T08:49:28.686154Z"
1037 | }
1038 | },
1039 | "outputs": [
1040 | {
1041 | "name": "stdout",
1042 | "output_type": "stream",
1043 | "text": [
1044 | "those of u --> 'p'\n"
1045 | ]
1046 | }
1047 | ],
1048 | "source": [
1049 | "predict(\"those of u\")"
1050 | ]
1051 | },
1052 | {
1053 | "cell_type": "markdown",
1054 | "metadata": {},
1055 | "source": [
1056 | "You can use `nn.Sequential` to make it a bit more readable."
1057 | ]
1058 | },
1059 | {
1060 | "cell_type": "code",
1061 | "execution_count": 37,
1062 | "metadata": {
1063 | "execution": {
1064 | "iopub.execute_input": "2020-12-06T08:49:28.692174Z",
1065 | "iopub.status.busy": "2020-12-06T08:49:28.691515Z",
1066 | "iopub.status.idle": "2020-12-06T08:49:28.693952Z",
1067 | "shell.execute_reply": "2020-12-06T08:49:28.693423Z"
1068 | }
1069 | },
1070 | "outputs": [],
1071 | "source": [
1072 | "class CharModel(nn.Module):\n",
1073 | " def __init__(self, n_vocab, n_embedding, n_hidden):\n",
1074 | " super().__init__()\n",
1075 | " self.i2e = nn.Sequential(\n",
1076 | " nn.Embedding(n_vocab, n_embedding),\n",
1077 | " nn.Linear(n_embedding, n_hidden),\n",
1078 | " nn.ReLU(),\n",
1079 | " )\n",
1080 | " self.h2h = nn.Sequential(\n",
1081 | " nn.Linear(n_hidden, n_hidden),\n",
1082 | " nn.Tanh(),\n",
1083 | " )\n",
1084 | " self.h2out = nn.Linear(n_hidden, n_vocab)\n",
1085 | " \n",
1086 | " self.n_hidden = n_hidden\n",
1087 | " \n",
1088 | " def forward(self, X):\n",
1089 | " h = torch.zeros(X.shape[0], n_hidden, requires_grad=True).to(device)\n",
1090 | " for i in range(X.shape[1]):\n",
1091 | " c = X[:, i]\n",
1092 | " h = self.h2h(h + self.i2e(c))\n",
1093 | "\n",
1094 | " return F.log_softmax(self.h2out(h), dim=-1)"
1095 | ]
1096 | },
1097 | {
1098 | "cell_type": "code",
1099 | "execution_count": 38,
1100 | "metadata": {
1101 | "execution": {
1102 | "iopub.execute_input": "2020-12-06T08:49:28.697759Z",
1103 | "iopub.status.busy": "2020-12-06T08:49:28.697090Z",
1104 | "iopub.status.idle": "2020-12-06T08:50:06.780817Z",
1105 | "shell.execute_reply": "2020-12-06T08:50:06.781311Z"
1106 | }
1107 | },
1108 | "outputs": [
1109 | {
1110 | "name": "stdout",
1111 | "output_type": "stream",
1112 | "text": [
1113 | "Epoch 0:\n",
1114 | " Train Loss: 2.0896\n",
1115 | " Train Acc: 0.40\n",
1116 | "Epoch 1:\n",
1117 | " Train Loss: 1.7967\n",
1118 | " Train Acc: 0.47\n",
1119 | "\n",
1120 | "the --> ' '\n",
1121 | "wom --> 'a'\n",
1122 | "man --> ' '\n",
1123 | "hum --> 'a'\n"
1124 | ]
1125 | }
1126 | ],
1127 | "source": [
1128 | "model = CharModel(n_alphabet, n_embedding=n_embedding, n_hidden=n_hidden).to(device)\n",
1129 | "fit(model)\n",
1130 | "\n",
1131 | "print()\n",
1132 | "predict(\"the\")\n",
1133 | "predict(\"wom\")\n",
1134 | "predict(\"man\")\n",
1135 | "predict(\"hum\")"
1136 | ]
1137 | }
1138 | ],
1139 | "metadata": {
1140 | "kernelspec": {
1141 | "display_name": "Python 3",
1142 | "language": "python",
1143 | "name": "python3"
1144 | },
1145 | "language_info": {
1146 | "codemirror_mode": {
1147 | "name": "ipython",
1148 | "version": 3
1149 | },
1150 | "file_extension": ".py",
1151 | "mimetype": "text/x-python",
1152 | "name": "python",
1153 | "nbconvert_exporter": "python",
1154 | "pygments_lexer": "ipython3",
1155 | "version": "3.8.5"
1156 | }
1157 | },
1158 | "nbformat": 4,
1159 | "nbformat_minor": 2
1160 | }
1161 |
--------------------------------------------------------------------------------
/notebooks/rnn_from_scratch.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # RNN from scratch with PyTorch
17 | # A RNN ist just a normal NN.
18 | # It's very easy to implement in PyTorch due to its dynamic nature.
19 | #
20 | # We'll build a very simple character based language model.
21 | #
22 | # Taken from http://www.fast.ai/
23 |
24 | # %% [markdown]
25 | # ## Init and helpers
26 |
27 | # %%
28 | from pathlib import Path
29 | import numpy as np
30 |
31 | # %% [markdown]
32 | # ## Data
33 |
34 | # %%
35 | NIETSCHE_PATH = Path("../data/raw/nietzsche.txt")
36 | if NIETSCHE_PATH.is_file():
37 | print("I already have the data.")
38 | else:
39 | # !wget -o ../data/raw/nietzsche.txt https://s3.amazonaws.com/text-datasets/nietzsche.txt
40 |
41 | with NIETSCHE_PATH.open() as f:
42 | data = f.read()
43 |
44 | # %% [markdown]
45 | # A tweet of Nietzsche:
46 |
47 | # %%
48 | print(data[:140])
49 |
50 | # %% [markdown]
51 | # We need to know the alphabet and we add a padding value "\0" to the alphabet.
52 |
53 | # %%
54 | alphabet = ["\0", *sorted(list(set(data)))]
55 | n_alphabet = len(alphabet)
56 | n_alphabet
57 |
58 | # %%
59 | char2index = {c: i for i, c in enumerate(alphabet)}
60 | index2char = {i: c for i, c in enumerate(alphabet)}
61 |
62 | # %% [markdown]
63 | # Convert the data into a list of integers
64 |
65 | # %%
66 | index = [char2index[c] for c in data]
67 |
68 | # %%
69 | print(index[:25])
70 | print("".join(index2char[i] for i in index[:25]))
71 |
72 | # %%
73 | index[0: 3]
74 |
75 | # %%
76 | X, y = [], []
77 | for i in range(len(index) - 4):
78 | X.append(index[i : i + 3])
79 | y.append(index[i + 3])
80 |
81 | X = np.stack(X)
82 | y = np.stack(y)
83 |
84 | # %%
85 | X.shape, y.shape
86 |
87 | # %%
88 | X[0], y[0]
89 |
90 | # %%
91 | type(y)
92 |
93 | # %%
94 | import torch
95 | from torch.utils.data import DataLoader, Dataset, TensorDataset
96 |
97 |
98 | train_ds = TensorDataset(torch.from_numpy(X), torch.from_numpy(y))
99 | train_dl = DataLoader(train_ds, batch_size=500)
100 |
101 | # %% [markdown]
102 | # # The model
103 |
104 | # %%
105 | import torch
106 | import torch.nn as nn
107 | import torch.nn.functional as F
108 | import torch.optim as optim
109 |
110 | # %%
111 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
112 | device
113 |
114 |
115 | # %%
116 | class CharModel(nn.Module):
117 | def __init__(self, n_vocab, n_embedding, n_hidden):
118 | super().__init__()
119 | self.emb = nn.Embedding(n_vocab, n_embedding)
120 | self.lin_in = nn.Linear(n_embedding, n_hidden)
121 |
122 | self.lin_hidden = nn.Linear(n_hidden, n_hidden)
123 | self.lin_out = nn.Linear(n_hidden, n_vocab)
124 |
125 | def forward(self, X):
126 | c1, c2, c3 = X[:, 0], X[:, 1], X[:, 2]
127 |
128 | in1 = F.relu(self.lin_in(self.emb(c1)))
129 | h = F.tanh(self.lin_hidden(in1))
130 |
131 | in2 = F.relu(self.lin_in(self.emb(c2)))
132 | h = F.tanh(self.lin_hidden(h + in2))
133 |
134 | in3 = F.relu(self.lin_in(self.emb(c3)))
135 | h = F.tanh(self.lin_hidden(h + in3))
136 |
137 | return F.log_softmax(self.lin_out(h), dim=-1)
138 |
139 |
140 | # %%
141 | n_embedding = 40
142 | n_hidden = 256
143 |
144 | model = CharModel(n_alphabet, n_embedding=40, n_hidden=128)
145 | model = model.to(device)
146 |
147 | # %%
148 | optimizer = optim.Adam(model.parameters(), 0.001)
149 | #criterion = nn.CrossEntropyLoss()
150 | criterion = F.nll_loss
151 |
152 |
153 | # %%
154 | def fit(model, n_epoch=2):
155 | optimizer = optim.Adam(model.parameters(), 0.001)
156 |
157 | for epoch in range(n_epoch):
158 | print(f"Epoch {epoch}:")
159 | running_loss, correct = 0.0, 0
160 |
161 | model.train()
162 | for X, y in train_dl:
163 | X, y = X.to(device), y.to(device)
164 | optimizer.zero_grad()
165 |
166 | y_ = model(X)
167 | loss = criterion(y_, y)
168 |
169 | loss.backward()
170 | optimizer.step()
171 |
172 | _, y_label_ = torch.max(y_, 1)
173 | correct += (y_label_ == y).sum().item()
174 | running_loss += loss.item() * X.shape[0]
175 |
176 | print(f" Train Loss: {running_loss / len(train_dl.dataset):0.4f}")
177 | print(f" Train Acc: {correct / len(train_dl.dataset):0.2f}")
178 |
179 |
180 | # %%
181 | fit(model, 2)
182 |
183 |
184 | # %%
185 | def predict(word):
186 | word_idx = [char2index[c] for c in word]
187 | word_idx
188 | with torch.no_grad():
189 | X = torch.tensor(word_idx).unsqueeze(0).to(device)
190 | model.eval()
191 | y_ = model(X).cpu()
192 | pred = index2char[torch.argmax(y_).item()]
193 | print(f"{word} --> '{pred}'")
194 |
195 |
196 | # %%
197 | predict("the")
198 |
199 | # %%
200 | predict("wom")
201 |
202 | # %%
203 | predict("man")
204 |
205 | # %%
206 | predict("hum")
207 |
208 |
209 | # %%
210 | class CharModel(nn.Module):
211 | def __init__(self, n_vocab, n_embedding, n_hidden):
212 | super().__init__()
213 | self.emb = nn.Embedding(n_vocab, n_embedding)
214 | self.lin_in = nn.Linear(n_embedding, n_hidden)
215 | self.lin_hidden = nn.Linear(n_hidden, n_hidden)
216 | self.lin_out = nn.Linear(n_hidden, n_vocab)
217 |
218 | def forward(self, X):
219 | c1, c2, c3 = X[:, 0], X[:, 1], X[:, 2]
220 |
221 | in1 = F.relu(self.lin_in(self.emb(c1)))
222 | in2 = F.relu(self.lin_in(self.emb(c2)))
223 | in3 = F.relu(self.lin_in(self.emb(c3)))
224 |
225 | h = F.tanh(self.lin_hidden(in1))
226 | h = F.tanh(self.lin_hidden(h + in2))
227 | h = F.tanh(self.lin_hidden(h + in3))
228 |
229 | return F.log_softmax(self.lin_out(h), dim=-1)
230 |
231 |
232 | # %%
233 | model = CharModel(n_alphabet, n_embedding=n_embedding, n_hidden=128).to(device)
234 | fit(model)
235 |
236 | print()
237 | predict("the")
238 | predict("wom")
239 | predict("man")
240 | predict("hum")
241 |
242 |
243 | # %%
244 | class CharModel(nn.Module):
245 | def __init__(self, n_vocab, n_embedding, n_hidden):
246 | super().__init__()
247 | self.emb = nn.Embedding(n_vocab, n_embedding)
248 | self.lin_in = nn.Linear(n_embedding, n_hidden)
249 | self.lin_hidden = nn.Linear(n_hidden, n_hidden)
250 | self.lin_out = nn.Linear(n_hidden, n_vocab)
251 |
252 | self.n_hidden = n_hidden
253 |
254 | def forward(self, X):
255 | c1, c2, c3 = X[:, 0], X[:, 1], X[:, 2]
256 |
257 | in1 = F.relu(self.lin_in(self.emb(c1)))
258 | in2 = F.relu(self.lin_in(self.emb(c2)))
259 | in3 = F.relu(self.lin_in(self.emb(c3)))
260 |
261 | h = torch.zeros(X.shape[0], n_hidden, requires_grad=True).to(device)
262 | h = F.tanh(self.lin_hidden(h + in1))
263 | h = F.tanh(self.lin_hidden(h + in2))
264 | h = F.tanh(self.lin_hidden(h + in3))
265 |
266 | return F.log_softmax(self.lin_out(h), dim=-1)
267 |
268 |
269 | # %%
270 | model = CharModel(n_alphabet, n_embedding=n_embedding, n_hidden=n_hidden).to(device)
271 | fit(model)
272 |
273 | print()
274 | predict("the")
275 | predict("wom")
276 | predict("man")
277 | predict("hum")
278 |
279 |
280 | # %%
281 | class CharModel(nn.Module):
282 | def __init__(self, n_vocab, n_embedding, n_hidden):
283 | super().__init__()
284 | self.emb = nn.Embedding(n_vocab, n_embedding)
285 | self.lin_in = nn.Linear(n_embedding, n_hidden)
286 | self.lin_hidden = nn.Linear(n_hidden, n_hidden)
287 | self.lin_out = nn.Linear(n_hidden, n_vocab)
288 |
289 | self.n_hidden = n_hidden
290 |
291 | def forward(self, X):
292 | h = torch.zeros(X.shape[0], n_hidden, requires_grad=True).to(device)
293 | for i in range(X.shape[1]):
294 | c = X[:, i]
295 | in_ = F.relu(self.lin_in(self.emb(c)))
296 | h = F.tanh(self.lin_hidden(h + in_))
297 |
298 | return F.log_softmax(self.lin_out(h), dim=-1)
299 |
300 |
301 | # %%
302 | model = CharModel(n_alphabet, n_embedding=n_embedding, n_hidden=n_hidden).to(device)
303 | fit(model)
304 |
305 | print()
306 | predict("the")
307 | predict("wom")
308 | predict("man")
309 | predict("hum")
310 |
311 | # %%
312 | predict("the huma")
313 |
314 | # %%
315 | predict("those ")
316 |
317 | # %%
318 | predict("those o")
319 |
320 | # %%
321 | predict("those of ")
322 |
323 | # %%
324 | predict("those of u")
325 |
326 |
327 | # %% [markdown]
328 | # You can use `nn.Sequential` to make it a bit more readable.
329 |
330 | # %%
331 | class CharModel(nn.Module):
332 | def __init__(self, n_vocab, n_embedding, n_hidden):
333 | super().__init__()
334 | self.i2e = nn.Sequential(
335 | nn.Embedding(n_vocab, n_embedding),
336 | nn.Linear(n_embedding, n_hidden),
337 | nn.ReLU(),
338 | )
339 | self.h2h = nn.Sequential(
340 | nn.Linear(n_hidden, n_hidden),
341 | nn.Tanh(),
342 | )
343 | self.h2out = nn.Linear(n_hidden, n_vocab)
344 |
345 | self.n_hidden = n_hidden
346 |
347 | def forward(self, X):
348 | h = torch.zeros(X.shape[0], n_hidden, requires_grad=True).to(device)
349 | for i in range(X.shape[1]):
350 | c = X[:, i]
351 | h = self.h2h(h + self.i2e(c))
352 |
353 | return F.log_softmax(self.h2out(h), dim=-1)
354 |
355 |
356 | # %%
357 | model = CharModel(n_alphabet, n_embedding=n_embedding, n_hidden=n_hidden).to(device)
358 | fit(model)
359 |
360 | print()
361 | predict("the")
362 | predict("wom")
363 | predict("man")
364 | predict("hum")
365 |
--------------------------------------------------------------------------------
/notebooks/storing_and_loading_models.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # Storing and Loading Models
17 | #
18 | # https://pytorch.org/tutorials/beginner/saving_loading_models.html
19 |
20 | # %% [markdown]
21 | # ## Init, helpers, utils, ...
22 |
23 | # %%
24 | # %matplotlib inline
25 |
26 | # %%
27 | import torch
28 | import torch.nn as nn
29 | import torch.nn.functional as F
30 | import torch.optim as optim
31 | import torchvision
32 |
33 | DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
34 |
35 | # %%
36 | from pprint import pprint
37 | import matplotlib.pyplot as plt
38 | import numpy as np
39 | from IPython.core.debugger import set_trace
40 |
41 |
42 | # %% [markdown]
43 | # # `state_dict()`
44 |
45 | # %% [markdown]
46 | # ## `nn.Module.state_dict()`
47 | # `nn.Module` contain state dict, that maps each layer to the learnable parameters.
48 |
49 | # %%
50 | class Net(nn.Module):
51 | def __init__(self):
52 | super().__init__()
53 | self.conv1 = nn.Conv2d(3, 6, 5)
54 | self.pool = nn.MaxPool2d(2, 2)
55 | self.conv2 = nn.Conv2d(6, 16, 5)
56 | self.fc1 = nn.Linear(16 * 5 * 5, 120)
57 | self.fc2 = nn.Linear(120, 84)
58 | self.fc3 = nn.Linear(84, 10)
59 |
60 | def forward(self, x):
61 | x = self.pool(F.relu(self.conv1(x)))
62 | x = self.pool(F.relu(self.conv2(x)))
63 | x = x.view(-1, 16 * 5 * 5)
64 | x = F.relu(self.fc1(x))
65 | x = F.relu(self.fc2(x))
66 | x = self.fc3(x)
67 | return x
68 |
69 |
70 | # %%
71 | model = Net()
72 |
73 | # %%
74 | model.state_dict()
75 |
76 |
77 | # %%
78 | def state_dict_info(obj):
79 | print(f"{'layer':25} shape")
80 | print("===================================================")
81 | for k,v in obj.state_dict().items():
82 | try:
83 | print(f"{k:25} {v.shape}")
84 | except AttributeError:
85 | print(f"{k:25} {v}")
86 |
87 |
88 | # %%
89 | state_dict_info(model)
90 |
91 | # %% [markdown]
92 | # ## `nn.Optimizer`
93 | #
94 | # Optimizers also have a a `state_dict`.
95 |
96 | # %%
97 | optimizer = optim.Adadelta(model.parameters())
98 |
99 | # %%
100 | state_dict_info(optimizer)
101 |
102 | # %%
103 | optimizer.state_dict()["state"]
104 |
105 | # %%
106 | optimizer.state_dict()["param_groups"]
107 |
108 | # %% [markdown]
109 | # ## Storing and loading `state_dict`
110 |
111 | # %%
112 | model_file = "model_state_dict.pt"
113 | torch.save(model.state_dict(), model_file)
114 |
115 | # %%
116 | model = Net()
117 | model.load_state_dict(torch.load(model_file))
118 |
119 | # %% [markdown]
120 | # ## Storing and loading the full model
121 |
122 | # %%
123 | model_file = "model_123.pt"
124 | torch.save(model, model_file)
125 |
126 | # %%
127 | # Only works if code for `Net` is available right now
128 | model = torch.load(model_file)
129 |
130 | # %% [markdown]
131 | # # Example Checkpointing
132 | # You can store model, optimizer and arbitrary information and reload it.
133 | #
134 | # Example:
135 | # ```python
136 | # torch.save(
137 | # {
138 | # 'model_state_dict': model.state_dict(),
139 | # 'optimizer_state_dict': optimizer.state_dict(),
140 | # 'epoch': epoch,
141 | # 'loss': loss,
142 | # },
143 | # PATH,
144 | # )
145 | # ```
146 |
147 | # %% [markdown]
148 | # # Exercise
149 | # - Find out what is going to be in the `state` variable of the `state_dict` of an optimizer.
150 | # - Write your own checkpoint functionality.
151 |
--------------------------------------------------------------------------------
/notebooks/the_end.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# The End\n",
8 | "\n",
9 | "\n",
10 | "https://twitter.com/karpathy/status/1013244313327681536"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "metadata": {},
16 | "source": [
17 | "**Goals:**\n",
18 | "- understand PyTorch concepts\n",
19 | "- be able to use transfer learning in PyTorch\n",
20 | "- be aware of some handy tools/libs"
21 | ]
22 | },
23 | {
24 | "cell_type": "markdown",
25 | "metadata": {},
26 | "source": [
27 | "# Interesting reads\n",
28 | "\n",
29 | "- [The road to 1.0: production ready PyTorch](https://pytorch.org/2018/05/02/road-to-1.0.html)\n",
30 | " - `torch.jit`\n",
31 | " - optimize for mobile\n",
32 | " - quantized inference (such as 8-bit inference)\n",
33 | " - [caffe2 already merged into pytorch repo](https://github.com/pytorch/pytorch/tree/master/caffe2)\n",
34 | " - ONNX - Open Neural Network Exchange\n",
35 | "\n",
36 | "- [PyTorch under the hood](https://speakerdeck.com/perone/pytorch-under-the-hood)\n",
37 | "- PyTorch 1.0 videos"
38 | ]
39 | }
40 | ],
41 | "metadata": {
42 | "kernelspec": {
43 | "display_name": "Python 3",
44 | "language": "python",
45 | "name": "python3"
46 | },
47 | "language_info": {
48 | "codemirror_mode": {
49 | "name": "ipython",
50 | "version": 3
51 | },
52 | "file_extension": ".py",
53 | "mimetype": "text/x-python",
54 | "name": "python",
55 | "nbconvert_exporter": "python",
56 | "pygments_lexer": "ipython3",
57 | "version": "3.8.5"
58 | }
59 | },
60 | "nbformat": 4,
61 | "nbformat_minor": 2
62 | }
63 |
--------------------------------------------------------------------------------
/notebooks/the_end.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # The End
17 | #
18 | # 
19 | # https://twitter.com/karpathy/status/1013244313327681536
20 |
21 | # %% [markdown]
22 | # **Goals:**
23 | # - understand PyTorch concepts
24 | # - be able to use transfer learning in PyTorch
25 | # - be aware of some handy tools/libs
26 |
27 | # %% [markdown]
28 | # # Interesting reads
29 | #
30 | # - [The road to 1.0: production ready PyTorch](https://pytorch.org/2018/05/02/road-to-1.0.html)
31 | # - `torch.jit`
32 | # - optimize for mobile
33 | # - quantized inference (such as 8-bit inference)
34 | # - [caffe2 already merged into pytorch repo](https://github.com/pytorch/pytorch/tree/master/caffe2)
35 | # - ONNX - Open Neural Network Exchange
36 | #
37 | # - [PyTorch under the hood](https://speakerdeck.com/perone/pytorch-under-the-hood)
38 | # - PyTorch 1.0 videos
39 |
--------------------------------------------------------------------------------
/notebooks/torch_jit.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# `torch.jit`\n",
8 | "\n",
9 | "Eager execution is great for development and debugging. but it can be hard to (automatically) optimize the code and deploy it.\n",
10 | "\n",
11 | "Now there is`torch.jit` with two flavours:\n",
12 | "\n",
13 | "- `torch.jit.trace` does not record control flow.\n",
14 | "- `torch.jit.script` records control flow and creates an intermediate representation that can be optimized; only supports a subset of Python.\n",
15 | "\n",
16 | "Note: don't forget `model.eval()` and `model.train()`.\n",
17 | "\n",
18 | "\n",
19 | "## Ref and More:\n",
20 | "- https://pytorch.org/docs/stable/jit.html\n",
21 | "- https://speakerdeck.com/perone/pytorch-under-the-hood\n",
22 | "- https://lernapparat.de/fast-lstm-pytorch/"
23 | ]
24 | },
25 | {
26 | "cell_type": "markdown",
27 | "metadata": {},
28 | "source": [
29 | "## Init, helpers, utils, ..."
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "execution_count": 1,
35 | "metadata": {
36 | "execution": {
37 | "iopub.execute_input": "2020-11-27T14:31:24.971789Z",
38 | "iopub.status.busy": "2020-11-27T14:31:24.969854Z",
39 | "iopub.status.idle": "2020-11-27T14:31:25.203359Z",
40 | "shell.execute_reply": "2020-11-27T14:31:25.202626Z"
41 | }
42 | },
43 | "outputs": [],
44 | "source": [
45 | "%load_ext autoreload\n",
46 | "%autoreload 2\n",
47 | "\n",
48 | "%matplotlib inline"
49 | ]
50 | },
51 | {
52 | "cell_type": "code",
53 | "execution_count": 2,
54 | "metadata": {
55 | "execution": {
56 | "iopub.execute_input": "2020-11-27T14:31:25.206235Z",
57 | "iopub.status.busy": "2020-11-27T14:31:25.205811Z",
58 | "iopub.status.idle": "2020-11-27T14:31:25.478268Z",
59 | "shell.execute_reply": "2020-11-27T14:31:25.477948Z"
60 | }
61 | },
62 | "outputs": [],
63 | "source": [
64 | "import torch\n",
65 | "import torch.nn as nn\n",
66 | "import torch.nn.functional as F\n",
67 | "import torch.optim as optim\n",
68 | "import torchvision"
69 | ]
70 | },
71 | {
72 | "cell_type": "code",
73 | "execution_count": 3,
74 | "metadata": {
75 | "execution": {
76 | "iopub.execute_input": "2020-11-27T14:31:25.481570Z",
77 | "iopub.status.busy": "2020-11-27T14:31:25.481113Z",
78 | "iopub.status.idle": "2020-11-27T14:31:25.494485Z",
79 | "shell.execute_reply": "2020-11-27T14:31:25.494069Z"
80 | }
81 | },
82 | "outputs": [],
83 | "source": [
84 | "from pprint import pprint\n",
85 | "\n",
86 | "import matplotlib.pyplot as plt\n",
87 | "import numpy as np\n",
88 | "from IPython.core.debugger import set_trace\n",
89 | "\n",
90 | "import utils # little helpers\n",
91 | "from utils import attr"
92 | ]
93 | },
94 | {
95 | "cell_type": "markdown",
96 | "metadata": {},
97 | "source": [
98 | "# `torch.jit.trace`"
99 | ]
100 | },
101 | {
102 | "cell_type": "code",
103 | "execution_count": 4,
104 | "metadata": {
105 | "execution": {
106 | "iopub.execute_input": "2020-11-27T14:31:25.497670Z",
107 | "iopub.status.busy": "2020-11-27T14:31:25.497372Z",
108 | "iopub.status.idle": "2020-11-27T14:31:25.504775Z",
109 | "shell.execute_reply": "2020-11-27T14:31:25.504504Z"
110 | }
111 | },
112 | "outputs": [],
113 | "source": [
114 | "def f(x):\n",
115 | " if x.item() < 0:\n",
116 | " return torch.tensor(0)\n",
117 | " else:\n",
118 | " return x"
119 | ]
120 | },
121 | {
122 | "cell_type": "code",
123 | "execution_count": 5,
124 | "metadata": {
125 | "execution": {
126 | "iopub.execute_input": "2020-11-27T14:31:25.507274Z",
127 | "iopub.status.busy": "2020-11-27T14:31:25.506848Z",
128 | "iopub.status.idle": "2020-11-27T14:31:25.515959Z",
129 | "shell.execute_reply": "2020-11-27T14:31:25.515596Z"
130 | }
131 | },
132 | "outputs": [
133 | {
134 | "data": {
135 | "text/plain": [
136 | "tensor(0)"
137 | ]
138 | },
139 | "execution_count": 1,
140 | "metadata": {},
141 | "output_type": "execute_result"
142 | }
143 | ],
144 | "source": [
145 | "f(torch.tensor(-1))"
146 | ]
147 | },
148 | {
149 | "cell_type": "code",
150 | "execution_count": 6,
151 | "metadata": {
152 | "execution": {
153 | "iopub.execute_input": "2020-11-27T14:31:25.518286Z",
154 | "iopub.status.busy": "2020-11-27T14:31:25.517863Z",
155 | "iopub.status.idle": "2020-11-27T14:31:25.525322Z",
156 | "shell.execute_reply": "2020-11-27T14:31:25.525004Z"
157 | }
158 | },
159 | "outputs": [
160 | {
161 | "data": {
162 | "text/plain": [
163 | "tensor(3)"
164 | ]
165 | },
166 | "execution_count": 1,
167 | "metadata": {},
168 | "output_type": "execute_result"
169 | }
170 | ],
171 | "source": [
172 | "f(torch.tensor(3))"
173 | ]
174 | },
175 | {
176 | "cell_type": "code",
177 | "execution_count": 7,
178 | "metadata": {
179 | "execution": {
180 | "iopub.execute_input": "2020-11-27T14:31:25.527966Z",
181 | "iopub.status.busy": "2020-11-27T14:31:25.527466Z",
182 | "iopub.status.idle": "2020-11-27T14:31:25.541534Z",
183 | "shell.execute_reply": "2020-11-27T14:31:25.540887Z"
184 | }
185 | },
186 | "outputs": [
187 | {
188 | "name": "stderr",
189 | "output_type": "stream",
190 | "text": [
191 | ":2: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
192 | " if x.item() < 0:\n"
193 | ]
194 | }
195 | ],
196 | "source": [
197 | "X = torch.tensor(1)\n",
198 | "traced = torch.jit.trace(f, X)"
199 | ]
200 | },
201 | {
202 | "cell_type": "code",
203 | "execution_count": 8,
204 | "metadata": {
205 | "execution": {
206 | "iopub.execute_input": "2020-11-27T14:31:25.543997Z",
207 | "iopub.status.busy": "2020-11-27T14:31:25.543681Z",
208 | "iopub.status.idle": "2020-11-27T14:31:25.551831Z",
209 | "shell.execute_reply": "2020-11-27T14:31:25.551525Z"
210 | }
211 | },
212 | "outputs": [
213 | {
214 | "data": {
215 | "text/plain": [
216 | "torch.jit.ScriptFunction"
217 | ]
218 | },
219 | "execution_count": 1,
220 | "metadata": {},
221 | "output_type": "execute_result"
222 | }
223 | ],
224 | "source": [
225 | "type(traced)"
226 | ]
227 | },
228 | {
229 | "cell_type": "code",
230 | "execution_count": 9,
231 | "metadata": {
232 | "execution": {
233 | "iopub.execute_input": "2020-11-27T14:31:25.554129Z",
234 | "iopub.status.busy": "2020-11-27T14:31:25.553805Z",
235 | "iopub.status.idle": "2020-11-27T14:31:25.562407Z",
236 | "shell.execute_reply": "2020-11-27T14:31:25.562078Z"
237 | }
238 | },
239 | "outputs": [
240 | {
241 | "data": {
242 | "text/plain": [
243 | "tensor(1)"
244 | ]
245 | },
246 | "execution_count": 1,
247 | "metadata": {},
248 | "output_type": "execute_result"
249 | }
250 | ],
251 | "source": [
252 | "traced(torch.tensor(1))"
253 | ]
254 | },
255 | {
256 | "cell_type": "code",
257 | "execution_count": 10,
258 | "metadata": {
259 | "execution": {
260 | "iopub.execute_input": "2020-11-27T14:31:25.564943Z",
261 | "iopub.status.busy": "2020-11-27T14:31:25.564560Z",
262 | "iopub.status.idle": "2020-11-27T14:31:25.572611Z",
263 | "shell.execute_reply": "2020-11-27T14:31:25.572278Z"
264 | }
265 | },
266 | "outputs": [
267 | {
268 | "data": {
269 | "text/plain": [
270 | "graph(%0 : Long(requires_grad=0, device=cpu)):\n",
271 | " return (%0)"
272 | ]
273 | },
274 | "execution_count": 1,
275 | "metadata": {},
276 | "output_type": "execute_result"
277 | }
278 | ],
279 | "source": [
280 | "traced.graph"
281 | ]
282 | },
283 | {
284 | "cell_type": "code",
285 | "execution_count": 11,
286 | "metadata": {
287 | "execution": {
288 | "iopub.execute_input": "2020-11-27T14:31:25.575102Z",
289 | "iopub.status.busy": "2020-11-27T14:31:25.574627Z",
290 | "iopub.status.idle": "2020-11-27T14:31:25.582783Z",
291 | "shell.execute_reply": "2020-11-27T14:31:25.582382Z"
292 | }
293 | },
294 | "outputs": [
295 | {
296 | "data": {
297 | "text/plain": [
298 | "tensor(-1)"
299 | ]
300 | },
301 | "execution_count": 1,
302 | "metadata": {},
303 | "output_type": "execute_result"
304 | }
305 | ],
306 | "source": [
307 | "traced(torch.tensor(-1))"
308 | ]
309 | },
310 | {
311 | "cell_type": "markdown",
312 | "metadata": {},
313 | "source": [
314 | "## Storing and restoring"
315 | ]
316 | },
317 | {
318 | "cell_type": "code",
319 | "execution_count": 12,
320 | "metadata": {
321 | "execution": {
322 | "iopub.execute_input": "2020-11-27T14:31:25.585530Z",
323 | "iopub.status.busy": "2020-11-27T14:31:25.585143Z",
324 | "iopub.status.idle": "2020-11-27T14:31:25.606206Z",
325 | "shell.execute_reply": "2020-11-27T14:31:25.605544Z"
326 | }
327 | },
328 | "outputs": [],
329 | "source": [
330 | "traced.save(\"traced.pt\")"
331 | ]
332 | },
333 | {
334 | "cell_type": "code",
335 | "execution_count": 13,
336 | "metadata": {
337 | "execution": {
338 | "iopub.execute_input": "2020-11-27T14:31:25.608723Z",
339 | "iopub.status.busy": "2020-11-27T14:31:25.608368Z",
340 | "iopub.status.idle": "2020-11-27T14:31:25.733329Z",
341 | "shell.execute_reply": "2020-11-27T14:31:25.731998Z"
342 | }
343 | },
344 | "outputs": [
345 | {
346 | "name": "stdout",
347 | "output_type": "stream",
348 | "text": [
349 | "scripted.pt: Zip archive data, at least v?[0] to extract\r\n"
350 | ]
351 | }
352 | ],
353 | "source": [
354 | "!file scripted.pt"
355 | ]
356 | },
357 | {
358 | "cell_type": "code",
359 | "execution_count": 14,
360 | "metadata": {
361 | "execution": {
362 | "iopub.execute_input": "2020-11-27T14:31:25.737533Z",
363 | "iopub.status.busy": "2020-11-27T14:31:25.737240Z",
364 | "iopub.status.idle": "2020-11-27T14:31:25.745978Z",
365 | "shell.execute_reply": "2020-11-27T14:31:25.745553Z"
366 | }
367 | },
368 | "outputs": [],
369 | "source": [
370 | "g = torch.jit.load(\"traced.pt\")"
371 | ]
372 | },
373 | {
374 | "cell_type": "code",
375 | "execution_count": 15,
376 | "metadata": {
377 | "execution": {
378 | "iopub.execute_input": "2020-11-27T14:31:25.749203Z",
379 | "iopub.status.busy": "2020-11-27T14:31:25.748303Z",
380 | "iopub.status.idle": "2020-11-27T14:31:25.757664Z",
381 | "shell.execute_reply": "2020-11-27T14:31:25.757985Z"
382 | }
383 | },
384 | "outputs": [
385 | {
386 | "data": {
387 | "text/plain": [
388 | "tensor(1)"
389 | ]
390 | },
391 | "execution_count": 1,
392 | "metadata": {},
393 | "output_type": "execute_result"
394 | }
395 | ],
396 | "source": [
397 | "g(torch.tensor(1))"
398 | ]
399 | },
400 | {
401 | "cell_type": "code",
402 | "execution_count": 16,
403 | "metadata": {
404 | "execution": {
405 | "iopub.execute_input": "2020-11-27T14:31:25.761208Z",
406 | "iopub.status.busy": "2020-11-27T14:31:25.760655Z",
407 | "iopub.status.idle": "2020-11-27T14:31:25.769043Z",
408 | "shell.execute_reply": "2020-11-27T14:31:25.769360Z"
409 | }
410 | },
411 | "outputs": [
412 | {
413 | "data": {
414 | "text/plain": [
415 | "tensor(-1)"
416 | ]
417 | },
418 | "execution_count": 1,
419 | "metadata": {},
420 | "output_type": "execute_result"
421 | }
422 | ],
423 | "source": [
424 | "g(torch.tensor(-1))"
425 | ]
426 | },
427 | {
428 | "cell_type": "markdown",
429 | "metadata": {},
430 | "source": [
431 | "# `torch.jit.script`"
432 | ]
433 | },
434 | {
435 | "cell_type": "code",
436 | "execution_count": 17,
437 | "metadata": {
438 | "execution": {
439 | "iopub.execute_input": "2020-11-27T14:31:25.772118Z",
440 | "iopub.status.busy": "2020-11-27T14:31:25.771699Z",
441 | "iopub.status.idle": "2020-11-27T14:31:25.780525Z",
442 | "shell.execute_reply": "2020-11-27T14:31:25.780141Z"
443 | }
444 | },
445 | "outputs": [
446 | {
447 | "data": {
448 | "text/plain": [
449 | "True"
450 | ]
451 | },
452 | "execution_count": 1,
453 | "metadata": {},
454 | "output_type": "execute_result"
455 | }
456 | ],
457 | "source": [
458 | "bool(torch.tensor(1) < 2)"
459 | ]
460 | },
461 | {
462 | "cell_type": "code",
463 | "execution_count": 18,
464 | "metadata": {
465 | "execution": {
466 | "iopub.execute_input": "2020-11-27T14:31:25.784867Z",
467 | "iopub.status.busy": "2020-11-27T14:31:25.783207Z",
468 | "iopub.status.idle": "2020-11-27T14:31:25.792360Z",
469 | "shell.execute_reply": "2020-11-27T14:31:25.792027Z"
470 | }
471 | },
472 | "outputs": [],
473 | "source": [
474 | "@torch.jit.script\n",
475 | "def f(x):\n",
476 | " if bool(x < 0):\n",
477 | " result = torch.zeros(1)\n",
478 | " else:\n",
479 | " result = x\n",
480 | " return result"
481 | ]
482 | },
483 | {
484 | "cell_type": "markdown",
485 | "metadata": {},
486 | "source": [
487 | "This is `torchscript` which is a only a supset of python."
488 | ]
489 | },
490 | {
491 | "cell_type": "code",
492 | "execution_count": 19,
493 | "metadata": {
494 | "execution": {
495 | "iopub.execute_input": "2020-11-27T14:31:25.794784Z",
496 | "iopub.status.busy": "2020-11-27T14:31:25.794427Z",
497 | "iopub.status.idle": "2020-11-27T14:31:25.803364Z",
498 | "shell.execute_reply": "2020-11-27T14:31:25.803087Z"
499 | }
500 | },
501 | "outputs": [
502 | {
503 | "data": {
504 | "text/plain": [
505 | "tensor([0.])"
506 | ]
507 | },
508 | "execution_count": 1,
509 | "metadata": {},
510 | "output_type": "execute_result"
511 | }
512 | ],
513 | "source": [
514 | "f(torch.tensor(-1))"
515 | ]
516 | },
517 | {
518 | "cell_type": "code",
519 | "execution_count": 20,
520 | "metadata": {
521 | "execution": {
522 | "iopub.execute_input": "2020-11-27T14:31:25.806024Z",
523 | "iopub.status.busy": "2020-11-27T14:31:25.805284Z",
524 | "iopub.status.idle": "2020-11-27T14:31:25.813152Z",
525 | "shell.execute_reply": "2020-11-27T14:31:25.813381Z"
526 | }
527 | },
528 | "outputs": [
529 | {
530 | "data": {
531 | "text/plain": [
532 | "tensor(1)"
533 | ]
534 | },
535 | "execution_count": 1,
536 | "metadata": {},
537 | "output_type": "execute_result"
538 | }
539 | ],
540 | "source": [
541 | "f(torch.tensor(1))"
542 | ]
543 | },
544 | {
545 | "cell_type": "code",
546 | "execution_count": 21,
547 | "metadata": {
548 | "execution": {
549 | "iopub.execute_input": "2020-11-27T14:31:25.815516Z",
550 | "iopub.status.busy": "2020-11-27T14:31:25.815182Z",
551 | "iopub.status.idle": "2020-11-27T14:31:25.823160Z",
552 | "shell.execute_reply": "2020-11-27T14:31:25.822776Z"
553 | }
554 | },
555 | "outputs": [
556 | {
557 | "data": {
558 | "text/plain": [
559 | "torch.jit.ScriptFunction"
560 | ]
561 | },
562 | "execution_count": 1,
563 | "metadata": {},
564 | "output_type": "execute_result"
565 | }
566 | ],
567 | "source": [
568 | "type(f)"
569 | ]
570 | },
571 | {
572 | "cell_type": "code",
573 | "execution_count": 22,
574 | "metadata": {
575 | "execution": {
576 | "iopub.execute_input": "2020-11-27T14:31:25.825962Z",
577 | "iopub.status.busy": "2020-11-27T14:31:25.825098Z",
578 | "iopub.status.idle": "2020-11-27T14:31:25.833882Z",
579 | "shell.execute_reply": "2020-11-27T14:31:25.833453Z"
580 | }
581 | },
582 | "outputs": [
583 | {
584 | "data": {
585 | "text/plain": [
586 | "graph(%x.1 : Tensor):\n",
587 | " %8 : None = prim::Constant()\n",
588 | " %2 : int = prim::Constant[value=0]() # :3:16\n",
589 | " %5 : int = prim::Constant[value=1]() # :4:29\n",
590 | " %3 : Tensor = aten::lt(%x.1, %2) # :3:12\n",
591 | " %4 : bool = aten::Bool(%3) # :3:7\n",
592 | " %result : Tensor = prim::If(%4) # :3:4\n",
593 | " block0():\n",
594 | " %7 : int[] = prim::ListConstruct(%5)\n",
595 | " %result.1 : Tensor = aten::zeros(%7, %8, %8, %8, %8) # :4:17\n",
596 | " -> (%result.1)\n",
597 | " block1():\n",
598 | " -> (%x.1)\n",
599 | " return (%result)"
600 | ]
601 | },
602 | "execution_count": 1,
603 | "metadata": {},
604 | "output_type": "execute_result"
605 | }
606 | ],
607 | "source": [
608 | "f.graph"
609 | ]
610 | },
611 | {
612 | "cell_type": "markdown",
613 | "metadata": {},
614 | "source": [
615 | "## Storing and restoring"
616 | ]
617 | },
618 | {
619 | "cell_type": "code",
620 | "execution_count": 23,
621 | "metadata": {
622 | "execution": {
623 | "iopub.execute_input": "2020-11-27T14:31:25.836435Z",
624 | "iopub.status.busy": "2020-11-27T14:31:25.836068Z",
625 | "iopub.status.idle": "2020-11-27T14:31:25.843708Z",
626 | "shell.execute_reply": "2020-11-27T14:31:25.843266Z"
627 | }
628 | },
629 | "outputs": [],
630 | "source": [
631 | "torch.jit.save(f, \"scripted.pt\")"
632 | ]
633 | },
634 | {
635 | "cell_type": "code",
636 | "execution_count": 24,
637 | "metadata": {
638 | "execution": {
639 | "iopub.execute_input": "2020-11-27T14:31:25.847133Z",
640 | "iopub.status.busy": "2020-11-27T14:31:25.846686Z",
641 | "iopub.status.idle": "2020-11-27T14:31:25.963596Z",
642 | "shell.execute_reply": "2020-11-27T14:31:25.962539Z"
643 | }
644 | },
645 | "outputs": [
646 | {
647 | "name": "stdout",
648 | "output_type": "stream",
649 | "text": [
650 | "scripted.pt: Zip archive data, at least v?[0] to extract\r\n"
651 | ]
652 | }
653 | ],
654 | "source": [
655 | "!file scripted.pt"
656 | ]
657 | },
658 | {
659 | "cell_type": "code",
660 | "execution_count": 25,
661 | "metadata": {
662 | "execution": {
663 | "iopub.execute_input": "2020-11-27T14:31:25.968061Z",
664 | "iopub.status.busy": "2020-11-27T14:31:25.967602Z",
665 | "iopub.status.idle": "2020-11-27T14:31:25.979424Z",
666 | "shell.execute_reply": "2020-11-27T14:31:25.978497Z"
667 | }
668 | },
669 | "outputs": [],
670 | "source": [
671 | "g = torch.jit.load(\"scripted.pt\")"
672 | ]
673 | },
674 | {
675 | "cell_type": "code",
676 | "execution_count": 26,
677 | "metadata": {
678 | "execution": {
679 | "iopub.execute_input": "2020-11-27T14:31:25.982405Z",
680 | "iopub.status.busy": "2020-11-27T14:31:25.982021Z",
681 | "iopub.status.idle": "2020-11-27T14:31:25.992205Z",
682 | "shell.execute_reply": "2020-11-27T14:31:25.991823Z"
683 | }
684 | },
685 | "outputs": [
686 | {
687 | "data": {
688 | "text/plain": [
689 | "tensor([0.])"
690 | ]
691 | },
692 | "execution_count": 1,
693 | "metadata": {},
694 | "output_type": "execute_result"
695 | }
696 | ],
697 | "source": [
698 | "g(torch.tensor(-1))"
699 | ]
700 | },
701 | {
702 | "cell_type": "code",
703 | "execution_count": 27,
704 | "metadata": {
705 | "execution": {
706 | "iopub.execute_input": "2020-11-27T14:31:25.995114Z",
707 | "iopub.status.busy": "2020-11-27T14:31:25.994647Z",
708 | "iopub.status.idle": "2020-11-27T14:31:26.005153Z",
709 | "shell.execute_reply": "2020-11-27T14:31:26.004702Z"
710 | }
711 | },
712 | "outputs": [
713 | {
714 | "data": {
715 | "text/plain": [
716 | "tensor(1)"
717 | ]
718 | },
719 | "execution_count": 1,
720 | "metadata": {},
721 | "output_type": "execute_result"
722 | }
723 | ],
724 | "source": [
725 | "g(torch.tensor(1))"
726 | ]
727 | },
728 | {
729 | "cell_type": "markdown",
730 | "metadata": {},
731 | "source": [
732 | "## Subclassing `torch.jit.ScriptModule`\n",
733 | "If you work with `nn.Module` replace it by `torch.jit.ScriptModule` (see [[tutorial]](https://pytorch.org/tutorials/beginner/deploy_seq2seq_hybrid_frontend_tutorial.html) for more).\n",
734 | "\n",
735 | "```python\n",
736 | "class MyModule(torch.jit.ScriptModule):\n",
737 | " def __init__(self):\n",
738 | " super().__init__()\n",
739 | " \n",
740 | " def forward(self, x):\n",
741 | " # ...\n",
742 | " return x\n",
743 | "```"
744 | ]
745 | },
746 | {
747 | "cell_type": "markdown",
748 | "metadata": {},
749 | "source": [
750 | "# PyTorch and C++\n",
751 | "\n",
752 | "PyTorch offers a very nice(!) C++ interface which is very close to Python."
753 | ]
754 | },
755 | {
756 | "cell_type": "markdown",
757 | "metadata": {},
758 | "source": [
759 | "## Loading traced models from C++"
760 | ]
761 | },
762 | {
763 | "cell_type": "markdown",
764 | "metadata": {},
765 | "source": [
766 | "```c++\n",
767 | "#include \n",
768 | "\n",
769 | "int main(int(argc, const char* argv[]) {\n",
770 | " auto module = torch::jit::load(\"scrpted.pt\");\n",
771 | " // data ...\n",
772 | " module->forward(data);\n",
773 | "}\n",
774 | "```"
775 | ]
776 | }
777 | ],
778 | "metadata": {
779 | "kernelspec": {
780 | "display_name": "Python 3",
781 | "language": "python",
782 | "name": "python3"
783 | },
784 | "language_info": {
785 | "codemirror_mode": {
786 | "name": "ipython",
787 | "version": 3
788 | },
789 | "file_extension": ".py",
790 | "mimetype": "text/x-python",
791 | "name": "python",
792 | "nbconvert_exporter": "python",
793 | "pygments_lexer": "ipython3",
794 | "version": "3.8.5"
795 | }
796 | },
797 | "nbformat": 4,
798 | "nbformat_minor": 2
799 | }
800 |
--------------------------------------------------------------------------------
/notebooks/torch_jit.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # `torch.jit`
17 | #
18 | # Eager execution is great for development and debugging. but it can be hard to (automatically) optimize the code and deploy it.
19 | #
20 | # Now there is`torch.jit` with two flavours:
21 | #
22 | # - `torch.jit.trace` does not record control flow.
23 | # - `torch.jit.script` records control flow and creates an intermediate representation that can be optimized; only supports a subset of Python.
24 | #
25 | # Note: don't forget `model.eval()` and `model.train()`.
26 | #
27 | #
28 | # ## Ref and More:
29 | # - https://pytorch.org/docs/stable/jit.html
30 | # - https://speakerdeck.com/perone/pytorch-under-the-hood
31 | # - https://lernapparat.de/fast-lstm-pytorch/
32 |
33 | # %% [markdown]
34 | # ## Init, helpers, utils, ...
35 |
36 | # %%
37 | # %load_ext autoreload
38 | # %autoreload 2
39 |
40 | # %matplotlib inline
41 |
42 | # %%
43 | import torch
44 | import torch.nn as nn
45 | import torch.nn.functional as F
46 | import torch.optim as optim
47 | import torchvision
48 |
49 | # %%
50 | from pprint import pprint
51 |
52 | import matplotlib.pyplot as plt
53 | import numpy as np
54 | from IPython.core.debugger import set_trace
55 |
56 | import utils # little helpers
57 | from utils import attr
58 |
59 |
60 | # %% [markdown]
61 | # # `torch.jit.trace`
62 |
63 | # %%
64 | def f(x):
65 | if x.item() < 0:
66 | return torch.tensor(0)
67 | else:
68 | return x
69 |
70 |
71 | # %%
72 | f(torch.tensor(-1))
73 |
74 | # %%
75 | f(torch.tensor(3))
76 |
77 | # %%
78 | X = torch.tensor(1)
79 | traced = torch.jit.trace(f, X)
80 |
81 | # %%
82 | type(traced)
83 |
84 | # %%
85 | traced(torch.tensor(1))
86 |
87 | # %%
88 | traced.graph
89 |
90 | # %%
91 | traced(torch.tensor(-1))
92 |
93 | # %% [markdown]
94 | # ## Storing and restoring
95 |
96 | # %%
97 | traced.save("traced.pt")
98 |
99 | # %%
100 | # !file scripted.pt
101 |
102 | # %%
103 | g = torch.jit.load("traced.pt")
104 |
105 | # %%
106 | g(torch.tensor(1))
107 |
108 | # %%
109 | g(torch.tensor(-1))
110 |
111 | # %% [markdown]
112 | # # `torch.jit.script`
113 |
114 | # %%
115 | bool(torch.tensor(1) < 2)
116 |
117 |
118 | # %%
119 | @torch.jit.script
120 | def f(x):
121 | if bool(x < 0):
122 | result = torch.zeros(1)
123 | else:
124 | result = x
125 | return result
126 |
127 |
128 | # %% [markdown]
129 | # This is `torchscript` which is a only a supset of python.
130 |
131 | # %%
132 | f(torch.tensor(-1))
133 |
134 | # %%
135 | f(torch.tensor(1))
136 |
137 | # %%
138 | type(f)
139 |
140 | # %%
141 | f.graph
142 |
143 | # %% [markdown]
144 | # ## Storing and restoring
145 |
146 | # %%
147 | torch.jit.save(f, "scripted.pt")
148 |
149 | # %%
150 | # !file scripted.pt
151 |
152 | # %%
153 | g = torch.jit.load("scripted.pt")
154 |
155 | # %%
156 | g(torch.tensor(-1))
157 |
158 | # %%
159 | g(torch.tensor(1))
160 |
161 | # %% [markdown]
162 | # ## Subclassing `torch.jit.ScriptModule`
163 | # If you work with `nn.Module` replace it by `torch.jit.ScriptModule` (see [[tutorial]](https://pytorch.org/tutorials/beginner/deploy_seq2seq_hybrid_frontend_tutorial.html) for more).
164 | #
165 | # ```python
166 | # class MyModule(torch.jit.ScriptModule):
167 | # def __init__(self):
168 | # super().__init__()
169 | #
170 | # def forward(self, x):
171 | # # ...
172 | # return x
173 | # ```
174 |
175 | # %% [markdown]
176 | # # PyTorch and C++
177 | #
178 | # PyTorch offers a very nice(!) C++ interface which is very close to Python.
179 |
180 | # %% [markdown]
181 | # ## Loading traced models from C++
182 |
183 | # %% [markdown]
184 | # ```c++
185 | # #include
186 | #
187 | # int main(int(argc, const char* argv[]) {
188 | # auto module = torch::jit::load("scrpted.pt");
189 | # // data ...
190 | # module->forward(data);
191 | # }
192 | # ```
193 |
--------------------------------------------------------------------------------
/notebooks/transfer_learning.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # Transfer learning with PyTorch
17 | # We're going to train a neural network to classify dogs and cats.
18 |
19 | # %% [markdown]
20 | # ## Init, helpers, utils, ...
21 |
22 | # %%
23 | # %matplotlib inline
24 |
25 | # %%
26 | import torch
27 | import torch.nn as nn
28 | import torch.nn.functional as F
29 | import torch.optim as optim
30 | import torchvision
31 |
32 | DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
33 | DEVICE
34 |
35 | # %%
36 | from pprint import pprint
37 |
38 | import matplotlib.pyplot as plt
39 | import numpy as np
40 | from IPython.core.debugger import set_trace
41 |
42 |
43 | # %%
44 | # # %load my_train_helper.py
45 | def get_trainable(model_params):
46 | return (p for p in model_params if p.requires_grad)
47 |
48 |
49 | def get_frozen(model_params):
50 | return (p for p in model_params if not p.requires_grad)
51 |
52 |
53 | def all_trainable(model_params):
54 | return all(p.requires_grad for p in model_params)
55 |
56 |
57 | def all_frozen(model_params):
58 | return all(not p.requires_grad for p in model_params)
59 |
60 |
61 | def freeze_all(model_params):
62 | for param in model_params:
63 | param.requires_grad = False
64 |
65 |
66 |
67 | # %% [markdown] toc-hr-collapsed=true
68 | # # The Data - DogsCatsDataset
69 |
70 | # %% [markdown]
71 | # ## Transforms
72 |
73 | # %%
74 | from torchvision import transforms
75 |
76 | _image_size = 224
77 | _mean = [0.485, 0.456, 0.406]
78 | _std = [0.229, 0.224, 0.225]
79 |
80 |
81 | train_trans = transforms.Compose([
82 | transforms.Resize(256), # some images are pretty small
83 | transforms.RandomCrop(_image_size),
84 | transforms.RandomHorizontalFlip(),
85 | transforms.ColorJitter(.3, .3, .3),
86 | transforms.ToTensor(),
87 | transforms.Normalize(_mean, _std),
88 | ])
89 | val_trans = transforms.Compose([
90 | transforms.Resize(256),
91 | transforms.CenterCrop(_image_size),
92 | transforms.ToTensor(),
93 | transforms.Normalize(_mean, _std),
94 | ])
95 |
96 | # %% [markdown]
97 | # ## Dataset
98 | #
99 | # The implementation of the dataset does not really.
100 |
101 | # %%
102 | from torchvision.datasets.folder import ImageFolder
103 |
104 | # %%
105 | train_ds = ImageFolder("dogscats/training_set/", transform=train_trans)
106 | val_ds = ImageFolder("dogscats/test_set/", transform=val_trans)
107 |
108 | batch_size = 32
109 | n_classes = 2
110 |
111 | # %% [markdown]
112 | # Use the following if you want to use the full dataset:
113 |
114 | # %%
115 | # train_ds = DogsCatsDataset("../data/raw", "train", transform=train_trans)
116 | # val_ds = DogsCatsDataset("../data/raw", "valid", transform=val_trans)
117 |
118 | # batch_size = 128
119 | # n_classes = 2
120 |
121 | # %%
122 | len(train_ds), len(val_ds)
123 |
124 | # %% [markdown]
125 | # ## DataLoader
126 | # Batch loading for datasets with multi-processing and different sample strategies.
127 |
128 | # %%
129 | from torch.utils.data import DataLoader
130 |
131 |
132 | train_dl = DataLoader(
133 | train_ds,
134 | batch_size=batch_size,
135 | shuffle=True,
136 | num_workers=16,
137 | )
138 |
139 | val_dl = DataLoader(
140 | val_ds,
141 | batch_size=batch_size,
142 | shuffle=False,
143 | num_workers=16,
144 | )
145 |
146 | # %% [markdown]
147 | # # The Model
148 | # PyTorch offers quite a few [pre-trained networks](https://pytorch.org/docs/stable/torchvision/models.html) such as:
149 | # - AlexNet
150 | # - VGG
151 | # - ResNet
152 | # - SqueezeNet
153 | # - DenseNet
154 | # - Inception v3
155 | #
156 | # And there are more available via [pretrained-models.pytorch](https://github.com/Cadene/pretrained-models.pytorch):
157 | # - NASNet,
158 | # - ResNeXt,
159 | # - InceptionV4,
160 | # - InceptionResnetV2,
161 | # - Xception,
162 | # - DPN,
163 | # - ...
164 | #
165 | # We'll use a simple resnet18 model:
166 |
167 | # %%
168 | from torchvision import models
169 |
170 | model = models.resnet18(pretrained=True)
171 |
172 | # %%
173 | model
174 |
175 | # %%
176 | import torchsummary
177 |
178 | torchsummary.summary(model, (3, 224, 224), device="cpu")
179 |
180 | # %%
181 | nn.Linear(2, 1, bias=True)
182 |
183 | # %%
184 | # Freeze all parameters manually
185 | for param in model.parameters():
186 | param.requires_grad = False
187 |
188 | # %%
189 | # Or use our convenient functions from before
190 | freeze_all(model.parameters())
191 | assert all_frozen(model.parameters())
192 |
193 | # %% [markdown]
194 | # Replace the last layer with a linear layer. New layers have `requires_grad = True`.
195 |
196 | # %%
197 | model.fc = nn.Linear(512, n_classes)
198 |
199 | # %%
200 | assert not all_frozen(model.parameters())
201 |
202 |
203 | # %%
204 | def get_model(n_classes=2):
205 | model = models.resnet18(pretrained=True)
206 | freeze_all(model.parameters())
207 | model.fc = nn.Linear(512, n_classes)
208 | model = model.to(DEVICE)
209 | return model
210 |
211 |
212 | model = get_model()
213 |
214 | # %% [markdown]
215 | # # The Loss
216 |
217 | # %%
218 | criterion = nn.CrossEntropyLoss()
219 |
220 | # %% [markdown]
221 | # # The Optimizer
222 |
223 | # %%
224 | optimizer = torch.optim.Adam(
225 | get_trainable(model.parameters()),
226 | lr=0.001,
227 | # momentum=0.9,
228 | )
229 |
230 | # %% [markdown]
231 | # # The Train Loop
232 |
233 | # %%
234 | N_EPOCHS = 1
235 |
236 | for epoch in range(N_EPOCHS):
237 |
238 | # Train
239 | model.train() # IMPORTANT
240 |
241 | total_loss, n_correct, n_samples = 0.0, 0, 0
242 | for batch_i, (X, y) in enumerate(train_dl):
243 | X, y = X.to(DEVICE), y.to(DEVICE)
244 |
245 | optimizer.zero_grad()
246 | y_ = model(X)
247 | loss = criterion(y_, y)
248 | loss.backward()
249 | optimizer.step()
250 |
251 | # Statistics
252 | print(
253 | f"Epoch {epoch+1}/{N_EPOCHS} |"
254 | f" batch: {batch_i} |"
255 | f" batch loss: {loss.item():0.3f}"
256 | )
257 | _, y_label_ = torch.max(y_, 1)
258 | n_correct += (y_label_ == y).sum().item()
259 | total_loss += loss.item() * X.shape[0]
260 | n_samples += X.shape[0]
261 |
262 | print(
263 | f"Epoch {epoch+1}/{N_EPOCHS} |"
264 | f" train loss: {total_loss / n_samples:9.3f} |"
265 | f" train acc: {n_correct / n_samples * 100:9.3f}%"
266 | )
267 |
268 |
269 | # Eval
270 | model.eval() # IMPORTANT
271 |
272 | total_loss, n_correct, n_samples = 0.0, 0, 0
273 | with torch.no_grad(): # IMPORTANT
274 | for X, y in val_dl:
275 | X, y = X.to(DEVICE), y.to(DEVICE)
276 |
277 | y_ = model(X)
278 |
279 | # Statistics
280 | _, y_label_ = torch.max(y_, 1)
281 | n_correct += (y_label_ == y).sum().item()
282 | loss = criterion(y_, y)
283 | total_loss += loss.item() * X.shape[0]
284 | n_samples += X.shape[0]
285 |
286 |
287 | print(
288 | f"Epoch {epoch+1}/{N_EPOCHS} |"
289 | f" valid loss: {total_loss / n_samples:9.3f} |"
290 | f" valid acc: {n_correct / n_samples * 100:9.3f}%"
291 | )
292 |
293 |
294 | # %% [markdown]
295 | # # Exercise
296 | # - Create your own module which takes any of the existing pre-trained model as backbone and adds a problem specific head.
297 |
298 | # %%
299 | class Net(nn.Module):
300 | def __init__(self, backbone: nn.Module, n_classes: int):
301 | super().__init__()
302 | # self.backbone
303 | # self.head = init_head(n_classes)
304 |
305 | def forward(self, x):
306 | # TODO
307 | return x
308 |
--------------------------------------------------------------------------------
/notebooks/visualize_model_loss_optimizer.py:
--------------------------------------------------------------------------------
1 | # ---
2 | # jupyter:
3 | # jupytext:
4 | # text_representation:
5 | # extension: .py
6 | # format_name: percent
7 | # format_version: '1.3'
8 | # jupytext_version: 1.7.1
9 | # kernelspec:
10 | # display_name: Python 3
11 | # language: python
12 | # name: python3
13 | # ---
14 |
15 | # %% [markdown]
16 | # # Software vs Machine Learning
17 | #
18 | # 
19 | #
20 | # 
21 |
22 | # %% [markdown]
23 | # # Widget to visualize linear regression, error, and loss
24 |
25 | # %%
26 | import numpy as np
27 | import altair as alt
28 | import pandas as pd
29 | import ipywidgets
30 |
31 |
32 | # %%
33 | def f(x, slope: float, bias: float):
34 | """A simple linear model."""
35 | return x * slope + bias
36 |
37 |
38 | # %%
39 | def err2(pred, true):
40 | return (true - pred) ** 2
41 |
42 | def mse(pred, true):
43 | return np.mean(err2(pred, true))
44 |
45 |
46 | # %%
47 | n = 20
48 | std = 4
49 |
50 | x = np.linspace(-10, 10, 20)
51 | noise = np.random.normal(0, 2, size=n)
52 |
53 | y = f(x, slope=1.3, bias=5) + noise
54 |
55 | data = pd.DataFrame({"x": x, "y": y})
56 |
57 | # %%
58 | slope_dom = np.linspace(-2, 4.5, 66)
59 | slope_losses = {
60 | _slope: mse(f(x, _slope, bias=5), y)
61 | for _slope in slope_dom
62 | }
63 | df_slope_losses = pd.DataFrame({
64 | "slope": slope_losses.keys(),
65 | "loss": slope_losses.values(),
66 | })
67 |
68 | # %%
69 | alt.renderers.enable('altair_viewer')
70 |
71 |
72 | # %%
73 | def show_lin_reg(
74 | slope: float,
75 | bias: float,
76 | show_pred=True,
77 | show_err=True,
78 | show_err2=False,
79 | show_loss_landscape=False,
80 | ):
81 |
82 | pred = x * slope + bias
83 |
84 | data["pred"] = pred
85 | data["err"] = y - pred
86 | data["err2"] = (y - pred) ** 2
87 | data["x2"] = x - data["err"]
88 |
89 | mse = np.mean(data['err2'])
90 | mae = np.mean(np.abs(data['err']))
91 |
92 | chart = (
93 | alt.Chart(data)
94 | .mark_point()
95 | .encode(x="x", y="y")
96 | .properties(title=f"Lin Reg | MSE: {mse:5.01f} | MAE: {mae:5.02f}")
97 | )
98 | if show_pred:
99 | chart += (
100 | alt.Chart(data)
101 | .mark_line()
102 | .encode(x="x", y="pred")
103 | )
104 | if show_err:
105 | chart += (
106 | alt.Chart(data)
107 | .mark_line()
108 | .encode(x="x", y="y", y2="pred")
109 |
110 | )
111 | if show_err2:
112 | chart += (
113 | alt.Chart(data)
114 | .mark_rect(fill="none", stroke="red")
115 | .encode(x="x", y="y", x2="x2", y2="pred")
116 |
117 | )
118 |
119 |
120 | if not show_loss_landscape:
121 | return chart
122 |
123 | _chart_loss = (
124 | alt.Chart(df_slope_losses)
125 | .mark_line()
126 | .encode(x="slope", y="loss")
127 | .properties(title="Loss Landscape (slope)")
128 | )
129 | _chart_loss_hl = (
130 | alt.Chart(pd.DataFrame({"x": [slope], "y": [0], "y2": [400]}))
131 | .mark_line()
132 | .encode(x="x", y="y", y2="y2")
133 | )
134 | return chart | (_chart_loss + _chart_loss_hl)
135 |
136 | # %%
137 | # show_lin_reg(
138 | # slope=.3,
139 | # bias=8,
140 | # show_pred=True,
141 | # show_err=True,
142 | # show_err2=False,
143 | # )
144 |
145 | # %%
146 | ipywidgets.interact(
147 | show_lin_reg,
148 | slope=(-2.0, 2.0),
149 | bias=(-8.0, 8.0),
150 | show_pred=True,
151 | show_err=False,
152 | )
153 |
154 | # %% [markdown]
155 | # ## Linear regression - more formally
156 | #
157 | # 0. Data
158 | #
159 | #
160 | # 1. Model:
161 | # - $f(X) = X \beta = \hat y$
162 | #
163 | #
164 | # 2. Loss / criterion:
165 | # - $ err_i = y_i - f(X_i)$
166 | # - $MSE = \frac{1}{n} \sum_{i=1}^{N} err_i^2$
167 | #
168 | #
169 | # 3. Optimize:
170 | # - minimize the MSE yields the optimal $\hat\beta$ (after doing some math)
171 | # - $\hat\beta = (X^TX)^{-1}X^Ty$
172 | # - (or, more generally, use gradient descent to optimize the parameters)
173 |
--------------------------------------------------------------------------------
/notebooks/working_with_data.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | # ---
3 | # jupyter:
4 | # jupytext:
5 | # text_representation:
6 | # extension: .py
7 | # format_name: percent
8 | # format_version: '1.3'
9 | # jupytext_version: 1.7.1
10 | # kernelspec:
11 | # display_name: Python 3
12 | # language: python
13 | # name: python3
14 | # ---
15 |
16 | # %% [markdown]
17 | # # Working with Data: `Dataset`, `DataLoader`, `Sampler`, and `Transforms`
18 | #
19 | # These basic concepts make it easy to work with large data.
20 |
21 | # %% [markdown]
22 | # ## Init, helpers, utils, ...
23 |
24 | # %%
25 | # %matplotlib inline
26 |
27 | # %%
28 | import torch
29 | import torch.nn as nn
30 | import torch.nn.functional as F
31 | import torch.optim as optim
32 | import torchvision
33 |
34 | DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
35 |
36 | # %%
37 | from pprint import pprint
38 |
39 | import matplotlib.pyplot as plt
40 | import numpy as np
41 | from IPython.core.debugger import set_trace
42 |
43 | # %% [markdown]
44 | # # Dataset
45 | # It's easy to create your `Dataset`,
46 | # but PyTorch comes with some
47 | # [build-in datasets](https://pytorch.org/docs/stable/torchvision/datasets.html):
48 | #
49 | # - MNIST
50 | # - Fashion-MNIST
51 | # - KMNIST
52 | # - EMNIST
53 | # - FakeData
54 | # - COCO
55 | # - Captions
56 | # - Detection
57 | # - LSUN
58 | # - ImageFolder
59 | # - DatasetFolder
60 | # - Imagenet-12
61 | # - CIFAR
62 | # - STL10
63 | # - SVHN
64 | # - PhotoTour
65 | # - SBU
66 | # - Flickr
67 | # - VOC
68 | # - Cityscapes
69 | #
70 | # `Dataset` gives you information about the number of samples (implement `__len__`) and gives you the sample at a given index (implement `__getitem__`.
71 | # It's a nice and simple abstraction to work with data.
72 |
73 | # %%
74 | from torch.utils.data import Dataset
75 |
76 | # %% [markdown]
77 | # ```python
78 | # class Dataset(object):
79 | # def __getitem__(self, index):
80 | # raise NotImplementedError
81 | #
82 | # def __len__(self):
83 | # raise NotImplementedError
84 | #
85 | # def __add__(self, other):
86 | # return ConcatDataset([self, other])
87 | # ```
88 |
89 | # %% [markdown]
90 | # The `ImageFolder` dataset is quite useful and follows the usual conventions for folder layouts:
91 | #
92 | # ```
93 | # root/dog/xxx.png
94 | # root/dog/xxy.png
95 | # root/dog/xxz.png
96 | #
97 | # root/cat/123.png
98 | # root/cat/nsdf3.png
99 | # root/cat/asd932_.png
100 | # ```
101 |
102 | # %% [markdown]
103 | # ## Example: dogs and cats dataset
104 | # Please download the dataset from
105 | # https://www.kaggle.com/chetankv/dogs-cats-images
106 | # and place it in the `notebook/` folder.
107 |
108 | # %%
109 | # !tree -d dogscats/
110 |
111 | # %%
112 | from torchvision.datasets.folder import ImageFolder
113 |
114 | train_ds = ImageFolder("dogscats/training_set/")
115 |
116 | # %%
117 | train_ds
118 |
119 | # %%
120 | # the __len__ method
121 | len(train_ds)
122 |
123 | # %%
124 | # the __getitem__ method
125 | train_ds[0]
126 |
127 | # %%
128 | train_ds[0][0]
129 |
130 | # %%
131 | train_ds[0][1]
132 |
133 | # %% [markdown]
134 | # Optionally, some datasets offer convenience functions and attributes.
135 | # This is not enforced by the interface! Don't rely on it!
136 |
137 | # %%
138 | train_ds.classes
139 |
140 | # %%
141 | train_ds.class_to_idx
142 |
143 | # %%
144 | train_ds.imgs
145 |
146 | # %%
147 |
148 | # %%
149 | import random
150 |
151 | rand_idx = np.random.randint(0, len(train_ds), 4)
152 | for i in rand_idx:
153 | img, label_id = train_ds[i]
154 | print(label_id, train_ds.classes[label_id], i)
155 | display(img)
156 |
157 | # %% [markdown]
158 | # # `torchvision.transforms`
159 | #
160 | # Common image transformation that can be composed/chained [[docs]](https://pytorch.org/docs/stable/torchvision/transforms.html).
161 |
162 | # %%
163 | from torchvision import transforms
164 |
165 | # %%
166 | _image_size = 224
167 | _mean = [0.485, 0.456, 0.406]
168 | _std = [0.229, 0.224, 0.225]
169 |
170 |
171 | trans = transforms.Compose([
172 | transforms.Resize(256),
173 | transforms.RandomCrop(_image_size),
174 | transforms.RandomHorizontalFlip(),
175 | transforms.ColorJitter(.3, .3, .3),
176 | transforms.ToTensor(),
177 | transforms.Normalize(_mean, _std),
178 | ])
179 |
180 | trans(train_ds[7074][0])
181 |
182 | # %% [markdown]
183 | # ## `torchvision.transforms.functional`
184 | #
185 | # >Functional transforms give you fine-grained control of the transformation pipeline. As opposed to the transformations above, functional transforms don’t contain a random number generator for their parameters. That means you have to specify/generate all parameters, but you can reuse the functional transform. For example, you can apply a functional transform to multiple images like this:
186 | # >
187 | # > https://pytorch.org/docs/stable/torchvision/transforms.html#functional-transforms
188 | #
189 | # ```python
190 | # import torchvision.transforms.functional as TF
191 | # import random
192 | #
193 | # def my_segmentation_transforms(image, segmentation):
194 | # if random.random() > 5:
195 | # angle = random.randint(-30, 30)
196 | # image = TF.rotate(image, angle)
197 | # segmentation = TF.rotate(segmentation, angle)
198 | # # more transforms ...
199 | # return image, segmentation
200 | # ```
201 |
202 | # %% [markdown]
203 | # Ref:
204 | # - https://pytorch.org/docs/stable/torchvision/transforms.htm
205 | # - https://pytorch.org/docs/stable/torchvision/transforms.html#functional-transforms
206 | # - https://pytorch.org/tutorials/beginner/data_loading_tutorial.html
207 | # - https://github.com/mdbloice/Augmentor
208 | # - https://github.com/aleju/imgaug
209 | #
210 | # Shout-out:
211 | # - Hig performance image augmentation with pillow-simd [[github]](https://github.com/uploadcare/pillow-simd) [[benchmark]](http://python-pillow.org/pillow-perf/)
212 | # - Improving Deep Learning Performance with AutoAugment [[blog]](https://ai.googleblog.com/2018/06/improving-deep-learning-performance.html) [[paper]](https://arxiv.org/abs/1805.09501) [[pytorch implementation]](https://github.com/DeepVoltaire/AutoAugment)
213 |
214 | # %% [markdown]
215 | # # Dataloader
216 | # The `DataLoader` class offers batch loading of datasets with multi-processing and different sample strategies [[docs]](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader).
217 | #
218 | # The signature looks something like this:
219 | # ```python
220 | # DataLoader(
221 | # dataset,
222 | # batch_size=1,
223 | # shuffle=False,
224 | # sampler=None,
225 | # batch_sampler=None,
226 | # num_workers=0,
227 | # collate_fn=default_collate,
228 | # pin_memory=False,
229 | # drop_last=False,
230 | # timeout=0,
231 | # worker_init_fn=None
232 | # )
233 | # ```
234 |
235 | # %%
236 | from torch.utils.data import DataLoader
237 |
238 | # %%
239 | train_ds = ImageFolder("dogscats/training_set/", transform=trans)
240 | train_dl = DataLoader(
241 | train_ds,
242 | batch_size=2,
243 | shuffle=True,
244 | num_workers=4,
245 | )
246 |
247 | # %%
248 | train_iter = iter(train_dl)
249 | X, y = next(train_iter)
250 |
251 | # %%
252 | print("X:", X.shape)
253 | print("y:", y.shape)
254 |
255 | # %% [markdown]
256 | # Note that I passed `trans`, which returns `torch.Tensor`, not pillow images.
257 | # DataLoader expects tensors, numbers, dicts or lists.
258 |
259 | # %%
260 | _train_ds = ImageFolder("dogscats/test_set/", transform=trans)
261 | _train_dl = DataLoader(_train_ds, batch_size=2, shuffle=True)
262 |
263 |
264 | # %% [markdown]
265 | # ## `collate_fn`
266 | # The `collate_fn` argument of `DataLoader` allows you to customize how single datapoints are put together into a batch.
267 | # `collate_fn` is a simple callable that gets a list of datapoints (i.e. what `dataset.__getitem__` returns).
268 |
269 | # %% [markdown]
270 | # Example of a custom `collate_fn`
271 | # (taken from [here](https://discuss.pytorch.org/t/how-to-create-a-dataloader-with-variable-size-input/8278/3)):
272 |
273 | # %%
274 | def my_collate_fn(list_of_x_y):
275 | data = [item[0] for item in list_of_x_y]
276 | target = [item[1] for item in list_of_x_y]
277 | target = torch.LongTensor(target)
278 | return [data, target]
279 |
280 |
281 | # %% [markdown]
282 | # # Sampler
283 | # `Sampler` define **how** to sample from the dataset [[docs]](https://pytorch.org/docs/stable/data.html#torch.utils.data.sampler.Sampler).
284 | #
285 | # Examples:
286 | # - `SequentialSampler`
287 | # - `RandomSamples`
288 | # - `SubsetSampler`
289 | # - `WeightedRandomSampler`
290 | #
291 | # Write your own by simply implementing `__iter__` to iterate over the indices of the dataset.
292 | #
293 | # ```python
294 | # class Sampler(object):
295 | # def __init__(self, data_source):
296 | # pass
297 | #
298 | # def __iter__(self):
299 | # raise NotImplementedError
300 | #
301 | # def __len__(self):
302 | # raise NotImplementedError
303 | # ```
304 |
305 | # %% [markdown]
306 | # # Recap
307 | # - `Dataset`: get one datapoint
308 | # - `transforms`: composable transformations
309 | # - `DataLoader`: combine single datapoints into batches (plus multi processing and more)
310 | # - `Sampler`: **how** to sample from a dataset
311 | #
312 | # **Simple but extensible interfaces**
313 |
314 | # %% [markdown]
315 | # # Exercise
316 | # Go out and play:
317 | #
318 | # - Maybe extend the `DogsCatsDataset` such that you can specify the size of dataset, i.e. the number of samples.
319 | # - Maybe try the `Subset` [[docs]](https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset) to create smaller datasets.
320 | # - Maybe create `SubsetFraction` where you can specify the size of the dataset (between 0. and 1.).
321 | # - Maybe write a custom collate function for the `DogsCatsDataset` that turns it into a dataset appropriate to use in an autoencoder settings.
322 |
323 | # %%
324 | def autoencoder_collate_fn(list_of_x_y):
325 | # TODO implement me
326 | pass
327 |
328 |
329 | # %%
330 | class MyDataSet(Dataset):
331 | def __init__(self):
332 | super().__init__()
333 | # TODO implement me
334 |
335 | def __len__(self):
336 | # TODO implement me
337 | pass
338 |
339 | def __getitem__(self, idx):
340 | # TODO implement me
341 | pass
342 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | torch==1.7
2 | torchvision
3 | jupyterlab
4 | jupytext
5 | notebook
6 | ipywidgets
7 | matplotlib
8 | sklearn
9 | altair
10 | altair_viewer
11 | vega_datasets
12 |
--------------------------------------------------------------------------------