├── .gitignore ├── LICENSE ├── README.md └── VisualizingActivations.ipynb /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | MANIFEST 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | .pytest_cache/ 49 | 50 | # Translations 51 | *.mo 52 | *.pot 53 | 54 | # Django stuff: 55 | *.log 56 | local_settings.py 57 | db.sqlite3 58 | 59 | # Flask stuff: 60 | instance/ 61 | .webassets-cache 62 | 63 | # Scrapy stuff: 64 | .scrapy 65 | 66 | # Sphinx documentation 67 | docs/_build/ 68 | 69 | # PyBuilder 70 | target/ 71 | 72 | # Jupyter Notebook 73 | .ipynb_checkpoints 74 | 75 | # pyenv 76 | .python-version 77 | 78 | # celery beat schedule file 79 | celerybeat-schedule 80 | 81 | # SageMath parsed files 82 | *.sage.py 83 | 84 | # Environments 85 | .env 86 | .venv 87 | env/ 88 | venv/ 89 | ENV/ 90 | env.bak/ 91 | venv.bak/ 92 | 93 | # Spyder project settings 94 | .spyderproject 95 | .spyproject 96 | 97 | # Rope project settings 98 | .ropeproject 99 | 100 | # mkdocs documentation 101 | /site 102 | 103 | # mypy 104 | .mypy_cache/ 105 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Josh Varty 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Visualizing Activations 2 | 3 | Full Notebook: 4 | 5 | In [Lecture 10](https://forums.fast.ai/t/lesson-10-discussion-wiki-2019/42781/306) we looked at a few approaches to using hooks and plotting information about means and standard deviations of our network's activations. 6 | 7 | This seems like it might be useful as a debugging strategy or sanity check on real-world models, so I wanted to try to instrument my own network. For simplicity's sake I chose to ResNet-18 against the MNIST dataset. 8 | 9 | The structure of ResNet-18 looks like: 10 | 11 | ![image|443x400](https://i.imgur.com/1g29NJ9.png) 12 | 13 | I've chosen to instrument `conv1`, `conv2_x`, `conv3_x`, `conv4_x`, and `conv5_x`. 14 | 15 | I ran `.fit()` for `3` epochs using a learning rate of `1e-2`. I used a validation size of 50% because the graphs start to get too wide if there are too many items in the training set. 16 | 17 | ### Pretrained ResNet-18 Activations 18 | 19 | Trains to an error rate of `0.033000` 20 | 21 | ![image|690x452](https://i.imgur.com/gOgkdMX.png) 22 | 23 | ### Untrained ResNet-18 Activations 24 | 25 | Trains to an error rate of `0.034486` 26 | ![image|690x449](https://i.imgur.com/U9Jr8wQ.png) 27 | 28 | ### Some thoughts 29 | 30 | - Both train to comparable error rates, despite having what appear to be wildly different activations 31 | - All of the layers of the untrained model change considerably from where they started 32 | 33 | ### Let's Break the Pretrained ResNet-18 Model 34 | 35 | Out of curiosity, what happens if we use learning rates that are too large. 36 | 37 | Trained with a learning rate of `1` to an error rate of `0.808114` 38 | 39 | ![image|690x457](https://i.imgur.com/G5AAWFO.png) 40 | 41 | ### Let's Break the Untrained ResNet-18 Model 42 | 43 | Out of curiosity, what happens if we use learning rates that are too large. 44 | 45 | Trained with a learning rate of `1` to an error rate of `0.898943` 46 | 47 | ![image|690x457](https://i.imgur.com/O69SBKE.png) 48 | 49 | 50 | 51 | The untrained model's weights descend into some kind of pattern. In general it looks like most of the activations are collapsing to values closer to zero. 52 | --------------------------------------------------------------------------------