├── .gitignore ├── README.md ├── images ├── ct.jpeg ├── gender_race.png ├── gender_workclass.png └── workflow.png └── lessons ├── week_01 └── outline.pdf ├── week_02 ├── ml_fundamentals_and_decision_trees.pdf └── sources │ ├── README.md │ ├── data_check.ipynb │ ├── data_segregation.ipynb │ ├── eda.ipynb │ ├── fetch_data.ipynb │ ├── preprocessing.ipynb │ ├── test.ipynb │ └── train.ipynb ├── week_05 └── deploy_ml.pdf ├── week_10 ├── Datasets │ ├── data.mat │ ├── test_catvnoncat.h5 │ └── train_catvnoncat.h5 ├── Figures │ ├── Figures.ipynb │ └── Perceptron │ │ ├── fig10_dimension_1output.tex │ │ ├── fig_1_perceptron_N.tex │ │ ├── fig_2_perceptron_bias.tex │ │ ├── fig_3_perceptron_example.tex │ │ ├── fig_4_perceptron_z.tex │ │ ├── fig_5_perceptron_z2.tex │ │ ├── fig_6_one_hidden.tex │ │ ├── fig_6_one_hidden_B.tex │ │ ├── fig_7_two_hidden.tex │ │ ├── fig_8_onehidden.tex │ │ ├── fig_9_dimension.tex │ │ └── latexmkrc ├── Notebooks │ ├── Week 10 Task 01 - TensorFlow 2.x + Keras Crash Course.ipynb │ └── Week 10 Task 02 - Better Learning part I.ipynb └── Week 10 Introduction to Deep Learning and TensorFlow.pdf ├── week_12 ├── Better Generalization I.ipynb ├── Better Generalization II.ipynb ├── Better Generalizaton vs Better Learning.pdf ├── Better Learning II.ipynb └── Better Learning III.ipynb └── week_14 ├── Hyperparameter Tuning and Batch Normalization.pdf ├── Task #01 Hyperparameter Tuning using Keras Tuner.ipynb ├── Task #02 Hyperparameter Tuning using Weights and Biases.ipynb └── Task #03 Batch Normalization.ipynb /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | /census.csv 131 | /EDA.ipynb 132 | /raw_data.csv 133 | 134 | # MacOS 135 | .DS_Store 136 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 |
3 | 4 | # Federal University of Rio Grande do Norte 5 | ## Technology Center 6 | ### Graduate Program in Electrical and Computer Engineering 7 | #### Department of Computer Engineering and Automation 8 | ##### EEC1509 Machine Learning 9 | 10 | #### References 11 | 12 | - :books: Aurélien Géron. Hands on Machine Learning with Scikit-Learn, Keras and TensorFlow. [[Link]](https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/) 13 | - :books: François Chollet. Deep Learning with Python. [[Link]](https://www.manning.com/books/deep-learning-with-python-second-edition) 14 | - :books: Hannes Hapke, Catherine Nelson. Building Machine Learning Pipelines. [[Link]](https://www.oreilly.com/library/view/building-machine-learning/9781492053187/) 15 | - :books: Noah Gift, Alfredo Deza. Practical MLOps: Operationalizing Machine Learning Models [[Link]](https://www.oreilly.com/library/view/practical-mlops/9781098103002/) 16 | - :fist_right: Dataquest Academic Program [[Link]](https://www.dataquest.io/academic-program/) 17 | 18 | **Week 01**: Course Outline [![Open in PDF](https://img.shields.io/badge/-PDF-EC1C24?style=flat-square&logo=adobeacrobatreader)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_01/outline.pdf) 19 | - Motivation, Syllabus, Calender, other issues. 20 | 21 | **Weeks 02, 03, 04** Machine Learning Fundamentals and Decision Trees [![Open in PDF](https://img.shields.io/badge/-PDF-EC1C24?style=flat-square&logo=adobeacrobatreader)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_02/ml_fundamentals_and_decision_trees.pdf) 22 | - Outline [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/4979782637e34d37a0bb8551835a5a00) 23 | - What is Machine Learning (ML)? [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/098676fae4c2464788dd67ac1b419340) 24 | - ML types [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/4005e7ef95d4431db1bd266979a6789c) 25 | - Main challenges of ML 26 | - Variables, pipeline, and controlling chaos [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/f5456342c6b643799c1824362020fc5e) 27 | - Train, dev and test sets [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/954298d6f4c1433488239956b5d7007e) 28 | - Bias vs Variance [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/c496098013c84911a9ac353fec7e3131) 29 | - Decision Trees 30 | - Introduction [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/4f10b2436c1943f2aaa84d0f56c9e8c3) 31 | - Mathematical foundations [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/a215906eceda4b9cb655b226261bfb21) 32 | - Evaluation metrics 33 | - How to choose an evaluation metric? [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/3dd9bd6dcb844704ba9cd1e1b34932c3) 34 | - Threshold metrics [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/efc3248b6f8747a3ab86cd22cadde993) 35 | - Ranking metrics [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/1394db7fc27e4592af6f538c06cebbd1) 36 | - :rocket: Case Study [![Jupyter](https://img.shields.io/badge/-Notebook-191A1B?style=flat-square&logo=jupyter)](https://github.com/ivanovitchm/ppgeecmachinelearning/tree/main/lessons/week_02/sources) 37 | - Google Colaboratory [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/8a4f0d34b3cb4d9ea04b6dcf0b3d1aca) [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/d96cb0af7d9c4416bfe8145c93248a11) 38 | - Setup of the environment [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/fea2d097fc7d4de89e53da259ece6d25) 39 | - Extract, Transform and Load (ETL) 40 | - Exploratory Data Analysis (EDA) [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/799b9712c6274f2fa547a3eb4cd230df) 41 | - Fetch Data [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/9861e9013ba940aba2c6dd1db5a00ebf) 42 | - EDA using Pandas-Profiling [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/cf19e023208946938d3f70e6e52018b4) 43 | - Manual EDA [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/9cec1f4d529a41dc90af19f23ef2082a) 44 | - Preprocessing [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/51a2972c8ffc4949891e9e249f9f48a3) 45 | - Data Check [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/f359ca8430b149309f6ac0b1d9c6e233) 46 | - Data Segregation [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://loom.com/share/25a491791e104c1694b2bf5615fe2c26) 47 | - Train 48 | - Train and validation component [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/3b708c0820b64ef199178b63fc4ef395) 49 | - Data preparation and outlier removal [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/140068a18a5e4c8d83b807868ebdd011) 50 | - Encoding the target variable [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/b0edb4ccb28a4e1884a2f37637b58deb) 51 | - Encoding the independent variables manually [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/4adce083a32b4d3787fd50b59da4fdb5) 52 | - Using a full-pipeline to prepare categorical features [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/12de69ebeb744ebdbf2524b07773c7c2) 53 | - Using a full-pipeline to prepare numerical features [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/3b92e3fd78df42ebbbdce36dbce1707a) 54 | - Creating a full-preprocessing pipeline [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/6796f0129b1d4865aeb277e68461da80) 55 | - Holdout training [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/188a610fb09542b883b89cc962d6a823) 56 | - Evaluation metrics [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/4b2a9dd0ae44465b914974cf886390f9) 57 | - Hyperparameter tuning using Wandb [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/7e3a9d52709843bbb6026f816fa49d90) 58 | - Configure, train and export the best model [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/1c7a30cd4e90400daeb3916ee4006534) 59 | - Test [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/7725679b69a7426c927c317cb634dec3) 60 | - Dataquest Courses 61 | - Elements of the Command Line [![Open in Dataquest](https://img.shields.io/badge/link-dataquest-green)](https://www.dataquest.io/course/command-line-elements/) 62 | - You'll learn how to: a) employ the command line for Data Science, b) modify the behavior of commands with options, c) employ glob patterns and wildcards, d) define Important command line concepts, e) navigate he filesystem, f) manage users and permissions. 63 | - Functions: Advanced - Best practices for writing functions [![Open in Dataquest](https://img.shields.io/badge/link-dataquest-green)](https://www.dataquest.io/course/python-advanced-functions/) 64 | - Command Line Intermediate [![Open in Dataquest](https://img.shields.io/badge/link-dataquest-green)](https://www.dataquest.io/course/command-line-intermediate/) 65 | - Learn more about the command line and how to use it in your data analysis workflow. You'll learn how to: a) employ Jupyter console and b) process data from the command line. 66 | - Git and Version Control [![Open in Dataquest](https://img.shields.io/badge/link-dataquest-green)](https://www.dataquest.io/course/git-and-vcs/) 67 | - You'll learn how to: a) organize your code using version control, b) resolve conflicts in version control, c) employ Git and Github to collaborate with others. 68 | 69 | **Weeks 05 and 06** Deploy a ML Pipeline in Production [![Open in PDF](https://img.shields.io/badge/-PDF-EC1C24?style=flat-square&logo=adobeacrobatreader)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_05/deploy_ml.pdf) 70 | - :neckbeard: Hands on [![Repository](https://img.shields.io/badge/-Repo-191A1B?style=flat-square&logo=github)](https://github.com/ivanovitchm/colab2mlops) 71 | - Outline [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/8bc6b17050e14db1b5a644b614b9863b) 72 | - Previously on the last lesson and next steps [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/2497e73815354083a0299c376c6b1bb7) 73 | - Install essential tools to configure the dev environment [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/5147cf6180e146689fe976e1212dfd60) 74 | - Environment management system using conda [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/b03a14eddae543319071f483e1f73728) 75 | - Using FastAPI to Build Web APIs [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/7c4ccaa0de28422db02522dbad03bba7) 76 | - Hello world using fastapi [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/d54ee20891d74c70bd2c866c68fbe4f6) 77 | - Implementing a post method [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/8514c74f1f3443d3b7a82b8160f9d271) 78 | - Path and query parameter [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/7b6f2e4a2fc345019b5f5e0081aec490) 79 | - Local API testing [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/db74b4cc2294486480e2c31f05cbe3d5) 80 | - API deployment with FastAPI [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/bad07405f31e4625ba0a45a632b4f9d7) 81 | - Run and consuming our RESTful API [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/bd04680bd2ba4c41bf5e33bd18e6e9c7) 82 | - Using pytest and fastAPI to test our RESTful API [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/90ec0a6c964a4e669c05d7c3d3f54347) 83 | - Fundamentals of CI/CD [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/9759c65ddb9b486fb9068ff603dda38c) 84 | - Configure a GitHub action [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/6551d576e4b340f2a1d7849edd910109) 85 | - Workflow file configuration (Continuous Integration step) [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/b7d932f842f64ea4805feeb5c11d82ed) 86 | - Delivery your API with Heroku 87 | - Sign up Heroku and create a new app [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/f2eeba8220cc45b2984813786df0c7f4) 88 | - Install the Heroku CLI, configure credentials, remote repository, buildpacks, dyno, procfile and push CD [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/3f4f6a31147148418fa5052a545740d4) 89 | - Debuging and query live api [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/60782b538d25411780a8c9c1c14249f6) 90 | 91 | **Weeks 07, 08 and 09** Project 01 92 | - Create an end-to-end machine learning pipeling 93 | - From fetch data to deploy 94 | - Using: sklearn, wandb, fastapi, github actions, heroku, notebooks 95 | 96 | **Weeks 10 and 11** Fundamentals of Deep Learning [![Open in PDF](https://img.shields.io/badge/-PDF-EC1C24?style=flat-square&logo=adobeacrobatreader)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_10/Week%2010%20Introduction%20to%20Deep%20Learning%20and%20TensorFlow.pdf) 97 | - Outline [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/694778c1c318458589ad1990c1bb9614) 98 | - The perceptron [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/8d0ed35c632a4f0e805c103376974ec6) 99 | - Building Neural Networks [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/4ed93e63f36f468dad163bde0ed4102c) 100 | - Matrix Dimension [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/ac46a8425264456ea91f9644df3d992a) 101 | - Applying Neural Networks [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/24f247d0c8a74a48b3e481985fd843bd) 102 | - Training a Neural Networks [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/c96bebdd16d9444e9c4adf23a4a93398) 103 | - Backpropagation with Pencil & Paper [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/3098f18d6fdc4a039d5e382357bebc82) 104 | - Learning rate & Batch Size [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/4271c1e07f294ec181a0b40b93f604b7) 105 | - Exponentially Weighted Average [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/eb2e3905d23742d98572d14120fb3f57) 106 | - Adam, Momentum, RMSProp, Learning Rate Decay [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/6b4b3a14b3044dfdbe76a5606bc8e513) 107 | - Hands on :fire: 108 | - TensorFlow Crash Course [![Jupyter](https://img.shields.io/badge/-Notebook-191A1B?style=flat-square&logo=jupyter)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_10/Notebooks/Week%2010%20Task%2001%20-%20TensorFlow%202.x%20%2B%20Keras%20Crash%20Course.ipynb) 109 | - Better Learning - Part I [![Jupyter](https://img.shields.io/badge/-Notebook-191A1B?style=flat-square&logo=jupyter)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_10/Notebooks/Week%2010%20Task%2002%20-%20Better%20Learning%20part%20I.ipynb) 110 | 111 | **Weeks 12 and 13** Better Generalization vs Better Learning [![Open in PDF](https://img.shields.io/badge/-PDF-EC1C24?style=flat-square&logo=adobeacrobatreader)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_12/Better%20Generalizaton%20vs%20Better%20Learning.pdf) 112 | - Outline [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/f2069bde275c439abf9b8f8b0c774aa3) 113 | - Better Generalization 114 | - Spliting Data [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/8c127749600f4061ac1e4b233ab459a7) 115 | - Bias vs Variance [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/890fe680846f4c748668325ee4760b57) 116 | - Weight Regularization [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/159cdf86ae7140c489c35ea3561cc571) 117 | - Weight Constraint [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/6b4f15f754b8466a8eb4d46136747390) 118 | - Dropout [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/bc58906ee39c4269b0d8ec6f67091a47) 119 | - Promote Robustness with Noise [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/099bed79389b43d4976d35e6de32dcfb) 120 | - Early Stopping [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/8995488b79434b5d88e887440aa8c953) 121 | - Hands on :eyes: 122 | - Better Generalization - Part I [![Jupyter](https://img.shields.io/badge/-Notebook-191A1B?style=flat-square&logo=jupyter)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_12/Better%20Generalization%20I.ipynb) 123 | - Better Generalization - Part II [![Jupyter](https://img.shields.io/badge/-Notebook-191A1B?style=flat-square&logo=jupyter)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_12/Better%20Generalization%20II.ipynb) 124 | - Better Learning II 125 | - Data scaling [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/09f19beaffd946b897d1c61f3bf27f02) 126 | - Vanishing/Exploding Gradient [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/68ed7dd6c5284652a8a81396bb8465ec) 127 | - Fix Vanishing Gradient with Relu [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/d7fadbbf54c040d69facea3ce035d8f2) 128 | - Fix Exploding Gradient with Gradient Clipping [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/9dc3598a295a4bcca2f9717ba8f041f5) 129 | - Hands on :mortar_board: 130 | - Better Learning - Part II [![Jupyter](https://img.shields.io/badge/-Notebook-191A1B?style=flat-square&logo=jupyter)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_12/Better%20Learning%20II.ipynb) 131 | - Better Learning - Part III [![Jupyter](https://img.shields.io/badge/-Notebook-191A1B?style=flat-square&logo=jupyter)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_12/Better%20Learning%20III.ipynb) 132 | 133 | **Week 14** - Hyperparameter Tuning & Batch Normalization [![Open in PDF](https://img.shields.io/badge/-PDF-EC1C24?style=flat-square&logo=adobeacrobatreader)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_14/Hyperparameter%20Tuning%20and%20Batch%20Normalization.pdf) 134 | 135 | - Outline [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/a2c5d4d632b54dbfb27af28efa2a3ad0) 136 | - Hyperparameter Tuning Fundamentals [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/1993b8e11c764c23a87ac79f2a377275) 137 | - Keras Tuner, and Weight and Biases [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/162156aa0c1f4e6c95279320cc7be0d7) 138 | - Wandb - Part 01 [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/0db14b6c327c46d292c07e9da85274ae) 139 | - Wandb - Part 02 [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/ae3b1ef0b4d542c39c03158ed75dc280) 140 | - Batch Normalization Fundamentals [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/1217ecc68b904391bf8cb4ab4edc0f7f) 141 | - Batch Normalization Math Details [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/c031045602fc4d0b8aa2043042f19b81) 142 | - Batch Normalization Case Study [![Open in Loom](https://img.shields.io/badge/-Video-83DA77?style=flat-square&logo=loom)](https://www.loom.com/share/b05d320e561b43658be90a812537c87c) 143 | - Hands on :bell: 144 | - Hyperparameter tuning using keras tuner [![Jupyter](https://img.shields.io/badge/-Notebook-191A1B?style=flat-square&logo=jupyter)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_14/Task%20%2301%20Hyperparameter%20Tuning%20using%20Keras%20Tuner.ipynb) 145 | - Hyperparameter tuning using weights and biases [![Jupyter](https://img.shields.io/badge/-Notebook-191A1B?style=flat-square&logo=jupyter)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_14/Task%20%2302%20Hyperparameter%20Tuning%20using%20Weights%20and%20Biases.ipynb) 146 | - Batch Normalization [![Jupyter](https://img.shields.io/badge/-Notebook-191A1B?style=flat-square&logo=jupyter)](https://github.com/ivanovitchm/ppgeecmachinelearning/blob/main/lessons/week_14/Task%20%2303%20Batch%20Normalization.ipynb) 147 | -------------------------------------------------------------------------------- /images/ct.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/images/ct.jpeg -------------------------------------------------------------------------------- /images/gender_race.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/images/gender_race.png -------------------------------------------------------------------------------- /images/gender_workclass.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/images/gender_workclass.png -------------------------------------------------------------------------------- /images/workflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/images/workflow.png -------------------------------------------------------------------------------- /lessons/week_01/outline.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/lessons/week_01/outline.pdf -------------------------------------------------------------------------------- /lessons/week_02/ml_fundamentals_and_decision_trees.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/lessons/week_02/ml_fundamentals_and_decision_trees.pdf -------------------------------------------------------------------------------- /lessons/week_02/sources/README.md: -------------------------------------------------------------------------------- 1 | # Model Card 2 | 3 | Model cards are a succinct approach for documenting the creation, use, and shortcomings of a model. The idea is to write a documentation such that a non-expert can understand the model card's contents. For additional information see the Model Card paper: https://arxiv.org/pdf/1810.03993.pdf 4 | 5 | ## Model Details 6 | Ivanovitch Silva created the model. A complete data pipeline was built using Google Colab, Scikit-Learn and Weights & Bias to train a Decision Tree model. The big-picture of the data pipeline is shown below: 7 | 8 | 9 | 10 | For the sake of understanding, a simple hyperparameter-tuning was conducted using a Random Sweep of Wandb, and the hyperparameters values adopted in the train were: 11 | 12 | - full_pipeline__num_pipeline__num_transformer__model: 2 13 | - classifier__criterion: 'entropy' 14 | - classifier__splitter: 'best' 15 | - classifier__random_state: 41 16 | 17 | ## Intended Use 18 | This model is used as a proof of concept for the evaluation of an entire data pipeline incorporating Machine Learning fundamentals. The data pipeline is composed of the following stages: a) ``fecht data``, b) ``eda``, c) ``preprocess``, d) ``check data``, e) ``segregate``, f) ``train`` and g) ``test``. 19 | 20 | ## Training Data 21 | 22 | The dataset used in this project is based on individual income in the United States. The *data* is from the *1994 census*, and contains information on an individual's ``marital status, age, type of work, and more``. The target column, or what we want to predict, is whether individuals make *less than or equal to 50k a year*, or *more than 50k a year*. 23 | 24 | You can download the data from the University of California, Irvine's [website](http://archive.ics.uci.edu/ml/datasets/Adult). 25 | 26 | After the EDA stage of the data pipeline, it was noted that the training data is imbalanced when considered the target variable and some features (``sex``, ``race`` and ``workclass``. 27 | 28 | 29 | 30 | 31 | ## Evaluation Data 32 | The dataset under study is split into Train and Test during the ``Segregate`` stage of the data pipeline. 70% of the clean data is used to Train and the remaining 30% to Test. Additionally, 30% of the Train data is used for validation purposes (hyperparameter-tuning). 33 | 34 | ## Metrics 35 | In order to follow the performance of machine learning experiments, the project marked certains stage outputs of the data pipeline as metrics. The metrics adopted are: [accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html), [f1](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score), [precision](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score), [recall](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score). 36 | 37 | To calculate the evaluations metrics is only necessary to run: 38 | 39 | The follow results will be shown: 40 | 41 | **Stage [Run]** | **Accuracy** | **F1** | **Precision** | **Recall** | 42 | ---------------------------------|--------------|--------|---------------|------------| 43 | Train [distinctive-sweep-7](https://wandb.ai/ivanovitchm/decision_tree/runs/f40ujfaq/overview?workspace=user-ivanovitchm) | 0.8109 | 0.6075 | 0.6075 | 0.6075 | 44 | Test [crips-resonance-11](https://wandb.ai/ivanovitchm/decision_tree/runs/1wg7ibyy/overview?workspace=user-ivanovitchm) | 0.8019 | 0.5899 | 0.5884 | 0.5914 | 45 | 46 | 47 | ## Ethical Considerations 48 | 49 | We may be tempted to claim that this dataset contains the only attributes capable of predicting someone's income. However, we know that is not true, and we will need to deal with the class imbalances somehow. 50 | 51 | ## Caveats and Recommendations 52 | It should be noted that the model trained in this project was used only for validation of a complete data pipeline. It is notary that some important issues related to dataset imbalances exist, and adequate techniques need to be adopted in order to balance it. -------------------------------------------------------------------------------- /lessons/week_02/sources/data_check.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "data_check.ipynb", 7 | "provenance": [], 8 | "collapsed_sections": [] 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "0I4pgzLVtBTP" 23 | }, 24 | "source": [ 25 | "# 1.0 An end-to-end classification problem (Data Check)\n", 26 | "\n" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "metadata": { 32 | "id": "Dh34gim6KPtT" 33 | }, 34 | "source": [ 35 | "## 1.1 Dataset description" 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": { 41 | "id": "iE8OJoDZ5AFK" 42 | }, 43 | "source": [ 44 | "We'll be looking at individual income in the United States. The **data** is from the **1994 census**, and contains information on an individual's **marital status**, **age**, **type of work**, and more. The **target column**, or what we want to predict, is whether individuals make less than or equal to 50k a year, or more than **50k a year**.\n", 45 | "\n", 46 | "You can download the data from the [University of California, Irvine's website](http://archive.ics.uci.edu/ml/datasets/Adult).\n", 47 | "\n", 48 | "Let's take the following steps:\n", 49 | "\n", 50 | "1. ETL (done!!!)\n", 51 | "4. Data Checks\n", 52 | "\n", 53 | "
" 54 | ] 55 | }, 56 | { 57 | "cell_type": "markdown", 58 | "metadata": { 59 | "id": "7UpxKxU1Ej7f" 60 | }, 61 | "source": [ 62 | "## 1.2 Install, load libraries and setup wandb" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "metadata": { 69 | "id": "t82KewAPWCYe" 70 | }, 71 | "outputs": [], 72 | "source": [ 73 | "!pip install wandb" 74 | ] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "source": [ 79 | "!pip install pytest pytest-sugar" 80 | ], 81 | "metadata": { 82 | "id": "IumW4s8Sh9i_" 83 | }, 84 | "execution_count": null, 85 | "outputs": [] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": null, 90 | "metadata": { 91 | "id": "LASaVZuhRJlL" 92 | }, 93 | "outputs": [], 94 | "source": [ 95 | "import wandb" 96 | ] 97 | }, 98 | { 99 | "cell_type": "code", 100 | "source": [ 101 | "# Login to Weights & Biases\n", 102 | "!wandb login --relogin" 103 | ], 104 | "metadata": { 105 | "id": "QZXcN54GkP25" 106 | }, 107 | "execution_count": null, 108 | "outputs": [] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "source": [ 113 | "## 1.2 Pytest\n" 114 | ], 115 | "metadata": { 116 | "id": "MPjpyeU7d37l" 117 | } 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "metadata": { 122 | "id": "WB1QY4sbPs4-" 123 | }, 124 | "source": [ 125 | "### 1.2.1 How pytest discovers tests\n" 126 | ] 127 | }, 128 | { 129 | "cell_type": "markdown", 130 | "source": [ 131 | "\n", 132 | "pytests uses the following [conventions](https://docs.pytest.org/en/latest/goodpractices.html#conventions-for-python-test-discovery) to automatically discovering tests:\n", 133 | " 1. files with tests should be called `test_*.py` or `*_test.py `\n", 134 | " 2. test function name should start with `test_`\n", 135 | "\n", 136 | "\n" 137 | ], 138 | "metadata": { 139 | "id": "gXlqh21wiW6P" 140 | } 141 | }, 142 | { 143 | "cell_type": "markdown", 144 | "source": [ 145 | "### 1.2.2 Fixture" 146 | ], 147 | "metadata": { 148 | "id": "JtTD3oxoiYF6" 149 | } 150 | }, 151 | { 152 | "cell_type": "markdown", 153 | "source": [ 154 | "\n", 155 | "An important aspect when using ``pytest`` is understanding the fixture's scope works. \n", 156 | "\n", 157 | "The scope of the fixture can have a few legal values, described [here](https://docs.pytest.org/en/6.2.x/fixture.html#fixture-scopes). We are going to consider only **session** and **function**: with the former, the fixture is executed only once in a pytest session and the value it returns is used for all the tests that need it; with the latter, every test function gets a fresh copy of the data. This is useful if the tests modify the input in a way that make the other tests fail, for example." 158 | ], 159 | "metadata": { 160 | "id": "mwNz2mgMevJJ" 161 | } 162 | }, 163 | { 164 | "cell_type": "markdown", 165 | "metadata": { 166 | "id": "uJjWla1qxd3i" 167 | }, 168 | "source": [ 169 | "### 1.2.3 Create and run a test file\n" 170 | ] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "metadata": { 175 | "id": "9acpZigRVANF" 176 | }, 177 | "source": [ 178 | "%%file test_data.py\n", 179 | "import pytest\n", 180 | "import wandb\n", 181 | "import pandas as pd\n", 182 | "\n", 183 | "# This is global so all tests are collected under the same run\n", 184 | "run = wandb.init(project=\"decision_tree\", job_type=\"data_checks\")\n", 185 | "\n", 186 | "@pytest.fixture(scope=\"session\")\n", 187 | "def data():\n", 188 | "\n", 189 | " local_path = run.use_artifact(\"decision_tree/preprocessed_data.csv:latest\").file()\n", 190 | " df = pd.read_csv(local_path)\n", 191 | "\n", 192 | " return df\n", 193 | "\n", 194 | "def test_data_length(data):\n", 195 | " \"\"\"\n", 196 | " We test that we have enough data to continue\n", 197 | " \"\"\"\n", 198 | " assert len(data) > 1000\n", 199 | "\n", 200 | "\n", 201 | "def test_number_of_columns(data):\n", 202 | " \"\"\"\n", 203 | " We test that we have enough data to continue\n", 204 | " \"\"\"\n", 205 | " assert data.shape[1] == 15\n", 206 | "\n", 207 | "def test_column_presence_and_type(data):\n", 208 | "\n", 209 | " required_columns = {\n", 210 | " \"age\": pd.api.types.is_int64_dtype,\n", 211 | " \"workclass\": pd.api.types.is_object_dtype,\n", 212 | " \"fnlwgt\": pd.api.types.is_int64_dtype,\n", 213 | " \"education\": pd.api.types.is_object_dtype,\n", 214 | " \"education_num\": pd.api.types.is_int64_dtype,\n", 215 | " \"marital_status\": pd.api.types.is_object_dtype,\n", 216 | " \"occupation\": pd.api.types.is_object_dtype,\n", 217 | " \"relationship\": pd.api.types.is_object_dtype,\n", 218 | " \"race\": pd.api.types.is_object_dtype,\n", 219 | " \"sex\": pd.api.types.is_object_dtype,\n", 220 | " \"capital_gain\": pd.api.types.is_int64_dtype,\n", 221 | " \"capital_loss\": pd.api.types.is_int64_dtype, \n", 222 | " \"hours_per_week\": pd.api.types.is_int64_dtype,\n", 223 | " \"native_country\": pd.api.types.is_object_dtype,\n", 224 | " \"high_income\": pd.api.types.is_object_dtype\n", 225 | " }\n", 226 | "\n", 227 | " # Check column presence\n", 228 | " assert set(data.columns.values).issuperset(set(required_columns.keys()))\n", 229 | "\n", 230 | " for col_name, format_verification_funct in required_columns.items():\n", 231 | "\n", 232 | " assert format_verification_funct(data[col_name]), f\"Column {col_name} failed test {format_verification_funct}\"\n", 233 | "\n", 234 | "\n", 235 | "def test_class_names(data):\n", 236 | "\n", 237 | " # Check that only the known classes are present\n", 238 | " known_classes = [\n", 239 | " \" <=50K\",\n", 240 | " \" >50K\"\n", 241 | " ]\n", 242 | "\n", 243 | " assert data[\"high_income\"].isin(known_classes).all()\n", 244 | "\n", 245 | "\n", 246 | "def test_column_ranges(data):\n", 247 | "\n", 248 | " ranges = {\n", 249 | " \"age\": (17, 90),\n", 250 | " \"fnlwgt\": (1.228500e+04, 1.484705e+06),\n", 251 | " \"education_num\": (1, 16),\n", 252 | " \"capital_gain\": (0, 99999),\n", 253 | " \"capital_loss\": (0, 4356),\n", 254 | " \"hours_per_week\": (1, 99)\n", 255 | " }\n", 256 | "\n", 257 | " for col_name, (minimum, maximum) in ranges.items():\n", 258 | "\n", 259 | " assert data[col_name].dropna().between(minimum, maximum).all(), (\n", 260 | " f\"Column {col_name} failed the test. Should be between {minimum} and {maximum}, \"\n", 261 | " f\"instead min={data[col_name].min()} and max={data[col_name].max()}\"\n", 262 | " )" 263 | ], 264 | "execution_count": null, 265 | "outputs": [] 266 | }, 267 | { 268 | "cell_type": "markdown", 269 | "metadata": { 270 | "id": "XTBnZ3-vVe0p" 271 | }, 272 | "source": [ 273 | "Now lets run pytest" 274 | ] 275 | }, 276 | { 277 | "cell_type": "code", 278 | "metadata": { 279 | "id": "DXBQkMc8VeD8" 280 | }, 281 | "source": [ 282 | "!pytest . -vv" 283 | ], 284 | "execution_count": null, 285 | "outputs": [] 286 | }, 287 | { 288 | "cell_type": "code", 289 | "source": [ 290 | "# close the run\n", 291 | "# waiting a while after run the previous cell before execute this\n", 292 | "run.finish()" 293 | ], 294 | "metadata": { 295 | "id": "5284u1A7euMF" 296 | }, 297 | "execution_count": null, 298 | "outputs": [] 299 | }, 300 | { 301 | "cell_type": "code", 302 | "source": [ 303 | "" 304 | ], 305 | "metadata": { 306 | "id": "FpvJLXccv8Kz" 307 | }, 308 | "execution_count": null, 309 | "outputs": [] 310 | } 311 | ] 312 | } -------------------------------------------------------------------------------- /lessons/week_02/sources/data_segregation.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "data_segregation.ipynb", 7 | "provenance": [], 8 | "collapsed_sections": [] 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "0I4pgzLVtBTP" 23 | }, 24 | "source": [ 25 | "# 1.0 An end-to-end classification problem (Data Segregation)\n", 26 | "\n" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "metadata": { 32 | "id": "Dh34gim6KPtT" 33 | }, 34 | "source": [ 35 | "## 1.1 Dataset description" 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": { 41 | "id": "iE8OJoDZ5AFK" 42 | }, 43 | "source": [ 44 | "We'll be looking at individual income in the United States. The **data** is from the **1994 census**, and contains information on an individual's **marital status**, **age**, **type of work**, and more. The **target column**, or what we want to predict, is whether individuals make less than or equal to 50k a year, or more than **50k a year**.\n", 45 | "\n", 46 | "You can download the data from the [University of California, Irvine's website](http://archive.ics.uci.edu/ml/datasets/Adult).\n", 47 | "\n", 48 | "Let's take the following steps:\n", 49 | "\n", 50 | "1. ETL (done)\n", 51 | "2. Data Checks (done)\n", 52 | "3. Data Segregation\n", 53 | "\n", 54 | "
" 55 | ] 56 | }, 57 | { 58 | "cell_type": "markdown", 59 | "metadata": { 60 | "id": "7UpxKxU1Ej7f" 61 | }, 62 | "source": [ 63 | "## 1.2 Install, load libraries and setup wandb" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "metadata": { 70 | "id": "t82KewAPWCYe" 71 | }, 72 | "outputs": [], 73 | "source": [ 74 | "!pip install wandb" 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": null, 80 | "metadata": { 81 | "id": "LASaVZuhRJlL" 82 | }, 83 | "outputs": [], 84 | "source": [ 85 | "import logging\n", 86 | "import tempfile\n", 87 | "import pandas as pd\n", 88 | "import os\n", 89 | "import wandb\n", 90 | "from sklearn.model_selection import train_test_split" 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "source": [ 96 | "# Login to Weights & Biases\n", 97 | "!wandb login --relogin" 98 | ], 99 | "metadata": { 100 | "id": "QZXcN54GkP25" 101 | }, 102 | "execution_count": null, 103 | "outputs": [] 104 | }, 105 | { 106 | "cell_type": "markdown", 107 | "source": [ 108 | "## 1.3 Data Segregation" 109 | ], 110 | "metadata": { 111 | "id": "sa1PvzZYF4J2" 112 | } 113 | }, 114 | { 115 | "cell_type": "code", 116 | "source": [ 117 | "# global variables\n", 118 | "\n", 119 | "# ratio used to split train and test data\n", 120 | "test_size = 0.30\n", 121 | "\n", 122 | "# seed used to reproduce purposes\n", 123 | "seed = 41\n", 124 | "\n", 125 | "# reference (column) to stratify the data\n", 126 | "stratify = \"high_income\"\n", 127 | "\n", 128 | "# name of the input artifact\n", 129 | "artifact_input_name = \"decision_tree/preprocessed_data.csv:latest\"\n", 130 | "\n", 131 | "# type of the artifact\n", 132 | "artifact_type = \"segregated_data\"" 133 | ], 134 | "metadata": { 135 | "id": "6Box9NiNKCgM" 136 | }, 137 | "execution_count": null, 138 | "outputs": [] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "source": [ 143 | "# configure logging\n", 144 | "logging.basicConfig(level=logging.INFO,\n", 145 | " format=\"%(asctime)s %(message)s\",\n", 146 | " datefmt='%d-%m-%Y %H:%M:%S')\n", 147 | "\n", 148 | "# reference for a logging obj\n", 149 | "logger = logging.getLogger()\n", 150 | "\n", 151 | "# initiate wandb project\n", 152 | "run = wandb.init(project=\"decision_tree\", job_type=\"split_data\")\n", 153 | "\n", 154 | "logger.info(\"Downloading and reading artifact\")\n", 155 | "artifact = run.use_artifact(artifact_input_name)\n", 156 | "artifact_path = artifact.file()\n", 157 | "df = pd.read_csv(artifact_path)\n", 158 | "\n", 159 | "# Split firstly in train/test, then we further divide the dataset to train and validation\n", 160 | "logger.info(\"Splitting data into train and test\")\n", 161 | "splits = {}\n", 162 | "\n", 163 | "splits[\"train\"], splits[\"test\"] = train_test_split(df,\n", 164 | " test_size=test_size,\n", 165 | " random_state=seed,\n", 166 | " stratify=df[stratify])\n", 167 | "\n", 168 | "# Save the artifacts. We use a temporary directory so we do not leave any trace behind\n", 169 | "with tempfile.TemporaryDirectory() as tmp_dir:\n", 170 | "\n", 171 | " for split, df in splits.items():\n", 172 | "\n", 173 | " # Make the artifact name from the name of the split plus the provided root\n", 174 | " artifact_name = f\"{split}.csv\"\n", 175 | "\n", 176 | " # Get the path on disk within the temp directory\n", 177 | " temp_path = os.path.join(tmp_dir, artifact_name)\n", 178 | "\n", 179 | " logger.info(f\"Uploading the {split} dataset to {artifact_name}\")\n", 180 | "\n", 181 | " # Save then upload to W&B\n", 182 | " df.to_csv(temp_path,index=False)\n", 183 | "\n", 184 | " artifact = wandb.Artifact(name=artifact_name,\n", 185 | " type=artifact_type,\n", 186 | " description=f\"{split} split of dataset {artifact_input_name}\",\n", 187 | " )\n", 188 | " artifact.add_file(temp_path)\n", 189 | "\n", 190 | " logger.info(\"Logging artifact\")\n", 191 | " run.log_artifact(artifact)\n", 192 | "\n", 193 | " # This waits for the artifact to be uploaded to W&B. If you\n", 194 | " # do not add this, the temp directory might be removed before\n", 195 | " # W&B had a chance to upload the datasets, and the upload\n", 196 | " # might fail\n", 197 | " artifact.wait()" 198 | ], 199 | "metadata": { 200 | "id": "4tha7oPLF58G" 201 | }, 202 | "execution_count": null, 203 | "outputs": [] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "source": [ 208 | "# close the run\n", 209 | "# waiting a while after run the previous cell before execute this\n", 210 | "run.finish()" 211 | ], 212 | "metadata": { 213 | "id": "s1IcDCzUO57y" 214 | }, 215 | "execution_count": null, 216 | "outputs": [] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "source": [ 221 | "" 222 | ], 223 | "metadata": { 224 | "id": "s-decyPSPZ_d" 225 | }, 226 | "execution_count": null, 227 | "outputs": [] 228 | } 229 | ] 230 | } -------------------------------------------------------------------------------- /lessons/week_02/sources/eda.ipynb: -------------------------------------------------------------------------------- 1 | {"cells":[{"cell_type":"markdown","metadata":{"id":"0I4pgzLVtBTP"},"source":["# 1.0 An end-to-end classification problem (ETL)\n","\n"]},{"cell_type":"markdown","metadata":{"id":"Dh34gim6KPtT"},"source":["## 1.1 Dataset description"]},{"cell_type":"markdown","metadata":{"id":"iE8OJoDZ5AFK"},"source":["\n","\n","We'll be looking at individual income in the United States. The **data** is from the **1994 census**, and contains information on an individual's **marital status**, **age**, **type of work**, and more. The **target column**, or what we want to predict, is whether individuals make less than or equal to 50k a year, or more than **50k a year**.\n","\n","You can download the data from the [University of California, Irvine's website](http://archive.ics.uci.edu/ml/datasets/Adult).\n","\n","Let's take the following steps:\n","\n","1. Load Libraries\n","2. Fetch Data, including EDA\n","3. Pre-procesing\n","4. Data Segregation\n","\n","
"]},{"cell_type":"markdown","metadata":{"id":"7UpxKxU1Ej7f"},"source":["## 1.2 Install and load libraries"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"e3Zq4gmzWK6z"},"outputs":[],"source":["!pip install pandas-profiling==3.1.0"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"t82KewAPWCYe"},"outputs":[],"source":["!pip install wandb"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"LASaVZuhRJlL"},"outputs":[],"source":["import wandb\n","import matplotlib.pyplot as plt\n","import seaborn as sns\n","import pandas as pd\n","import numpy as np\n","from pandas_profiling import ProfileReport\n","import tempfile\n","import os"]},{"cell_type":"markdown","metadata":{"id":"Z74pHa-qHVrT"},"source":["## 1.3 Exploratory Data Analysis (EDA)"]},{"cell_type":"markdown","metadata":{"id":"MxzlOezZLWQ_"},"source":["### 1.3.1 Login wandb\n"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"xvG-dYfiVwM9"},"outputs":[],"source":["# Login to Weights & Biases\n","!wandb login --relogin"]},{"cell_type":"markdown","metadata":{"id":"18RAS5kFXPAe"},"source":["### 1.3.2 Download raw_data artifact from Wandb"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"5kKY7pRNSGo9"},"outputs":[],"source":["# save_code tracking all changes of the notebook and sync with Wandb\n","run = wandb.init(project=\"decision_tree\", save_code=True)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"H6eyFp6XTLyf"},"outputs":[],"source":["# donwload the latest version of artifact raw_data.csv\n","artifact = run.use_artifact(\"decision_tree/raw_data.csv:latest\")\n","\n","# create a dataframe from the artifact\n","df = pd.read_csv(artifact.file())"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"SWMTKGEUYHeK"},"outputs":[],"source":["df.head()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"4WkUfOnkYI5l"},"outputs":[],"source":["df.info()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"0dVXWRhmGmOn"},"outputs":[],"source":["df.describe()"]},{"cell_type":"markdown","metadata":{"id":"PnBYLNwlBFrO"},"source":["### 1.3.3 Pandas Profilling"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"vGOWSZJoI-s_"},"outputs":[],"source":["ProfileReport(df, title=\"Pandas Profiling Report\", explorative=True)"]},{"cell_type":"markdown","metadata":{"id":"N3yBP7hxM0q1"},"source":["### 1.3.4 EDA Manually"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"JN2R1QPW4ZJo"},"outputs":[],"source":["# There are duplicated rows\n","df.duplicated().sum()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"5Qscf9h04gO_"},"outputs":[],"source":["# Delete duplicated rows\n","df.drop_duplicates(inplace=True)\n","df.duplicated().sum()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"yCPerUdKl-yz"},"outputs":[],"source":["# what the sex column can help us?\n","pd.crosstab(df.high_income,df.sex,margins=True,normalize=False)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"Qrk81pTpUPi-"},"outputs":[],"source":["# income vs [sex & race]?\n","pd.crosstab(df.high_income,[df.sex,df.race],margins=True)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"WTcnUzDtW7pX"},"outputs":[],"source":["%matplotlib inline\n","\n","sns.catplot(x=\"sex\", \n"," hue=\"race\", \n"," col=\"high_income\",\n"," data=df, kind=\"count\",\n"," height=4, aspect=.7)\n","plt.show()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"iiNAy2H2mOZT"},"outputs":[],"source":["g = sns.catplot(x=\"sex\", \n"," hue=\"workclass\", \n"," col=\"high_income\",\n"," data=df, kind=\"count\",\n"," height=4, aspect=.7)\n","\n","g.savefig(\"HighIncome_Sex_Workclass.png\", dpi=100)\n","\n","run.log(\n"," {\n"," \"High_Income vs Sex vs Workclass\": wandb.Image(\"HighIncome_Sex_Workclass.png\")\n"," }\n"," )"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"w_-oFseHmmBc"},"outputs":[],"source":["df.isnull().sum()"]},{"cell_type":"code","source":["run.finish()"],"metadata":{"id":"cHDWOxTji0P0"},"execution_count":null,"outputs":[]}],"metadata":{"colab":{"collapsed_sections":[],"name":"eda.ipynb","provenance":[]},"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.8.12"}},"nbformat":4,"nbformat_minor":0} -------------------------------------------------------------------------------- /lessons/week_02/sources/fetch_data.ipynb: -------------------------------------------------------------------------------- 1 | {"cells":[{"cell_type":"markdown","metadata":{"id":"0I4pgzLVtBTP","tags":[]},"source":["# 1.0 An end-to-end classification problem (ETL)\n","\n"]},{"cell_type":"markdown","metadata":{"id":"Dh34gim6KPtT","tags":[]},"source":["## 1.1 Dataset description"]},{"cell_type":"markdown","metadata":{"id":"iE8OJoDZ5AFK"},"source":["\n","\n","We'll be looking at individual income in the United States. The **data** is from the **1994 census**, and contains information on an individual's **marital status**, **age**, **type of work**, and more. The **target column**, or what we want to predict, is whether individuals make less than or equal to 50k a year, or more than **50k a year**.\n","\n","You can download the data from the [University of California, Irvine's website](http://archive.ics.uci.edu/ml/datasets/Adult).\n","\n","Let's take the following steps:\n","\n","1. Load Libraries\n","2. Fetch Data, including EDA\n","3. Pre-procesing\n","4. Data Segregation\n","\n","
"]},{"cell_type":"markdown","metadata":{"id":"7UpxKxU1Ej7f"},"source":["## 1.2 Install and load libraries"]},{"cell_type":"code","source":["!pip install wandb"],"metadata":{"id":"Y6MAZ5-e7kgA"},"execution_count":null,"outputs":[]},{"cell_type":"code","execution_count":2,"metadata":{"id":"LASaVZuhRJlL","tags":[],"executionInfo":{"status":"ok","timestamp":1649100830251,"user_tz":180,"elapsed":914,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}}},"outputs":[],"source":["import wandb\n","import pandas as pd"]},{"cell_type":"markdown","metadata":{"id":"Z74pHa-qHVrT"},"source":["## 1.3 Fetch Data"]},{"cell_type":"markdown","metadata":{"id":"MxzlOezZLWQ_"},"source":["### 1.3.1 Create the raw_data artifact"]},{"cell_type":"code","execution_count":3,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":339},"id":"i_n2KZu0usUv","outputId":"7160274a-1f05-456b-fda3-023f55353d51","tags":[],"executionInfo":{"status":"ok","timestamp":1649100856527,"user_tz":180,"elapsed":1404,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":[" age workclass fnlwgt education education_num \\\n","0 39 State-gov 77516 Bachelors 13 \n","1 50 Self-emp-not-inc 83311 Bachelors 13 \n","2 38 Private 215646 HS-grad 9 \n","3 53 Private 234721 11th 7 \n","4 28 Private 338409 Bachelors 13 \n","\n"," marital_status occupation relationship race sex \\\n","0 Never-married Adm-clerical Not-in-family White Male \n","1 Married-civ-spouse Exec-managerial Husband White Male \n","2 Divorced Handlers-cleaners Not-in-family White Male \n","3 Married-civ-spouse Handlers-cleaners Husband Black Male \n","4 Married-civ-spouse Prof-specialty Wife Black Female \n","\n"," capital_gain capital_loss hours_per_week native_country high_income \n","0 2174 0 40 United-States <=50K \n","1 0 0 13 United-States <=50K \n","2 0 0 40 United-States <=50K \n","3 0 0 40 United-States <=50K \n","4 0 0 40 Cuba <=50K "],"text/html":["\n","
\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryhigh_income
039State-gov77516Bachelors13Never-marriedAdm-clericalNot-in-familyWhiteMale2174040United-States<=50K
150Self-emp-not-inc83311Bachelors13Married-civ-spouseExec-managerialHusbandWhiteMale0013United-States<=50K
238Private215646HS-grad9DivorcedHandlers-cleanersNot-in-familyWhiteMale0040United-States<=50K
353Private23472111th7Married-civ-spouseHandlers-cleanersHusbandBlackMale0040United-States<=50K
428Private338409Bachelors13Married-civ-spouseProf-specialtyWifeBlackFemale0040Cuba<=50K
\n","
\n"," \n"," \n"," \n","\n"," \n","
\n","
\n"," "]},"metadata":{},"execution_count":3}],"source":["# columns used \n","columns = ['age', 'workclass', 'fnlwgt', 'education', 'education_num',\n"," 'marital_status', 'occupation', 'relationship', 'race', \n"," 'sex','capital_gain', 'capital_loss', 'hours_per_week',\n"," 'native_country','high_income']\n","# importing the dataset\n","income = pd.read_csv(\"https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data\",\n"," header=None,\n"," names=columns)\n","income.head()"]},{"cell_type":"code","execution_count":4,"metadata":{"id":"j3Otnz1-UYre","tags":[],"executionInfo":{"status":"ok","timestamp":1649100915009,"user_tz":180,"elapsed":465,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}}},"outputs":[],"source":["income.to_csv(\"raw_data.csv\",index=False)"]},{"cell_type":"code","execution_count":5,"metadata":{"tags":[],"colab":{"base_uri":"https://localhost:8080/"},"id":"sgSoPVLZ6-ps","executionInfo":{"status":"ok","timestamp":1649100978757,"user_tz":180,"elapsed":40646,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}},"outputId":"ffcebab8-e0cc-4833-b4f2-d71658dbe66d"},"outputs":[{"output_type":"stream","name":"stdout","text":["\u001b[34m\u001b[1mwandb\u001b[0m: You can find your API key in your browser here: https://wandb.ai/authorize\n","\u001b[34m\u001b[1mwandb\u001b[0m: Paste an API key from your profile and hit enter, or press ctrl+c to quit: \n","\u001b[34m\u001b[1mwandb\u001b[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc\n"]}],"source":["# Login to Weights & Biases\n","!wandb login --relogin"]},{"cell_type":"code","execution_count":6,"metadata":{"id":"l7lg2qtMUFGW","tags":[],"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1649101124875,"user_tz":180,"elapsed":15017,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}},"outputId":"a7ac4446-02f6-4e45-ccc1-e249f3131cc0"},"outputs":[{"output_type":"stream","name":"stdout","text":["\u001b[34m\u001b[1mwandb\u001b[0m: Uploading file raw_data.csv to: \"ivanovitchm/decision_tree/raw_data.csv:latest\" (raw_data)\n","\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mivanovitchm\u001b[0m (use `wandb login --relogin` to force relogin)\n","\u001b[34m\u001b[1mwandb\u001b[0m: Tracking run with wandb version 0.12.11\n","\u001b[34m\u001b[1mwandb\u001b[0m: Run data is saved locally in \u001b[35m\u001b[1m/content/wandb/run-20220404_193832-32qj2hph\u001b[0m\n","\u001b[34m\u001b[1mwandb\u001b[0m: Run \u001b[1m`wandb offline`\u001b[0m to turn off syncing.\n","\u001b[34m\u001b[1mwandb\u001b[0m: Syncing run \u001b[33mquiet-snow-1\u001b[0m\n","\u001b[34m\u001b[1mwandb\u001b[0m: ⭐️ View project at \u001b[34m\u001b[4mhttps://wandb.ai/ivanovitchm/decision_tree\u001b[0m\n","\u001b[34m\u001b[1mwandb\u001b[0m: 🚀 View run at \u001b[34m\u001b[4mhttps://wandb.ai/ivanovitchm/decision_tree/runs/32qj2hph\u001b[0m\n","Artifact uploaded, use this artifact in a run by adding:\n","\n"," artifact = run.use_artifact(\"ivanovitchm/decision_tree/raw_data.csv:latest\")\n","\n","\n","\u001b[34m\u001b[1mwandb\u001b[0m: Waiting for W&B process to finish... \u001b[32m(success).\u001b[0m\n","\u001b[34m\u001b[1mwandb\u001b[0m: \n","\u001b[34m\u001b[1mwandb\u001b[0m: Synced \u001b[33mquiet-snow-1\u001b[0m: \u001b[34m\u001b[4mhttps://wandb.ai/ivanovitchm/decision_tree/runs/32qj2hph\u001b[0m\n","\u001b[34m\u001b[1mwandb\u001b[0m: Synced 5 W&B file(s), 0 media file(s), 1 artifact file(s) and 0 other file(s)\n","\u001b[34m\u001b[1mwandb\u001b[0m: Find logs at: \u001b[35m\u001b[1m./wandb/run-20220404_193832-32qj2hph/logs\u001b[0m\n"]}],"source":["# Send the raw_data.csv to the Wandb storing it as an artifact\n","!wandb artifact put \\\n"," --name decision_tree/raw_data.csv \\\n"," --type raw_data \\\n"," --description \"The raw data from 1994 US Census\" raw_data.csv"]},{"cell_type":"code","source":[""],"metadata":{"id":"e8Z6xmn0Ffaj"},"execution_count":null,"outputs":[]}],"metadata":{"colab":{"collapsed_sections":[],"name":"fetch_data.ipynb","provenance":[]},"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.8.12"}},"nbformat":4,"nbformat_minor":0} -------------------------------------------------------------------------------- /lessons/week_02/sources/preprocessing.ipynb: -------------------------------------------------------------------------------- 1 | {"cells":[{"cell_type":"markdown","metadata":{"id":"0I4pgzLVtBTP"},"source":["# 1.0 An end-to-end classification problem (ETL)\n","\n"]},{"cell_type":"markdown","metadata":{"id":"Dh34gim6KPtT"},"source":["## 1.1 Dataset description"]},{"cell_type":"markdown","metadata":{"id":"iE8OJoDZ5AFK"},"source":["\n","\n","We'll be looking at individual income in the United States. The **data** is from the **1994 census**, and contains information on an individual's **marital status**, **age**, **type of work**, and more. The **target column**, or what we want to predict, is whether individuals make less than or equal to 50k a year, or more than **50k a year**.\n","\n","You can download the data from the [University of California, Irvine's website](http://archive.ics.uci.edu/ml/datasets/Adult).\n","\n","Let's take the following steps:\n","\n","1. Load Libraries\n","2. Fetch Data, including EDA\n","3. Pre-procesing\n","4. Data Segregation\n","\n","
"]},{"cell_type":"markdown","metadata":{"id":"7UpxKxU1Ej7f"},"source":["## 1.2 Install and load libraries"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"t82KewAPWCYe"},"outputs":[],"source":["!pip install wandb"]},{"cell_type":"code","execution_count":20,"metadata":{"id":"LASaVZuhRJlL","executionInfo":{"status":"ok","timestamp":1649111549347,"user_tz":180,"elapsed":427,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}}},"outputs":[],"source":["import wandb\n","import pandas as pd"]},{"cell_type":"markdown","metadata":{"id":"Z74pHa-qHVrT"},"source":["## 1.3 Preprocessing"]},{"cell_type":"markdown","metadata":{"id":"MxzlOezZLWQ_"},"source":["### 1.3.1 Login wandb\n"]},{"cell_type":"code","execution_count":21,"metadata":{"id":"xvG-dYfiVwM9","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1649111573515,"user_tz":180,"elapsed":21048,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}},"outputId":"2e7d5d03-12c7-47e3-c29f-ff93a0587be6"},"outputs":[{"output_type":"stream","name":"stdout","text":["\u001b[34m\u001b[1mwandb\u001b[0m: You can find your API key in your browser here: https://wandb.ai/authorize\n","\u001b[34m\u001b[1mwandb\u001b[0m: Paste an API key from your profile and hit enter, or press ctrl+c to quit: \n","\u001b[34m\u001b[1mwandb\u001b[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc\n"]}],"source":["# Login to Weights & Biases\n","!wandb login --relogin"]},{"cell_type":"markdown","source":["### 1.3.2 Artifacts"],"metadata":{"id":"3GrAiPvGm0kl"}},{"cell_type":"code","source":["input_artifact=\"decision_tree/raw_data.csv:latest\"\n","artifact_name=\"preprocessed_data.csv\"\n","artifact_type=\"clean_data\"\n","artifact_description=\"Data after preprocessing\""],"metadata":{"id":"dBiQMbchm78L","executionInfo":{"status":"ok","timestamp":1649111604334,"user_tz":180,"elapsed":539,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}}},"execution_count":22,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"18RAS5kFXPAe"},"source":["### 1.3.3 Setup your wandb project and clean the dataset"]},{"cell_type":"markdown","source":["After the fetch step the raw data artifact was generated.\n","Now, we need to pre-processing the raw data to create a new artfiact (clean_data)."],"metadata":{"id":"C6YkQ8SOn3qo"}},{"cell_type":"code","execution_count":23,"metadata":{"id":"5kKY7pRNSGo9","colab":{"base_uri":"https://localhost:8080/","height":69},"executionInfo":{"status":"ok","timestamp":1649111639094,"user_tz":180,"elapsed":5526,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}},"outputId":"eab22db6-2470-41ae-dd2d-a058ca3b9073"},"outputs":[{"output_type":"display_data","data":{"text/plain":[""],"text/html":["Tracking run with wandb version 0.12.11"]},"metadata":{}},{"output_type":"display_data","data":{"text/plain":[""],"text/html":["Run data is saved locally in /content/wandb/run-20220404_223353-n93uggvj"]},"metadata":{}},{"output_type":"display_data","data":{"text/plain":[""],"text/html":["Syncing run likely-wind-6 to Weights & Biases (docs)
"]},"metadata":{}}],"source":["# create a new job_type\n","run = wandb.init(project=\"decision_tree\", job_type=\"process_data\")"]},{"cell_type":"code","execution_count":24,"metadata":{"id":"H6eyFp6XTLyf","executionInfo":{"status":"ok","timestamp":1649111658361,"user_tz":180,"elapsed":2624,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}}},"outputs":[],"source":["# donwload the latest version of artifact raw_data.csv\n","artifact = run.use_artifact(input_artifact)\n","\n","# create a dataframe from the artifact\n","df = pd.read_csv(artifact.file())"]},{"cell_type":"code","execution_count":25,"metadata":{"id":"SWMTKGEUYHeK","executionInfo":{"status":"ok","timestamp":1649111669513,"user_tz":180,"elapsed":548,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}}},"outputs":[],"source":["# Delete duplicated rows\n","df.drop_duplicates(inplace=True)\n","\n","# Generate a \"clean data file\"\n","df.to_csv(artifact_name,index=False)"]},{"cell_type":"code","execution_count":26,"metadata":{"id":"4WkUfOnkYI5l","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1649111718114,"user_tz":180,"elapsed":435,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}},"outputId":"2454d010-301f-4c7a-a8d2-c104c692cf13"},"outputs":[{"output_type":"execute_result","data":{"text/plain":[""]},"metadata":{},"execution_count":26}],"source":["# Create a new artifact and configure with the necessary arguments\n","artifact = wandb.Artifact(name=artifact_name,\n"," type=artifact_type,\n"," description=artifact_description)\n","artifact.add_file(artifact_name)"]},{"cell_type":"code","execution_count":27,"metadata":{"id":"0dVXWRhmGmOn","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1649111732059,"user_tz":180,"elapsed":430,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}},"outputId":"4742801e-ff1c-497c-9554-15ae996514ac"},"outputs":[{"output_type":"execute_result","data":{"text/plain":[""]},"metadata":{},"execution_count":27}],"source":["# Upload the artifact to Wandb\n","run.log_artifact(artifact)"]},{"cell_type":"code","source":["# close the run\n","# waiting a while after run the previous cell before execute this\n","run.finish()"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":104,"referenced_widgets":["5a6fb731545e418cad4d849f125a7b75","cc467ffc17814d0595bcbefc7da605c0","343d716007a04ebaa529b0416e5fa984","494329a9abcc46a9a82b2c231e917752","5f92efb7fa3742f3b46eff74fc48ec2f","46aaa8f54fd6438a83af95b4413ca42e","778a30fa4e6b424a80f3c067f43499b2","1c60b15036a64ebd95f57faf6c22c15d"]},"id":"mqRqmqbDp3Zc","executionInfo":{"status":"ok","timestamp":1649111754546,"user_tz":180,"elapsed":5106,"user":{"displayName":"Ivanovitch Silva","userId":"06428777505436195303"}},"outputId":"d20a1136-0fb7-464d-b958-e567709b883d"},"execution_count":28,"outputs":[{"output_type":"stream","name":"stdout","text":["\n"]},{"output_type":"display_data","data":{"text/plain":[""],"text/html":["Waiting for W&B process to finish... (success)."]},"metadata":{}},{"output_type":"display_data","data":{"text/plain":["VBox(children=(Label(value='3.633 MB of 3.633 MB uploaded (0.000 MB deduped)\\r'), FloatProgress(value=1.0, max…"],"application/vnd.jupyter.widget-view+json":{"version_major":2,"version_minor":0,"model_id":"5a6fb731545e418cad4d849f125a7b75"}},"metadata":{}},{"output_type":"display_data","data":{"text/plain":[""],"text/html":["Synced likely-wind-6: https://wandb.ai/ivanovitchm/decision_tree/runs/n93uggvj
Synced 4 W&B file(s), 0 media file(s), 1 artifact file(s) and 0 other file(s)"]},"metadata":{}},{"output_type":"display_data","data":{"text/plain":[""],"text/html":["Find logs at: ./wandb/run-20220404_223353-n93uggvj/logs"]},"metadata":{}}]},{"cell_type":"code","source":[""],"metadata":{"id":"i2jFX9TtqPLV"},"execution_count":null,"outputs":[]}],"metadata":{"colab":{"collapsed_sections":[],"name":"preprocessing.ipynb","provenance":[]},"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.8.12"},"widgets":{"application/vnd.jupyter.widget-state+json":{"5a6fb731545e418cad4d849f125a7b75":{"model_module":"@jupyter-widgets/controls","model_name":"VBoxModel","model_module_version":"1.5.0","state":{"_dom_classes":[],"_model_module":"@jupyter-widgets/controls","_model_module_version":"1.5.0","_model_name":"VBoxModel","_view_count":null,"_view_module":"@jupyter-widgets/controls","_view_module_version":"1.5.0","_view_name":"VBoxView","box_style":"","children":["IPY_MODEL_cc467ffc17814d0595bcbefc7da605c0","IPY_MODEL_343d716007a04ebaa529b0416e5fa984"],"layout":"IPY_MODEL_494329a9abcc46a9a82b2c231e917752"}},"cc467ffc17814d0595bcbefc7da605c0":{"model_module":"@jupyter-widgets/controls","model_name":"LabelModel","model_module_version":"1.5.0","state":{"_dom_classes":[],"_model_module":"@jupyter-widgets/controls","_model_module_version":"1.5.0","_model_name":"LabelModel","_view_count":null,"_view_module":"@jupyter-widgets/controls","_view_module_version":"1.5.0","_view_name":"LabelView","description":"","description_tooltip":null,"layout":"IPY_MODEL_5f92efb7fa3742f3b46eff74fc48ec2f","placeholder":"​","style":"IPY_MODEL_46aaa8f54fd6438a83af95b4413ca42e","value":"3.640 MB of 3.640 MB uploaded (0.000 MB deduped)\r"}},"343d716007a04ebaa529b0416e5fa984":{"model_module":"@jupyter-widgets/controls","model_name":"FloatProgressModel","model_module_version":"1.5.0","state":{"_dom_classes":[],"_model_module":"@jupyter-widgets/controls","_model_module_version":"1.5.0","_model_name":"FloatProgressModel","_view_count":null,"_view_module":"@jupyter-widgets/controls","_view_module_version":"1.5.0","_view_name":"ProgressView","bar_style":"","description":"","description_tooltip":null,"layout":"IPY_MODEL_778a30fa4e6b424a80f3c067f43499b2","max":1,"min":0,"orientation":"horizontal","style":"IPY_MODEL_1c60b15036a64ebd95f57faf6c22c15d","value":1}},"494329a9abcc46a9a82b2c231e917752":{"model_module":"@jupyter-widgets/base","model_name":"LayoutModel","model_module_version":"1.2.0","state":{"_model_module":"@jupyter-widgets/base","_model_module_version":"1.2.0","_model_name":"LayoutModel","_view_count":null,"_view_module":"@jupyter-widgets/base","_view_module_version":"1.2.0","_view_name":"LayoutView","align_content":null,"align_items":null,"align_self":null,"border":null,"bottom":null,"display":null,"flex":null,"flex_flow":null,"grid_area":null,"grid_auto_columns":null,"grid_auto_flow":null,"grid_auto_rows":null,"grid_column":null,"grid_gap":null,"grid_row":null,"grid_template_areas":null,"grid_template_columns":null,"grid_template_rows":null,"height":null,"justify_content":null,"justify_items":null,"left":null,"margin":null,"max_height":null,"max_width":null,"min_height":null,"min_width":null,"object_fit":null,"object_position":null,"order":null,"overflow":null,"overflow_x":null,"overflow_y":null,"padding":null,"right":null,"top":null,"visibility":null,"width":null}},"5f92efb7fa3742f3b46eff74fc48ec2f":{"model_module":"@jupyter-widgets/base","model_name":"LayoutModel","model_module_version":"1.2.0","state":{"_model_module":"@jupyter-widgets/base","_model_module_version":"1.2.0","_model_name":"LayoutModel","_view_count":null,"_view_module":"@jupyter-widgets/base","_view_module_version":"1.2.0","_view_name":"LayoutView","align_content":null,"align_items":null,"align_self":null,"border":null,"bottom":null,"display":null,"flex":null,"flex_flow":null,"grid_area":null,"grid_auto_columns":null,"grid_auto_flow":null,"grid_auto_rows":null,"grid_column":null,"grid_gap":null,"grid_row":null,"grid_template_areas":null,"grid_template_columns":null,"grid_template_rows":null,"height":null,"justify_content":null,"justify_items":null,"left":null,"margin":null,"max_height":null,"max_width":null,"min_height":null,"min_width":null,"object_fit":null,"object_position":null,"order":null,"overflow":null,"overflow_x":null,"overflow_y":null,"padding":null,"right":null,"top":null,"visibility":null,"width":null}},"46aaa8f54fd6438a83af95b4413ca42e":{"model_module":"@jupyter-widgets/controls","model_name":"DescriptionStyleModel","model_module_version":"1.5.0","state":{"_model_module":"@jupyter-widgets/controls","_model_module_version":"1.5.0","_model_name":"DescriptionStyleModel","_view_count":null,"_view_module":"@jupyter-widgets/base","_view_module_version":"1.2.0","_view_name":"StyleView","description_width":""}},"778a30fa4e6b424a80f3c067f43499b2":{"model_module":"@jupyter-widgets/base","model_name":"LayoutModel","model_module_version":"1.2.0","state":{"_model_module":"@jupyter-widgets/base","_model_module_version":"1.2.0","_model_name":"LayoutModel","_view_count":null,"_view_module":"@jupyter-widgets/base","_view_module_version":"1.2.0","_view_name":"LayoutView","align_content":null,"align_items":null,"align_self":null,"border":null,"bottom":null,"display":null,"flex":null,"flex_flow":null,"grid_area":null,"grid_auto_columns":null,"grid_auto_flow":null,"grid_auto_rows":null,"grid_column":null,"grid_gap":null,"grid_row":null,"grid_template_areas":null,"grid_template_columns":null,"grid_template_rows":null,"height":null,"justify_content":null,"justify_items":null,"left":null,"margin":null,"max_height":null,"max_width":null,"min_height":null,"min_width":null,"object_fit":null,"object_position":null,"order":null,"overflow":null,"overflow_x":null,"overflow_y":null,"padding":null,"right":null,"top":null,"visibility":null,"width":null}},"1c60b15036a64ebd95f57faf6c22c15d":{"model_module":"@jupyter-widgets/controls","model_name":"ProgressStyleModel","model_module_version":"1.5.0","state":{"_model_module":"@jupyter-widgets/controls","_model_module_version":"1.5.0","_model_name":"ProgressStyleModel","_view_count":null,"_view_module":"@jupyter-widgets/base","_view_module_version":"1.2.0","_view_name":"StyleView","bar_color":null,"description_width":""}}}}},"nbformat":4,"nbformat_minor":0} -------------------------------------------------------------------------------- /lessons/week_02/sources/test.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "id": "0I4pgzLVtBTP" 7 | }, 8 | "source": [ 9 | "# 1.0 An end-to-end classification problem (Testing)\n", 10 | "\n" 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": { 16 | "id": "Dh34gim6KPtT" 17 | }, 18 | "source": [ 19 | "## 1.1 Dataset description" 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": { 25 | "id": "iE8OJoDZ5AFK" 26 | }, 27 | "source": [ 28 | "We'll be looking at individual income in the United States. The **data** is from the **1994 census**, and contains information on an individual's **marital status**, **age**, **type of work**, and more. The **target column**, or what we want to predict, is whether individuals make less than or equal to 50k a year, or more than **50k a year**.\n", 29 | "\n", 30 | "You can download the data from the [University of California, Irvine's website](http://archive.ics.uci.edu/ml/datasets/Adult).\n", 31 | "\n", 32 | "Let's take the following steps:\n", 33 | "\n", 34 | "1. ETL (done)\n", 35 | "2. Data Checks (done)\n", 36 | "3. Data Segregation (done)\n", 37 | "4. Training (done)\n", 38 | "5. Test\n", 39 | "\n", 40 | "
\n" 41 | ] 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "metadata": { 46 | "id": "7UpxKxU1Ej7f" 47 | }, 48 | "source": [ 49 | "## 1.2 Install, load libraries" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": null, 55 | "metadata": { 56 | "id": "t82KewAPWCYe" 57 | }, 58 | "outputs": [], 59 | "source": [ 60 | "!pip install wandb" 61 | ] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "execution_count": null, 66 | "metadata": { 67 | "id": "NreRnbvI8lL5" 68 | }, 69 | "outputs": [], 70 | "source": [ 71 | "import logging\n", 72 | "import pandas as pd\n", 73 | "import wandb\n", 74 | "import joblib\n", 75 | "from sklearn.base import BaseEstimator, TransformerMixin\n", 76 | "from sklearn.metrics import fbeta_score, precision_score, recall_score, accuracy_score\n", 77 | "from sklearn.metrics import classification_report\n", 78 | "from sklearn.metrics import confusion_matrix\n", 79 | "from sklearn.metrics import ConfusionMatrixDisplay\n", 80 | "import matplotlib.pyplot as plt" 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": null, 86 | "metadata": { 87 | "id": "QZXcN54GkP25" 88 | }, 89 | "outputs": [], 90 | "source": [ 91 | "# Login to Weights & Biases\n", 92 | "!wandb login --relogin" 93 | ] 94 | }, 95 | { 96 | "cell_type": "markdown", 97 | "metadata": { 98 | "id": "8ueX2KClcICb" 99 | }, 100 | "source": [ 101 | "## 1.3 Test evaluation" 102 | ] 103 | }, 104 | { 105 | "cell_type": "markdown", 106 | "metadata": { 107 | "id": "rU-ssUv-gx8K" 108 | }, 109 | "source": [ 110 | "### 1.3.1 Definition of the base classes" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": { 116 | "id": "VeqPYBtbg7DF" 117 | }, 118 | "source": [ 119 | "This is necessary in order to ```joblib.load()```see the previous definitions used in the Train Pipeline." 120 | ] 121 | }, 122 | { 123 | "cell_type": "code", 124 | "execution_count": null, 125 | "metadata": { 126 | "id": "vQ4naPLtgVAm" 127 | }, 128 | "outputs": [], 129 | "source": [ 130 | "class FeatureSelector(BaseEstimator, TransformerMixin):\n", 131 | " # Class Constructor\n", 132 | " def __init__(self, feature_names):\n", 133 | " self.feature_names = feature_names\n", 134 | "\n", 135 | " # Return self nothing else to do here\n", 136 | " def fit(self, X, y=None):\n", 137 | " return self\n", 138 | "\n", 139 | " # Method that describes what this custom transformer need to do\n", 140 | " def transform(self, X, y=None):\n", 141 | " return X[self.feature_names]\n", 142 | "\n", 143 | "# Handling categorical features\n", 144 | "class CategoricalTransformer(BaseEstimator, TransformerMixin):\n", 145 | " # Class constructor method that takes one boolean as its argument\n", 146 | " def __init__(self, new_features=True, colnames=None):\n", 147 | " self.new_features = new_features\n", 148 | " self.colnames = colnames\n", 149 | "\n", 150 | " # Return self nothing else to do here\n", 151 | " def fit(self, X, y=None):\n", 152 | " return self\n", 153 | "\n", 154 | " def get_feature_names_out(self):\n", 155 | " return self.colnames.tolist()\n", 156 | "\n", 157 | " # Transformer method we wrote for this transformer\n", 158 | " def transform(self, X, y=None):\n", 159 | " df = pd.DataFrame(X, columns=self.colnames)\n", 160 | "\n", 161 | " # Remove white space in categorical features\n", 162 | " df = df.apply(lambda row: row.str.strip())\n", 163 | "\n", 164 | " # customize feature?\n", 165 | " # How can I identify what needs to be modified? EDA!!!!\n", 166 | " if self.new_features:\n", 167 | "\n", 168 | " # minimize the cardinality of native_country feature\n", 169 | " # check cardinality using df.native_country.unique()\n", 170 | " df.loc[df['native_country'] != 'United-States','native_country'] = 'non_usa'\n", 171 | "\n", 172 | " # replace ? with Unknown\n", 173 | " edit_cols = ['native_country', 'occupation', 'workclass']\n", 174 | " for col in edit_cols:\n", 175 | " df.loc[df[col] == '?', col] = 'unknown'\n", 176 | "\n", 177 | " # decrease the cardinality of education feature\n", 178 | " hs_grad = ['HS-grad', '11th', '10th', '9th', '12th']\n", 179 | " elementary = ['1st-4th', '5th-6th', '7th-8th']\n", 180 | " # replace\n", 181 | " df['education'].replace(to_replace=hs_grad,value='HS-grad',inplace=True)\n", 182 | " df['education'].replace(to_replace=elementary,value='elementary_school',inplace=True)\n", 183 | "\n", 184 | " # adjust marital_status feature\n", 185 | " married = ['Married-spouse-absent','Married-civ-spouse','Married-AF-spouse']\n", 186 | " separated = ['Separated', 'Divorced']\n", 187 | "\n", 188 | " # replace\n", 189 | " df['marital_status'].replace(to_replace=married, value='Married', inplace=True)\n", 190 | " df['marital_status'].replace(to_replace=separated, value='Separated', inplace=True)\n", 191 | "\n", 192 | " # adjust workclass feature\n", 193 | " self_employed = ['Self-emp-not-inc', 'Self-emp-inc']\n", 194 | " govt_employees = ['Local-gov', 'State-gov', 'Federal-gov']\n", 195 | "\n", 196 | " # replace elements in list.\n", 197 | " df['workclass'].replace(to_replace=self_employed,value='Self_employed',inplace=True)\n", 198 | " df['workclass'].replace(to_replace=govt_employees,value='Govt_employees',inplace=True)\n", 199 | "\n", 200 | " # update column names\n", 201 | " self.colnames = df.columns\n", 202 | "\n", 203 | " return df\n", 204 | " \n", 205 | "# transform numerical features\n", 206 | "class NumericalTransformer(BaseEstimator, TransformerMixin):\n", 207 | " # Class constructor method that takes a model parameter as its argument\n", 208 | " # model 0: minmax\n", 209 | " # model 1: standard\n", 210 | " # model 2: without scaler\n", 211 | " def __init__(self, model=0, colnames=None):\n", 212 | " self.model = model\n", 213 | " self.colnames = colnames\n", 214 | " self.scaler = None\n", 215 | "\n", 216 | " # Fit is used only to learn statistical about Scalers\n", 217 | " def fit(self, X, y=None):\n", 218 | " df = pd.DataFrame(X, columns=self.colnames)\n", 219 | " # minmax\n", 220 | " if self.model == 0:\n", 221 | " self.scaler = MinMaxScaler()\n", 222 | " self.scaler.fit(df)\n", 223 | " # standard scaler\n", 224 | " elif self.model == 1:\n", 225 | " self.scaler = StandardScaler()\n", 226 | " self.scaler.fit(df)\n", 227 | " return self\n", 228 | "\n", 229 | " # return columns names after transformation\n", 230 | " def get_feature_names_out(self):\n", 231 | " return self.colnames\n", 232 | "\n", 233 | " # Transformer method we wrote for this transformer\n", 234 | " # Use fitted scalers\n", 235 | " def transform(self, X, y=None):\n", 236 | " df = pd.DataFrame(X, columns=self.colnames)\n", 237 | "\n", 238 | " # update columns name\n", 239 | " self.colnames = df.columns.tolist()\n", 240 | "\n", 241 | " # minmax\n", 242 | " if self.model == 0:\n", 243 | " # transform data\n", 244 | " df = self.scaler.transform(df)\n", 245 | " elif self.model == 1:\n", 246 | " # transform data\n", 247 | " df = self.scaler.transform(df)\n", 248 | " else:\n", 249 | " df = df.values\n", 250 | "\n", 251 | " return df" 252 | ] 253 | }, 254 | { 255 | "cell_type": "markdown", 256 | "metadata": { 257 | "id": "8N4nmtemhLLZ" 258 | }, 259 | "source": [ 260 | "### 1.3.2 Evaluation" 261 | ] 262 | }, 263 | { 264 | "cell_type": "code", 265 | "execution_count": null, 266 | "metadata": { 267 | "id": "E-2z7Fq7cdbX" 268 | }, 269 | "outputs": [], 270 | "source": [ 271 | "# global variables\n", 272 | "\n", 273 | "# name of the artifact related to test dataset\n", 274 | "artifact_test_name = \"decision_tree/test.csv:latest\"\n", 275 | "\n", 276 | "# name of the model artifact\n", 277 | "artifact_model_name = \"decision_tree/model_export:latest\"\n", 278 | "\n", 279 | "# name of the target encoder artifact\n", 280 | "artifact_encoder_name = \"decision_tree/target_encoder:latest\"" 281 | ] 282 | }, 283 | { 284 | "cell_type": "code", 285 | "execution_count": null, 286 | "metadata": { 287 | "id": "tOh7odFBdO88" 288 | }, 289 | "outputs": [], 290 | "source": [ 291 | "# configure logging\n", 292 | "logging.basicConfig(level=logging.INFO,\n", 293 | " format=\"%(asctime)s %(message)s\",\n", 294 | " datefmt='%d-%m-%Y %H:%M:%S')\n", 295 | "\n", 296 | "# reference for a logging obj\n", 297 | "logger = logging.getLogger()" 298 | ] 299 | }, 300 | { 301 | "cell_type": "code", 302 | "execution_count": null, 303 | "metadata": { 304 | "id": "O177vAixdW9i" 305 | }, 306 | "outputs": [], 307 | "source": [ 308 | "# initiate the wandb project\n", 309 | "run = wandb.init(project=\"decision_tree\",job_type=\"test\")" 310 | ] 311 | }, 312 | { 313 | "cell_type": "code", 314 | "execution_count": null, 315 | "metadata": { 316 | "id": "siFQuqmvdiRO" 317 | }, 318 | "outputs": [], 319 | "source": [ 320 | "logger.info(\"Downloading and reading test artifact\")\n", 321 | "test_data_path = run.use_artifact(artifact_test_name).file()\n", 322 | "df_test = pd.read_csv(test_data_path)\n", 323 | "\n", 324 | "# Extract the target from the features\n", 325 | "logger.info(\"Extracting target from dataframe\")\n", 326 | "x_test = df_test.copy()\n", 327 | "y_test = x_test.pop(\"high_income\")" 328 | ] 329 | }, 330 | { 331 | "cell_type": "code", 332 | "execution_count": null, 333 | "metadata": { 334 | "id": "pdXTWnBfdwbE" 335 | }, 336 | "outputs": [], 337 | "source": [ 338 | "# Takes a look at test set\n", 339 | "x_test.head()" 340 | ] 341 | }, 342 | { 343 | "cell_type": "code", 344 | "execution_count": null, 345 | "metadata": { 346 | "id": "irBhj9UXdybG" 347 | }, 348 | "outputs": [], 349 | "source": [ 350 | "# Take a look at the target variable\n", 351 | "y_test.head()" 352 | ] 353 | }, 354 | { 355 | "cell_type": "code", 356 | "execution_count": null, 357 | "metadata": { 358 | "id": "dwIVe_-GeBFz" 359 | }, 360 | "outputs": [], 361 | "source": [ 362 | "# Extract the encoding of the target variable\n", 363 | "logger.info(\"Extracting the encoding of the target variable\")\n", 364 | "encoder_export_path = run.use_artifact(artifact_encoder_name).file()\n", 365 | "le = joblib.load(encoder_export_path)" 366 | ] 367 | }, 368 | { 369 | "cell_type": "code", 370 | "execution_count": null, 371 | "metadata": { 372 | "id": "GtdxHiF7e8oJ" 373 | }, 374 | "outputs": [], 375 | "source": [ 376 | "# transform y_train\n", 377 | "y_test = le.transform(y_test)\n", 378 | "logger.info(\"Classes [0, 1]: {}\".format(le.inverse_transform([0, 1])))" 379 | ] 380 | }, 381 | { 382 | "cell_type": "code", 383 | "execution_count": null, 384 | "metadata": { 385 | "id": "QRJ_dDWNfD0Z" 386 | }, 387 | "outputs": [], 388 | "source": [ 389 | "# target variable after the encoding\n", 390 | "y_test" 391 | ] 392 | }, 393 | { 394 | "cell_type": "code", 395 | "execution_count": null, 396 | "metadata": { 397 | "id": "hubXniw5fGc_" 398 | }, 399 | "outputs": [], 400 | "source": [ 401 | "# Download inference artifact\n", 402 | "logger.info(\"Downloading and load the exported model\")\n", 403 | "model_export_path = run.use_artifact(artifact_model_name).file()\n", 404 | "pipe = joblib.load(model_export_path)" 405 | ] 406 | }, 407 | { 408 | "cell_type": "code", 409 | "execution_count": null, 410 | "metadata": { 411 | "id": "7AmX-6-Sf0v0" 412 | }, 413 | "outputs": [], 414 | "source": [ 415 | "# predict\n", 416 | "logger.info(\"Infering\")\n", 417 | "predict = pipe.predict(x_test)\n", 418 | "\n", 419 | "# Evaluation Metrics\n", 420 | "logger.info(\"Test Evaluation metrics\")\n", 421 | "fbeta = fbeta_score(y_test, predict, beta=1, zero_division=1)\n", 422 | "precision = precision_score(y_test, predict, zero_division=1)\n", 423 | "recall = recall_score(y_test, predict, zero_division=1)\n", 424 | "acc = accuracy_score(y_test, predict)\n", 425 | "\n", 426 | "logger.info(\"Test Accuracy: {}\".format(acc))\n", 427 | "logger.info(\"Test Precision: {}\".format(precision))\n", 428 | "logger.info(\"Test Recall: {}\".format(recall))\n", 429 | "logger.info(\"Test F1: {}\".format(fbeta))\n", 430 | "\n", 431 | "run.summary[\"Acc\"] = acc\n", 432 | "run.summary[\"Precision\"] = precision\n", 433 | "run.summary[\"Recall\"] = recall\n", 434 | "run.summary[\"F1\"] = fbeta" 435 | ] 436 | }, 437 | { 438 | "cell_type": "code", 439 | "execution_count": null, 440 | "metadata": { 441 | "id": "HRdA8Djahueu" 442 | }, 443 | "outputs": [], 444 | "source": [ 445 | "# Compare the accuracy, precision, recall with previous ones\n", 446 | "print(classification_report(y_test,predict))" 447 | ] 448 | }, 449 | { 450 | "cell_type": "code", 451 | "execution_count": null, 452 | "metadata": { 453 | "id": "DuM6sg3wh6k_" 454 | }, 455 | "outputs": [], 456 | "source": [ 457 | "fig_confusion_matrix, ax = plt.subplots(1,1,figsize=(7,4))\n", 458 | "ConfusionMatrixDisplay(confusion_matrix(predict,y_test,labels=[1,0]),\n", 459 | " display_labels=[\">50k\",\"<=50k\"]).plot(values_format=\".0f\",ax=ax)\n", 460 | "\n", 461 | "ax.set_xlabel(\"True Label\")\n", 462 | "ax.set_ylabel(\"Predicted Label\")\n", 463 | "plt.show()" 464 | ] 465 | }, 466 | { 467 | "cell_type": "code", 468 | "execution_count": null, 469 | "metadata": { 470 | "id": "QXis7klgiEWG" 471 | }, 472 | "outputs": [], 473 | "source": [ 474 | "# Uploading figures\n", 475 | "logger.info(\"Uploading figures\")\n", 476 | "run.log(\n", 477 | " {\n", 478 | " \"confusion_matrix\": wandb.Image(fig_confusion_matrix),\n", 479 | " # \"other_figure\": wandb.Image(other_fig)\n", 480 | " }\n", 481 | ")" 482 | ] 483 | }, 484 | { 485 | "cell_type": "code", 486 | "execution_count": null, 487 | "metadata": { 488 | "id": "MFC4iWBmiTKe" 489 | }, 490 | "outputs": [], 491 | "source": [ 492 | "run.finish()" 493 | ] 494 | }, 495 | { 496 | "cell_type": "code", 497 | "execution_count": null, 498 | "metadata": { 499 | "id": "ln5TlQjURzBp" 500 | }, 501 | "outputs": [], 502 | "source": [] 503 | } 504 | ], 505 | "metadata": { 506 | "colab": { 507 | "collapsed_sections": [], 508 | "name": "test.ipynb", 509 | "provenance": [], 510 | "toc_visible": true 511 | }, 512 | "kernelspec": { 513 | "display_name": "Python 3 (ipykernel)", 514 | "language": "python", 515 | "name": "python3" 516 | }, 517 | "language_info": { 518 | "codemirror_mode": { 519 | "name": "ipython", 520 | "version": 3 521 | }, 522 | "file_extension": ".py", 523 | "mimetype": "text/x-python", 524 | "name": "python", 525 | "nbconvert_exporter": "python", 526 | "pygments_lexer": "ipython3", 527 | "version": "3.8.13" 528 | } 529 | }, 530 | "nbformat": 4, 531 | "nbformat_minor": 4 532 | } 533 | -------------------------------------------------------------------------------- /lessons/week_05/deploy_ml.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/lessons/week_05/deploy_ml.pdf -------------------------------------------------------------------------------- /lessons/week_10/Datasets/data.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/lessons/week_10/Datasets/data.mat -------------------------------------------------------------------------------- /lessons/week_10/Datasets/test_catvnoncat.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/lessons/week_10/Datasets/test_catvnoncat.h5 -------------------------------------------------------------------------------- /lessons/week_10/Datasets/train_catvnoncat.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/lessons/week_10/Datasets/train_catvnoncat.h5 -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/fig10_dimension_1output.tex: -------------------------------------------------------------------------------- 1 | % create a file name latexmkrc and copy this code on it: 2 | % END { system('convert -density 300 output.pdf myImage.png'); } 3 | 4 | \documentclass[convert={density=300,size=1080x800,outext=.png}]{standalone} 5 | \usepackage{neuralnetwork} 6 | \usepackage{tikz} 7 | \usepackage{xpatch} 8 | \makeatletter 9 | 10 | % source: https://tex.stackexchange.com/questions/406167/generalize-neural-network 11 | % \linklayers have \nn@lastnode instead of \lastnode, 12 | % patch it to replace the former with the latter, and similar for thisnode 13 | \xpatchcmd{\linklayers}{\nn@lastnode}{\lastnode}{}{} 14 | \xpatchcmd{\linklayers}{\nn@thisnode}{\thisnode}{}{} 15 | \makeatother 16 | 17 | \begin{document} 18 | 19 | \begin{neuralnetwork}[height=1,maintitleheight=1cm,layertitleheight=2.5cm, layerspacing=50] 20 | \newcommand{\textinput}[2]{\ifnum #2=4 $x_n$ \else $x_#2$ \fi} 21 | \newcommand{\textactivation}[2]{$g(.)$} 22 | \newcommand{\texthidden}[2]{\ifnum #2=4 $z_{1,k}$\else $z_{1,#2}$ \fi} 23 | \newcommand{\textout}[2]{\ifnum #2=1 $z_{2,j}$ \else $z_{2,#2}$ \fi} 24 | 25 | \newcommand{\w}[4] {\ifnum #2=4 \ifnum #4=1 $w_{k,j}^{(2)}$ \else $w_{k,#4}^{(2)}$ \fi \else \ifnum #4=2 $w_{#2,j}^{(2)}$ \else $w_{#2,#4}^{(2)}$ \fi \fi} 26 | \newcommand{\ww}[4] {\ifnum #4=4 $w_{n,k}^{(1)}$ \else $w_{n,#4}^{(1)}$ \fi} 27 | 28 | \newcommand{\ws}[4] {} 29 | 30 | \inputlayer[count=4, title={}, bias=True, exclude={3}, text=\textinput] 31 | \hiddenlayer[count=4, title={}, bias=True, text=\texthidden] 32 | \linklayers[not from={3}] 33 | \setdefaultlinklabel{\ww} 34 | \foreach \n in {4}{ 35 | \foreach \m in {1,2,3,4}{ 36 | \link[style={}, labelpos=midway, from layer=0, from node=\n, to layer=1, to node=\m] 37 | } 38 | } 39 | \setdefaultlinklabel{\ws} 40 | \outputlayer[count=1, title={}, bias=False, text=\textout] 41 | \linklayers 42 | \setdefaultlinklabel{\w} 43 | \foreach \o in {0,1,2,3,4}{ 44 | \foreach \p in {1}{ 45 | \link[style={}, labelpos=midway, from layer=1, from node=\o, to layer=2, to node=\p] 46 | } 47 | } 48 | 49 | 50 | % draw dots 51 | \path (L0-2) -- node{$\vdots$} (L0-4); 52 | \draw [->,>=stealth,gray] (L2-1) -- node [black,midway,above] {$\hat{y}_1=g(z_{2,j})$} +(2,0); 53 | %\draw [->,>=stealth,gray] (L2-2) -- node [black,midway,above] {$\hat{y}_2=g(z_{2,j})$} +(2,0); 54 | \end{neuralnetwork} 55 | 56 | \end{document} 57 | -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/fig_1_perceptron_N.tex: -------------------------------------------------------------------------------- 1 | % create a file name latexmkrc and copy this code on it: 2 | % END { system('convert -density 300 output.pdf myImage.png'); } 3 | 4 | \documentclass[convert={density=300,size=1080x800,outext=.png}]{standalone} 5 | \usepackage{neuralnetwork} 6 | 7 | \usepackage{xpatch} 8 | \makeatletter 9 | 10 | % source: https://tex.stackexchange.com/questions/406167/generalize-neural-network 11 | % \linklayers have \nn@lastnode instead of \lastnode, 12 | % patch it to replace the former with the latter, and similar for thisnode 13 | \xpatchcmd{\linklayers}{\nn@lastnode}{\lastnode}{}{} 14 | \xpatchcmd{\linklayers}{\nn@thisnode}{\thisnode}{}{} 15 | \makeatother 16 | 17 | \begin{document} 18 | 19 | \begin{neuralnetwork}[height=1,maintitleheight=1cm,layertitleheight=2.5cm, layerspacing=50] 20 | \newcommand{\textinput}[2]{\ifnum #2=4 $x_n$ \else $x_#2$ \fi} 21 | \newcommand{\textactivation}[2]{$g(.)$} 22 | \newcommand{\textsum}[2]{$\sum$} 23 | \newcommand{\textout}[2]{$\hat{y}$} 24 | 25 | \newcommand{\w}[4] {\ifnum #2=4 $w_n$ \else $w_#2$ \fi} 26 | \newcommand{\ws}[4] {} 27 | \setdefaultlinklabel{\w} 28 | 29 | \inputlayer[count=4, title={Inputs}, bias=False, exclude={3}, text=\textinput] 30 | \hiddenlayer[count=1, title={Sum}, bias=False, text=\textsum] 31 | \linklayers[not from={3}] 32 | \hiddenlayer[count=1, title={Non-Linearity}, bias=False, text=\textactivation] 33 | \setdefaultlinklabel{\ws} 34 | \linklayers 35 | \outputlayer[count=1, title={Output}, text=\textout] 36 | \linklayers 37 | 38 | % draw dots 39 | \path (L0-2) -- node{$\vdots$} (L0-4); 40 | \end{neuralnetwork} 41 | 42 | \end{document} 43 | -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/fig_2_perceptron_bias.tex: -------------------------------------------------------------------------------- 1 | % create a file name latexmkrc and copy this code on it: 2 | % END { system('convert -density 300 output.pdf myImage.png'); } 3 | 4 | \documentclass[convert={density=300,size=1080x800,outext=.png}]{standalone} 5 | \usepackage{neuralnetwork} 6 | 7 | \usepackage{xpatch} 8 | \makeatletter 9 | 10 | % source: https://tex.stackexchange.com/questions/406167/generalize-neural-network 11 | % \linklayers have \nn@lastnode instead of \lastnode, 12 | % patch it to replace the former with the latter, and similar for thisnode 13 | \xpatchcmd{\linklayers}{\nn@lastnode}{\lastnode}{}{} 14 | \xpatchcmd{\linklayers}{\nn@thisnode}{\thisnode}{}{} 15 | \makeatother 16 | 17 | \begin{document} 18 | 19 | \begin{neuralnetwork}[height=1,maintitleheight=1cm,layertitleheight=2.5cm, layerspacing=50] 20 | \newcommand{\textinput}[2]{\ifnum #2=4 $x_n$ \else $x_#2$ \fi} 21 | \newcommand{\textactivation}[2]{$g(.)$} 22 | \newcommand{\textsum}[2]{$\sum$} 23 | \newcommand{\textout}[2]{$\hat{y}$} 24 | 25 | \newcommand{\w}[4] {\ifnum #2=4 $w_n$ \else $w_#2$ \fi} 26 | \newcommand{\ws}[4] {} 27 | \setdefaultlinklabel{\w} 28 | 29 | \inputlayer[count=4, title={Inputs}, bias=True, exclude={3}, text=\textinput] 30 | \hiddenlayer[count=1, title={Sum}, bias=False, text=\textsum] 31 | \linklayers[not from={3}] 32 | \hiddenlayer[count=1, title={Non-Linearity}, bias=False, text=\textactivation] 33 | \setdefaultlinklabel{\ws} 34 | \linklayers 35 | \outputlayer[count=1, title={Output}, text=\textout] 36 | \linklayers 37 | 38 | % draw dots 39 | \path (L0-2) -- node{$\vdots$} (L0-4); 40 | \end{neuralnetwork} 41 | 42 | \end{document} 43 | -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/fig_3_perceptron_example.tex: -------------------------------------------------------------------------------- 1 | % create a file name latexmkrc and copy this code on it: 2 | % END { system('convert -density 300 output.pdf myImage.png'); } 3 | 4 | \documentclass[convert={density=300,size=1080x800,outext=.png}]{standalone} 5 | \usepackage{neuralnetwork} 6 | \begin{document} 7 | 8 | \begin{neuralnetwork}[height=1,maintitleheight=1cm,layertitleheight=2cm, layerspacing=50] 9 | \newcommand{\textinput}[2]{\ifnum #2=3 $x_n$ \else $x_#2$ \fi} 10 | \newcommand{\textactivation}[2]{$g(.)$} 11 | \newcommand{\textsum}[2]{$\sum$} 12 | \newcommand{\textout}[2]{$\hat{y}$} 13 | 14 | \newcommand{\w}[4] {\ifnum #2=0 $1$ \else \ifnum #2=1 $3$ \else -2 \fi \fi} 15 | \newcommand{\ws}[4] {} 16 | \setdefaultlinklabel{\w} 17 | 18 | \inputlayer[count=2, title={}, bias=True, text=\textinput] 19 | \hiddenlayer[count=1, title={}, bias=False, text=\textsum] 20 | \linklayers 21 | \hiddenlayer[count=1, title={}, bias=False, text=\textactivation] 22 | \setdefaultlinklabel{\ws} 23 | \linklayers 24 | \outputlayer[count=1, title={}, text=\textout] 25 | \linklayers 26 | 27 | \end{neuralnetwork} 28 | 29 | \end{document} -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/fig_4_perceptron_z.tex: -------------------------------------------------------------------------------- 1 | % create a file name latexmkrc and copy this code on it: 2 | % END { system('convert -density 300 output.pdf myImage.png'); } 3 | 4 | \documentclass[convert={density=300,size=1080x800,outext=.png}]{standalone} 5 | \usepackage{neuralnetwork} 6 | \usepackage{tikz} 7 | \usepackage{xpatch} 8 | \makeatletter 9 | 10 | % source: https://tex.stackexchange.com/questions/406167/generalize-neural-network 11 | % \linklayers have \nn@lastnode instead of \lastnode, 12 | % patch it to replace the former with the latter, and similar for thisnode 13 | \xpatchcmd{\linklayers}{\nn@lastnode}{\lastnode}{}{} 14 | \xpatchcmd{\linklayers}{\nn@thisnode}{\thisnode}{}{} 15 | \makeatother 16 | 17 | \begin{document} 18 | 19 | \begin{neuralnetwork}[height=1,maintitleheight=1cm,layertitleheight=2.5cm, layerspacing=50] 20 | \newcommand{\textinput}[2]{\ifnum #2=4 $x_n$ \else $x_#2$ \fi} 21 | \newcommand{\textactivation}[2]{$g(.)$} 22 | \newcommand{\textsum}[2]{$\sum$} 23 | \newcommand{\textout}[2]{$z$} 24 | 25 | \newcommand{\w}[4] {\ifnum #2=4 $w_n$ \else $w_#2$ \fi} 26 | \newcommand{\ws}[4] {} 27 | \setdefaultlinklabel{\w} 28 | 29 | \inputlayer[count=4, title={}, bias=True, exclude={3}, text=\textinput] 30 | \outputlayer[count=1, title={}, bias=False, text=\textout] 31 | \linklayers[not from={3}] 32 | 33 | % draw dots 34 | \path (L0-2) -- node{$\vdots$} (L0-4); 35 | \draw [->,>=stealth,gray] (L1-1) -- node [black,midway,above] {$\hat{y}=g(z)$} +(1.6,0); 36 | \end{neuralnetwork} 37 | 38 | \end{document} 39 | -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/fig_5_perceptron_z2.tex: -------------------------------------------------------------------------------- 1 | % create a file name latexmkrc and copy this code on it: 2 | % END { system('convert -density 300 output.pdf myImage.png'); } 3 | 4 | \documentclass[convert={density=300,size=1080x800,outext=.png}]{standalone} 5 | \usepackage{neuralnetwork} 6 | \usepackage{tikz} 7 | \usepackage{xpatch} 8 | \makeatletter 9 | 10 | % source: https://tex.stackexchange.com/questions/406167/generalize-neural-network 11 | % \linklayers have \nn@lastnode instead of \lastnode, 12 | % patch it to replace the former with the latter, and similar for thisnode 13 | \xpatchcmd{\linklayers}{\nn@lastnode}{\lastnode}{}{} 14 | \xpatchcmd{\linklayers}{\nn@thisnode}{\thisnode}{}{} 15 | \makeatother 16 | 17 | \begin{document} 18 | 19 | \begin{neuralnetwork}[height=1,maintitleheight=1cm,layertitleheight=2.5cm, layerspacing=50] 20 | \newcommand{\textinput}[2]{\ifnum #2=4 $x_n$ \else $x_#2$ \fi} 21 | \newcommand{\textactivation}[2]{$g(.)$} 22 | \newcommand{\textsum}[2]{$\sum$} 23 | \newcommand{\textout}[2]{$z_{#2}$} 24 | 25 | \newcommand{\w}[4] {\ifnum #2=4 $w_{n,#4}$ \else $w_{#2,#4}$ \fi} 26 | \newcommand{\ws}[4] {} 27 | \setdefaultlinklabel{\w} 28 | 29 | \inputlayer[count=4, title={}, bias=True, exclude={3}, text=\textinput] 30 | \outputlayer[count=2, title={}, bias=False, text=\textout] 31 | %\linklayers[not from={3}] 32 | 33 | \foreach \n in {0,1,2,4}{ 34 | \foreach \m in {1,2}{ 35 | \link[style={}, labelpos=near start, from layer=0, from node=\n, to layer=1, to node=\m] 36 | } 37 | } 38 | 39 | % draw dots 40 | \path (L0-2) -- node{$\vdots$} (L0-4); 41 | \draw [->,>=stealth,gray] (L1-1) -- node [black,midway,above] {$\hat{y}_1=g(z_1)$} +(1.9,0); 42 | \draw [->,>=stealth,gray] (L1-2) -- node [black,midway,above] {$\hat{y}_2=g(z_2)$} +(1.9,0); 43 | \end{neuralnetwork} 44 | 45 | \end{document} 46 | -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/fig_6_one_hidden.tex: -------------------------------------------------------------------------------- 1 | % create a file name latexmkrc and copy this code on it: 2 | % END { system('convert -density 300 output.pdf myImage.png'); } 3 | 4 | \documentclass[convert={density=300,size=1080x800,outext=.png}]{standalone} 5 | \usepackage{neuralnetwork} 6 | \usepackage{tikz} 7 | \usepackage{xpatch} 8 | \makeatletter 9 | 10 | % source: https://tex.stackexchange.com/questions/406167/generalize-neural-network 11 | % \linklayers have \nn@lastnode instead of \lastnode, 12 | % patch it to replace the former with the latter, and similar for thisnode 13 | \xpatchcmd{\linklayers}{\nn@lastnode}{\lastnode}{}{} 14 | \xpatchcmd{\linklayers}{\nn@thisnode}{\thisnode}{}{} 15 | \makeatother 16 | 17 | \begin{document} 18 | 19 | \begin{neuralnetwork}[height=1,maintitleheight=1cm,layertitleheight=2.5cm, layerspacing=50] 20 | \newcommand{\textinput}[2]{\ifnum #2=4 $x_n$ \else $x_#2$ \fi} 21 | \newcommand{\textactivation}[2]{$g(.)$} 22 | \newcommand{\texthidden}[2]{$z_{1,#2}$} 23 | \newcommand{\textout}[2]{$z_{2,#2}$} 24 | 25 | \newcommand{\w}[4] {\ifnum #2=4 $w_{n,#4}$ \else $w_{#2,#4}$ \fi} 26 | \newcommand{\ws}[4] {} 27 | \setdefaultlinklabel{\ws} 28 | 29 | \inputlayer[count=4, title={}, bias=True, exclude={3}, text=\textinput] 30 | \hiddenlayer[count=4, title={}, bias=True, text=\texthidden] 31 | \linklayers[not from={3}] 32 | \outputlayer[count=2, title={}, bias=False, text=\textout] 33 | \linklayers 34 | 35 | 36 | 37 | % draw dots 38 | \path (L0-2) -- node{$\vdots$} (L0-4); 39 | \draw [->,>=stealth,gray] (L2-1) -- node [black,midway,above] {$\hat{y}_1=g(z_{2,1})$} +(2,0); 40 | \draw [->,>=stealth,gray] (L2-2) -- node [black,midway,above] {$\hat{y}_2=g(z_{2,2})$} +(2,0); 41 | \end{neuralnetwork} 42 | 43 | \end{document} 44 | -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/fig_6_one_hidden_B.tex: -------------------------------------------------------------------------------- 1 | % create a file name latexmkrc and copy this code on it: 2 | % END { system('convert -density 300 output.pdf myImage.png'); } 3 | 4 | \documentclass[convert={density=300,size=1080x800,outext=.png}]{standalone} 5 | \usepackage{neuralnetwork} 6 | \usepackage{tikz} 7 | \usepackage{xpatch} 8 | \makeatletter 9 | 10 | % source: https://tex.stackexchange.com/questions/406167/generalize-neural-network 11 | % \linklayers have \nn@lastnode instead of \lastnode, 12 | % patch it to replace the former with the latter, and similar for thisnode 13 | \xpatchcmd{\linklayers}{\nn@lastnode}{\lastnode}{}{} 14 | \xpatchcmd{\linklayers}{\nn@thisnode}{\thisnode}{}{} 15 | \makeatother 16 | 17 | \begin{document} 18 | 19 | \begin{neuralnetwork}[height=1,maintitleheight=1cm,layertitleheight=2.5cm, layerspacing=50] 20 | \newcommand{\textinput}[2]{\ifnum #2=4 $x_n$ \else $x_#2$ \fi} 21 | \newcommand{\textactivation}[2]{$g(.)$} 22 | \newcommand{\texthidden}[2]{$z_{1,#2}$} 23 | \newcommand{\textout}[2]{$z_{2,#2}$} 24 | 25 | \newcommand{\w}[4] {$w_{#2,#4}^{(2)}$} 26 | \newcommand{\ww}[4] {\ifnum #2=4 $w_{n,#4}^{(1)}$ \else $w_{#2,#4}^{(1)}$ \fi} 27 | 28 | \newcommand{\ws}[4] {} 29 | 30 | \inputlayer[count=4, title={}, bias=True, exclude={3}, text=\textinput] 31 | \hiddenlayer[count=4, title={}, bias=True, text=\texthidden] 32 | \linklayers[not from={3}] 33 | \setdefaultlinklabel{\ww} 34 | \foreach \n in {4}{ 35 | \foreach \m in {1,2,3,4}{ 36 | \link[style={}, labelpos=midway, from layer=0, from node=\n, to layer=1, to node=\m] 37 | } 38 | } 39 | \setdefaultlinklabel{\ws} 40 | \outputlayer[count=2, title={}, bias=False, text=\textout] 41 | \linklayers 42 | \setdefaultlinklabel{\w} 43 | \foreach \o in {0,4}{ 44 | \foreach \p in {1,2}{ 45 | \link[style={}, labelpos=midway, from layer=1, from node=\o, to layer=2, to node=\p] 46 | } 47 | } 48 | 49 | 50 | 51 | % draw dots 52 | \path (L0-2) -- node{$\vdots$} (L0-4); 53 | \draw [->,>=stealth,gray] (L2-1) -- node [black,midway,above] {$\hat{y}_1=g(z_{2,1})$} +(2,0); 54 | \draw [->,>=stealth,gray] (L2-2) -- node [black,midway,above] {$\hat{y}_2=g(z_{2,2})$} +(2,0); 55 | \end{neuralnetwork} 56 | 57 | \end{document} 58 | -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/fig_7_two_hidden.tex: -------------------------------------------------------------------------------- 1 | % create a file name latexmkrc and copy this code on it: 2 | % END { system('convert -density 300 output.pdf myImage.png'); } 3 | 4 | \documentclass[convert={density=300,size=1080x800,outext=.png}]{standalone} 5 | \usepackage{neuralnetwork} 6 | \usepackage{tikz} 7 | \usepackage{xpatch} 8 | \makeatletter 9 | 10 | % source: https://tex.stackexchange.com/questions/406167/generalize-neural-network 11 | % \linklayers have \nn@lastnode instead of \lastnode, 12 | % patch it to replace the former with the latter, and similar for thisnode 13 | \xpatchcmd{\linklayers}{\nn@lastnode}{\lastnode}{}{} 14 | \xpatchcmd{\linklayers}{\nn@thisnode}{\thisnode}{}{} 15 | \makeatother 16 | 17 | \begin{document} 18 | 19 | \begin{neuralnetwork}[height=1,maintitleheight=1cm,layertitleheight=2.5cm, layerspacing=50] 20 | \newcommand{\textinput}[2]{\ifnum #2=4 $x_n$ \else $x_#2$ \fi} 21 | \newcommand{\textactivation}[2]{$g(.)$} 22 | \newcommand{\texthiddena}[2]{$z_{1,#2}$} 23 | \newcommand{\texthiddenb}[2]{$z_{2,#2}$} 24 | \newcommand{\textout}[2]{$z_{3,#2}$} 25 | 26 | \newcommand{\w}[4] {\ifnum #2=4 $w_{n,#4}$ \else $w_{#2,#4}$ \fi} 27 | \newcommand{\ws}[4] {} 28 | \setdefaultlinklabel{\ws} 29 | 30 | \inputlayer[count=4, title={}, bias=True, exclude={3}, text=\textinput] 31 | \hiddenlayer[count=4, title={}, bias=True, text=\texthiddena] 32 | \linklayers[not from={3}] 33 | \hiddenlayer[count=4, title={}, bias=True, text=\texthiddenb] 34 | \linklayers 35 | \outputlayer[count=2, title={}, bias=False, text=\textout] 36 | \linklayers 37 | 38 | 39 | 40 | % draw dots 41 | \path (L0-2) -- node{$\vdots$} (L0-4); 42 | \draw [->,>=stealth,gray] (L3-1) -- node [black,midway,above] {$\hat{y}_1=g(z_{3,1})$} +(2,0); 43 | \draw [->,>=stealth,gray] (L3-2) -- node [black,midway,above] {$\hat{y}_2=g(z_{3,2})$} +(2,0); 44 | \end{neuralnetwork} 45 | 46 | \end{document} 47 | -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/fig_8_onehidden.tex: -------------------------------------------------------------------------------- 1 | % create a file name latexmkrc and copy this code on it: 2 | % END { system('convert -density 300 output.pdf myImage.png'); } 3 | 4 | \documentclass[convert={density=300,size=1080x800,outext=.png}]{standalone} 5 | \usepackage{neuralnetwork} 6 | \usepackage{tikz} 7 | \usepackage{xpatch} 8 | \makeatletter 9 | 10 | % source: https://tex.stackexchange.com/questions/406167/generalize-neural-network 11 | % \linklayers have \nn@lastnode instead of \lastnode, 12 | % patch it to replace the former with the latter, and similar for thisnode 13 | \xpatchcmd{\linklayers}{\nn@lastnode}{\lastnode}{}{} 14 | \xpatchcmd{\linklayers}{\nn@thisnode}{\thisnode}{}{} 15 | \makeatother 16 | 17 | \begin{document} 18 | 19 | \begin{neuralnetwork}[height=1,maintitleheight=1cm,layertitleheight=2.5cm, layerspacing=50] 20 | \newcommand{\textinput}[2]{\ifnum #2=4 $x_n$ \else $x_#2$ \fi} 21 | \newcommand{\textactivation}[2]{$g(.)$} 22 | \newcommand{\texthiddena}[2]{$z_{1,#2}$} 23 | \newcommand{\texthiddenb}[2]{$z_{2,#2}$} 24 | \newcommand{\textout}[2]{$z_{2,#2}$} 25 | 26 | \newcommand{\w}[4] {\ifnum #2=4 $w_{n,#4}$ \else $w_{#2,#4}$ \fi} 27 | \newcommand{\ws}[4] {} 28 | \setdefaultlinklabel{\ws} 29 | 30 | \inputlayer[count=4, title={}, bias=True, exclude={3}, text=\textinput] 31 | \hiddenlayer[count=4, title={}, bias=True, text=\texthiddena] 32 | \linklayers[not from={3}] 33 | \outputlayer[count=1, title={}, bias=False, text=\textout] 34 | \linklayers 35 | 36 | 37 | 38 | % draw dots 39 | \path (L0-2) -- node{$\vdots$} (L0-4); 40 | \draw [->,>=stealth,gray] (L2-1) -- node [black,midway,above] {$\hat{y}_1=g(z_{2,1})$} +(2,0); 41 | \end{neuralnetwork} 42 | 43 | \end{document} 44 | -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/fig_9_dimension.tex: -------------------------------------------------------------------------------- 1 | % create a file name latexmkrc and copy this code on it: 2 | % END { system('convert -density 300 output.pdf myImage.png'); } 3 | 4 | \documentclass[convert={density=300,size=1080x800,outext=.png}]{standalone} 5 | \usepackage{neuralnetwork} 6 | \usepackage{tikz} 7 | \usepackage{xpatch} 8 | \makeatletter 9 | 10 | % source: https://tex.stackexchange.com/questions/406167/generalize-neural-network 11 | % \linklayers have \nn@lastnode instead of \lastnode, 12 | % patch it to replace the former with the latter, and similar for thisnode 13 | \xpatchcmd{\linklayers}{\nn@lastnode}{\lastnode}{}{} 14 | \xpatchcmd{\linklayers}{\nn@thisnode}{\thisnode}{}{} 15 | \makeatother 16 | 17 | \begin{document} 18 | 19 | \begin{neuralnetwork}[height=1,maintitleheight=1cm,layertitleheight=2.5cm, layerspacing=50] 20 | \newcommand{\textinput}[2]{\ifnum #2=4 $x_n$ \else $x_#2$ \fi} 21 | \newcommand{\textactivation}[2]{$g(.)$} 22 | \newcommand{\texthidden}[2]{\ifnum #2=4 $z_{1,k}$\else $z_{1,#2}$ \fi} 23 | \newcommand{\textout}[2]{\ifnum #2=2 $z_{2,j}$ \else $z_{2,#2}$ \fi} 24 | 25 | \newcommand{\w}[4] {\ifnum #2=4 \ifnum #4=2 $w_{k,j}^{(2)}$ \else $w_{k,#4}^{(2)}$ \fi \else \ifnum #4=2 $w_{#2,j}^{(2)}$ \else $w_{#2,#4}^{(2)}$ \fi \fi} 26 | \newcommand{\ww}[4] {\ifnum #4=4 $w_{n,k}^{(1)}$ \else $w_{n,#4}^{(1)}$ \fi} 27 | 28 | \newcommand{\ws}[4] {} 29 | 30 | \inputlayer[count=4, title={}, bias=True, exclude={3}, text=\textinput] 31 | \hiddenlayer[count=4, title={}, bias=True, text=\texthidden] 32 | \linklayers[not from={3}] 33 | \setdefaultlinklabel{\ww} 34 | \foreach \n in {4}{ 35 | \foreach \m in {1,2,3,4}{ 36 | \link[style={}, labelpos=midway, from layer=0, from node=\n, to layer=1, to node=\m] 37 | } 38 | } 39 | \setdefaultlinklabel{\ws} 40 | \outputlayer[count=2, title={}, bias=False, text=\textout] 41 | \linklayers 42 | \setdefaultlinklabel{\w} 43 | \foreach \o in {0,4}{ 44 | \foreach \p in {1,2}{ 45 | \link[style={}, labelpos=midway, from layer=1, from node=\o, to layer=2, to node=\p] 46 | } 47 | } 48 | 49 | 50 | % draw dots 51 | \path (L0-2) -- node{$\vdots$} (L0-4); 52 | \draw [->,>=stealth,gray] (L2-1) -- node [black,midway,above] {$\hat{y}_1=g(z_{2,1})$} +(2,0); 53 | \draw [->,>=stealth,gray] (L2-2) -- node [black,midway,above] {$\hat{y}_2=g(z_{2,j})$} +(2,0); 54 | \end{neuralnetwork} 55 | 56 | \end{document} 57 | -------------------------------------------------------------------------------- /lessons/week_10/Figures/Perceptron/latexmkrc: -------------------------------------------------------------------------------- 1 | END { system('convert -density 300 output.pdf myImage.png'); } -------------------------------------------------------------------------------- /lessons/week_10/Week 10 Introduction to Deep Learning and TensorFlow.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/lessons/week_10/Week 10 Introduction to Deep Learning and TensorFlow.pdf -------------------------------------------------------------------------------- /lessons/week_12/Better Generalization I.ipynb: -------------------------------------------------------------------------------- 1 | {"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Lesson #03_04 Task #01 Better Generalization I.ipynb","provenance":[],"collapsed_sections":["-CE5LY7lu0WS","MMxX_wG4lbGU","uB3_YIlEx1Do","tkf7WAAbx1EI","q2cF42K2FPD3"]},"kernelspec":{"name":"python3","display_name":"Python 3"},"accelerator":"GPU"},"cells":[{"cell_type":"markdown","metadata":{"id":"-CE5LY7lu0WS"},"source":["# 1 - Introduction"]},{"cell_type":"markdown","metadata":{"id":"C-EGMZ_68Hk9"},"source":["There is not a lot of code required, but we are going to step over it slowly so that you will know how to create your own models in the future. The steps you are going to cover in this practical assignment are as follows:\n","\n","1. Load Data\n","2. Define Model\n","3. Compile Model\n","4. Fit Model\n","5. Evaluate Model\n","6. Tie It All Together\n","7. Make Predictions"]},{"cell_type":"markdown","metadata":{"id":"Otpv6Hr1svIK"},"source":["## Import packages"]},{"cell_type":"code","metadata":{"id":"VJ5o8smyChPk"},"source":["!pip install mlxtend==0.17.3"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"Nf8JEmDoXgMq"},"source":["# Load the TensorBoard notebook extension\n","%load_ext tensorboard"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"85HpkSMAhJWa"},"source":["# Clear any logs from previous runs\n","!rm -rf logs"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"88qrGJpfsx9X"},"source":["import tensorflow as tf\n","import numpy as np\n","import matplotlib.pyplot as plt\n","import sklearn\n","import sklearn.datasets\n","import scipy.io\n","import time\n","from time import gmtime, strftime\n","import datetime\n","import os\n","import pytz\n","from mlxtend.plotting import plot_decision_regions\n","\n","%matplotlib inline\n","plt.rcParams['figure.figsize'] = (7.0, 4.0) # set default size of plots\n","plt.rcParams['image.interpolation'] = 'nearest'\n","plt.rcParams['image.cmap'] = 'gray'"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"TyO9yST1Fl_M"},"source":["tf.__version__"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"MMxX_wG4lbGU"},"source":["# 2 - Overfit multilayer perceptron"]},{"cell_type":"code","metadata":{"id":"FsEw115Yl9Gq"},"source":["class MyCustomCallback(tf.keras.callbacks.Callback):\n","\n"," def on_train_begin(self, batch, logs=None):\n"," self.begins = time.time()\n"," print('Training: begins at {}'.format(datetime.datetime.now(pytz.timezone('America/Fortaleza')).strftime(\"%a, %d %b %Y %H:%M:%S\")))\n","\n"," def on_train_end(self, logs=None):\n"," print('Training: ends at {}'.format(datetime.datetime.now(pytz.timezone('America/Fortaleza')).strftime(\"%a, %d %b %Y %H:%M:%S\")))\n"," print('Duration: {:.2f} seconds'.format(time.time() - self.begins)) "],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"-plq4rJxlZam"},"source":["# overfit mlp for the moons dataset\n","from sklearn.datasets import make_moons\n","from tensorflow.keras.models import Sequential\n","from tensorflow.keras.layers import Dense\n","import matplotlib.pyplot as plt\n","\n","# generate 2d classification dataset\n","x, y = make_moons(n_samples=100, noise=0.2, random_state=1)\n","\n","# split into train and test sets\n","n_train = 30\n","train_x, test_x = x[:n_train, :], x[n_train:, :]\n","train_y, test_y = y[:n_train], y[n_train:]\n","\n","# define model\n","model = Sequential()\n","model.add(Dense(500, input_dim=2, activation='relu'))\n","model.add(Dense(1, activation='sigmoid'))\n","model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])\n","\n","# callbacks tensorboard\n","logdir = os.path.join(\"logs\", datetime.datetime.now().strftime(\"%Y%m%d-%H%M%S\"))\n","tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=100)\n","\n","# fit model\n","history = model.fit(train_x, train_y,\n"," validation_data=(test_x, test_y), \n"," epochs=4000, verbose=0,batch_size=32,\n"," callbacks=[MyCustomCallback(),tensorboard_callback])\n","\n","# evaluate the model\n","_, train_acc = model.evaluate(train_x, train_y, verbose=0)\n","_, test_acc = model.evaluate(test_x, test_y, verbose=0)\n","print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))\n","\n","# plot loss learning curves\n","plt.subplot(211)\n","plt.title('Cross-Entropy Loss', pad=-40)\n","plt.plot(history.history['loss'], label='train')\n","plt.plot(history.history['val_loss'], label='test')\n","plt.legend()\n","\n","# plot accuracy learning curves\n","plt.subplot(212)\n","plt.title('Accuracy', pad=-40)\n","plt.plot(history.history['accuracy'], label='train')\n","plt.plot(history.history['val_accuracy'], label='test')\n","plt.legend()\n","plt.tight_layout()\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"h2vPSeVpm8ua"},"source":["from mlxtend.plotting import plot_decision_regions\n","# Plot decision boundary\n","plot_decision_regions(test_x,test_y.squeeze(), clf=model,zoom_factor=2.0)\n","plt.title(\"Model without regularization\")\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"Nto21ovNn4bk"},"source":["# Start TensorBoard within the notebook using magics\n","%tensorboard --logdir logs"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"uB3_YIlEx1Do"},"source":["# 3 - L2 Regularization\n"]},{"cell_type":"markdown","metadata":{"id":"4l02mx3Q8251"},"source":["\n","The standard way to avoid overfitting is called **L2 regularization**. It consists of appropriately modifying your cost function, from:\n","$$J = -\\frac{1}{m} \\sum\\limits_{i = 1}^{m} \\left( \\small y^{(i)}\\log\\left(\\hat{y}^{(i)}\\right) + (1-y^{(i)})\\log\\left(1- \\hat{y}^{(i)}\\right) \\right) \\tag{1}$$\n","\n","\n","To:\n","$$J_{regularized} = \\small \\underbrace{-\\frac{1}{m} \\sum\\limits_{i = 1}^{m} \\large{(}\\small y^{(i)}\\log\\left(\\hat{y}^{(i)}\\right) + (1-y^{(i)})\\log\\left(1- \\hat{y}^{(i)}\\right) \\large{)} }_\\text{cross-entropy cost} + \\underbrace{\\frac{1}{m} \\frac{\\lambda}{2} \\sum\\limits_l\\sum\\limits_k\\sum\\limits_j W_{k,j}^{[l]2} }_\\text{L2 regularization cost} \\tag{2}$$\n"]},{"cell_type":"code","metadata":{"id":"qfKzscuQqGvK"},"source":["# mlp with weight regularization for the moons dataset\n","from sklearn.datasets import make_moons\n","from tensorflow.keras.models import Sequential\n","from tensorflow.keras.layers import Dense\n","from tensorflow.keras.regularizers import l2\n","import matplotlib.pyplot as plt\n","\n","# generate 2d classification dataset\n","x, y = make_moons(n_samples=100, noise=0.2, random_state=1)\n","\n","# split into train and test sets\n","n_train = 30\n","train_x, test_x = x[:n_train, :], x[n_train:, :]\n","train_y, test_y = y[:n_train], y[n_train:]\n","\n","# define model\n","model_l2 = Sequential()\n","model_l2.add(Dense(500, input_dim=2, activation='relu',\n"," kernel_regularizer=l2(0.001)))\n","model_l2.add(Dense(1, activation='sigmoid'))\n","model_l2.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])\n","\n","# callbacks tensorboard\n","logdir = os.path.join(\"logs\", datetime.datetime.now().strftime(\"%Y%m%d-%H%M%S\"))\n","tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=100)\n","\n","# fit model\n","history_l2 = model_l2.fit(train_x, train_y, \n"," validation_data=(test_x, test_y),\n"," epochs=4000, verbose=0,\n"," callbacks=[MyCustomCallback(),tensorboard_callback])\n","\n","# evaluate the model\n","_, train_acc = model_l2.evaluate(train_x, train_y, verbose=0)\n","_, test_acc = model_l2.evaluate(test_x, test_y, verbose=0)\n","print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))\n","\n","# plot loss learning curves\n","plt.subplot(211)\n","plt.title('Cross-Entropy Loss', pad=-40)\n","plt.plot(history_l2.history['loss'], label='train')\n","plt.plot(history_l2.history['val_loss'], label='test')\n","plt.legend()\n","\n","# plot accuracy learning curves\n","plt.subplot(212)\n","plt.title('Accuracy', pad=-40)\n","plt.plot(history_l2.history['accuracy'], label='train')\n","plt.plot(history_l2.history['val_accuracy'], label='test')\n","plt.legend()\n","plt.tight_layout()\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"8k4vCne3wxaR"},"source":["from mlxtend.plotting import plot_decision_regions\n","# Plot decision boundary\n","plot_decision_regions(test_x,test_y.squeeze(), clf=model_l2,zoom_factor=2.0)\n","plt.title(\"Model with regularization\")\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"ICM4TUHNbYyO"},"source":["# Start TensorBoard within the notebook using magics\n","%tensorboard --logdir logs"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"tkf7WAAbx1EI"},"source":["# 4 - Dropout\n"]},{"cell_type":"markdown","metadata":{"id":"RlCshhJ9-mnl"},"source":["\n","Finally, **dropout** is a widely used regularization technique that is specific to deep learning. \n","**It randomly shuts down some neurons in each iteration.** Watch these two animations to see what this means!\n","\n"," \n","\n","
\n","
\n","
Figure 1: Drop-out on the second hidden layer.
At each iteration, you shut down (= set to zero) each neuron of a layer with probability $1 - keep\\_prob$ or keep it with probability $keep\\_prob$ (50% here). The dropped neurons don't contribute to the training in both the forward and backward propagations of the iteration.
\n","\n","
\n","
Figure 2: Drop-out on the first and third hidden layers.
$1^{st}$ layer: we shut down on average 40% of the neurons. $3^{rd}$ layer: we shut down on average 20% of the neurons.
\n","\n","\n","When you shut some neurons down, you actually modify your model. The idea behind drop-out is that at each iteration, you train a different model that uses only a subset of your neurons. With dropout, your neurons thus become less sensitive to the activation of one other specific neuron, because that other neuron might be shut down at any time. \n"]},{"cell_type":"code","metadata":{"id":"FpbWii871hzS"},"source":["# mlp with weight regularization for the moons dataset\n","from sklearn.datasets import make_moons\n","from tensorflow.keras.models import Sequential\n","from tensorflow.keras.layers import Dense\n","from tensorflow.keras.layers import Dropout\n","from tensorflow.keras.regularizers import l2\n","import matplotlib.pyplot as plt\n","\n","# generate 2d classification dataset\n","x, y = make_moons(n_samples=100, noise=0.2, random_state=1)\n","\n","# split into train and test sets\n","n_train = 30\n","train_x, test_x = x[:n_train, :], x[n_train:, :]\n","train_y, test_y = y[:n_train], y[n_train:]\n","\n","# define model\n","model_dropout = Sequential()\n","model_dropout.add(Dense(500, input_dim=2, activation='relu'))\n","model_dropout.add(Dropout(0.4))\n","model_dropout.add(Dense(1, activation='sigmoid'))\n","model_dropout.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])\n","\n","# callbacks tensorboard\n","logdir = os.path.join(\"logs\", datetime.datetime.now().strftime(\"%Y%m%d-%H%M%S\"))\n","tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=100)\n","\n","# fit model\n","history_dropout = model_dropout.fit(train_x, train_y, \n"," validation_data=(test_x, test_y),\n"," epochs=4000, verbose=0,\n"," callbacks=[MyCustomCallback(),tensorboard_callback])\n","\n","# evaluate the model\n","_, train_acc = model_dropout.evaluate(train_x, train_y, verbose=0)\n","_, test_acc = model_dropout.evaluate(test_x, test_y, verbose=0)\n","print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))\n","\n","# plot loss learning curves\n","plt.subplot(211)\n","plt.title('Cross-Entropy Loss', pad=-40)\n","plt.plot(history_dropout.history['loss'], label='train')\n","plt.plot(history_dropout.history['val_loss'], label='test')\n","plt.legend()\n","\n","# plot accuracy learning curves\n","plt.subplot(212)\n","plt.title('Accuracy', pad=-40)\n","plt.plot(history_dropout.history['accuracy'], label='train')\n","plt.plot(history_dropout.history['val_accuracy'], label='test')\n","plt.legend()\n","plt.tight_layout()\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"Z-nFSIaGqU-a"},"source":["from mlxtend.plotting import plot_decision_regions\n","# Plot decision boundary\n","plot_decision_regions(test_x,test_y.squeeze(), clf=model_dropout,zoom_factor=2.0)\n","plt.title(\"Model with dropout\")\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"5eoZoKAa2rnJ"},"source":["# Start TensorBoard within the notebook using magics\n","%tensorboard --logdir logs"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"q2cF42K2FPD3"},"source":["# 5 - L2 vs Dropout"]},{"cell_type":"code","metadata":{"id":"53fGLuKeMFgG"},"source":["def print_analysis(titles,history,loss=True):\n"," if loss:\n"," func = \"loss\"\n"," func_val = \"val_loss\"\n"," else:\n"," func = \"binary_accuracy\"\n"," func_val = \"val_binary_accuracy\"\n","\n"," f, axs = plt.subplots(1,len(titles),figsize=(12,6))\n"," \n"," for i, title in enumerate(titles):\n"," axs[i].set_title(title)\n"," axs[i].plot(history[i].history[func])\n"," axs[i].plot(history[i].history[func_val])\n"," axs[i].set_ylabel(func)\n"," axs[i].set_xlabel('epoch')\n"," axs[i].legend(['train', 'test'], loc='best')\n"," \n"," plt.tight_layout()\n"," plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"i1Q68J32Nrcj"},"source":["titles = ['Model without regularization','Model with regularization L2','Model with dropout']\n","hist = [history,history_l2,history_dropout]\n","print_analysis(titles,hist,loss=True)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"WyCRtKAIPgIY"},"source":["def print_regions(titles,models):\n","\n"," f, axs = plt.subplots(1,len(titles),figsize=(12,4))\n"," \n"," for i, title in enumerate(titles):\n"," plot_decision_regions(test_x,test_y.squeeze(), clf=models[i],zoom_factor=2.0,ax=axs[i])\n"," axs[i].set_title(title)\n"," plt.tight_layout()\n"," plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"668zLQ3BQFNA"},"source":["models = [model,model_l2,model_dropout]\n","print_regions(titles,models)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"3ZnGlZK43WBi"},"source":["# 6 - Force Small Weights with Weight Constraints"]},{"cell_type":"code","metadata":{"id":"loAB8_wRURLy"},"source":["# mlp overfit on the moons dataset with a unit norm constraint\n","from sklearn.datasets import make_moons\n","from tensorflow.keras.layers import Dense\n","from tensorflow.keras.models import Sequential\n","from tensorflow.keras.constraints import unit_norm\n","import matplotlib.pyplot as plt\n","import os\n","\n","# generate 2d classification dataset\n","x, y = make_moons(n_samples=100, noise=0.2, random_state=1)\n","\n","# split into train and test\n","n_train = 30\n","train_x, test_x = x[:n_train, :], x[n_train:, :]\n","train_y, test_y = y[:n_train], y[n_train:]\n","\n","# define model\n","model = Sequential()\n","model.add(Dense(500, input_dim=2, activation='relu', kernel_constraint=unit_norm()))\n","#kernel_constraint=tf.keras.constraints.min_max_norm(min_value=-0.2, max_value=1.0)))\n","model.add(Dense(1, activation='sigmoid'))\n","model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])\n","\n","# callbacks tensorboard\n","logdir = os.path.join(\"logs\", datetime.datetime.now().strftime(\"%Y%m%d-%H%M%S\"))\n","tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=100)\n","\n","# fit model\n","history = model.fit(train_x, train_y,\n"," validation_data=(test_x, test_y),\n"," epochs=4000, verbose=0,\n"," callbacks=[MyCustomCallback(),tensorboard_callback])\n","\n","# evaluate the model\n","_, train_acc = model.evaluate(train_x, train_y, verbose=0)\n","_, test_acc = model.evaluate(test_x, test_y, verbose=0)\n","print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))\n","\n","# plot loss learning curves\n","plt.subplot(211)\n","plt.title('Cross-Entropy Loss', pad=-40)\n","plt.plot(history.history['loss'], label='train')\n","plt.plot(history.history['val_loss'], label='test')\n","plt.legend()\n","\n","# plot accuracy learning curves\n","plt.subplot(212)\n","plt.title('Accuracy', pad=-40)\n","plt.plot(history.history['accuracy'], label='train')\n","plt.plot(history.history['val_accuracy'], label='test')\n","plt.legend()\n","plt.tight_layout()\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"BHChR2tCUrpv"},"source":["from mlxtend.plotting import plot_decision_regions\n","# Plot decision boundary\n","plot_decision_regions(test_x,test_y.squeeze(), clf=model,zoom_factor=2.0)\n","plt.title(\"Model with weights constraints\")\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"PWus4c-GaYYH"},"source":["# Start TensorBoard within the notebook using magics\n","%tensorboard --logdir logs"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"FtdmGRF3iLPB"},"source":["filter = tf.keras.constraints.UnitNorm()\n","data = np.arange(3).reshape(3, 1).astype(np.float32)\n","print(data)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"2l86cAzCib0f"},"source":["filter(data)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"iaLEf9vqiiGQ"},"source":["np.linalg.norm(filter(data))"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"ngq7qmwWlvxm"},"source":["np.linalg.norm(data)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"ymjSWu26l5pC"},"source":["data/np.linalg.norm(data)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"I7_8PhubebIR"},"source":["filter = tf.keras.constraints.UnitNorm()\n","data = np.arange(6).reshape(3, 2).astype(np.float32)\n","data"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"-l_ODs3RevKr"},"source":["filter(data)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"eCFtK1cqo8wI"},"source":["np.linalg.norm(filter(data),axis=0)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"Ha4YL7P1e2Kk"},"source":["np.linalg.norm(data,axis=0)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"fTtiuqSBfsBC"},"source":["data/np.linalg.norm(data,axis=0)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"O10e_wAiaxnv"},"source":["# mlp overfit on the moons dataset with a unit norm constraint\n","from sklearn.datasets import make_moons\n","from tensorflow.keras.layers import Dense\n","from tensorflow.keras.models import Sequential\n","from tensorflow.keras.constraints import unit_norm\n","import matplotlib.pyplot as plt\n","import os\n","\n","# generate 2d classification dataset\n","x, y = make_moons(n_samples=100, noise=0.2, random_state=1)\n","\n","# split into train and test\n","n_train = 30\n","train_x, test_x = x[:n_train, :], x[n_train:, :]\n","train_y, test_y = y[:n_train], y[n_train:]\n","\n","# define model\n","model = Sequential()\n","model.add(Dense(500, input_dim=2, activation='relu', \n"," kernel_constraint=tf.keras.constraints.min_max_norm(min_value=-0.2, max_value=1.0)))\n","model.add(Dense(1, activation='sigmoid'))\n","model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])\n","\n","# callbacks tensorboard\n","logdir = os.path.join(\"logs\", datetime.datetime.now().strftime(\"%Y%m%d-%H%M%S\"))\n","tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=100)\n","\n","\n","# fit model\n","history = model.fit(train_x, train_y,\n"," validation_data=(test_x, test_y),\n"," epochs=4000, verbose=0,\n"," callbacks=[MyCustomCallback(),tensorboard_callback])\n","\n","# evaluate the model\n","_, train_acc = model.evaluate(train_x, train_y, verbose=0)\n","_, test_acc = model.evaluate(test_x, test_y, verbose=0)\n","print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))\n","\n","# plot loss learning curves\n","plt.subplot(211)\n","plt.title('Cross-Entropy Loss', pad=-40)\n","plt.plot(history.history['loss'], label='train')\n","plt.plot(history.history['val_loss'], label='test')\n","plt.legend()\n","\n","# plot accuracy learning curves\n","plt.subplot(212)\n","plt.title('Accuracy', pad=-40)\n","plt.plot(history.history['accuracy'], label='train')\n","plt.plot(history.history['val_accuracy'], label='test')\n","plt.legend()\n","plt.tight_layout()\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"Hpt_bysOqgoP"},"source":["from mlxtend.plotting import plot_decision_regions\n","# Plot decision boundary\n","plot_decision_regions(test_x,test_y.squeeze(), clf=model,zoom_factor=2.0)\n","plt.title(\"Model with weights constraints\")\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"gYjUEqSarISm"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"XEVcChn-mXAd"},"source":[""],"execution_count":null,"outputs":[]}]} -------------------------------------------------------------------------------- /lessons/week_12/Better Generalizaton vs Better Learning.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/lessons/week_12/Better Generalizaton vs Better Learning.pdf -------------------------------------------------------------------------------- /lessons/week_14/Hyperparameter Tuning and Batch Normalization.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ivanovitchm/ppgeecmachinelearning/ec32e114013d044419593d4da7b9647439024501/lessons/week_14/Hyperparameter Tuning and Batch Normalization.pdf -------------------------------------------------------------------------------- /lessons/week_14/Task #01 Hyperparameter Tuning using Keras Tuner.ipynb: -------------------------------------------------------------------------------- 1 | {"nbformat":4,"nbformat_minor":0,"metadata":{"accelerator":"GPU","colab":{"name":"Week #06 Task #01 Hyperparameter Tuning using Keras Tuner.ipynb","provenance":[],"collapsed_sections":[]},"kernelspec":{"display_name":"Python 3","name":"python3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"7DLkQmFwhbq4"},"source":["# A brief recap about DL Pipeline"]},{"cell_type":"markdown","metadata":{"id":"VH2FWhOPNzZO"},"source":["- Define the task\n"," - Frame the problem\n"," - Collect a dataset\n"," - Understand your data\n"," - Choose a measure of success\n","- Develop a model\n"," - Prepare the data\n"," - Choose an evaluation protocol\n"," - Beat a baseline\n"," - Scale up: develop a model that overfits\n"," - Regularize and tune your model\n","- Deploy your model\n"," - Explain your work to stakeholders and set expectations\n"," - Ship an inference model\n"," - Deploying a model as a rest API\n"," - Deploying a model on device\n"," - Deploying a model in the browser\n"," - Monitor your model in the wild\n"," - Maintain your model\n"]},{"cell_type":"markdown","metadata":{"id":"cTShha8tLAvY"},"source":["# 1.0 Baseline Model"]},{"cell_type":"markdown","metadata":{"id":"RD1hMKLZEObd"},"source":["## 1.1 Import Libraries"]},{"cell_type":"markdown","metadata":{"id":"4MR6rLenKnry"},"source":["Install and import the Keras Tuner."]},{"cell_type":"code","metadata":{"id":"rEYDnz5LKra8"},"source":["# pip install -q (quiet)\n","!pip install git+https://github.com/keras-team/keras-tuner.git -q"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"jtGJ6_Pi2yG9"},"source":["import tensorflow as tf\n","import numpy as np\n","import matplotlib.pyplot as plt\n","import h5py\n","import time\n","import datetime\n","import pytz\n","import IPython\n","import keras_tuner as kt"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"ccfIzxFaHeOb"},"source":["print('TF version:', tf.__version__)\n","print('KT version:', kt.__version__)\n","print('GPU devices:', tf.config.list_physical_devices('GPU'))"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"0WGGeZQGESGP"},"source":["## 1.2 Utils Functions"]},{"cell_type":"code","metadata":{"id":"hw9z0s-QqSCd"},"source":["# download train_catvnoncat.h5\n","!gdown https://drive.google.com/uc?id=1ZPWKlEATuDjFtZJPgHCc5SURrcKaVP9Z\n","\n","# download test_catvnoncat.h5\n","!gdown https://drive.google.com/uc?id=1ndRNAwidOqEgqDHBurA0PGyXqHBlvzz-"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"wJYkR70dEGHh"},"source":["def load_dataset():\n"," # load the train data\n"," train_dataset = h5py.File('train_catvnoncat.h5', \"r\")\n","\n"," # your train set features\n"," train_set_x_orig = np.array(train_dataset[\"train_set_x\"][:]) \n","\n"," # your train set labels\n"," train_set_y_orig = np.array(train_dataset[\"train_set_y\"][:]) \n","\n"," # load the test data\n"," test_dataset = h5py.File('test_catvnoncat.h5', \"r\")\n","\n"," # your test set features\n"," test_set_x_orig = np.array(test_dataset[\"test_set_x\"][:]) \n","\n"," # your test set labels \n"," test_set_y_orig = np.array(test_dataset[\"test_set_y\"][:]) \n","\n"," # the list of classes\n"," classes = np.array(test_dataset[\"list_classes\"][:]) \n","\n"," # reshape the test data\n"," train_set_y_orig = train_set_y_orig.reshape((train_set_y_orig.shape[0],1))\n"," test_set_y_orig = test_set_y_orig.reshape((test_set_y_orig.shape[0],1))\n","\n"," return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"exIeT2zMHhxa"},"source":["## 1.3 Load Dataset"]},{"cell_type":"code","metadata":{"id":"mjQYsrdPHSyh"},"source":["# Loading the data (cat/non-cat)\n","train_set_x_orig, train_y, test_set_x_orig, test_y, classes = load_dataset()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"6ihL83MbHlhc"},"source":["# Reshape the training and test examples\n","train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0],-1)\n","test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0],-1)\n","\n","# Standardize the dataset\n","train_x = train_set_x_flatten/255\n","test_x = test_set_x_flatten/255"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"sQn-Brt-IhK_"},"source":["print (\"train_x shape: \" + str(train_x.shape))\n","print (\"train_y shape: \" + str(train_y.shape))\n","print (\"test_x shape: \" + str(test_x.shape))\n","print (\"test_y shape: \" + str(test_y.shape))"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"Xa9dcSRx1__x"},"source":["# visualize a sample data\n","index = 13\n","plt.imshow(train_set_x_orig[index])"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"5zLrb3ORIlo_"},"source":["## 1.4 Model"]},{"cell_type":"code","metadata":{"id":"zJutkmXnIpG6"},"source":["class MyCustomCallback(tf.keras.callbacks.Callback):\n","\n"," def on_train_begin(self, batch, logs=None):\n"," self.begins = time.time()\n"," print('Training: begins at {}'.format(datetime.datetime.now(pytz.timezone('America/Fortaleza')).strftime(\"%a, %d %b %Y %H:%M:%S\")))\n","\n"," def on_train_end(self, logs=None):\n"," print('Training: ends at {}'.format(datetime.datetime.now(pytz.timezone('America/Fortaleza')).strftime(\"%a, %d %b %Y %H:%M:%S\")))\n"," print('Duration: {:.2f} seconds'.format(time.time() - self.begins)) "],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"SyfcUdH36lGG"},"source":["# Instantiate a simple classification model\n","model = tf.keras.Sequential([\n"," tf.keras.layers.Dense(8, activation=tf.nn.relu, dtype='float64'),\n"," tf.keras.layers.Dense(8, activation=tf.nn.relu, dtype='float64'),\n"," tf.keras.layers.Dense(1, activation=tf.nn.sigmoid, dtype='float64')\n","])\n","\n","# Instantiate a logistic loss function that expects integer targets.\n","loss = tf.keras.losses.BinaryCrossentropy()\n","\n","# Instantiate an accuracy metric.\n","accuracy = tf.keras.metrics.BinaryAccuracy()\n","\n","# Instantiate an optimizer.\n","optimizer = tf.keras.optimizers.SGD(learning_rate=0.001)\n","\n","# configure the optimizer, loss, and metrics to monitor.\n","model.compile(optimizer=optimizer, loss=loss, metrics=[accuracy])\n","\n","# training \n","history = model.fit(x=train_x,\n"," y=train_y,\n"," batch_size=32,\n"," epochs=500,\n"," validation_data=(test_x,test_y),\n"," callbacks=[MyCustomCallback()],\n"," verbose=1)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"fugflUT5JtCe"},"source":["loss, acc = model.evaluate(x=train_x,y=train_y, batch_size=32)\n","print('Train loss: %.4f - acc: %.4f' % (loss, acc))\n","\n","loss_, acc_ = model.evaluate(x=test_x,y=test_y, batch_size=32)\n","print('Test loss: %.4f - acc: %.4f' % (loss_, acc_))"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"q6CsWRrqJzFQ"},"source":["# 2.0 Hyperparameter Tuning using Keras-Tuner"]},{"cell_type":"markdown","metadata":{"id":"yff1zTsrLJ4J"},"source":["The [Keras Tuner](https://github.com/keras-team/keras-tuner) is a library that helps you pick the optimal set of hyperparameters for your TensorFlow program. The process of selecting the right set of hyperparameters for your machine learning (ML) application is called **hyperparameter tuning** or **hypertuning**. \n","\n","Hyperparameters are the variables that govern the training process and the topology of an ML model. These variables remain constant over the training process and directly impact the performance of your ML program. Hyperparameters are of two types:\n","1. **Model hyperparameters** which influence model selection such as the number and width of hidden layers\n","2. **Algorithm hyperparameters** which influence the speed and quality of the learning algorithm such as the learning rate for Stochastic Gradient Descent (SGD) and the number of nearest neighbors for a k Nearest Neighbors (KNN) classifier, among others.\n"]},{"cell_type":"markdown","metadata":{"id":"K5YEL2H2Ax3e"},"source":["## 2.1 Define the model\n"]},{"cell_type":"markdown","metadata":{"id":"md7FhKoYcuj7"},"source":["\n","When you build a model for hypertuning, you also define the hyperparameter search space in addition to the model architecture. The model you set up for hypertuning is called a **hypermodel**.\n","\n","You can define a hypermodel through two approaches:\n","\n","* By using a model builder function\n","* By subclassing the `HyperModel` class of the Keras Tuner API\n","\n","You can also use two pre-defined `HyperModel` classes - [HyperXception](https://keras-team.github.io/keras-tuner/documentation/hypermodels/#hyperxception-class) and [HyperResNet](https://keras-team.github.io/keras-tuner/documentation/hypermodels/#hyperresnet-class) for computer vision applications.\n","\n","In this section, you use a model builder function to define the image classification model. The model builder function returns a compiled model and uses hyperparameters you define inline to hypertune the model."]},{"cell_type":"code","metadata":{"id":"M3Iosz7dctGf"},"source":["def model_builder(hp):\n"," # Instantiate a simple classification model\n"," model = tf.keras.Sequential()\n","\n"," # Tune the number of units in the first Dense layer\n"," # Choose an optimal value between 8-32\n"," hp_units = hp.Int('units', min_value = 8, max_value = 32, step = 8)\n"," model.add(tf.keras.layers.Dense(hp_units, activation=tf.nn.relu, dtype='float64'))\n"," model.add(tf.keras.layers.Dense(8, activation=tf.nn.relu, dtype='float64'))\n"," model.add(tf.keras.layers.Dense(1, activation=tf.nn.sigmoid, dtype='float64'))\n","\n"," # Instantiate a logistic loss function that expects integer targets.\n"," loss = tf.keras.losses.BinaryCrossentropy()\n","\n"," # Instantiate an accuracy metric.\n"," accuracy = tf.keras.metrics.BinaryAccuracy()\n","\n"," # Instantiate an optimizer.\n"," optimizer = tf.keras.optimizers.SGD(learning_rate=0.001)\n","\n"," # configure the optimizer, loss, and metrics to monitor.\n"," model.compile(optimizer=optimizer, loss=loss, metrics=[accuracy])\n","\n"," return model"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"L8bYqKpDeBlB"},"source":["## 2.2 Instantiate the tuner and perform hypertuning\n"]},{"cell_type":"markdown","metadata":{"id":"pYX1lhvDeDoz"},"source":["\n","Instantiate the tuner to perform the hypertuning. The Keras Tuner has [four tuners available](https://keras-team.github.io/keras-tuner/documentation/tuners/) - `RandomSearch`, `Hyperband`, `BayesianOptimization`, and `Sklearn`. \n","\n","Notice that in previous subsection we're not fitting there, and we're returning the compiled model. Let's continue to build out the rest of our program first, then we'll make things more dynamic. Adding the dynamic bits will all happen in the **model_builder** function, but we will need some other code that will use this function now. To start, we're going to import **RandomSearch** and after that we'll first define our tuner."]},{"cell_type":"code","metadata":{"id":"BOzyY-KLhQd5"},"source":["from keras_tuner.tuners import RandomSearch"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"V4JXxAYAe67h"},"source":["# path to store results\n","LOG_DIR = f\"{int(time.time())}\""],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"Ikjt5YbnepK5"},"source":["tuner = RandomSearch(model_builder,\n"," objective='val_binary_accuracy',\n"," max_trials=4, # how many model configurations would you like to test?\n"," executions_per_trial=1, # how many trials per variation? (same model could perform differently)\n"," directory=LOG_DIR,\n"," project_name=\"my_first_tuner\"\n"," )"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"d8LgEGSAhewb"},"source":["- Your objective here probably should be **validation accuracy**, but you can choose from other things like **val_loss** for example.\n","- **max_trials** allows you limit how many tests will be run. If you put 10 here, you will get 10 different tests (provided you've specified enough variability for 10 different combinations, anyway).\n","- **executions_per_trial** might be 1, but you might also do many more like 3,5, or even 10.\n","\n","Basically, if you're just hunting for a model that works, then you should just do 1 trial per variation. If you're attempting to seek out 1-3% on **validation accuracy**, then you should run 3+ trials most likely per model, because each time a model runs, you should see some variation in final values. So this will just depend on what kind of a search you're doing (just trying to find something that works vs fine tuning...or anything in between)."]},{"cell_type":"markdown","metadata":{"id":"fJwb5NZgiWXt"},"source":["Run the hyperparameter search. The arguments for the search method are the same as those used for [`tf.keras.model.fit`](https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit).\n","\n","Before running the hyperparameter search, define a callback to clear the training outputs at the end of every training step."]},{"cell_type":"code","metadata":{"id":"nK6XNozXlJK7"},"source":["class ClearTrainingOutput(tf.keras.callbacks.Callback):\n"," def on_train_end(*args, **kwargs):\n"," IPython.display.clear_output(wait = True)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"PyNf7zRakAcr"},"source":["tuner.search(train_x,\n"," train_y, \n"," epochs = 500, \n"," verbose=1,\n"," batch_size=32,\n"," validation_data = (test_x, test_y),\n"," callbacks = [ClearTrainingOutput()])"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"Th3JZnM5kX_k"},"source":["# print a summary of results\n","tuner.results_summary(num_trials=10)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"ICBtBvvSm7AE"},"source":["# best hyperparameters is a dictionary\n","tuner.get_best_hyperparameters()[0].values"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"_O9cyI_kK0US"},"source":["# search space summary\n","tuner.search_space_summary()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"xfJ3aiJ5p9iv"},"source":["print(f\"\"\"The hyperparameter search is complete. The optimal number of units in the first densely-connected\n","layer is {tuner.get_best_hyperparameters()[0].values.get('units')}\"\"\")"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"uvyvW9oZwkgJ"},"source":["## 2.3 Playing with search space"]},{"cell_type":"markdown","metadata":{"id":"-cIpO1GkxTgZ"},"source":["We also can play with the search space in order to contain conditional hyperparameters. Below, we have a **for loop** creating a **tunable number of layers**, which themselves involve a tunable **units** parameter. This can be pushed to any level of parameter interdependency, including recursion. Note that all parameter names should be unique (here, in the loop over **i**, we name the inner parameters **'units_'** + **str(i)**)."]},{"cell_type":"code","metadata":{"id":"5ZPJ9lSLyEfg"},"source":["def model_builder_all(hp):\n"," # Instantiate a simple classification model\n"," model = tf.keras.Sequential()\n"," \n"," # Create a tunable number of layers 1,2,3,4\n"," for i in range(hp.Int('num_layers', 1, 4)):\n","\n"," # Tune the number of units in the Dense layer\n"," # Choose an optimal value between 8-32\n"," model.add(tf.keras.layers.Dense(units=hp.Int('units_' + str(i),\n"," min_value = 8,\n"," max_value = 32,\n"," step = 8),\n"," # Tune the activation functions\n"," activation= hp.Choice('dense_activation_' + str(i),\n"," values=['relu', 'tanh'],\n"," default='relu'),\n"," dtype='float64'))\n","\n"," model.add(tf.keras.layers.Dense(1, activation=tf.nn.sigmoid, dtype='float64'))\n","\n"," # Instantiate a logistic loss function that expects integer targets.\n"," loss = tf.keras.losses.BinaryCrossentropy()\n","\n"," # Instantiate an accuracy metric.\n"," accuracy = tf.keras.metrics.BinaryAccuracy()\n","\n"," optimizer = hp.Choice('optimizer', ['adam', 'SGD'])\n"," if optimizer == 'adam':\n"," opt = tf.keras.optimizers.Adam(learning_rate=hp.Float('lrate_adam',\n"," min_value=1e-4,\n"," max_value=1e-2, \n"," sampling='LOG'))\n"," else:\n"," opt = tf.keras.optimizers.SGD(learning_rate=hp.Float('lrate_sgd',\n"," min_value=1e-4,\n"," max_value=1e-2, \n"," sampling='LOG'))\n","\n"," # configure the optimizer, loss, and metrics to monitor.\n"," model.compile(optimizer=opt, loss=loss, metrics=[accuracy])\n","\n"," return model"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"3GuC7Drk2ZLC"},"source":["# path to store results\n","LOG_DIR = f\"{int(time.time())}\""],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"Xc7BSy102ZLI"},"source":["tuner_ = RandomSearch(model_builder_all,\n"," objective='val_binary_accuracy',\n"," max_trials=20, # how many model configurations would you like to test?\n"," executions_per_trial=1, # how many trials per variation? (same model could perform differently)\n"," directory=LOG_DIR,\n"," project_name=\"my_first_tuner\"\n"," )"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"p39oHKAl2aXK"},"source":["tuner_.search(train_x,\n"," train_y, \n"," epochs = 500,\n"," # verbose = 0 (silent) \n"," verbose=0,\n"," batch_size=32,\n"," validation_data = (test_x, test_y),\n"," callbacks = [ClearTrainingOutput()])"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"1CNxpcdg3phN"},"source":["tuner_.results_summary()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"IPU-lmv_3AHD"},"source":["tuner_.get_best_hyperparameters()[0].values"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"3TUz0_MtWf9e"},"source":["tuner_.search_space_summary()"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"NmAuZCk8gAbP"},"source":["## 2.4 Retrain the model with the optimal hyperparameters"]},{"cell_type":"code","metadata":{"id":"pZi_vqSUcX5p"},"source":["# Build the model with the optimal hyperparameters and train it on the data\n","best_hps = tuner_.get_best_hyperparameters()[0]\n","model = tuner_.hypermodel.build(best_hps)\n","model.fit(train_x, train_y, epochs = 500, validation_data = (test_x, test_y),batch_size=32)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"8simDMO93_Cb"},"source":["loss_, acc_ = model.evaluate(x=test_x,y=test_y, batch_size=32)\n","print('Test loss: %.3f - acc: %.3f' % (loss_, acc_))"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"TyDo-xOti6_4"},"source":["model.summary()"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"m25iq5vkgpg0"},"source":["Exercise\n","\n","Hyperparameter tuning is a time-consuming task. The previous result was not so good. You can try to improve it the tuning considering:\n","- Other [Tuners](https://keras-team.github.io/keras-tuner/documentation/tuners/): BayesianOptimization, Hyperband\n","- Evaluate **max_trials** ranges over 100 or more?\n","- **executions_per_trial** values in [2,3]?\n","- How about you write an article on Medium about Keras Tuner?"]},{"cell_type":"code","metadata":{"id":"4BOmmZWxiInr"},"source":["# PUT YOUR CODE HERE"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"5QL9wGBN363h"},"source":["# 3.0 References"]},{"cell_type":"markdown","metadata":{"id":"2Hfqjn0sXQVh"},"source":["1. https://www.kaggle.com/fchollet/moa-keras-kerastuner-best-practices/\n","2. https://www.kaggle.com/fchollet/titanic-keras-kerastuner-best-practices\n","3. https://www.kaggle.com/fchollet/keras-kerastuner-best-practices\n","4. https://pythonprogramming.net/keras-tuner-optimizing-neural-network-tutorial/\n","5. https://github.com/keras-team/keras-tuner\n","6. https://machinelearningmastery.com/autokeras-for-classification-and-regression/"]}]} -------------------------------------------------------------------------------- /lessons/week_14/Task #02 Hyperparameter Tuning using Weights and Biases.ipynb: -------------------------------------------------------------------------------- 1 | {"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Week #06 Task #02 Hyperparameter Tuning using Weights and Biases.ipynb","provenance":[],"collapsed_sections":[],"authorship_tag":"ABX9TyNepLrtPIfGc4cSKw0HDKrn"},"kernelspec":{"name":"python3","display_name":"Python 3"},"accelerator":"GPU"},"cells":[{"cell_type":"markdown","metadata":{"id":"RIf7eFgwyZ06"},"source":["# 1 Import libraries"]},{"cell_type":"code","metadata":{"id":"L4OeKOISEobo"},"source":["import tensorflow as tf\n","import numpy as np\n","import matplotlib.pyplot as plt\n","import h5py\n","import time\n","import datetime\n","import pytz\n","import IPython"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"aRdkrAFXxYcb"},"source":["print('TF version:', tf.__version__)\n","print('GPU devices:', tf.config.list_physical_devices('GPU'))"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"MSK_tRj1yipL"},"source":["# 2 Data load and preprocessing"]},{"cell_type":"code","metadata":{"id":"z7Rhm9scwVx1"},"source":["# download train_catvnoncat.h5\n","!gdown https://drive.google.com/uc?id=1ZPWKlEATuDjFtZJPgHCc5SURrcKaVP9Z\n","\n","# download test_catvnoncat.h5\n","!gdown https://drive.google.com/uc?id=1ndRNAwidOqEgqDHBurA0PGyXqHBlvzz-"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"evhmAkbKxYy_"},"source":["def load_dataset():\n"," # load the train data\n"," train_dataset = h5py.File('train_catvnoncat.h5', \"r\")\n","\n"," # your train set features\n"," train_set_x_orig = np.array(train_dataset[\"train_set_x\"][:]) \n","\n"," # your train set labels\n"," train_set_y_orig = np.array(train_dataset[\"train_set_y\"][:]) \n","\n"," # load the test data\n"," test_dataset = h5py.File('test_catvnoncat.h5', \"r\")\n","\n"," # your test set features\n"," test_set_x_orig = np.array(test_dataset[\"test_set_x\"][:]) \n","\n"," # your test set labels \n"," test_set_y_orig = np.array(test_dataset[\"test_set_y\"][:]) \n","\n"," # the list of classes\n"," classes = np.array(test_dataset[\"list_classes\"][:]) \n","\n"," # reshape the test data\n"," train_set_y_orig = train_set_y_orig.reshape((train_set_y_orig.shape[0],1))\n"," test_set_y_orig = test_set_y_orig.reshape((test_set_y_orig.shape[0],1))\n","\n"," return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"mjQYsrdPHSyh"},"source":["# Loading the data (cat/non-cat)\n","train_set_x_orig, train_y, test_set_x_orig, test_y, classes = load_dataset()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"6ihL83MbHlhc"},"source":["# Reshape the training and test examples\n","train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0],-1)\n","test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0],-1)\n","\n","# Standardize the dataset\n","train_x = train_set_x_flatten/255\n","test_x = test_set_x_flatten/255"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"sQn-Brt-IhK_"},"source":["print (\"train_x shape: \" + str(train_x.shape))\n","print (\"train_y shape: \" + str(train_y.shape))\n","print (\"test_x shape: \" + str(test_x.shape))\n","print (\"test_y shape: \" + str(test_y.shape))"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"ATILeNhddrcC"},"source":["# visualize a sample modified data\n","index = 13\n","plt.imshow(train_x[index].reshape(64,64,3))"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"Xa9dcSRx1__x"},"source":["# visualize a sample raw data\n","index = 13\n","plt.imshow(train_set_x_orig[index])"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"DvACru73ysxG"},"source":["class MyCustomCallback(tf.keras.callbacks.Callback):\n","\n"," def on_train_begin(self, batch, logs=None):\n"," self.begins = time.time()\n"," print('Training: begins at {}'.format(datetime.datetime.now(pytz.timezone('America/Fortaleza')).strftime(\"%a, %d %b %Y %H:%M:%S\")))\n","\n"," def on_train_end(self, logs=None):\n"," print('Training: ends at {}'.format(datetime.datetime.now(pytz.timezone('America/Fortaleza')).strftime(\"%a, %d %b %Y %H:%M:%S\")))\n"," print('Duration: {:.2f} seconds'.format(time.time() - self.begins)) "],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"iyHu7kEeyrh-"},"source":["# 3 Base Model"]},{"cell_type":"code","metadata":{"id":"HPkKk-ZayvfU"},"source":["# Instantiate a simple classification model\n","model = tf.keras.Sequential([\n"," tf.keras.layers.Dense(8, activation=tf.nn.relu, dtype='float64'),\n"," tf.keras.layers.Dense(8, activation=tf.nn.relu, dtype='float64'),\n"," tf.keras.layers.Dense(1, activation=tf.nn.sigmoid, dtype='float64')\n","])\n","\n","# Instantiate a logistic loss function that expects integer targets.\n","loss = tf.keras.losses.BinaryCrossentropy()\n","\n","# Instantiate an accuracy metric.\n","accuracy = tf.keras.metrics.BinaryAccuracy()\n","\n","# Instantiate an optimizer.\n","optimizer = tf.keras.optimizers.SGD(learning_rate=0.001)\n","\n","# configure the optimizer, loss, and metrics to monitor.\n","model.compile(optimizer=optimizer, loss=loss, metrics=[accuracy])\n","\n","# training \n","history = model.fit(x=train_x,\n"," y=train_y,\n"," batch_size=32,\n"," epochs=500,\n"," validation_data=(test_x,test_y),\n"," callbacks=[MyCustomCallback()],\n"," verbose=1)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"nLdVTFjGyyDe"},"source":["loss, acc = model.evaluate(x=train_x,y=train_y, batch_size=32)\n","print('Train loss: %.4f - acc: %.4f' % (loss, acc))\n","\n","loss_, acc_ = model.evaluate(x=test_x,y=test_y, batch_size=32)\n","print('Test loss: %.4f - acc: %.4f' % (loss_, acc_))"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"RooPTo2ezBBh"},"source":["# 4 Hyperparameter Tuning "]},{"cell_type":"code","metadata":{"id":"ESSHH5_UzQ3o"},"source":["%%capture\n","!pip install wandb==0.10.17"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"Y_Tjh1Sbz1tJ"},"source":["!wandb login --relogin"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"kKxNLU83GV4q"},"source":["## 4.1 Monitoring a neural network"]},{"cell_type":"code","metadata":{"id":"Hi7lXapOz50x"},"source":["import wandb\n","from wandb.keras import WandbCallback\n","from tensorflow.keras.callbacks import EarlyStopping\n","\n","# Default values for hyperparameters\n","defaults = dict(layer_1 = 8,\n"," layer_2 = 8,\n"," learn_rate = 0.001,\n"," batch_size = 32,\n"," epoch = 500)\n","\n","wandb.init(project=\"week06\", config= defaults, name=\"week06_run_01\")\n","config = wandb.config"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"-bKBiY1q1QdJ"},"source":["# Instantiate a simple classification model\n","model = tf.keras.Sequential([\n"," tf.keras.layers.Dense(config.layer_1, activation=tf.nn.relu, dtype='float64'),\n"," tf.keras.layers.Dense(config.layer_2, activation=tf.nn.relu, dtype='float64'),\n"," tf.keras.layers.Dense(1, activation=tf.nn.sigmoid, dtype='float64')\n","])\n","\n","# Instantiate a logistic loss function that expects integer targets.\n","loss = tf.keras.losses.BinaryCrossentropy()\n","\n","# Instantiate an accuracy metric.\n","accuracy = tf.keras.metrics.BinaryAccuracy()\n","\n","# Instantiate an optimizer.\n","optimizer = tf.keras.optimizers.SGD(learning_rate=config.learn_rate)\n","\n","# configure the optimizer, loss, and metrics to monitor.\n","model.compile(optimizer=optimizer, loss=loss, metrics=[accuracy])"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"TuzJG3XG2jjw"},"source":["%%wandb\n","# Add WandbCallback() to the fit function\n","model.fit(x=train_x,\n"," y=train_y,\n"," batch_size=config.batch_size,\n"," epochs=config.epoch,\n"," validation_data=(test_x,test_y),\n"," callbacks=[WandbCallback(log_weights=True)],\n"," verbose=1)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"vh562csI3WFO"},"source":["## 4.2 Sweeps"]},{"cell_type":"code","metadata":{"id":"mo_cE96gG8Tq"},"source":[" # The sweep calls this function with each set of hyperparameters\n","def train():\n"," # Default values for hyper-parameters we're going to sweep over\n"," defaults = dict(layer_1 = 8,\n"," layer_2 = 8,\n"," learn_rate = 0.001,\n"," batch_size = 32,\n"," epoch = 500)\n"," \n"," # Initialize a new wandb run\n"," wandb.init(project=\"week06\", config= defaults)\n","\n"," # Config is a variable that holds and saves hyperparameters and inputs\n"," config = wandb.config\n"," \n"," # Instantiate a simple classification model\n"," model = tf.keras.Sequential([\n"," tf.keras.layers.Dense(config.layer_1, activation=tf.nn.relu, dtype='float64'),\n"," tf.keras.layers.Dense(config.layer_2, activation=tf.nn.relu, dtype='float64'),\n"," tf.keras.layers.Dense(1, activation=tf.nn.sigmoid, dtype='float64')\n"," ])\n","\n"," # Instantiate a logistic loss function that expects integer targets.\n"," loss = tf.keras.losses.BinaryCrossentropy()\n","\n"," # Instantiate an accuracy metric.\n"," accuracy = tf.keras.metrics.BinaryAccuracy()\n","\n"," # Instantiate an optimizer.\n"," optimizer = tf.keras.optimizers.SGD(learning_rate=config.learn_rate)\n","\n"," # configure the optimizer, loss, and metrics to monitor.\n"," model.compile(optimizer=optimizer, loss=loss, metrics=[accuracy]) \n","\n"," model.fit(train_x, train_y, batch_size=config.batch_size,\n"," epochs=config.epoch,\n"," validation_data=(test_x, test_y),\n"," callbacks=[WandbCallback(),\n"," EarlyStopping(patience=100)]\n"," ) "],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"S6whtzZ1eior"},"source":["# See the source code in order to see other parameters\n","# https://github.com/wandb/client/tree/master/wandb/sweeps"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"1Ibov1wrLgS9"},"source":["# Configure the sweep – specify the parameters to search through, the search strategy, the optimization metric et all.\n","sweep_config = {\n"," 'method': 'random', #grid, random\n"," 'metric': {\n"," 'name': 'binary_accuracy',\n"," 'goal': 'maximize' \n"," },\n"," 'parameters': {\n"," 'layer_1': {\n"," 'max': 32,\n"," 'min': 8,\n"," 'distribution': 'int_uniform',\n"," },\n"," 'layer_2': {\n"," 'max': 32,\n"," 'min': 8,\n"," 'distribution': 'int_uniform',\n"," },\n"," 'learn_rate': {\n"," 'min': -4,\n"," 'max': -2,\n"," 'distribution': 'log_uniform', \n"," },\n"," 'epoch': {\n"," 'values': [300,400,600]\n"," },\n"," 'batch_size': {\n"," 'values': [32,64]\n"," }\n"," }\n","}"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"1rCRIA2HMG1Y"},"source":["# Initialize a new sweep\n","# Arguments:\n","# – sweep_config: the sweep config dictionary defined above\n","# – entity: Set the username for the sweep\n","# – project: Set the project name for the sweep\n","sweep_id = wandb.sweep(sweep_config, entity=\"ivanovitchm\", project=\"week06\")"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"1V0Cobv4MahM"},"source":["# Initialize a new sweep\n","# Arguments:\n","# – sweep_id: the sweep_id to run - this was returned above by wandb.sweep()\n","# – function: function that defines your model architecture and trains it\n","wandb.agent(sweep_id = sweep_id, function=train,count=20)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"P_dpK2SHjIlu"},"source":["### 4.2.1 Restore a model\n","\n","Restore a file, such as a model checkpoint, into your local run folder to access in your script.\n","\n","See [the restore docs](https://docs.wandb.com/library/restore) for more details."]},{"cell_type":"code","metadata":{"id":"-ikTvdN61mm0"},"source":["%%capture\n","!pip install wandb==0.10.17"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"V8dR38tFEpby"},"source":["!pip install wandb"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"xi-Fv1nfAU6q","colab":{"base_uri":"https://localhost:8080/","height":35},"executionInfo":{"status":"ok","timestamp":1628246847147,"user_tz":180,"elapsed":799,"user":{"displayName":"Ivanovitch Silva","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Git9r91cROvzBPiAlvwQtPMEFxLz44uDidMPM-PrQ=s64","userId":"06428777505436195303"}},"outputId":"cff675dc-f568-495d-d032-92d48ea5d47a"},"source":[" import wandb\n"," wandb.__version__"],"execution_count":2,"outputs":[{"output_type":"execute_result","data":{"application/vnd.google.colaboratory.intrinsic+json":{"type":"string"},"text/plain":["'0.11.2'"]},"metadata":{"tags":[]},"execution_count":2}]},{"cell_type":"code","metadata":{"id":"5zhf0Gix1nYD","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1628246856152,"user_tz":180,"elapsed":5630,"user":{"displayName":"Ivanovitch Silva","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Git9r91cROvzBPiAlvwQtPMEFxLz44uDidMPM-PrQ=s64","userId":"06428777505436195303"}},"outputId":"7e4ab300-d488-4807-a689-15b628270563"},"source":["!wandb login"],"execution_count":3,"outputs":[{"output_type":"stream","text":["\u001b[34m\u001b[1mwandb\u001b[0m: You can find your API key in your browser here: https://wandb.ai/authorize\n","\u001b[34m\u001b[1mwandb\u001b[0m: Paste an API key from your profile and hit enter: \n","\u001b[34m\u001b[1mwandb\u001b[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"0LB6j3O-jIsd"},"source":["# restore the raw model file \"model-best.h5\" from a specific run by user \"ivanovitchm\"\n","# in project \"lesson04\" from run \"sqdv5ccj\"\n","best_model = wandb.restore('model-best.h5', run_path=\"ivanovitchm/week06/cbwfq70j\")"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"wo_JI5RPHzKu"},"source":["# restore the model for tf.keras\n","model = tf.keras.models.load_model(best_model.name)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"aeM9gLcrDiz7"},"source":["# execute the loss and accuracy using the test dataset\n","loss_, acc_ = model.evaluate(x=test_x,y=test_y, batch_size=64)\n","print('Test loss: %.3f - acc: %.3f' % (loss_, acc_))"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"a55_JCKuR4kJ"},"source":["# source: https://github.com/wandb/awesome-dl-projects/blob/master/ml-tutorial/EMNIST_Dense_Classification.ipynb\n","import seaborn as sns\n","from sklearn.metrics import confusion_matrix\n","\n","predictions = np.greater_equal(model.predict(test_x),0.5).astype(int)\n","cm = confusion_matrix(y_true = test_y, y_pred = predictions)\n","\n","plt.figure(figsize=(6,6));\n","sns.heatmap(cm, annot=True)\n","plt.savefig('confusion_matrix.png', bbox_inches='tight')\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"wTu0f6DiR7oW"},"source":["wandb.init(project=\"week06\")\n","wandb.log({\"image_confusion_matrix\": [wandb.Image('confusion_matrix.png')]})"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"XM5q_a_zqMY0"},"source":["# visualize the images and instances with error\n","# ground-truth\n","print(\"Ground-truth\\n\",test_y[~np.equal(predictions,test_y)])\n","\n","# predictions\n","print(\"Predictions\\n\",predictions[~np.equal(predictions,test_y)])"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"9iJtJ-wWvvl5"},"source":["# Images predicted as non-cat\n","fig, ax = plt.subplots(2,6,figsize=(10,6))\n","wrong_images = (~np.equal(predictions,test_y)).astype(int)\n","index = np.where(wrong_images == 1)[0]\n","\n","for i,value in enumerate(index):\n"," ax[i//6,i%6].imshow(test_x[value].reshape(64,64,3))\n","plt.savefig('wrong_predictions.png', bbox_inches='tight')"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"LnywcEUHxvL_"},"source":["wandb.log({\"wrong_predictions\": [wandb.Image('wrong_predictions.png')]})"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"jhBR6ePBvHy7"},"source":["# 5 References"]},{"cell_type":"markdown","metadata":{"id":"Hb3mCmDDvJjw"},"source":["1. https://github.com/wandb/awesome-dl-projects\n","2. https://docs.wandb.ai/app/features/panels/parameter-importance\n","3. https://wandb.ai/wandb/DistHyperOpt/reports/Modern-Scalable-Hyperparameter-Tuning-Methods--VmlldzoyMTQxODM"]}]} -------------------------------------------------------------------------------- /lessons/week_14/Task #03 Batch Normalization.ipynb: -------------------------------------------------------------------------------- 1 | {"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Week #06 Track #03 Batch Normalization.ipynb","provenance":[],"collapsed_sections":[],"toc_visible":true,"authorship_tag":"ABX9TyNGK9TN93fpHnMO0nkGhzBz"},"kernelspec":{"name":"python3","display_name":"Python 3"},"accelerator":"GPU"},"cells":[{"cell_type":"markdown","metadata":{"id":"r6LB4RyNr0pn"},"source":["# 1 - Accelerate Learning with Batch Normalization"]},{"cell_type":"markdown","metadata":{"id":"MwJEaq7SsJYQ"},"source":["**Training deep neural networks** with tens of layers is challenging as they can be **sensitive to the initial random weights** and configuration of the learning algorithm. \n","\n","One possible reason for this difficulty is: \n","\n","> the distribution of the inputs to layers deep in the network may change after\n","each minibatch when the weights are updated. \n","\n","This can cause the learning algorithm to chase a moving target forever. This change in the distribution of inputs to layers in the network is referred to by the technical name **internal covariate shift**. \n","\n","**Batch normalization** is a technique for training very deep neural networks that standardize each minibatch layer's inputs. This has the effect of stabilizing the learning process and dramatically reducing the number of training epochs required to train deep networks. This section will discover the batch normalization method used to accelerate deep learning neural networks training. After\n","reading this section, you will know:\n","\n","- Deep neural networks are challenging to train, not least because the input from prior layers can change after weight updates.\n","\n","- Batch normalization is a technique to standardize the inputs to a network, applied to either the activations of a prior layer or inputs directly.\n","\n","- Batch normalization accelerates training, in some cases by halving the number of epochs (or better), and provides some regularization effect, reducing generalization error."]},{"cell_type":"markdown","metadata":{"id":"OH9mCk8fvRAH"},"source":["## 1.1 Batch Normalization"]},{"cell_type":"markdown","metadata":{"id":"m0DvdGy5EH1d"},"source":["Training deep neural networks, e.g., networks with tens of hidden layers, is challenging. One aspect of this challenge is that the model is updated layer-by-layer backward from the output to the input using an **estimate of error that assumes the weights in the layers prior to the current\n","the layer are fixed**.\n","\n","> Very deep models involve the composition of several functions or layers. The gradient tells how to update each parameter, under the assumption that the other layers do not change. In practice, we update all of the layers simultaneously.\n","\n","**Because all layers are changed during an update**, the update procedure is forever chasing a moving target. For example, the weights of a layer are updated given an expectation that the prior layer outputs values with a given distribution. This distribution is likely changed after the\n","weights of the prior layer are updated.\n","\n","\n","> Training Deep Neural Networks is complicated by the fact that **the distribution of each layer's inputs changes during training as the parameters of the previous layers changes**. This slows down the training by **requiring lower learning rates** and **careful parameter initialization**, making it notoriously hard to train models with saturating nonlinearities."]},{"cell_type":"markdown","metadata":{"id":"YUR6FVtmE0My"},"source":["## 1.2 Standardize Layer Inputs"]},{"cell_type":"markdown","metadata":{"id":"6pCPWZmKF0yV"},"source":["Batch normalization, or **batch norm** for short, is [proposed as a technique](https://arxiv.org/pdf/1502.03167.pdf) to help coordinate the update of multiple layers in the model.\n","\n","> Batch normalization provides an elegant way of reparametrizing almost any deep network. The reparametrization significantly **reduces the problem of coordinating updates across many layers**.\n","\n","It does this by scaling the layer's output, specifically by **standardizing the activations of each input variable per minibatch**, such as the activations of a node from the previous layer. Recall that standardization refers to rescaling data to have a **mean of zero** and a **standard deviation of one**, e.g., a standard Gaussian.\n","\n","Standardizing the activations of the prior layer means that assumptions the subsequent layer **makes about the spread and distribution of inputs during the weight update will not change**, at least not dramatically. This has the effect of stabilizing and speeding-up the training process of deep neural networks.\n","\n","> Batch normalization acts to standardize only the mean and variance of each unit in order to stabilize learning but allows the relationships between units and the nonlinear statistics of a single unit to change.\n","\n","Normalizing the inputs to the layer affects the model's training, dramatically reducing the number of epochs required. **It can also have a regularizing effect**, reducing generalization error much like the use of activation regularization.\n","\n","Although **reducing internal covariate shift** was a motivation in the development of the method,\n","there is some suggestion that instead batch normalization is effective because it smooths and, in\n","turn, **simplifies the optimization function that is being solved when training the network**.\n","\n","> According to a [recent paper](https://arxiv.org/pdf/1805.11604.pdf), BatchNorm impacts network training fundamentally: **it makes the landscape of the corresponding optimization problem be significantly more smooth**. This ensures, in particular, that the gradients are more predictive and thus allow for the use of a more extensive range of learning rates and faster network convergence."]},{"cell_type":"markdown","metadata":{"id":"I-o0htyLGIZT"},"source":["## 1.3 How to Standardize Layer Inputs"]},{"cell_type":"markdown","metadata":{"id":"KwnLPVyc9mLL"},"source":["Batch normalization can be **implemented during training by calculating each input variable's mean and standard deviation to a layer per minibatch** and using these statistics to perform the standardization. Alternately, a running average of mean and standard deviation can be\n","maintained across mini-batches but may result in unstable training.\n","\n","This standardization of inputs may be applied to input variables for the first hidden layer or the activations from a hidden layer for deeper layers. In practice, it is common to allow the layer to learn two new parameters, namely a new mean and standard deviation, **Beta** and\n","**Gamma** respectively, that allow the automatic scaling and shifting of the standardized layer inputs. The model learns these parameters as part of the training process.\n","\n","> Note that simply normalizing each input of a layer may change what the layer can represent. **These parameters are learned along with the original model parameters and restore the network's representation power**.\n","\n","Significantly the backpropagation algorithm is updated to operate upon the transformed inputs, and error is also used to update the new scale and shifting parameters learned by the model. The standardization is applied to the inputs to the layer, namely the input variables or the output of the activation function from the last layer. Given the choice of activation function, the input distribution to the layer may be pretty non-Gaussian. In this case, there may be a benefit in standardizing the summed activation before the activation function in the previous layer.\n","\n","\n","> **We add the BN transform immediately before the nonlinearity**. We could have also normalized the layer inputs *u*, but since *u* is likely the output of another nonlinearity, the shape of its distribution is likely to change during training, and constraining its first and second moments would not eliminate the covariate shift."]},{"cell_type":"markdown","metadata":{"id":"s_hUcvfK_1Fc"},"source":["## 1.4 Tips for Using Batch Normalization"]},{"cell_type":"markdown","metadata":{"id":"7_bQaaBhLA_0"},"source":["This section provides tips and suggestions for using batch normalization with your own neural networks.\n","\n","**Use With Different Network Types**\n","\n","> Batch normalization is a general technique that can be used to normalize the inputs to a layer. It can be used with most network types, such as **Multilayer Perceptrons**, **Convolutional Neural Networks**, and **Recurrent Neural Networks**.\n","\n","\n","**Probably Use Before the Activation**\n","\n","> Batch normalization may be used on the inputs to the layer before or after the activation function in the previous layer. It may be more **appropriate after the activation function for s-shaped functions** like the hyperbolic tangent and logistic function. It may be appropriate **before the activation function** for activations that may result in non-Gaussian distributions like\n","the **rectified linear activation function**, the modern default for most network types.\n","\n","The goal of Batch Normalization is to achieve a stable distribution of activation values throughout training. In experiments conducted in the [original paper]((https://arxiv.org/pdf/1502.03167.pdf)), authors applied it before the nonlinearity since matching the first and second moments is more likely to result in a stable distribution.\n","\n","**Use Large Learning Rates**\n","\n","> Using batch normalization makes the network more stable during training. This may require a much greater learning rate than standard learning rates, which may further speed up the learning process.\n","\n","**Less Sensitive to Weight Initialization**\n","\n","> Deep neural networks can be pretty sensitive to the technique used to initialize the weights before training. The stability to training brought by batch normalization can make training deep networks less sensitive to the weight initialization method's choice.\n","\n","**Do not Use With Dropout**\n","\n","> Batch normalization offers some regularization effect, reducing generalization error, perhaps no longer requiring dropout for regularization.\n","\n","Further, it may not be good to use batch normalization and dropout in the same network. The reason is that the statistics used to normalize the prior layer's activations may become noisy given the random dropping out of nodes during the dropout procedure."]},{"cell_type":"markdown","metadata":{"id":"WPSCaslMNf-1"},"source":["## 1.5 Batch Normalization Case Study"]},{"cell_type":"code","metadata":{"id":"Hwvh-fPgTd7Q"},"source":["# scatter plot of the circles dataset with points colored by class\n","from sklearn.datasets import make_circles\n","import numpy as np\n","import matplotlib.pyplot as plt\n","\n","# generate circles\n","x, y = make_circles(n_samples=1000, noise=0.1, random_state=1)\n","\n","# select indices of points with each class label\n","for i in range(2):\n","\tsamples_ix = np.where(y == i)\n","\tplt.scatter(x[samples_ix, 0], x[samples_ix, 1], label=str(i))\n","plt.legend()\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"D0ZGE8hHUfBd"},"source":["### 1.5.1 Multilayer Perceptron Model"]},{"cell_type":"code","metadata":{"id":"HPNBZYQGzjeq"},"source":["%%capture\n","!pip install wandb==0.10.17"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"mhJxFQyFzmea"},"source":["!wandb login"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"zfmevaH6UNns"},"source":["# mlp for the two circles problem\n","from sklearn.datasets import make_circles\n","from tensorflow.keras.models import Sequential\n","from tensorflow.keras.layers import Dense\n","from tensorflow.keras.optimizers import SGD\n","import matplotlib.pyplot as plt\n","import wandb\n","from wandb.keras import WandbCallback"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"xrS0KFjJgQxl"},"source":["# Default values for hyperparameters\n","defaults = dict(layer_1 = 50,\n"," learn_rate = 0.01,\n"," batch_size = 32,\n"," epoch = 100)\n","\n","wandb.init(project=\"week06_bn\", \n"," config= defaults, \n"," name=\"week06_bn_run_01\")\n","config = wandb.config"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"QWWarzqNgPDf"},"source":["%%wandb\n","\n","# generate 2d classification dataset\n","x, y = make_circles(n_samples=1000, noise=0.1, random_state=1)\n","\n","# split into train and test\n","n_train = 500\n","train_x, test_x = x[:n_train, :], x[n_train:, :]\n","train_y, test_y = y[:n_train], y[n_train:]\n","\n","# define model\n","model = Sequential()\n","model.add(Dense(config.layer_1, input_dim=2, \n"," activation='relu', \n"," kernel_initializer='he_uniform'))\n","model.add(Dense(1, activation='sigmoid'))\n","opt = SGD(learning_rate=config.learn_rate, momentum=0.9)\n","model.compile(loss='binary_crossentropy', \n"," optimizer=opt, metrics=['accuracy'])\n","\n","# fit model\n","history = model.fit(train_x, train_y, \n"," validation_data=(test_x, test_y), \n"," epochs=config.epoch, verbose=0, \n"," batch_size=config.batch_size,\n"," callbacks=[WandbCallback(log_weights=True,\n"," log_gradients=True,\n"," training_data=(train_x,train_y))])\n","\n","# for more elaborate results please see the project in wandb"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"BM68FYWygrql"},"source":["# evaluate the model\n","_, train_acc = model.evaluate(train_x, train_y, verbose=0)\n","_, test_acc = model.evaluate(test_x, test_y, verbose=0)\n","print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))\n","\n","# plot loss learning curves\n","plt.subplot(211)\n","plt.title('Cross-Entropy Loss', pad=-40)\n","plt.plot(history.history['loss'], label='train')\n","plt.plot(history.history['val_loss'], label='test')\n","plt.legend()\n","# plot accuracy learning curves\n","plt.subplot(212)\n","plt.title('Accuracy', pad=-40)\n","plt.plot(history.history['accuracy'], label='train')\n","plt.plot(history.history['val_accuracy'], label='test')\n","plt.legend()\n","plt.tight_layout()\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"574ahaQXcmfs"},"source":["model.summary()"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Pgf9llT9VF2P"},"source":["### 1.5.2 Multilayer Perceptron with Batch Normalization"]},{"cell_type":"markdown","metadata":{"id":"bpQNrJNZWipY"},"source":["The model introduced in the previous section can be updated to add batch normalization. The expectation is that batch normalization would accelerate the training process, offering similar or better classification accuracy in fewer training epochs. Batch normalization is also reported as providing a subtle form of regularization, meaning that it may also offer a slight reduction in generalization error demonstrated by a small increase in classification accuracy on the holdout test dataset. A new BatchNormalization layer can be added to the model after the hidden layer before the output layer. Specifically, after the activation function of the last hidden layer."]},{"cell_type":"code","metadata":{"id":"tuKBhad8W6yv"},"source":["# mlp for the two circles problem with batchnorm after activation function\n","from sklearn.datasets import make_circles\n","from tensorflow.keras.models import Sequential\n","from tensorflow.keras.layers import Dense\n","from tensorflow.keras.layers import BatchNormalization\n","from tensorflow.keras.optimizers import SGD\n","import matplotlib.pyplot as plt\n","import wandb\n","from wandb.keras import WandbCallback"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"4PMvo2EmhZhH"},"source":["# Default values for hyperparameters\n","defaults = dict(layer_1 = 50,\n"," learn_rate = 0.01,\n"," batch_size = 32,\n"," epoch = 100)\n","\n","wandb.init(project=\"week06_bn\", config= defaults, name=\"week06_bn_run_02\")\n","config = wandb.config"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"NHw-g-THhQWv"},"source":["%%wandb\n","\n","# mlp for the two circles problem with batchnorm after activation function\n","from sklearn.datasets import make_circles\n","from tensorflow.keras.models import Sequential\n","from tensorflow.keras.layers import Dense\n","from tensorflow.keras.layers import BatchNormalization\n","from tensorflow.keras.optimizers import SGD\n","import matplotlib.pyplot as plt\n","\n","# generate 2d classification dataset\n","x, y = make_circles(n_samples=1000, noise=0.1, random_state=1)\n","\n","# split into train and test\n","n_train = 500\n","train_x, test_x = x[:n_train, :], x[n_train:, :]\n","train_y, test_y = y[:n_train], y[n_train:]\n","\n","# define model\n","model = Sequential()\n","model.add(Dense(config.layer_1, \n"," input_dim=2, \n"," activation='relu', kernel_initializer='he_uniform'))\n","model.add(BatchNormalization())\n","model.add(Dense(1, activation='sigmoid'))\n","opt = SGD(learning_rate=config.learn_rate, momentum=0.9)\n","model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])\n","\n","# fit model\n","history = model.fit(train_x, train_y,\n"," validation_data=(test_x, test_y), \n"," epochs=config.epoch, verbose=0,\n"," batch_size=config.batch_size,\n"," callbacks=[WandbCallback(log_weights=True,\n"," log_gradients=True,\n"," training_data=(train_x,train_y))])"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"btAlfnKkhoqM"},"source":["# evaluate the model\n","_, train_acc = model.evaluate(train_x, train_y, verbose=0)\n","_, test_acc = model.evaluate(test_x, test_y, verbose=0)\n","print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))\n","\n","# plot loss learning curves\n","plt.subplot(211)\n","plt.title('Cross-Entropy Loss', pad=-40)\n","plt.plot(history.history['loss'], label='train')\n","plt.plot(history.history['val_loss'], label='test')\n","plt.legend()\n","# plot accuracy learning curves\n","plt.subplot(212)\n","plt.title('Accuracy', pad=-40)\n","plt.plot(history.history['accuracy'], label='train')\n","plt.plot(history.history['val_accuracy'], label='test')\n","plt.legend()\n","plt.tight_layout()\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"rPOC2Z-uc6tU"},"source":["# tensorflow.kera use non-trainable params with batch normalization\n","# in order to maintain auxiliary variables used in inference\n","model.summary()"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"oksQ49r9XsZT"},"source":["In this case, we can see the model's comparable performance on both the train and test set of about 84% accuracy, very similar to what we saw in the previous section, if not a little bit better.\n","\n","A graph of the learning curves is also created, showing classification accuracy on each training epoch's train and test sets. In this case, we can see that the model has learned the problem faster than the model in the previous section without batch normalization. Specifically,\n","**we can see that classification accuracy on the train and test datasets leap above 80% within the first 20 epochs instead of 30-to-40 epochs in the model without batch normalization**. The plot also shows the effect of batch normalization during training. We can see lower performance\n","on the training dataset than the test dataset: scores on the training dataset that are lower than the test dataset's performance at the end of the training run. This is likely the effect of the input collected and updated each minibatch.\n"]},{"cell_type":"markdown","metadata":{"id":"eDa2D7dCY6f8"},"source":["We can also try a variation of the model where batch normalization is applied prior to the activation function of the hidden layer, instead of after the activation function."]},{"cell_type":"code","metadata":{"id":"Hcawb4h2ZJy7"},"source":["# mlp for the two circles problem with batchnorm before activation function\n","from sklearn.datasets import make_circles\n","from tensorflow.keras.models import Sequential\n","from tensorflow.keras.layers import Dense\n","from tensorflow.keras.layers import Activation\n","from tensorflow.keras.layers import BatchNormalization\n","from tensorflow.keras.optimizers import SGD\n","import matplotlib.pyplot as plt"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"edqyvj9cipVV"},"source":["# Default values for hyperparameters\n","defaults = dict(layer_1 = 50,\n"," learn_rate = 0.01,\n"," batch_size = 32,\n"," epoch = 100)\n","\n","wandb.init(project=\"week06_bn\", config= defaults, name=\"week06_bn_run_03\")\n","config = wandb.config"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"MUtckY_8iJzq"},"source":["%%wandb\n","# generate 2d classification dataset\n","x, y = make_circles(n_samples=1000, noise=0.1, random_state=1)\n","\n","# split into train and test\n","n_train = 500\n","train_x, test_x = x[:n_train, :], x[n_train:, :]\n","train_y, test_y = y[:n_train], y[n_train:]\n","\n","# define model\n","model = Sequential()\n","model.add(Dense(config.layer_1, input_dim=2, kernel_initializer='he_uniform'))\n","model.add(BatchNormalization())\n","model.add(Activation('relu'))\n","model.add(Dense(1, activation='sigmoid'))\n","opt = SGD(learning_rate=config.learn_rate, momentum=0.9)\n","model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])\n","\n","# fit model\n","history = model.fit(train_x, train_y, \n"," validation_data=(test_x, test_y), \n"," epochs=config.epoch, verbose=0,\n"," callbacks=[WandbCallback(log_weights=True,\n"," log_gradients=True,\n"," training_data=(train_x,train_y))])"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"kIaCk10iTnVX"},"source":["# tensorflow.kera use non-trainable params with batch normalization\n","# in order to maintain auxiliary variables used in inference\n","model.summary()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"rFTRwvw9iXXs"},"source":["# evaluate the model\n","_, train_acc = model.evaluate(train_x, train_y, verbose=0)\n","_, test_acc = model.evaluate(test_x, test_y, verbose=0)\n","print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))\n","\n","# plot loss learning curves\n","plt.subplot(211)\n","plt.title('Cross-Entropy Loss', pad=-40)\n","plt.plot(history.history['loss'], label='train')\n","plt.plot(history.history['val_loss'], label='test')\n","plt.legend()\n","# plot accuracy learning curves\n","plt.subplot(212)\n","plt.title('Accuracy', pad=-40)\n","plt.plot(history.history['accuracy'], label='train')\n","plt.plot(history.history['val_accuracy'], label='test')\n","plt.legend()\n","plt.tight_layout()\n","plt.show()"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"eP4yiqS9Zqe0"},"source":["In this case, we can see the model's comparable performance on the train and test datasets, but slightly worse than the model without batch normalization.\n","\n","The line plot of the learning curves on the train and test sets also tells a different story. The plot shows the model learning perhaps at the same pace as the model without batch normalization, but the model's performance on the training dataset is much worse, hovering around 70% to 75% accuracy, again likely an effect of the statistics collected and used over each minibatch. At least for this model configuration on this specific dataset, it appears that batch normalization is more effective after the rectified linear activation function."]},{"cell_type":"markdown","metadata":{"id":"H8arWJv1ajh7"},"source":["### 1.5.3 Extensions"]},{"cell_type":"markdown","metadata":{"id":"5ABzuop7atwz"},"source":["This section lists some ideas for extending the case study that you may wish to explore.\n","\n","- **Without Beta and Gamma**: update the example to not use the beta and gamma parameters in the batch normalization layer and compare results.\n","- **Without Momentum**: update the example not to use momentum in the batch normalization layer during training and compare results.\n","- **Input Layer**: update the example to use batch normalization after the input to the model and compare results."]}]} --------------------------------------------------------------------------------