├── .gitignore └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

4 |

5 | 6 | 7 | 8 | ##

:wave: Welcome to ML Software Developer 101!

9 | 10 | Welcome to the beginning of your journey to becoming an ML Engineer (MLE)! :tada: Follow these steps to get your development environment teed up! After you've finished this set-up, feel free to go through the associated `Whodunit?`! 🕵️‍♀️ 11 | 12 | 13 | ## :books: Quick Review 14 | We will be using some terminal commands, so let's make sure you know what they are and what they do! 15 | 16 | | Command | Stands For | Description | 17 | | ----------- | ----------- | -------------| 18 | | `ls` | long listing | lists all files and directories in the present working directory | 19 | | `ls -a` | long listing all | lists hidden files as well | 20 | | `cd {dirname}` | change directory | to change to a particular directory | 21 | | `cd ~` | change directory home | navigate to HOME directory | 22 | | `cd ..` | change directory up | move one level up | 23 | | `cat {filename}` | concatenate | displays the file content | 24 | | `sudo` | superuser | allows regular users to run programs with the security privileges of the superuser or root | 25 | | `mv {filename} {newfilename}` | move | renames the file to new filename | 26 | | `clear` | clear | clears the terminal screen | 27 | | `mkdir {dirname}` | make directory | create new directory in present working directory or at specified path | 28 | | `rm {filename}` | remove | remove file with given filename | 29 | | `touch {filename}.{ext}` | touch | create new empty file | 30 | | `rmdir {dirname}` | remove directory | deletes a directory | 31 | | `ssh {username}@{ip-address} or {hostname}` | secure shell | login into a remote Linux machine using SSH | 32 | | `CTRL + SHIFT + C` | copy | keyboard shortcut for copying from terminal | 33 | | `CTRL + SHIFT + V` | paste | keyboard shortcut for pasting into terminal | 34 | 35 |

36 | 37 | ## :hammer_and_wrench: Tools We'll Be Using 38 | We will also be using a few tools such as `git`, `conda`, and `pip`. 39 |
40 | Git 41 | 42 | Git is a free and open source distributed version control system designed to handle everything from small to very large projects. These are the commands we will be using with `git`: 43 | 44 | `git clone` -> clone a remote repository to your local computer 45 | 46 | `git add` -> add files to a commit 47 | 48 | `git commit -m {message}` -> commit changes with a message 49 | 50 | `git push` -> push commit to remote repository 51 |
52 | 53 |
54 | Conda & Pip 55 | 56 | Conda is an open-source, cross-platform, language-agnostic package manager and environment management system. We will use `pip` within `conda` environments to manage our package installations. `pip` is Python's package management system. `conda` comes with Anaconda. And Anaconda is a convenient way to set up your Python programming environment since it comes with an enviornment management tool (`conda`) and comes with extra packages that are commonly used in data science and ML. 57 | 58 | Some commands we will use in this lesson when it comes to `conda` and `pip`: 59 | 60 | `conda create --name mle-course python=3.8 pip` -> This creates a virtual environment. A virtual environment is a Python environment such that the Python interpreter, libraries, amnd scripts installed into it are isolated from those installed on other environments and any libraries installed on the system. So basically, this allows you to keep all your project's code/dependencies/libraries separated from other projects. You are specifically saying to create said environment with the name `mle-course`, use `python` version 3.8, and use `pip` as your package manager. The command `conda` invokes the underlying logic to actually make the virtual environment and manages said environments for you. 61 | 62 | `conda activate mle-course` -> This activates the virtual environment you made with the above command for your current terminal session. 63 | 64 | `pip install numpy pandas matplotlib` -> This installs the three packages mentioned - `numpy`, `pandas`, and `matplotlib`. `numpy` is used for scientific computing, `pandas` is used for data analysis, and `matplotlib` is used for data graphics. `pip` is the Python package manager and you are telling it to `install` the listed packages to your environment. 65 |
66 | 67 |
68 | Jupyter Notebooks 69 | 70 | Jupyter Notebooks are an incredibly useful tool for experimentation, iteration, exploration, and even production at some companies! 71 | 72 | They have the file extension `.ipynb` (IPYthon NoteBook) 73 | 74 | You can learn more about Jupyter and their notebooks [here!](https://jupyter.org/) 75 | 76 | In order to use a notebook, you'll first want to make sure you've installed `jupyter` in your environment 77 | 78 | 1. `conda activate ` 79 | 2. `pip install jupyter` 80 | 81 | From here, you can navigate to any folder containing a `.ipynb` file, and run the command `jupyter notebook`. This should launch a server, and provide you with a link. Navigate to the link in your browser in order to get started in your notebook! 82 | 83 | Be sure to terminate the server when you are done! Closing the webpage does not stop the server, so you'll need to make sure you do that manually in the terminal, or before you close the webpage with your server! 84 | 85 |
86 | 87 |

88 | 89 | ## :rocket: Let's Get Started! 90 | Let's start off by setting up our environment! Review the environment setup instructions for the local environment that you'll be using in this course. 91 |
92 | Windows 93 | 94 | 95 | * Install [Windows Subsystem for Linux](https://docs.microsoft.com/en-us/windows/wsl/install) using Powershell 96 | 97 | ```powershell 98 | wsl --install -d Ubuntu-20.04 99 | ``` 100 | * Install [Windows Terminal](https://www.microsoft.com/en-us/p/windows-terminal/9n0dx20hk701?activetab=pivot:overviewtab) (You can even make it your [default!](https://devblogs.microsoft.com/commandline/windows-terminal-as-your-default-command-line-experience/)) 101 | * Install [Ubuntu](https://www.microsoft.com/en-us/p/ubuntu/9pdxgncfsczv?activetab=pivot:overviewtab) 102 | 103 | (If you find yourself getting stuck on the WSL2 install, [here](https://www.youtube.com/watch?v=VMZH9Pj2dXw&ab_channel=StefanRows) is a link to video instructions) 104 | 105 | Give it a test drive! 106 | 107 | ![WindowsTerminal](https://user-images.githubusercontent.com/72572922/160048214-37f08855-8b29-4c13-9d25-e0f69806f752.jpg) 108 | 109 | Continue by installing the following tools using [Windows Terminal](https://www.microsoft.com/en-us/p/windows-terminal/9n0dx20hk701?activetab=pivot:overviewtab) to setup your environment. When prompted, make sure to add `conda` to `init`. 110 | 111 | | Tool | Purpose | Command | 112 | | :-------- | :-------- | :------------------------------------------------------------------------------------------------ | 113 | | :snake: **Anaconda** | Python & ML Toolkits | `wget https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh`
`bash Anaconda3-2021.11-Linux-x86_64.sh`
`source ~/.bashrc` | 114 | | :octocat: **Git** | Version Control | `sudo apt update && sudo apt upgrade`
`sudo apt install git-all` | 115 | 116 |
117 | 118 |
119 | Linux (Debian/Ubuntu) 120 | 121 | Open terminal using Ctrl+Shift+T. Enter the following commands in terminal to setup your environment. When prompted, make sure to add `conda` to `init`. 122 | | Tool | Purpose | Command | 123 | | :-------- | :-------- | :------------------------------------------------------------------------------------------------ | 124 | | :snake: **Anaconda** | Python & ML Toolkits | `wget https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh`
`bash Anaconda3-2021.11-Linux-x86_64.sh`
`source ~/.bashrc` | 125 | | :octocat: **Git** | Version Control | `sudo apt update && sudo apt upgrade`
`sudo apt install git-all` | 126 | 127 |
128 | 129 |
130 | macOS 131 | 132 | To get started, we need to download the MacOS package manager, Homebrew :beer:, so that we can download the tools we'll be using in the course. If you don't already have Homebrew installed, run the following commands: 133 | 134 | 1. Open terminal using +Space and type `terminal`. 135 | 136 | 2. Install Homebrew using the command below, following the command prompts: 137 | 138 | `/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"` 139 | 140 | 3. Update Homebrew (This may take a few minutes) 141 | 142 | `git -C /usr/local/Homebrew/Library/Taps/homebrew/homebrew-core fetch --unshallow` 143 | 144 | `git -C /usr/local/Homebrew/Library/Taps/homebrew/homebrew-cask fetch` 145 | 146 | 4. Install the `wget` command to continue following along 147 | `brew install wget` 148 | 149 | Enter the following commands in terminal to setup your environment. When prompted, make sure to add `conda` to `init`. 150 | 151 | | Tool | Purpose | Command | 152 | | :-------- | :-------- | :------------------------------------------------------------------------------------------------ | 153 | | :snake: **Anaconda** | Python & ML Toolkits | `wget https://repo.anaconda.com/archive/Anaconda3-2021.11-MacOSX-x86_64.sh`
`bash Anaconda3-2021.11-MacOSX-x86_64.sh`
`source ~/.bashrc` | 154 | | :octocat: **Git** | Version Control | `brew install git` | 155 | 156 |
157 | 158 |

159 | 160 | ## Let's Make Sure That GitHub is Ready to Roll! 161 | 162 | If you don't already have one, make an account on [Github](https://github.com/) 163 | 164 |
165 | Github SSH Setup 166 | Secure Shell Protocol (SSH) provides a secure communication channel of an unsecured network. Let's set it up! 167 | 168 |

169 | 170 | 1. Generate a Private/Public SSH Key Pair. 171 | 172 | ```console 173 | ssh-keygen -o -t rsa -C "your email address for github" 174 | ``` 175 | 176 | 2. Save file pair. Default location `~/.ssh/id_rsa` is fine! 177 | 178 | 179 | 3. At the prompt, type in a secure passphrase. 180 | 4. Copy the contents of the public key that we will share with GitHub. 181 | 182 | 183 | * For WSL: 184 | 185 | ```console 186 | clip.exe < ~/.ssh/id_rsa.pub 187 | ``` 188 | 189 | * For MacOS: 190 | ```console 191 | pbcopy < ~/.ssh/id_rsa.pub 192 | ``` 193 | 194 | * For Linux: 195 | ```console 196 | xclip -sel c < ~/.ssh/id_rsa.pub 197 | ``` 198 | 199 | 5. Go to your GitHub account and go to `Settings`. 200 | 201 | 6. Under `Access`, click on the `SSH and GPG keys` tabs on the left. 202 | 203 | ![Access Section](../img/github_access_section.png) 204 | 205 | 7. Click on the `New SSH Key` button. 206 | 207 | ![New SSH Key](../img/github_new_ssh_key.png) 208 | 209 | 8. Name the key, and paste the public key that you copied. Click the `Add SSH Key` button 210 | 211 | 212 | ![Add SSH Key](../img/github_add_ssh_key.png) 213 | 214 |
215 | 216 |
217 | Viewing the Repositories 218 | 219 | Login and click on the top right user icon, then go to `repositories`. 220 | 221 |

222 | 223 |

224 |
225 | 226 | 227 |
228 | Creating a New Repository 229 | 230 | When viewing the respository page, click on `New` and proceed to create your repo. 231 | 232 |

233 | 234 |

235 |
236 | 237 | **Filling Respository Details** 238 | 239 | Create the repository by inputting the following: 240 | * `Repo name` 241 | * `Repo description` 242 | * Make repo `public` 243 | * Add a `README` 244 | * Add `.gitignore` (Python template) 245 | * Add `license` (choose MIT) 246 | 247 | Then click `Create Repository`. 248 | 249 |

250 | 251 |

252 | 253 |
254 | 255 |
256 | 257 | Clone Your Repo 258 | 259 | 1. Open your terminal and navigate to a place where you would like to make a directory to hold all your files for this class using the command `cd`. 260 | 261 | 262 | ```console 263 | cd {directory name} 264 | ``` 265 | 266 | 2. Once there, make a top level directory using `mkdir`. 267 | 268 | ```console 269 | mkdir {directory name} 270 | ``` 271 | 272 | 3. `cd` into it and make another directory called `code`. 273 | 274 | ```console 275 | cd {directory name} 276 | ``` 277 | 278 | ```console 279 | mkdir code 280 | ``` 281 | 282 | 4. `cd` into it and run your `git clone {your repo url}` command. 283 | 284 | ```console 285 | cd code 286 | ``` 287 | 288 | ```console 289 | git clone {your repo url} 290 | ``` 291 | 292 | 5. Now let's get into our directory so we can access the contents of the repo! 293 | 294 | ```console 295 | cd {your repo name} 296 | ``` 297 | 298 |
299 | 300 |
301 | Adding The FourthBrain Whodunit? Content to Your Repo 302 | 303 | 1. Check your remote git. 304 | 305 | ```console 306 | git remote -v 307 | ``` 308 | 309 | At this point, you should just have access to your own repo with an origin branch with both fetch and push options. 310 | 311 | 2. Let's setup our global configuration: 312 | 313 | ```console 314 | git config --global user.email "your email address" 315 | ``` 316 | 317 | ```console 318 | git config --global user.name "your name" 319 | ``` 320 | 321 | 3. Let's add a local branch for development. 322 | 323 | ```console 324 | git checkout -b LocalDev 325 | ``` 326 | 327 | You can change anything here in this branch! 328 | 329 | ```console 330 | git add . 331 | ``` 332 | 333 | Commit the changes with the branch addition. 334 | 335 | ```console 336 | git commit -m "Adding a LocalDev branch." 337 | ``` 338 | 339 | 4. Let's push our local changes to our remote repo. 340 | 341 | ```console 342 | git checkout main 343 | ``` 344 | 345 | ```console 346 | git merge LocalDev 347 | ``` 348 | 349 | ```console 350 | git push origin main 351 | ``` 352 | 353 | 354 | 5. Add the Whodunit (WD) repo as an extra remote repo: 355 | 356 | ```console 357 | git remote add WD git@github.com:FourthBrain/whodunit.git 358 | ``` 359 | 360 | Let's check our remote repos: 361 | 362 | ```console 363 | git remote -v 364 | ``` 365 | 366 | At this point, you should have access to both your own repo and FourthBrain and should see something like this: 367 | 368 | ```console 369 | WD git@github.com:FourthBrain/whodunit.git (fetch) 370 | WD git@github.com:FourthBrain/whodunit.git (push) 371 | origin git@github.com:rafatisina/TestRepo.git (fetch) 372 | origin git@github.com:rafatisina/TestRepo.git (push) 373 | ``` 374 | 375 | Let's update our local repos: 376 | 377 | ```console 378 | git fetch --all 379 | ``` 380 | 381 | Make a new branch for the Whodunit material (WDBranch). 382 | ```console 383 | git checkout --track -b WDBranch WD/main 384 | ``` 385 | 386 | You should see something like this: 387 | 388 | ```console 389 | Branch 'WDBranch' set up to track remote branch 'main' from 'WD'. 390 | ``` 391 | 392 | You can visually check whether you are in that branch: 393 | 394 | ```console 395 | git log --all --graph 396 | ``` 397 | 398 | Now let's push our updated local repo to our remote repo! 399 | 400 | ```console 401 | git checkout main 402 | ``` 403 | 404 | ```console 405 | git merge WDBranch --allow-unrelated-histories 406 | ``` 407 | 408 | If there are any conflicts you'll need to resolve them. 409 | ```console 410 | git add . 411 | ``` 412 | 413 | ```console 414 | git commit -m "message-here" 415 | ``` 416 | 417 | ```console 418 | git push origin main 419 | ``` 420 | 421 | From now on... after each release follow these steps to update your repo with new content: 422 | ```console 423 | git fetch --all 424 | git checkout WDBranch 425 | git merge --ff-only @{u} 426 | git add . 427 | git commit -m "branch is updated" 428 | git checkout main 429 | git merge WDBranch --allow-unrelated-histories 430 | ``` 431 | 432 | You will be asked to add a comment about why this change is necessary --> add a message. 433 | 434 | ```console 435 | git push origin main 436 | ``` 437 |
438 | 439 | 440 |

441 | 442 | ## Bringing it all together with Jupyter notebooks! 443 |
444 | Jupyter notebooks 445 | 446 | 1. First, make sure that you are in your repo's main directory. Then navigate to the MLE-8 folder of your repo. `HINT:` You can use `pwd` to see the directory you're currently in. 447 | 448 | 2. Navigate to the `notebooks` folder within the `software-dev-for-ml-101` folder. 449 | 450 | ```console 451 | cd software-dev-for-ml-101/notebooks 452 | ``` 453 | 454 | 3. Activate your conda environment that you created above. 455 | 456 | ```console 457 | conda activate 458 | ``` 459 | 460 | 4. Run the `jupyter notebook` command. 461 | 462 | 5. A new window should open in your browser with the Jupyter Server. If not copy and paste the give link in your browser. 463 | 464 | 6. Open the `unix-conda-pip.ipynb` notebook and go through the demo. 465 | 466 | Note: JupyterLab is an acceptable alternative to Jupyter Notebooks if you prefer JupyterLab! 467 | 468 |
469 | 470 |

471 | 472 | # :detective: Whodunit? 473 | Now let's practice what you have learned by playing the [Whodunit?](https://github.com/FourthBrain/whodunit) game! 474 | 475 | ### That's it for now! And so it begins.... :) 476 | --------------------------------------------------------------------------------