├── .gitignore
└── README.md
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | pip-wheel-metadata/
24 | share/python-wheels/
25 | *.egg-info/
26 | .installed.cfg
27 | *.egg
28 | MANIFEST
29 |
30 | # PyInstaller
31 | # Usually these files are written by a python script from a template
32 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
33 | *.manifest
34 | *.spec
35 |
36 | # Installer logs
37 | pip-log.txt
38 | pip-delete-this-directory.txt
39 |
40 | # Unit test / coverage reports
41 | htmlcov/
42 | .tox/
43 | .nox/
44 | .coverage
45 | .coverage.*
46 | .cache
47 | nosetests.xml
48 | coverage.xml
49 | *.cover
50 | *.py,cover
51 | .hypothesis/
52 | .pytest_cache/
53 |
54 | # Translations
55 | *.mo
56 | *.pot
57 |
58 | # Django stuff:
59 | *.log
60 | local_settings.py
61 | db.sqlite3
62 | db.sqlite3-journal
63 |
64 | # Flask stuff:
65 | instance/
66 | .webassets-cache
67 |
68 | # Scrapy stuff:
69 | .scrapy
70 |
71 | # Sphinx documentation
72 | docs/_build/
73 |
74 | # PyBuilder
75 | target/
76 |
77 | # Jupyter Notebook
78 | .ipynb_checkpoints
79 |
80 | # IPython
81 | profile_default/
82 | ipython_config.py
83 |
84 | # pyenv
85 | .python-version
86 |
87 | # pipenv
88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
91 | # install all needed dependencies.
92 | #Pipfile.lock
93 |
94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
95 | __pypackages__/
96 |
97 | # Celery stuff
98 | celerybeat-schedule
99 | celerybeat.pid
100 |
101 | # SageMath parsed files
102 | *.sage.py
103 |
104 | # Environments
105 | .env
106 | .venv
107 | env/
108 | venv/
109 | ENV/
110 | env.bak/
111 | venv.bak/
112 |
113 | # Spyder project settings
114 | .spyderproject
115 | .spyproject
116 |
117 | # Rope project settings
118 | .ropeproject
119 |
120 | # mkdocs documentation
121 | /site
122 |
123 | # mypy
124 | .mypy_cache/
125 | .dmypy.json
126 | dmypy.json
127 |
128 | # Pyre type checker
129 | .pyre/
130 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
4 |
5 |
6 |
7 |
8 | ## :wave: Welcome to ML Software Developer 101!
9 |
10 | Welcome to the beginning of your journey to becoming an ML Engineer (MLE)! :tada: Follow these steps to get your development environment teed up! After you've finished this set-up, feel free to go through the associated `Whodunit?`! 🕵️♀️
11 |
12 |
13 | ## :books: Quick Review
14 | We will be using some terminal commands, so let's make sure you know what they are and what they do!
15 |
16 | | Command | Stands For | Description |
17 | | ----------- | ----------- | -------------|
18 | | `ls` | long listing | lists all files and directories in the present working directory |
19 | | `ls -a` | long listing all | lists hidden files as well |
20 | | `cd {dirname}` | change directory | to change to a particular directory |
21 | | `cd ~` | change directory home | navigate to HOME directory |
22 | | `cd ..` | change directory up | move one level up |
23 | | `cat {filename}` | concatenate | displays the file content |
24 | | `sudo` | superuser | allows regular users to run programs with the security privileges of the superuser or root |
25 | | `mv {filename} {newfilename}` | move | renames the file to new filename |
26 | | `clear` | clear | clears the terminal screen |
27 | | `mkdir {dirname}` | make directory | create new directory in present working directory or at specified path |
28 | | `rm {filename}` | remove | remove file with given filename |
29 | | `touch {filename}.{ext}` | touch | create new empty file |
30 | | `rmdir {dirname}` | remove directory | deletes a directory |
31 | | `ssh {username}@{ip-address} or {hostname}` | secure shell | login into a remote Linux machine using SSH |
32 | | `CTRL + SHIFT + C` | copy | keyboard shortcut for copying from terminal |
33 | | `CTRL + SHIFT + V` | paste | keyboard shortcut for pasting into terminal |
34 |
35 |
36 |
37 | ## :hammer_and_wrench: Tools We'll Be Using
38 | We will also be using a few tools such as `git`, `conda`, and `pip`.
39 |
40 | Git
41 |
42 | Git is a free and open source distributed version control system designed to handle everything from small to very large projects. These are the commands we will be using with `git`:
43 |
44 | `git clone` -> clone a remote repository to your local computer
45 |
46 | `git add` -> add files to a commit
47 |
48 | `git commit -m {message}` -> commit changes with a message
49 |
50 | `git push` -> push commit to remote repository
51 |
52 |
53 |
54 | Conda & Pip
55 |
56 | Conda is an open-source, cross-platform, language-agnostic package manager and environment management system. We will use `pip` within `conda` environments to manage our package installations. `pip` is Python's package management system. `conda` comes with Anaconda. And Anaconda is a convenient way to set up your Python programming environment since it comes with an enviornment management tool (`conda`) and comes with extra packages that are commonly used in data science and ML.
57 |
58 | Some commands we will use in this lesson when it comes to `conda` and `pip`:
59 |
60 | `conda create --name mle-course python=3.8 pip` -> This creates a virtual environment. A virtual environment is a Python environment such that the Python interpreter, libraries, amnd scripts installed into it are isolated from those installed on other environments and any libraries installed on the system. So basically, this allows you to keep all your project's code/dependencies/libraries separated from other projects. You are specifically saying to create said environment with the name `mle-course`, use `python` version 3.8, and use `pip` as your package manager. The command `conda` invokes the underlying logic to actually make the virtual environment and manages said environments for you.
61 |
62 | `conda activate mle-course` -> This activates the virtual environment you made with the above command for your current terminal session.
63 |
64 | `pip install numpy pandas matplotlib` -> This installs the three packages mentioned - `numpy`, `pandas`, and `matplotlib`. `numpy` is used for scientific computing, `pandas` is used for data analysis, and `matplotlib` is used for data graphics. `pip` is the Python package manager and you are telling it to `install` the listed packages to your environment.
65 |
66 |
67 |
68 | Jupyter Notebooks
69 |
70 | Jupyter Notebooks are an incredibly useful tool for experimentation, iteration, exploration, and even production at some companies!
71 |
72 | They have the file extension `.ipynb` (IPYthon NoteBook)
73 |
74 | You can learn more about Jupyter and their notebooks [here!](https://jupyter.org/)
75 |
76 | In order to use a notebook, you'll first want to make sure you've installed `jupyter` in your environment
77 |
78 | 1. `conda activate `
79 | 2. `pip install jupyter`
80 |
81 | From here, you can navigate to any folder containing a `.ipynb` file, and run the command `jupyter notebook`. This should launch a server, and provide you with a link. Navigate to the link in your browser in order to get started in your notebook!
82 |
83 | Be sure to terminate the server when you are done! Closing the webpage does not stop the server, so you'll need to make sure you do that manually in the terminal, or before you close the webpage with your server!
84 |
85 |
86 |
87 |
88 |
89 | ## :rocket: Let's Get Started!
90 | Let's start off by setting up our environment! Review the environment setup instructions for the local environment that you'll be using in this course.
91 |
92 | Windows
93 |
94 |
95 | * Install [Windows Subsystem for Linux](https://docs.microsoft.com/en-us/windows/wsl/install) using Powershell
96 |
97 | ```powershell
98 | wsl --install -d Ubuntu-20.04
99 | ```
100 | * Install [Windows Terminal](https://www.microsoft.com/en-us/p/windows-terminal/9n0dx20hk701?activetab=pivot:overviewtab) (You can even make it your [default!](https://devblogs.microsoft.com/commandline/windows-terminal-as-your-default-command-line-experience/))
101 | * Install [Ubuntu](https://www.microsoft.com/en-us/p/ubuntu/9pdxgncfsczv?activetab=pivot:overviewtab)
102 |
103 | (If you find yourself getting stuck on the WSL2 install, [here](https://www.youtube.com/watch?v=VMZH9Pj2dXw&ab_channel=StefanRows) is a link to video instructions)
104 |
105 | Give it a test drive!
106 |
107 | 
108 |
109 | Continue by installing the following tools using [Windows Terminal](https://www.microsoft.com/en-us/p/windows-terminal/9n0dx20hk701?activetab=pivot:overviewtab) to setup your environment. When prompted, make sure to add `conda` to `init`.
110 |
111 | | Tool | Purpose | Command |
112 | | :-------- | :-------- | :------------------------------------------------------------------------------------------------ |
113 | | :snake: **Anaconda** | Python & ML Toolkits | `wget https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh`
`bash Anaconda3-2021.11-Linux-x86_64.sh`
`source ~/.bashrc` |
114 | | :octocat: **Git** | Version Control | `sudo apt update && sudo apt upgrade`
`sudo apt install git-all` |
115 |
116 |
117 |
118 |
119 | Linux (Debian/Ubuntu)
120 |
121 | Open terminal using Ctrl+Shift+T. Enter the following commands in terminal to setup your environment. When prompted, make sure to add `conda` to `init`.
122 | | Tool | Purpose | Command |
123 | | :-------- | :-------- | :------------------------------------------------------------------------------------------------ |
124 | | :snake: **Anaconda** | Python & ML Toolkits | `wget https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh`
`bash Anaconda3-2021.11-Linux-x86_64.sh`
`source ~/.bashrc` |
125 | | :octocat: **Git** | Version Control | `sudo apt update && sudo apt upgrade`
`sudo apt install git-all` |
126 |
127 |
128 |
129 |
130 | macOS
131 |
132 | To get started, we need to download the MacOS package manager, Homebrew :beer:, so that we can download the tools we'll be using in the course. If you don't already have Homebrew installed, run the following commands:
133 |
134 | 1. Open terminal using ⌘+Space and type `terminal`.
135 |
136 | 2. Install Homebrew using the command below, following the command prompts:
137 |
138 | `/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"`
139 |
140 | 3. Update Homebrew (This may take a few minutes)
141 |
142 | `git -C /usr/local/Homebrew/Library/Taps/homebrew/homebrew-core fetch --unshallow`
143 |
144 | `git -C /usr/local/Homebrew/Library/Taps/homebrew/homebrew-cask fetch`
145 |
146 | 4. Install the `wget` command to continue following along
147 | `brew install wget`
148 |
149 | Enter the following commands in terminal to setup your environment. When prompted, make sure to add `conda` to `init`.
150 |
151 | | Tool | Purpose | Command |
152 | | :-------- | :-------- | :------------------------------------------------------------------------------------------------ |
153 | | :snake: **Anaconda** | Python & ML Toolkits | `wget https://repo.anaconda.com/archive/Anaconda3-2021.11-MacOSX-x86_64.sh`
`bash Anaconda3-2021.11-MacOSX-x86_64.sh`
`source ~/.bashrc` |
154 | | :octocat: **Git** | Version Control | `brew install git` |
155 |
156 |
157 |
158 |
159 |
160 | ##
Let's Make Sure That GitHub is Ready to Roll!
161 |
162 | If you don't already have one, make an account on [Github](https://github.com/)
163 |
164 |
165 | Github SSH Setup
166 | Secure Shell Protocol (SSH) provides a secure communication channel of an unsecured network. Let's set it up!
167 |
168 |
169 |
170 | 1. Generate a Private/Public SSH Key Pair.
171 |
172 | ```console
173 | ssh-keygen -o -t rsa -C "your email address for github"
174 | ```
175 |
176 | 2. Save file pair. Default location `~/.ssh/id_rsa` is fine!
177 |
178 |
179 | 3. At the prompt, type in a secure passphrase.
180 | 4. Copy the contents of the public key that we will share with GitHub.
181 |
182 |
183 | * For WSL:
184 |
185 | ```console
186 | clip.exe < ~/.ssh/id_rsa.pub
187 | ```
188 |
189 | * For MacOS:
190 | ```console
191 | pbcopy < ~/.ssh/id_rsa.pub
192 | ```
193 |
194 | * For Linux:
195 | ```console
196 | xclip -sel c < ~/.ssh/id_rsa.pub
197 | ```
198 |
199 | 5. Go to your GitHub account and go to `Settings`.
200 |
201 | 6. Under `Access`, click on the `SSH and GPG keys` tabs on the left.
202 |
203 | 
204 |
205 | 7. Click on the `New SSH Key` button.
206 |
207 | 
208 |
209 | 8. Name the key, and paste the public key that you copied. Click the `Add SSH Key` button
210 |
211 |
212 | 
213 |
214 |
215 |
216 |
217 | Viewing the Repositories
218 |
219 | Login and click on the top right user icon, then go to `repositories`.
220 |
221 |
222 |
223 |
224 |
225 |
226 |
227 |
228 | Creating a New Repository
229 |
230 | When viewing the respository page, click on `New` and proceed to create your repo.
231 |
232 |
233 |
234 |
235 |
236 |
237 | **Filling Respository Details**
238 |
239 | Create the repository by inputting the following:
240 | * `Repo name`
241 | * `Repo description`
242 | * Make repo `public`
243 | * Add a `README`
244 | * Add `.gitignore` (Python template)
245 | * Add `license` (choose MIT)
246 |
247 | Then click `Create Repository`.
248 |
249 |
250 |
251 |
252 |
253 |
254 |
255 |
256 |
257 | Clone Your Repo
258 |
259 | 1. Open your terminal and navigate to a place where you would like to make a directory to hold all your files for this class using the command `cd`.
260 |
261 |
262 | ```console
263 | cd {directory name}
264 | ```
265 |
266 | 2. Once there, make a top level directory using `mkdir`.
267 |
268 | ```console
269 | mkdir {directory name}
270 | ```
271 |
272 | 3. `cd` into it and make another directory called `code`.
273 |
274 | ```console
275 | cd {directory name}
276 | ```
277 |
278 | ```console
279 | mkdir code
280 | ```
281 |
282 | 4. `cd` into it and run your `git clone {your repo url}` command.
283 |
284 | ```console
285 | cd code
286 | ```
287 |
288 | ```console
289 | git clone {your repo url}
290 | ```
291 |
292 | 5. Now let's get into our directory so we can access the contents of the repo!
293 |
294 | ```console
295 | cd {your repo name}
296 | ```
297 |
298 |
299 |
300 |
301 | Adding The FourthBrain Whodunit? Content to Your Repo
302 |
303 | 1. Check your remote git.
304 |
305 | ```console
306 | git remote -v
307 | ```
308 |
309 | At this point, you should just have access to your own repo with an origin branch with both fetch and push options.
310 |
311 | 2. Let's setup our global configuration:
312 |
313 | ```console
314 | git config --global user.email "your email address"
315 | ```
316 |
317 | ```console
318 | git config --global user.name "your name"
319 | ```
320 |
321 | 3. Let's add a local branch for development.
322 |
323 | ```console
324 | git checkout -b LocalDev
325 | ```
326 |
327 | You can change anything here in this branch!
328 |
329 | ```console
330 | git add .
331 | ```
332 |
333 | Commit the changes with the branch addition.
334 |
335 | ```console
336 | git commit -m "Adding a LocalDev branch."
337 | ```
338 |
339 | 4. Let's push our local changes to our remote repo.
340 |
341 | ```console
342 | git checkout main
343 | ```
344 |
345 | ```console
346 | git merge LocalDev
347 | ```
348 |
349 | ```console
350 | git push origin main
351 | ```
352 |
353 |
354 | 5. Add the Whodunit (WD) repo as an extra remote repo:
355 |
356 | ```console
357 | git remote add WD git@github.com:FourthBrain/whodunit.git
358 | ```
359 |
360 | Let's check our remote repos:
361 |
362 | ```console
363 | git remote -v
364 | ```
365 |
366 | At this point, you should have access to both your own repo and FourthBrain and should see something like this:
367 |
368 | ```console
369 | WD git@github.com:FourthBrain/whodunit.git (fetch)
370 | WD git@github.com:FourthBrain/whodunit.git (push)
371 | origin git@github.com:rafatisina/TestRepo.git (fetch)
372 | origin git@github.com:rafatisina/TestRepo.git (push)
373 | ```
374 |
375 | Let's update our local repos:
376 |
377 | ```console
378 | git fetch --all
379 | ```
380 |
381 | Make a new branch for the Whodunit material (WDBranch).
382 | ```console
383 | git checkout --track -b WDBranch WD/main
384 | ```
385 |
386 | You should see something like this:
387 |
388 | ```console
389 | Branch 'WDBranch' set up to track remote branch 'main' from 'WD'.
390 | ```
391 |
392 | You can visually check whether you are in that branch:
393 |
394 | ```console
395 | git log --all --graph
396 | ```
397 |
398 | Now let's push our updated local repo to our remote repo!
399 |
400 | ```console
401 | git checkout main
402 | ```
403 |
404 | ```console
405 | git merge WDBranch --allow-unrelated-histories
406 | ```
407 |
408 | If there are any conflicts you'll need to resolve them.
409 | ```console
410 | git add .
411 | ```
412 |
413 | ```console
414 | git commit -m "message-here"
415 | ```
416 |
417 | ```console
418 | git push origin main
419 | ```
420 |
421 | From now on... after each release follow these steps to update your repo with new content:
422 | ```console
423 | git fetch --all
424 | git checkout WDBranch
425 | git merge --ff-only @{u}
426 | git add .
427 | git commit -m "branch is updated"
428 | git checkout main
429 | git merge WDBranch --allow-unrelated-histories
430 | ```
431 |
432 | You will be asked to add a comment about why this change is necessary --> add a message.
433 |
434 | ```console
435 | git push origin main
436 | ```
437 |
438 |
439 |
440 |
441 |
442 | ##
Bringing it all together with Jupyter notebooks!
443 |
444 | Jupyter notebooks
445 |
446 | 1. First, make sure that you are in your repo's main directory. Then navigate to the MLE-8 folder of your repo. `HINT:` You can use `pwd` to see the directory you're currently in.
447 |
448 | 2. Navigate to the `notebooks` folder within the `software-dev-for-ml-101` folder.
449 |
450 | ```console
451 | cd software-dev-for-ml-101/notebooks
452 | ```
453 |
454 | 3. Activate your conda environment that you created above.
455 |
456 | ```console
457 | conda activate
458 | ```
459 |
460 | 4. Run the `jupyter notebook` command.
461 |
462 | 5. A new window should open in your browser with the Jupyter Server. If not copy and paste the give link in your browser.
463 |
464 | 6. Open the `unix-conda-pip.ipynb` notebook and go through the demo.
465 |
466 | Note: JupyterLab is an acceptable alternative to Jupyter Notebooks if you prefer JupyterLab!
467 |
468 |
469 |
470 |
471 |
472 | # :detective: Whodunit?
473 | Now let's practice what you have learned by playing the [Whodunit?](https://github.com/FourthBrain/whodunit) game!
474 |
475 | ### That's it for now! And so it begins.... :)
476 |
--------------------------------------------------------------------------------