├── LICENSES ├── CODE_OF_CONDUCT.md ├── LICENSE ├── LICENSE-CODE └── SECURITY.md ├── PythonForDataProfessionals ├── 00 Pre-Requisites.md ├── 01 Overview and Course Setup.md ├── 02 Programming Basics.md ├── 03 Working with Data.md ├── 04 Environments and Deployment.md ├── Python for Data Professionals.pyproj ├── Python for Data Professionals_hdi_settings.json ├── assets │ ├── MLCheatSheet.png │ ├── NoStarchPressPython.pdf │ ├── NumpyPythonCheatSheet.pdf │ ├── PandasCheatSheet.pdf │ ├── Python3CheatSheet.pdf │ └── UseCases.png ├── code │ ├── 01_OverviewAndCourseSetup.py │ ├── 02_ProgrammingBasics.py │ ├── 03_WorkingWithData.py │ └── 04_EnvrionmentsAndDeployment.py ├── data │ └── CATelcoCustomerChurnTrainingSample.csv ├── graphics │ ├── AnalyticsAreas.png │ ├── DataScience.png │ ├── MLCapabilities.png │ ├── MatPlotLib.png │ ├── SmallBuck.png │ ├── aml-logo.png │ ├── brain.png │ ├── check.png │ ├── checkbox.png │ ├── checkmark.jpg │ ├── cortanalogo.png │ ├── files.jpg │ ├── ggplot.png │ ├── keyboard.jpg │ ├── microsoftlogo.png │ ├── pin.jpg │ ├── solutions-microsoft-logo-small.png │ ├── tdsp.png │ └── thinking.jpg ├── html │ └── 00 Pre-Requisites.html └── notebooks │ ├── .ipynb_checkpoints │ ├── 00 Pre-Requisites-checkpoint.ipynb │ ├── 01 Overview and Setup-checkpoint.ipynb │ ├── 02 Programming Basics-checkpoint.ipynb │ ├── 03 Working with Data-checkpoint.ipynb │ └── 04 Environments and Deployment-checkpoint.ipynb │ ├── 00 Pre-Requisites.ipynb │ ├── 01 Overview and Setup.ipynb │ ├── 02 Programming Basics.ipynb │ ├── 03 Working with Data.ipynb │ └── 04 Environments and Deployment.ipynb ├── README.md ├── SECURITY.md └── graphics ├── AnalyticsAreas.png ├── DataScience.png ├── MLCapabilities.png ├── MatPlotLib.png ├── SmallBuck.png ├── aml-logo.png ├── brain.png ├── check.png ├── checkbox.png ├── checkmark.jpg ├── cortanalogo.png ├── files.jpg ├── ggplot.png ├── keyboard.jpg ├── microsoftlogo.png ├── pin.jpg ├── solutions-microsoft-logo-small.png ├── tdsp.png └── thinking.jpg /LICENSES/CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Microsoft Open Source Code of Conduct 2 | 3 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 4 | 5 | Resources: 6 | 7 | - [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/) 8 | - [Microsoft Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) 9 | - Contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or concerns 10 | -------------------------------------------------------------------------------- /LICENSES/LICENSE: -------------------------------------------------------------------------------- 1 | Attribution 4.0 International 2 | 3 | ======================================================================= 4 | 5 | Creative Commons Corporation ("Creative Commons") is not a law firm and 6 | does not provide legal services or legal advice. Distribution of 7 | Creative Commons public licenses does not create a lawyer-client or 8 | other relationship. Creative Commons makes its licenses and related 9 | information available on an "as-is" basis. Creative Commons gives no 10 | warranties regarding its licenses, any material licensed under their 11 | terms and conditions, or any related information. Creative Commons 12 | disclaims all liability for damages resulting from their use to the 13 | fullest extent possible. 14 | 15 | Using Creative Commons Public Licenses 16 | 17 | Creative Commons public licenses provide a standard set of terms and 18 | conditions that creators and other rights holders may use to share 19 | original works of authorship and other material subject to copyright 20 | and certain other rights specified in the public license below. The 21 | following considerations are for informational purposes only, are not 22 | exhaustive, and do not form part of our licenses. 23 | 24 | Considerations for licensors: Our public licenses are 25 | intended for use by those authorized to give the public 26 | permission to use material in ways otherwise restricted by 27 | copyright and certain other rights. Our licenses are 28 | irrevocable. Licensors should read and understand the terms 29 | and conditions of the license they choose before applying it. 30 | Licensors should also secure all rights necessary before 31 | applying our licenses so that the public can reuse the 32 | material as expected. Licensors should clearly mark any 33 | material not subject to the license. This includes other CC- 34 | licensed material, or material used under an exception or 35 | limitation to copyright. More considerations for licensors: 36 | wiki.creativecommons.org/Considerations_for_licensors 37 | 38 | Considerations for the public: By using one of our public 39 | licenses, a licensor grants the public permission to use the 40 | licensed material under specified terms and conditions. If 41 | the licensor's permission is not necessary for any reason--for 42 | example, because of any applicable exception or limitation to 43 | copyright--then that use is not regulated by the license. Our 44 | licenses grant only permissions under copyright and certain 45 | other rights that a licensor has authority to grant. Use of 46 | the licensed material may still be restricted for other 47 | reasons, including because others have copyright or other 48 | rights in the material. A licensor may make special requests, 49 | such as asking that all changes be marked or described. 50 | Although not required by our licenses, you are encouraged to 51 | respect those requests where reasonable. More_considerations 52 | for the public: 53 | wiki.creativecommons.org/Considerations_for_licensees 54 | 55 | ======================================================================= 56 | 57 | Creative Commons Attribution 4.0 International Public License 58 | 59 | By exercising the Licensed Rights (defined below), You accept and agree 60 | to be bound by the terms and conditions of this Creative Commons 61 | Attribution 4.0 International Public License ("Public License"). To the 62 | extent this Public License may be interpreted as a contract, You are 63 | granted the Licensed Rights in consideration of Your acceptance of 64 | these terms and conditions, and the Licensor grants You such rights in 65 | consideration of benefits the Licensor receives from making the 66 | Licensed Material available under these terms and conditions. 67 | 68 | 69 | Section 1 -- Definitions. 70 | 71 | a. Adapted Material means material subject to Copyright and Similar 72 | Rights that is derived from or based upon the Licensed Material 73 | and in which the Licensed Material is translated, altered, 74 | arranged, transformed, or otherwise modified in a manner requiring 75 | permission under the Copyright and Similar Rights held by the 76 | Licensor. For purposes of this Public License, where the Licensed 77 | Material is a musical work, performance, or sound recording, 78 | Adapted Material is always produced where the Licensed Material is 79 | synched in timed relation with a moving image. 80 | 81 | b. Adapter's License means the license You apply to Your Copyright 82 | and Similar Rights in Your contributions to Adapted Material in 83 | accordance with the terms and conditions of this Public License. 84 | 85 | c. Copyright and Similar Rights means copyright and/or similar rights 86 | closely related to copyright including, without limitation, 87 | performance, broadcast, sound recording, and Sui Generis Database 88 | Rights, without regard to how the rights are labeled or 89 | categorized. For purposes of this Public License, the rights 90 | specified in Section 2(b)(1)-(2) are not Copyright and Similar 91 | Rights. 92 | 93 | d. Effective Technological Measures means those measures that, in the 94 | absence of proper authority, may not be circumvented under laws 95 | fulfilling obligations under Article 11 of the WIPO Copyright 96 | Treaty adopted on December 20, 1996, and/or similar international 97 | agreements. 98 | 99 | e. Exceptions and Limitations means fair use, fair dealing, and/or 100 | any other exception or limitation to Copyright and Similar Rights 101 | that applies to Your use of the Licensed Material. 102 | 103 | f. Licensed Material means the artistic or literary work, database, 104 | or other material to which the Licensor applied this Public 105 | License. 106 | 107 | g. Licensed Rights means the rights granted to You subject to the 108 | terms and conditions of this Public License, which are limited to 109 | all Copyright and Similar Rights that apply to Your use of the 110 | Licensed Material and that the Licensor has authority to license. 111 | 112 | h. Licensor means the individual(s) or entity(ies) granting rights 113 | under this Public License. 114 | 115 | i. Share means to provide material to the public by any means or 116 | process that requires permission under the Licensed Rights, such 117 | as reproduction, public display, public performance, distribution, 118 | dissemination, communication, or importation, and to make material 119 | available to the public including in ways that members of the 120 | public may access the material from a place and at a time 121 | individually chosen by them. 122 | 123 | j. Sui Generis Database Rights means rights other than copyright 124 | resulting from Directive 96/9/EC of the European Parliament and of 125 | the Council of 11 March 1996 on the legal protection of databases, 126 | as amended and/or succeeded, as well as other essentially 127 | equivalent rights anywhere in the world. 128 | 129 | k. You means the individual or entity exercising the Licensed Rights 130 | under this Public License. Your has a corresponding meaning. 131 | 132 | 133 | Section 2 -- Scope. 134 | 135 | a. License grant. 136 | 137 | 1. Subject to the terms and conditions of this Public License, 138 | the Licensor hereby grants You a worldwide, royalty-free, 139 | non-sublicensable, non-exclusive, irrevocable license to 140 | exercise the Licensed Rights in the Licensed Material to: 141 | 142 | a. reproduce and Share the Licensed Material, in whole or 143 | in part; and 144 | 145 | b. produce, reproduce, and Share Adapted Material. 146 | 147 | 2. Exceptions and Limitations. For the avoidance of doubt, where 148 | Exceptions and Limitations apply to Your use, this Public 149 | License does not apply, and You do not need to comply with 150 | its terms and conditions. 151 | 152 | 3. Term. The term of this Public License is specified in Section 153 | 6(a). 154 | 155 | 4. Media and formats; technical modifications allowed. The 156 | Licensor authorizes You to exercise the Licensed Rights in 157 | all media and formats whether now known or hereafter created, 158 | and to make technical modifications necessary to do so. The 159 | Licensor waives and/or agrees not to assert any right or 160 | authority to forbid You from making technical modifications 161 | necessary to exercise the Licensed Rights, including 162 | technical modifications necessary to circumvent Effective 163 | Technological Measures. For purposes of this Public License, 164 | simply making modifications authorized by this Section 2(a) 165 | (4) never produces Adapted Material. 166 | 167 | 5. Downstream recipients. 168 | 169 | a. Offer from the Licensor -- Licensed Material. Every 170 | recipient of the Licensed Material automatically 171 | receives an offer from the Licensor to exercise the 172 | Licensed Rights under the terms and conditions of this 173 | Public License. 174 | 175 | b. No downstream restrictions. You may not offer or impose 176 | any additional or different terms or conditions on, or 177 | apply any Effective Technological Measures to, the 178 | Licensed Material if doing so restricts exercise of the 179 | Licensed Rights by any recipient of the Licensed 180 | Material. 181 | 182 | 6. No endorsement. Nothing in this Public License constitutes or 183 | may be construed as permission to assert or imply that You 184 | are, or that Your use of the Licensed Material is, connected 185 | with, or sponsored, endorsed, or granted official status by, 186 | the Licensor or others designated to receive attribution as 187 | provided in Section 3(a)(1)(A)(i). 188 | 189 | b. Other rights. 190 | 191 | 1. Moral rights, such as the right of integrity, are not 192 | licensed under this Public License, nor are publicity, 193 | privacy, and/or other similar personality rights; however, to 194 | the extent possible, the Licensor waives and/or agrees not to 195 | assert any such rights held by the Licensor to the limited 196 | extent necessary to allow You to exercise the Licensed 197 | Rights, but not otherwise. 198 | 199 | 2. Patent and trademark rights are not licensed under this 200 | Public License. 201 | 202 | 3. To the extent possible, the Licensor waives any right to 203 | collect royalties from You for the exercise of the Licensed 204 | Rights, whether directly or through a collecting society 205 | under any voluntary or waivable statutory or compulsory 206 | licensing scheme. In all other cases the Licensor expressly 207 | reserves any right to collect such royalties. 208 | 209 | 210 | Section 3 -- License Conditions. 211 | 212 | Your exercise of the Licensed Rights is expressly made subject to the 213 | following conditions. 214 | 215 | a. Attribution. 216 | 217 | 1. If You Share the Licensed Material (including in modified 218 | form), You must: 219 | 220 | a. retain the following if it is supplied by the Licensor 221 | with the Licensed Material: 222 | 223 | i. identification of the creator(s) of the Licensed 224 | Material and any others designated to receive 225 | attribution, in any reasonable manner requested by 226 | the Licensor (including by pseudonym if 227 | designated); 228 | 229 | ii. a copyright notice; 230 | 231 | iii. a notice that refers to this Public License; 232 | 233 | iv. a notice that refers to the disclaimer of 234 | warranties; 235 | 236 | v. a URI or hyperlink to the Licensed Material to the 237 | extent reasonably practicable; 238 | 239 | b. indicate if You modified the Licensed Material and 240 | retain an indication of any previous modifications; and 241 | 242 | c. indicate the Licensed Material is licensed under this 243 | Public License, and include the text of, or the URI or 244 | hyperlink to, this Public License. 245 | 246 | 2. You may satisfy the conditions in Section 3(a)(1) in any 247 | reasonable manner based on the medium, means, and context in 248 | which You Share the Licensed Material. For example, it may be 249 | reasonable to satisfy the conditions by providing a URI or 250 | hyperlink to a resource that includes the required 251 | information. 252 | 253 | 3. If requested by the Licensor, You must remove any of the 254 | information required by Section 3(a)(1)(A) to the extent 255 | reasonably practicable. 256 | 257 | 4. If You Share Adapted Material You produce, the Adapter's 258 | License You apply must not prevent recipients of the Adapted 259 | Material from complying with this Public License. 260 | 261 | 262 | Section 4 -- Sui Generis Database Rights. 263 | 264 | Where the Licensed Rights include Sui Generis Database Rights that 265 | apply to Your use of the Licensed Material: 266 | 267 | a. for the avoidance of doubt, Section 2(a)(1) grants You the right 268 | to extract, reuse, reproduce, and Share all or a substantial 269 | portion of the contents of the database; 270 | 271 | b. if You include all or a substantial portion of the database 272 | contents in a database in which You have Sui Generis Database 273 | Rights, then the database in which You have Sui Generis Database 274 | Rights (but not its individual contents) is Adapted Material; and 275 | 276 | c. You must comply with the conditions in Section 3(a) if You Share 277 | all or a substantial portion of the contents of the database. 278 | 279 | For the avoidance of doubt, this Section 4 supplements and does not 280 | replace Your obligations under this Public License where the Licensed 281 | Rights include other Copyright and Similar Rights. 282 | 283 | 284 | Section 5 -- Disclaimer of Warranties and Limitation of Liability. 285 | 286 | a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE 287 | EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS 288 | AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF 289 | ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS, 290 | IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION, 291 | WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR 292 | PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS, 293 | ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT 294 | KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT 295 | ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU. 296 | 297 | b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE 298 | TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION, 299 | NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT, 300 | INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES, 301 | COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR 302 | USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN 303 | ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR 304 | DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR 305 | IN PART, THIS LIMITATION MAY NOT APPLY TO YOU. 306 | 307 | c. The disclaimer of warranties and limitation of liability provided 308 | above shall be interpreted in a manner that, to the extent 309 | possible, most closely approximates an absolute disclaimer and 310 | waiver of all liability. 311 | 312 | 313 | Section 6 -- Term and Termination. 314 | 315 | a. This Public License applies for the term of the Copyright and 316 | Similar Rights licensed here. However, if You fail to comply with 317 | this Public License, then Your rights under this Public License 318 | terminate automatically. 319 | 320 | b. Where Your right to use the Licensed Material has terminated under 321 | Section 6(a), it reinstates: 322 | 323 | 1. automatically as of the date the violation is cured, provided 324 | it is cured within 30 days of Your discovery of the 325 | violation; or 326 | 327 | 2. upon express reinstatement by the Licensor. 328 | 329 | For the avoidance of doubt, this Section 6(b) does not affect any 330 | right the Licensor may have to seek remedies for Your violations 331 | of this Public License. 332 | 333 | c. For the avoidance of doubt, the Licensor may also offer the 334 | Licensed Material under separate terms or conditions or stop 335 | distributing the Licensed Material at any time; however, doing so 336 | will not terminate this Public License. 337 | 338 | d. Sections 1, 5, 6, 7, and 8 survive termination of this Public 339 | License. 340 | 341 | 342 | Section 7 -- Other Terms and Conditions. 343 | 344 | a. The Licensor shall not be bound by any additional or different 345 | terms or conditions communicated by You unless expressly agreed. 346 | 347 | b. Any arrangements, understandings, or agreements regarding the 348 | Licensed Material not stated herein are separate from and 349 | independent of the terms and conditions of this Public License. 350 | 351 | 352 | Section 8 -- Interpretation. 353 | 354 | a. For the avoidance of doubt, this Public License does not, and 355 | shall not be interpreted to, reduce, limit, restrict, or impose 356 | conditions on any use of the Licensed Material that could lawfully 357 | be made without permission under this Public License. 358 | 359 | b. To the extent possible, if any provision of this Public License is 360 | deemed unenforceable, it shall be automatically reformed to the 361 | minimum extent necessary to make it enforceable. If the provision 362 | cannot be reformed, it shall be severed from this Public License 363 | without affecting the enforceability of the remaining terms and 364 | conditions. 365 | 366 | c. No term or condition of this Public License will be waived and no 367 | failure to comply consented to unless expressly agreed to by the 368 | Licensor. 369 | 370 | d. Nothing in this Public License constitutes or may be interpreted 371 | as a limitation upon, or waiver of, any privileges and immunities 372 | that apply to the Licensor or You, including from the legal 373 | processes of any jurisdiction or authority. 374 | 375 | 376 | ======================================================================= 377 | 378 | Creative Commons is not a party to its public 379 | licenses. Notwithstanding, Creative Commons may elect to apply one of 380 | its public licenses to material it publishes and in those instances 381 | will be considered the “Licensor.” The text of the Creative Commons 382 | public licenses is dedicated to the public domain under the CC0 Public 383 | Domain Dedication. Except for the limited purpose of indicating that 384 | material is shared under a Creative Commons public license or as 385 | otherwise permitted by the Creative Commons policies published at 386 | creativecommons.org/policies, Creative Commons does not authorize the 387 | use of the trademark "Creative Commons" or any other trademark or logo 388 | of Creative Commons without its prior written consent including, 389 | without limitation, in connection with any unauthorized modifications 390 | to any of its public licenses or any other arrangements, 391 | understandings, or agreements concerning use of licensed material. For 392 | the avoidance of doubt, this paragraph does not form part of the 393 | public licenses. 394 | 395 | Creative Commons may be contacted at creativecommons.org. -------------------------------------------------------------------------------- /LICENSES/LICENSE-CODE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) Microsoft Corporation. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE 22 | -------------------------------------------------------------------------------- /LICENSES/SECURITY.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ## Security 4 | 5 | Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/). 6 | 7 | If you believe you have found a security vulnerability in any Microsoft-owned repository that meets Microsoft's [Microsoft's definition of a security vulnerability](https://docs.microsoft.com/en-us/previous-versions/tn-archive/cc751383(v=technet.10)) of a security vulnerability, please report it to us as described below. 8 | 9 | ## Reporting Security Issues 10 | 11 | **Please do not report security vulnerabilities through public GitHub issues.** 12 | 13 | Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://msrc.microsoft.com/create-report). 14 | 15 | If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc). 16 | 17 | You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc). 18 | 19 | Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue: 20 | 21 | * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.) 22 | * Full paths of source file(s) related to the manifestation of the issue 23 | * The location of the affected source code (tag/branch/commit or direct URL) 24 | * Any special configuration required to reproduce the issue 25 | * Step-by-step instructions to reproduce the issue 26 | * Proof-of-concept or exploit code (if possible) 27 | * Impact of the issue, including how an attacker might exploit the issue 28 | 29 | This information will help us triage your report more quickly. 30 | 31 | If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://microsoft.com/msrc/bounty) page for more details about our active programs. 32 | 33 | ## Preferred Languages 34 | 35 | We prefer all communications to be in English. 36 | 37 | ## Policy 38 | 39 | Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://www.microsoft.com/en-us/msrc/cvd). 40 | 41 | -------------------------------------------------------------------------------- /PythonForDataProfessionals/00 Pre-Requisites.md: -------------------------------------------------------------------------------- 1 | ![](graphics/solutions-microsoft-logo-small.png) 2 | 3 | # Python for Data Professionals 4 | 5 |

00 Pre-Requisites

6 | 7 | The "Python for Data Professionals" course is taught using Microsoft Windows, SQL Server, and Visual Studio. You can of course use the Python language on many platforms and in other distributions and with other tools, but using this configuration allows you to stay consistent for instruction during this course. Feel free to use other installations after you complete the course. 8 | 9 | *Note that all following activities must be completed prior to class - there will not be time to perform these operations during the course.* 10 | 11 |

Activity 1: Set up the Windows Operating System

12 | 13 | You have three options for setting up Microsoft Windows to complete this course. You can use a Local installation of Windows, a Virtual Machine on your local system, or a Virtual Machine stored in a Cloud provider such as Microsoft Azure. *(The third option is only for classrooms where you have reliable connections to the Internet)* 14 | 15 |

Option 1 - Local Installation

16 | 17 | - Install a recent version of Microsoft Windows. For this course, Windows 10, or any current of Windows Server is acceptable. 18 | - Install all updates to the operating system. 19 | 20 |

Option 2 - Install Windows on a Local Virtual Machine Environment

21 | 22 | - Using your local system, [navigate to this resource](https://developer.microsoft.com/en-us/windows/downloads/virtual-machines) and follow the instructions there. 23 | 24 | **NOTE: Wait as long as reasonably possible to ensure that the system does not expire - these are free licenses, but they have a time limit** 25 | 26 | - You can also use whatever Hypervisor you like for your system and install a legal, registered copy of Microsoft Windows. 27 | 28 |

Option 3 - Use a Virtual Machine in a Cloud Provider

29 | 30 | - If you have access to the Internet, you can set up a [free Microsoft Azure Account](https://azure.microsoft.com/en-us/free/search/?&OCID=AID631184_SEM_bSHIQHtA&lnkd=Google_Azure_Brand&gclid=Cj0KCQjwpcLZBRCnARIsAMPBgF2myLWEk3Hllm2354GEs0rD1sDST_xcfkFGRdAE8toYZMalbQJ4M3YaAs9UEALw_wcB&dclid=CPDRgcv57tsCFVXE4Qodo-gLzg) and use a [Data Science Virtual Machine](https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/provision-vm). Any size will do, and the free account provides enough resources for a single course. You will not need to install Anaconda, VSCode or SQL Server if you use this choice, as they are already installed for you. 31 | - Log in to the system and run [Windows Update](https://support.microsoft.com/en-us/help/4027667/windows-update-windows-10) 32 | 33 |

Activity 2: Install SQL Server 2017 with ML Services

34 | 35 | - [Navigate to this resource](https://www.microsoft.com/en-us/sql-server/sql-server-downloads), Select **Developer** from the lower part of the page, and install the **Developer Edition**. Select all components for installation. 36 | 37 | - Run Windows Update and select the ["Install updates for other products" option](https://www.lifewire.com/how-to-change-windows-update-settings-2625778). Apply the latest updates to the classroom system. 38 | 39 |

Activity 3: Install Visual Studio with Machine Learning and Data Science workloads

40 | 41 | - On your classroom system, [install Visual Studio 2017](https://www.visualstudio.com/downloads/) - The free Community Edition is adequate for this course. 42 | 43 | - During the installation, select the "Data storage and processing" and "Data science and analytical applicaitons" Workloads. *(NOTE: [In the Data Science Workload installation box, select ALL optional components on the Summary pane!](https://blogs.msdn.microsoft.com/visualstudio/2016/11/18/data-science-workloads-in-visual-studio-2017-rc/))* 44 | 45 | - Log in with a Live ID to Visual Studio, let the system load, and apply any updates. 46 | 47 | - After the updates complete, click the "R Tools" menu item and open the "Interactive R Window" option (This will verify that the Data Science Workloads add-ins are working, R and Python). Type the following in that panel to ensure the installation was successful: 48 | 49 | `x <- 10` 50 | 51 | `x` 52 | 53 | You should see the result **\[1\]10** returned. If not, open the Visual Studio Installer and select the "Repair" option. 54 | 55 |

For Further Study

56 | 57 | - Platforms supported: https://www.python.org/download/other/ 58 | 59 | - Installing Python: https://www.python.org/downloads/ 60 | 61 | - Installing Python using Anaconda: https://www.infoworld.com/article/3267976/python/anaconda-cpython-pypy-and-more-know-your-python-distributions.html 62 | 63 | Next, Continue to *01 Overview and Course Setup* 64 | -------------------------------------------------------------------------------- /PythonForDataProfessionals/01 Overview and Course Setup.md: -------------------------------------------------------------------------------- 1 | ![](graphics/solutions-microsoft-logo-small.png) 2 | 3 | # Python for Data Professionals 4 | 5 | ## 01 Overview and Setup 6 | 7 | In this course you'll cover the basics of the Python language and environment from a Data Professional's perspective. While you will learn Python, you'll quickly cover topics that have a lot more depth available. In each section you'll get more references to go deeper, which you should follow up on. Also watch for links within the text - click on each one to explore that topic. 8 | 9 | Make sure you check out the **00 Pre-Requisites** page before you start. You'll need all of the items loaded there before you can proceed with the course. 10 | 11 | You'll cover these topics in the course: 12 | 13 |

14 | 15 |
16 |
Course Outline
17 |
1 - Overview and Course Setup (This section)
18 |
2 - Programming Basics
19 |
3 Working with Data
20 |
4 Deployment and Environments
21 |
22 | 23 |

24 | 25 |

Overview

26 | 27 | There are two main versions of Python - 2 and 3. So many programs were written for version 2 that it is still around, and version 3 was such an upgrade that programs for 2 don't always run in 3 and visa-versa. For this course we'll do everything in version 3 - it's becoming the accepted standard for data professionals. 28 | 29 | You have a few ways of working with Python: 30 | 31 | - The Interactive Interpreter (Type `python` and the version number if it is in your path) 32 | - Writing code and running it in some graphical environment (Such as VSCode, Visual Studio, Spyder, PyCharm, IDLE, etc.) 33 | - Calling a `.py` script file from the `python` command 34 | 35 | When you're in command-mode, you'll see that the code looks more like a scripting language, meaning that some parenthesis around functions might not be there. Programming-mode looks like a standard programming language environment - you'll normally use that within an Integrated Programming Environment (IDE). 36 | 37 |

Activity: Verify Your Installation and Configure Python

38 | 39 | Open the **01_OverviewAndCourseSetup.py** file and run the code you see there. The exercises will be marked out using comments: 40 | 41 |
42 | # TODO - Section Number
43 | 
44 | 45 |

For Further Study

46 | 47 | - Version differences: https://wiki.python.org/moin/Python2orPython3 48 | - Development Environments: IDLE, tk, VSCode, PyCharm, Jupyter Notebooks, Documentation, Training Resources: https://www.python.org/doc/ 49 | - and https://docs.python.org/3/tutorial/index.html 50 | - The Official Python Documentation Course: https://docs.python.org/3/tutorial/index.html 51 | 52 | Next, Continue to *02 Programming Basics* -------------------------------------------------------------------------------- /PythonForDataProfessionals/02 Programming Basics.md: -------------------------------------------------------------------------------- 1 | ![](graphics/solutions-microsoft-logo-small.png) 2 | 3 | # Python for Data Professionals 4 | 5 | ## 02 Programming Basics 6 | 7 |

8 | 9 |
10 |
Course Outline
11 |
1 - Overview and Course Setup
12 |
2 - Programming Basics (This section)
13 |
2.1 - Getting help
14 |
2.2 Code Syntax and Structure
15 |
2.3 Variables
16 |
2.4 Operations and Functions
17 |
3 Working with Data
18 |
4 Deployment and Environments
19 |
20 | 21 |

22 | 23 | ## Programming Basics Overview 24 | 25 | From here on out, you'll focus on using Python in programming mode - you'll write code that you run from an IDE or a calling environment, not interactively from the command-line. As you work through this explanation, copy the code you see and run it to see the results. After you work through these copy-and-paste examples, you'll create your own code in the Activities that follow each section. 26 | 27 |

2.1 - Getting help

28 | 29 | The very first thing you should learn in any language is how to get help. You can [find the help documents on-line](https://docs.python.org/3/index.html), or simply type 30 | 31 | `help()` 32 | 33 | in your code. For help on a specific topic, put the topic in the parenthesis: 34 | 35 | `help(str)` 36 | 37 | To see a list of topics, type 38 | 39 | `help(topics)` 40 | 41 |

2.2 Code Syntax and Structure

42 | 43 | Let's cover a few basics about how Python code is written. (For a full discussion, check out the [Style Guide for Python, called PEP 8](https://www.python.org/dev/peps/pep-0008/) ) Let's use the "Zen of Python" rules from Tim Peters for this course: 44 | 45 |
 46 | 
 47 |     Beautiful is better than ugly.
 48 |     Explicit is better than implicit.
 49 |     Simple is better than complex.
 50 |     Complex is better than complicated.
 51 |     Flat is better than nested.
 52 |     Sparse is better than dense.
 53 |     Readability counts.
 54 |     Special cases aren't special enough to break the rules.
 55 |     Although practicality beats purity.
 56 |     Errors should never pass silently.
 57 |     Unless explicitly silenced.
 58 |     In the face of ambiguity, refuse the temptation to guess.
 59 |     There should be one-- and preferably only one --obvious way to do it.
 60 |     Although that way may not be obvious at first unless you're Dutch.
 61 |     Now is better than never.
 62 |     Although never is often better than right now.
 63 |     If the implementation is hard to explain, it's a bad idea.
 64 |     If the implementation is easy to explain, it may be a good idea.
 65 |     Namespaces are one honking great idea -- let's do more of those!
 66 |     --Tim Peters
 67 | 
 68 | 
69 | 70 | In general, use standard coding practices - don't use keywords for variables, be consistent in your naming (camel-case, lower-case, etc.), comment your code clearly, and understand the general syntax of your language, and follow the principles above. But the most important tip is to at least read the PEP 8 and decide for yourself how well that fits into your Zen. 71 | 72 | There is one hard-and-fast rule for Python that you *do* need to be aware of: indentation. You **must** indent your code for classes, functions (or methods), loops, conditions, and lists. You can use a tab or four spaces (spaces are the accepted way to do it) but in any case, you have to be consistent. If you use tabs, you always use tabs. If you use spaces, you have to use that throughout. It's best if you set your IDE to handle that for you, whichever way you go. 73 | 74 | Python code files have an extension of `.py`. 75 | 76 | Comments in Python start with the hash-tag: `#`. There are no block comments (and this makes us all sad) so each line you want to comment must have a tag in front of that line. Keep the lines short (80 characters or so) so that they don't fall off a single-line display like at the command line. 77 | 78 |

2.3 Variables

79 | 80 | Variables stand in for replaceable values. Python is not strongly-typed, meaning you can just declare a variable name and set it to a value at the same time, and Python will try and guess what data type you want. You use an `=` sign to assign values, and `==` to compare things. 81 | 82 | Quotes \" or ticks \' are fine, just be consistent. 83 | 84 | `# There are some keywords to be aware of, but x and y are always good choices.` 85 | 86 | `x = "Buck" # I'm a string.` 87 | 88 | `type(x)` 89 | 90 | `y = 10 # I'm an integer.` 91 | 92 | `type(y)` 93 | 94 | To change the type of a value, just re-enter something else: 95 | 96 | `x = "Buck" # I'm a string.` 97 | 98 | `type(x)` 99 | 100 | `x = 10 # Now I'm an integer.` 101 | 102 | `type(x)` 103 | 104 | Or cast it By implicitly declaring the conversion: 105 | 106 | `x = "10"` 107 | 108 | `type(x)` 109 | 110 | `print int(x)` 111 | 112 | To concatenate string values, use the `+` sign: 113 | 114 | `x = "Buck"` 115 | 116 | `y = " Woody"` 117 | 118 | `print(x + y)` 119 | 120 |

2.4 Operations and Functions

121 | 122 | Python has the following operators: 123 | 124 | Arithmetic Operators 125 | Comparison (Relational) Operators 126 | Assignment Operators 127 | Logical Operators 128 | Bitwise Operators 129 | Membership Operators 130 | Identity Operators 131 | 132 | You have the standard operators and functions from most every language. Here are some of the tokens: 133 | 134 |
135 | 
136 |     !=                  *=                  <<                  ^  
137 |     "                   +                   <<=                 ^= 
138 |     """                 +=                  <=                  `
139 |     %                   ,                   <>                  __
140 |     %=                  -                   ==                     
141 |     &                   -=                  >                   b" 
142 |     &=                  .                   >=                  b' 
143 |     '                   ...                 >>                  j  
144 |     '''                 /                   >>=                 r" 
145 |     (                   //                  @                   r' 
146 |     )                   //=                 J                   |'
147 |     *                   /=                  [                   |= 
148 |     **                  :                   \                   ~  
149 |     **=                 <                   ]                      
150 | 
151 | 
152 | 153 | Wait...that's it? That's all you're going to tell me? *(Hint: use what you've learned):* 154 | 155 | `help('symbols')` 156 | 157 | Walk through each of these operators carefully - you'll use them when you work with data in the next module. 158 | 159 |

Activity - Programming basics

160 | 161 | Open the **02_ProgrammingBasics.py** file and run the code you see there. The exercises will be marked out using comments: 162 | 163 | `# - Section Number` 164 | 165 |

For Further Study

166 | 167 | - The PEP - https://www.python.org/dev/peps/pep-0008/ 168 | - Introduction to the Python Coding Style - http://stackabuse.com/introduction-to-the-python-coding-style/ 169 | - The Microsoft Tutorial and samples for Python - https://code.visualstudio.com/docs/languages/python 170 | - Coding requirements and standards - PEP - https://www.python.org/dev/peps/pep-0008/ 171 | - Another free online self-paced course - https://www.w3schools.com/python/default.asp 172 | 173 | Next, Continue to *03 Working with Data* -------------------------------------------------------------------------------- /PythonForDataProfessionals/03 Working with Data.md: -------------------------------------------------------------------------------- 1 | ![](graphics/solutions-microsoft-logo-small.png) 2 | 3 | # Python for Data Professionals 4 | 5 | ## 03 Working with Data 6 | 7 |

8 | 9 |
10 |
Course Outline
11 |
1 - Overview and Course Setup
12 |
2 - Programming Basics
13 |
3 Working with Data (This section)
14 |
3.1 Data Types
15 |
3.2 Data Ingestion
16 |
3.3 Data Inspection
17 |
3.4 Graphing
18 |
3.5 Machine Learning and AI
19 |
4 Deployment and Environments
20 |
21 | 22 |

23 | 24 | Working with data is the main part of this course. This section will be quite a bit longer than what you have done so far, and the Activities will be harder. Remember to use the cheat-sheets and other references in the `./assets` course directory, because not everything you need to know will be in the course explanation. You'll need to dig a bit more and experiment, use the `help()` function, and do a bit of researching to figure out how to complete the Activities. 25 | 26 |

3.1 Data Types

27 | 28 | In most any language, after the Data Professional learns how to use help, they want to find out what data types the language supports, and how the language works with them. You covered the way Python works with data in the last module (under the topic *Operators*), so now you need to figure out the types of data Python can work with. 29 | 30 | Note that the data types you'll see next are the ones built-in to the Python language. Just like a Data Platform will often take a "primitive" data type and build on that with libraries, Python will do the same thing. You'll cover that in more depth in a moment. 31 | 32 | Python has 5 standard data type "families": 33 | 34 | - Numbers 35 | - Strings 36 | - Lists 37 | - Tuples 38 | - Dictionaries 39 | 40 |

Numbers

41 | 42 | Numbers contain the following types: 43 | 44 | - int (signed integer) 45 | - long (long integers in either decimal, octal or hex) 46 | - float (real floating point numbers) 47 | - complex (integers in the range of 0-255) 48 | 49 |

Strings

50 | 51 | Strings are ASCII characters within a single quote, double quote, or if you want to span a line, triple quotes. They are treated as an array of sorts, so if you do this: 52 | 53 | `myName = "Buck"` 54 | 55 | Then you can do this: 56 | 57 | `print(myName[0])` 58 | 59 | And you get back this: 60 | 61 |
B
62 | 63 | Oh, and there are all kinds of formatting options you have with strings. [Check those out here](https://pyformat.info/) 64 | 65 |

Lists

66 | 67 | Lists are arrays - and you're not limited to a single dimension. You define them by enclosing the values in square brackets. 68 | 69 | Here's a list: 70 | 71 | `myList = [0, 1, 2]` 72 | 73 | And now you can decorate that with data: 74 | 75 | `myList[0] = "One"` 76 | 77 | `myList[1] = "Two"` 78 | 79 | `myList[2] = "Three"` 80 | 81 | `print(myList)` 82 | 83 | `print(myList[2])` 84 | 85 | And so on. There are also ranges, loops, and methods you can use on lists - [more on that here](https://www.tutorialspoint.com/python/python_lists.htm) 86 | 87 |

Tuples

88 | 89 | Tuples are similar to Lists, but are immutable - you can't extend or shrink them dynamically. Think of them as a readable SQL Table. You define a tuple with parenthesis rather than square brackets. 90 | 91 | Use a Tuple when you want to "protect" the data structure so that no one changes the structure after you define it. You'll see some real-world examples throughout this course. 92 | 93 |

Dictionaries

94 | 95 | Dictionaries are Key-Value Pair (KVP) data. You set these up with a curly-brace, the key, a colon, and then the value, like this: 96 | 97 | `myDict = {1: "Buck", 2: "Jane", 3: "Jim"}` 98 | 99 | Now you can work with them by the key or the value. For instance, to show the value for key 1, it's simply: 100 | 101 | `myDict[1]` 102 | 103 | Or to find the key for Buck, you simply type this: 104 | 105 | `myDict["Buck"]` 106 | 107 | Dictionaries are used quite frequently in Python, so you should take some time to [read up on them here](https://docs.python.org/2/tutorial/datastructures.html#dictionaries) 108 | 109 |

Activity - Programming basics

110 | 111 | Open the **03_WorkingWithData.py** file and enter the code you find for section 3.1. The exercises will be marked out using comments: 112 | 113 | `# - 3.1 ` 114 | 115 |

3.2 Side-track: Working with Libraries for Data

116 | 117 | Python includes most of the functions you need to read data from files, work with them in memory and so on in the base installation. However, There is a way to add in to the functions you have for your code, using *Libraries*. Libraries are code someone else has written that you add in to your program from the start, using an `import` statement. You'll cover more information on working with Libraries (sometimes referred to as Modules or Packages, but more correctly Libraries) in a future lesson, but data "wrangling" (importing, manipulating and exporting) usually involves adding in at least one or two Libraries, so you'll cover that here.   118 | 119 |

NumPy

120 | 121 | *NOTE: You'll need to install both NumPy and Pandas before you can use them. You will cover that in a later lesson - your pre-requisites included this installation for now.* 122 | 123 | To work with numeric data, the first library you should become familiar with is *NumPy* (Numerical Python). The primary structure in NumPy is the *array*. 124 | 125 | To load the library, use the `import` statement with an optional "alias" of np: 126 | 127 | `import numpy as np` 128 | 129 | Now when you reference NumPy's methods and properties, you can use the shorter `np` label. 130 | 131 | It's simple enough to create and work with an array, now that you have the library loaded. This code creates a 2-dimensional NumPy array, and sets the values to integer: 132 | 133 | `x = np.array([(1,2,3), (4,5,6)], dtype = int)` 134 | 135 | The next important concept in NumPy is that the array is actually a set of pointers, involving four main components: 136 | 137 | - *data* : The memory address of the first byte in the array 138 | 139 | - *dtype* : The type of the elements in the array 140 | 141 | - *shape* : The layout of the array 142 | 143 | - *strides* : The number of bytes skipped in memory to go to the next element of the array 144 | 145 | Here are those properties in action: 146 | 147 | `print(x.data)` 148 | 149 | `print(x.dtype)` 150 | 151 | `print(x.shape)` 152 | 153 | `print(x.strides)` 154 | 155 | Now you can use the array, mostly by doing maths on them. Here are a few examples: 156 | 157 | Add, subtract, multiply and divide x and y: 158 | 159 | `np.add(x,y)` 160 | 161 | `np.subtract(x,y)` 162 | 163 | `np.multiply(x,y)` 164 | 165 | `np.divide(x,y)` 166 | 167 | You can experiment with a few more NumPy operations on your own in the Activities that follow. 168 | 169 |

Activity - Programming with NumPy

170 | 171 | Open the **03_WorkingWithData.py** file and enter the code you find for section 3.1a. The exercises will be marked out using comments: 172 | 173 | `# - 3.1a` 174 | 175 |

Pandas

176 | 177 | The primary library you'll use in working with data in Python is *Pandas* (the *Python Analysis Data Library*). Pandas provides many methods and properties that you can work with for your data, and it also has other data structures that make it more efficient to work with data. 178 | 179 | Just as in NumPy, use the `import` statement to load the Pandas Library: 180 | 181 | `import pandas as pd` 182 | 183 | Once the library is in memory, you start using it by creating a *dataframe* - the primary object Pandas works with. A dataframe is a mixed-type structure that looks similar to a SQL Table, and is very efficient. You can assign almost any data to a dataframe - here's an example that creates a dataframe by reading a comma-separated file: 184 | 185 | `my_df = pd.read_csv('./data/data.csv')` 186 | 187 | This illustrates one way of ingesting data, and in a moment you'll see a few more. Pandas has a lot of data sources it can work with, from the Clipboard to various filetypes. Here's a short list: 188 | 189 | - Flat Files 190 | 191 | - Clipboard 192 | 193 | - Excel 194 | 195 | - JSON 196 | 197 | - HTML 198 | 199 | - HDFStore: PyTables (HDF5) 200 | 201 | - Feather 202 | 203 | - Parquet 204 | 205 | - SAS 206 | 207 | - SQL 208 | 209 | - Google BigQuery 210 | 211 | - STATA 212 | 213 | ...among others. 214 | 215 | Now with the dataframe (`my_df`) loaded, it's an object you can work with. If you just type the name of the dataframe, you'll get back the data in the "table". 216 | 217 | Pandas has a lot of functions that allow you to work with data after you've inspected it. To work with datasets like you would in an RDBMS, here are a few examples. 218 | 219 | Starting with an equivalent (kind of) to the SELECT statement in SQL, you can project a column with the statement `my_df[col]`. Use a comma and include other columns to form *column, column*. These will come back as a new dataframe. 220 | 221 | If you want to use an ordinal position use `my_df.iloc[0]`. 222 | 223 | f you know the index you want, use `my_df.loc['index_one']`. 224 | 225 | If you want the whole row, use `my_df.iloc[0,:]` (for the first row). 226 | 227 | For a WHERE clause, use the comparison tokens you saw earlier. For instance, to get the months lower than October, use `my_df[my_df[month] > 9]`. 228 | 229 | For ORDER BY, use the sort_values function. This command sorts the first column of the dataframe in ascending order: `my_df.sort_values(col1,ascending=True)`. 230 | 231 | Moving on to JOIN operations, you have the ability to use multiple kinds of joins - for instance, the statement `my_df.join(my_df2,on=col1,how='inner')` joins the two dataframes `my_df` and `my_df2' on the column *col1* (which must exist in both dataframes). 232 | 233 | There's a lot more you can do with Pandas, including a lot of data cleaning operations that you'll use for Machine Learning and other Data Science tasks. You'll experiment with this in your Activities. 234 | 235 | Want to learn more? Check this reference: https://pandas.pydata.org/pandas-docs/stable/tutorials.html 236 | 237 |

Activity - Programming with Pandas

238 | 239 | Open the **03_WorkingWithData.py** file and enter the code you find for section 3.1b. The exercises will be marked out using comments: 240 | 241 | `# - 3.1b` 242 | 243 | (Note: Use the Cheat-sheets in the `./assets` directory in the exercise that follows) 244 | 245 |

3.3 Data Ingestion

246 | 247 | Python has many ways to read data in (*sometimes into memory, sometimes streaming as it reads it*) built right in to the standard libraries. Other Libraries, such as Pandas and NumPy, have their own way of reading in data. 248 | 249 | In any case, the data is assigned to a data family or *structure*, which you learned about earlier. Depending on which Library you are using, you'll pick a data structure that makes the most sense for how you want to work with it. For instance, Pandas uses a dataframe as the primary data structure it works with. This is why it's important to know the data types, so that you understand what structure you need to perform your desired operations. 250 | 251 |

Reading from Files

252 | 253 | Many times the data you are looking for is in storage, either locally or remotely. *File-source* based data is loosely defined as whatever data the operating system can reach natively. 254 | 255 | *NOTE:* This means that when you write your code, it's important to know where it will run. Python is an *interpreted* language, which means that it will run on a given platform in a certain way. If you load data from a Windows file system, and it gets deployed to a Linux system, you need to make sure the file-paths check for validity. 256 | 257 | You've already seen how to read data with Pandas. For the built-in Python library, you most often use the csv reader on comma-separated value data. To use it, import the `csv` module. From there, you can use a "with" block to process the file. This example opens a file, uses an if statement to process each line, and if the line contains "carrot", prints the ingredient, the type of carrot (shredded, sliced, etc.), and the amount for the recipe: 258 | 259 |
260 | import csv
261 | with open('mydata.csv') as csvfile:
262 |     reader = csv.DictReader(csvfile)
263 |     for row in reader:
264 |         if row['ingredient'] == 'carrot':
265 |             print(row['ingredient'] ,row ['type'],row ['amount'])
266 | 
267 | 268 | (Note the indentation - very important!) 269 | 270 | The csv reader has a "dialect" modifier so that you can work with CSV files that are stored in a particular way - use the `help()` function to learn more. 271 | 272 | Reference: https://realpython.com/python-csv/ 273 | 274 |

Working with Data in Databases

275 | 276 | Python has Libraries available that allow you to connect to a Relational Database Management System (RDBMS). the `pydobc` Library is one of the most widely used, and works well with Microsoft's SQL Server. You can read more about pyodbc and download it here: https://docs.microsoft.com/en-us/sql/connect/python/pyodbc/python-sql-driver-pyodbc?view=sql-server-2017 277 | 278 | Once you install it (more on installing Libraries later), you once again import it, and then set up your connection. You then use the connection to send a query, returning a dataset, or updating data if that's what you're going for. Here's an example: 279 | 280 |
281 |     import pyodbc
282 | 
283 |     server = 'tcp:myserver.database.windows.net'
284 |     # Some other example server values are
285 |     # server = 'localhost\sqlexpress' for a named instance
286 |     # server = 'myserver,port' to specify an alternate port
287 | 
288 |     database = 'mydb'
289 |     username = 'myusername'
290 |     password = 'mypassword'
291 | 
292 |     cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ 
293 |     password)
294 | 
295 |     cursor = cnxn.cursor()
296 | 
297 |     # Sample select query
298 |     cursor.execute("SELECT @@version;")
299 |     row = cursor.fetchone()
300 | 
301 |     while row: 
302 | 
303 |         print row[0]
304 |         row = cursor.fetchone()
305 | 
306 |     # Sample insert query
307 | 
308 |     cursor.execute("INSERT SalesLT.Product (Name, ProductNumber, StandardCost, ListPrice, SellStartDate) OUTPUT INSERTED.ProductID 
309 |     VALUES ('SQL Server Express New 20', 'SQLEXPRESS New 20', 0, 0, CURRENT_TIMESTAMP )")
310 | 
311 |     row = cursor.fetchone()
312 |     while row:
313 |         print 'Inserted Product key is ' + str(row[0]) 
314 |         row = cursor.fetchone()
315 | 
316 | 317 |

Data in Other Sources

318 | 319 | Many other data sources, such as cloud databases and network streams, also have ways of connecting from Python. Even web pages can be used as data sources. One of the primary Libraries for working with web data is *Beautiful Soup*, [which you can find here](https://www.crummy.com/software/BeautifulSoup/). You normally need to connect to the web page first, so for that you use another import, using `requests`, or perhaps `urllib` or `urllib2`. 320 | 321 | Here's an example of reading a web page and printing all the links it has: 322 | 323 |
324 |     from bs4 import BeautifulSoup
325 |     import requests
326 |     html_doc  = requests.get("http://coolwebpage.com")
327 |     soup = BeautifulSoup(html_doc, 'html.parser')
328 |     print(soup.get_text())
329 | 
330 | 331 |

Activity - Data Ingestion

332 | 333 | Open the **03_WorkingWithData.py** file and enter the code you find for section 3.2. The exercises will be marked out using comments: 334 | 335 | `# - 3.2` 336 | 337 |

3.4 Data Inspection

338 | 339 | After the data is loaded into a structure, the first step in analytics is to examine the data. You've already seen how to display the data using Pandas, and it's one of the best libraries for data exploration as well. 340 | 341 | Analytics professionals often start with the basics of the statistical layout of the numeric data in a dataset. If you want to see the basic statistics of your data stored in a dataframe called *my_df*, type `my_df.describe()`. 342 | 343 | You'll also want to determine the amount of data you're working with. To do that, type `my_df.shape` to get the number of rows and columns in a dataframe. 344 | 345 | Typing `my_df.head(n)` gives you `n` first rows of the data, or use `my_df.tail(n)` to get the end number of rows returned. 346 | 347 | Another way to see the "shape" of the data is to use `my_df.info()` to see the index, datatypes and memory information for the dataframe. 348 | 349 |

Activity - Data Inspection

350 | 351 | Open the **03_WorkingWithData.py** file and enter the code you find for section 3.3. The exercises will be marked out using comments: 352 | 353 | `# - 3.3` 354 | 355 |

3.4 Graphing

356 | 357 | Examining the data in tabular format won't give you all you need to evaluate and interpret it. It is very useful to display the data in a graphical format, and once again you'll turn to Libraries to do that. There are many Libraries for graphing data in Python, and more are written constantly. The primary Libraries you should be familiar with are MatPlotLib and ggplot. 358 | 359 |

Graphing Data with MatPlotLib

360 | 361 | MatPlotLib is quite old, bu it’s the most widely used graphical library for plotting in Python. It borrowed much of it's design from an industry commercial standard called MATLAB. Many other Libraries are built on top of MatPlotLib or simply work along side it. 362 | 363 | Take a look at an example of a histogram plot with MatPlotLib: 364 | 365 |
366 |     import matplotlib
367 |     from numpy.random import randn
368 |     import matplotlib.pyplot as plt
369 |     from matplotlib.ticker import FuncFormatter
370 | 
371 |     def to_percent(y, position):
372 |         # Ignore the passed in position. This has the effect of scaling the default
373 |         # tick locations.
374 |         s = str(100 * y)
375 | 
376 |         # The percent symbol needs escaping in latex
377 |         if matplotlib.rcParams['text.usetex'] is True:
378 |             return s + r'$\%$'
379 |         else:
380 |             return s + '%'
381 | 
382 |     x = randn(5000)
383 | 
384 |     # Make a normed histogram. It'll be multiplied by 100 later.
385 |     plt.hist(x, bins=50, normed=True)
386 | 
387 |     # Create the formatter using the function to_percent. This multiplies all the
388 |     # default labels by 100, making them all percentages
389 |     formatter = FuncFormatter(to_percent)
390 | 
391 |     # Set the formatter
392 |     plt.gca().yaxis.set_major_formatter(formatter)
393 | 
394 |     plt.show()
395 | 
396 | 397 | ![](./graphics/MatPlotLib.png) 398 | 399 | Of course, MatPlotLib can do so much more. [Take a look at this reference from the documentation which goes deeper.](https://matplotlib.org/examples/index.html) 400 | 401 |

Graphing with ggplot

402 | 403 | The ggplot library is also used in the R language (in a newer version called *ggplot2*). It follows the guidelines from the *Grammar of Graphics* reference work. The commands in ggplot layer the graphical components. You'll make a base graphic, and even after you create the chart you add axes, a line, add a trendline, coloring and more. 404 | 405 | Here's an example of a plot using the ggplot Library, with the mtcars sample dataset. Notice how it "builds" on the plot so that it's fairly easy to see how it represents each part: 406 |
407 |     from ggplot import *
408 | 
409 |     p = ggplot(aes(x='mpg'), data=mtcars)
410 |     p += geom_histogram()
411 |     p += xlab("Miles per Gallon")
412 |     p += ylab("# of Cars")
413 |     p
414 | 
415 | 416 | ![](./graphics/ggplot.png) 417 | 418 | [Check out the official documentation for many more examples.](https://github.com/yhat/ggpy/tree/master/docs) 419 | 420 |

Activity - Graphing

421 | 422 | Open the **03_WorkingWithData.py** file and enter the code you find for section 3.4. The exercises will be marked out using comments: 423 | 424 | `# - 3.4` 425 | 426 |

3.6 Altering Data

427 | 428 | Most data isn't "clean" by default. It's either in the wrong format, missing values, or isn't all structured the way you need it. For this type of work, there are two basic tasks you should learn: Regular Expressions and once again, Pandas. You won't cover an exercise on data editing in this section; instead you'll see an example of that as part of a Machine Learning exercise. 429 | 430 | You can use Regular Expressions in Python to make a lot of your changes - you can read more about that here: https://docs.python.org/3/library/re.html 431 | 432 | But most of the time you'll be using Pandas to make those changes. You can read more about that here: https://tomaugspurger.github.io/modern-5-tidy.html 433 | 434 | And of course there are lots of other things to know about altering data. Read this resource for more: https://www.springboard.com/blog/data-wrangling/ 435 | 436 |

3.7 Machine Learning and AI

437 | 438 | A full course on Machine Learning (and one of its applications, Artificial Intelligence), is long and involved. Machine Learning involves evaluating data for *features* (columns) that can create *labels* (predictions or classifications). You do this by using a collection of historical data, and selecting the most predictive features and applying one or more algorithms to that data. You get back a *model* (which is kind of like a function) that you can send new data to for a prediction. This is a bit of an oversimplification of course, but it will serve you well as you work through this course. For a more comprehensive discussion on Data Science and Machine Learning with Python, check out this reference: https://notebooks.azure.com/jakevdp/libraries/PythonDataScienceHandbook 439 | 440 | There are a few "families" of problems you can solve with a Machine Learning Solution: 441 | 442 |

443 | 444 |

445 | 446 | While it's tempting to start with the algorithms and the outputs, it's actually more important to understand the general process of a Data Science project. To do that, you can use the Team Data Science Process - in fact, you have been studying many of these steps already: 447 | 448 |

449 | 450 |

451 | 452 | Each of these phases has a specific set of steps you follow to complete them: 453 | 454 |

455 | 456 |

Phase One - Business Understanding

457 | 458 | In the Business Understanding Phase the team determines the prediction or categorical work your organization wants to create. You'll also set up your project planning documents, locate your initial data source locations, and set up the environment you will use to create and operationalize your models. This phase involves a great deal of coordination among the team and the broader organization. 459 | 460 | Read the [Documentation Reference here](https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/lifecycle-business-understanding) 461 | 462 |

463 | 464 |

Phase Two - Data Acquisition and Understanding

465 | 466 | Read the [Documentation Reference here](https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/lifecycle-data) 467 | 468 | The Data Acquisition and Understanding phase of the TDSP you ingest or access data from various locations to answer the questions the organization has asked. In most cases, this data will be in multiple locations. Once the data is ingested into the system, you’ll need to examine it to see what it holds. All data needs cleaning, so after the inspection phase, you’ll replace missing values, add and change columns. You’ve already seen the Libraries you'll need to work with for Data Wrangling - Pandas being the most common in use. 469 | 470 |

471 |

Phase Three - Modeling

472 | 473 | In this phase, you will create the experiment runs, perform feature engineering, and run experiments with various settings and parameters. After selecting the best performing run, you will create a trained model and save it for operationalization in the next phase. This modeling is done with yet another set of Python Libraries - the most common being SciKit Learn and TensorFlow : References, among others. You'll see this in action in just a bit. 474 | 475 | Read the [Documentation Reference here](https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/lifecycle-modeling) 476 | 477 |

478 |

Phase Four - Deployment

479 | 480 | In this phase you will take the trained model and any other necessary assets and deploy them to a system that will respond to API requests. 481 | 482 | Read the [Documentation Reference here](https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/lifecycle-deployment) 483 | 484 |

485 |

Phase Five - Customer Acceptance

486 | 487 | The final phase involves testing the model predictions on real-world queries to ensure that it meets all requirements. In this phase you also document the project so that all parameters are well-known. Finally, a mechanism is created to re-train the model. 488 | 489 | Read the [Documentation Reference here](https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/lifecycle-acceptance) 490 | 491 |

492 | 493 | As you can see, there are quite a few things to do to work with Python in a Data Science Machine Learning project. Rather than have you create an entire solution, there is one you can examine to see each phase. You'll do that next. 494 | 495 |

Activity - Machine Learning

496 | 497 | Now open the `/code/03_MachineLearning.py` file and read the code-blocks you see there marked "Machine Learning". 498 | 499 | Don't worry too much about the math and the functions in the Machine Learning Libraries, just focus on the process. Then swing back around to that Data Science with Python references for a deeper dive into this very large area. 500 | 501 | Want to see this in action? Check out this reference: https://tdsppython-buckwoodynotebooks.notebooks.azure.com/nb/notebooks/Instructor%20Notebook.ipynb 502 | 503 |

For Further Study

504 | 505 | - [Python Docs for Data Types](https://docs.python.org/2/tutorial/datastructures.html#) 506 | 507 | Next, Continue to *04 Environments and Deployment* -------------------------------------------------------------------------------- /PythonForDataProfessionals/04 Environments and Deployment.md: -------------------------------------------------------------------------------- 1 | ![](graphics/solutions-microsoft-logo-small.png) 2 | 3 | # Python for Data Professionals 4 | 5 | ## 04 Environments and Deployment 6 | 7 |

8 | 9 |
10 |
Course Outline
11 |
1 - Overview and Course Setup
12 |
2 - Programming Basics
13 |
3 Working with Data
14 |
4 Deployment and Environments (This section)
15 |
4.1 Conda
16 |
4.2 Pickling
17 |
4.3 SQL Server MAchine Learning Services
18 |
19 | 20 |

21 | 22 | The main installation of Python - sometimes called "Core" or "base" - has a set of parameters it works with. Since it runs on many operating systems, these variables are set and altered in different ways. Here are the primary environment settings on the standard installation of Python: 23 | 24 | - PYTHONPATH - Sets the location for the Python interpreter to locate the module files imported into a program. 25 | - PYTHONHOME - The alternative module search path. 26 | - PYTHONSTARTUP - The initialization file path ( `.pythonrc.py` ) containing the Python source code. It is executed every time you start the interpreter. 27 | - PYTHONCASEOK - For the Windows OS, find the first case-insensitive match in an "import" statement. 28 | 29 | You can show all of the variables by importing the base configuration system library, and then calling a print statement: 30 | 31 | `import sysconfig` 32 | 33 | `sysconfig.get_config_vars()` 34 | 35 | If you want to see just one variable, remember, it's just an array: 36 | 37 | `sysconfig.get_config_var('LIBDIR')` 38 | 39 |

40 | 41 |

4.1 pip and Conda

42 | 43 | To install new packages, you can build the source code manually, but that's not the way it's most often done. Typically you use a "package manager", and the most popular is "pip". The pip program installs and configures most of the libraries you will need for the base installation of Python. 44 | 45 | You probably already have the pip program. However, to install pip, you can use the [cURL](https://curl.haxx.se/download.html) program to get it: 46 | 47 | `curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py` 48 | 49 | Then use Python to run the script to install it: 50 | 51 | `python get-pip.py` 52 | 53 | From there, you can query the packages you have with this command, from the command-line in your operating system: 54 | 55 | `pip list` 56 | 57 | You can install a package using this command: 58 | 59 | `pip install SomePackage # latest version` 60 | 61 | `pip install SomePackage==1.0.4 # specific version` 62 | 63 | `pip install 'SomePackage>=1.0.4' # minimum version` 64 | 65 | And you can remove a package with this command: 66 | 67 | `pip uninstall SomePackage` 68 | 69 | There is a lot more that you can do with pip, and you can find out the list here: 70 | 71 | `pip` 72 | 73 | A more robust package manager, which even installs a distribution of Python for you along with other tools, is [Conda](https://conda.io/docs/user-guide/getting-started.html). For this course, you have installed Python using Conda, which not only has a package manager, but also isolates environments for you. This means that you can create a "boundary" of variables, package directories, and more around a name you specify. You can then switch to that environment to create your code, and that code will always have a consistent set of variables and packages. 74 | 75 | To create a Conda environment, issue the following command: 76 | 77 | `conda create --name` 78 | 79 | For instance, this command creates a new environment called "bucktest" and installs the biology package called biopython: 80 | 81 | `conda create --name bucktest biopython` 82 | 83 | To see the environments, issue the following command: 84 | 85 | `conda info --envs` 86 | 87 | The one with the asterisk (*) is the one you are using now. To switch to another environment, issue the following command: 88 | 89 | `activate bucktest` (In Windows) 90 | 91 | `source activate bucktest` (Mac and Linux) 92 | 93 | And to see information about that environment, issue the following command: 94 | 95 | `conda list` 96 | 97 | or just `conda` to find out everything you can do with Conda. 98 | 99 | To install packages in that environment, use this command: 100 | 101 | `conda install biopython` 102 | 103 |

Activity - pip and Conda

104 | 105 | Now open the `/code/04_EnvironmentsAndDeployment.py` file and follow the instructions you see there for 4.1. 106 | 107 |

108 | 109 |

4.2 Pickling

110 | 111 | "Pickling" in Python means to serialize a Python object. Perhaps that isn't very helpful - what it really means is to take the output of whatever you did in Python and make it available again in another environment or program. It's a way of saving the "state" of a program so that it can be transferred and then re-loaded. 112 | 113 | It's best illustrated with some code: 114 | 115 | `import pickle` 116 | 117 | `a = ['1','2','3']` 118 | 119 | `PickleFileName = "picklefile"` 120 | 121 | `FileObject = open(PickleFileName,'wb')` 122 | 123 | `pickle.dump(a,FileObject)` 124 | 125 | `fileObject.close()` 126 | 127 | Now you can copy that file to a new computer, open Python, and work with it again as if you ran it there: 128 | 129 | `import pickle` 130 | 131 | `PickleFileName = "picklefile"` 132 | 133 | `FileObject = open(PickleFileName,'r') ` 134 | 135 | `b = pickle.load(FileObject) ` 136 | 137 | `b` 138 | 139 | And now *a* equals *b*. Of course, your program would be much longer, most often a series of steps, which might for instance do a Machine Learning prediction. 140 | 141 | You can read a lot more about pickling here: https://wiki.python.org/moin/UsingPickle 142 | 143 |

Activity - Pickle

144 | 145 | Now open the `/code/04_EnvironmentsAndDeployment.py` file and follow the instructions you see there for step 4.2. 146 | 147 |

4.3 Docker and Flask

148 | 149 | Two other abstraction levels are useful to think about. You're probably familiar with Virtual Machines - which uses software to emulate hardware. This lets you install a complete new "computer" in a computer's OS. One level up from that abstraction layer is a *Container*. A Container goes slightly further by including a very small kernel of an operating system (most often Linux) to operate a runtime - like Python. This provides an even more consistent environment for your application, since it can also include settings and programs above the Python level. 150 | 151 | The *Flask* micro-framework for Python isn't technically an abstraction layer, it has more to do with serving your application up to a Web call. You'll often see Docker and Flask used together, so you'll cover it here for completeness. Once again, seeing some code is useful to understand - this example comes from the documentation site: 152 | 153 |
154 | 
155 | from flask import Flask
156 | app = Flask(__name__)
157 | 
158 | @app.route('/')
159 | def hello_world():
160 |     return 'Hello, World!'
161 | 
162 | 
163 | 164 | You can probably follow the layout of this code, but there are some specifics here. First, the code imported Flask itself. Next, the code creates an instance of a Flask app, called "app" in this case. From there, the route was set to the base URL call - just as in the main part of a web page. And finally, a simple function returns the words "Hello World!". 165 | 166 | So far, nothing is happening - the code is just on disk. However, you can "deploy" the code on a system that is running with these commands (in Linux): 167 | 168 |
169 | $ export FLASK_APP=hello.py
170 | $ flask run
171 |  * Running on http://127.0.0.1:5000/
172 | 
173 | 174 | OK...so what? Well, in this case, you could open a Web Browser on that system and type in that URL - and you'll see "Hello World!" pop up on the screen. Of course, real applications are much more complicated, can take POST and GET operations, and much more. But this is a very convenient way to serve up your Python application without having to tell your users to install and run Python. 175 | 176 | Of course, there's a lot more to both of these topics - read the references below to learn more. 177 | 178 |

179 | 180 |

4.3 Operationalizing Python in SQL Server Machine Learning Services

181 | 182 | SQL Server (2017 and higher) has a mechanism to run Python code by calling it in a Stored Procedure, which can work with a Pickle file or by running SQL Server code directly. The Python is run side-by-side with SQL Server, so as not to allow Python to interfere with SQL Server base processes. This Python extension is part of the SQL Server Machine Learning Services add-on to the relational database engine. It adds a Python execution environment, an Anaconda distribution with the Python 3.5 runtime and interpreter, standard libraries and tools, and the Microsoft product libraries for Python: [revoscalepy](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/revoscalepy-package) for analytics at scale and [microsoftml](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/microsoftml-package) for machine learning algorithms. Python runs in a separate process from SQL Server, to guarantee that database operations are not compromised. 183 | 184 | When you run Python "inside" SQL Server, you must encapsulate the Python script inside a special stored procedure, [sp_execute_external_script](https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql?view=sql-server-ver15). Here's an example of Python code running in SQL Server using a Stored Procedure: 185 | 186 |
187 | EXECUTE sp_execute_external_script @language = N'Python'
188 |     , @script = N'
189 | a = 1
190 | b = 2
191 | c = a/b
192 | d = a*b
193 | print(c, d)
194 | '
195 | 
196 | 197 | 198 | 199 | After the script has been embedded in the stored procedure, any application that can make a stored procedure call can initiate execution of the Python code. From there, SQL Server manages code execution in this process: 200 | 201 | 1. A request for the Python runtime is indicated by the parameter @language='Python' passed to the stored procedure. SQL Server sends this request to the launchpad service. In Linux, SQL uses a launchpadd service to communicate with a separate launchpad process for each user. See the Extensibility architecture diagram for details. 202 | 2. The launchpad service starts the appropriate launcher; in this case, PythonLauncher. 203 | 3. PythonLauncher starts the external Python35 process. 204 | 4. BxlServer coordinates with the Python runtime to manage exchanges of data, and storage of working results. 205 | 5. SQL Satellite manages communications about related tasks and processes with SQL Server. 206 | 6. BxlServer uses SQL Satellite to communicate status and results to SQL Server. 207 | 7. SQL Server gets results and closes related tasks and processes. 208 | 209 | You can see that process here: 210 | 211 |
212 | 213 |
214 | 215 |

Activity - Run Python in a SQL Server Stored Procedure

216 | 217 | - Ensure you have [the pre-requisites completed for the installation of SQL Server Machine Learning Services](https://docs.microsoft.com/en-us/sql/machine-learning/install/sql-machine-learning-services-windows-install?view=sql-server-ver15) installed. 218 | - [Open this reference and follow the steps you see there](https://docs.microsoft.com/en-us/sql/machine-learning/tutorials/quickstart-python-create-script?view=sql-server-ver15). 219 | 220 |

221 | 222 |

For Further Study

223 | 224 | - [You can learn more about Docker here](https://www.fullstackpython.com/docker.html) 225 | - [More on Flask](http://flask.pocoo.org/) 226 | - [Creating a simple Flask application](http://containertutorials.com/docker-compose/flask-simple-app.html) 227 | - [More on SQL Server Machine Learning Services is here](https://docs.microsoft.com/en-us/sql/machine-learning/what-is-sql-server-machine-learning?view=sql-server-ver15) 228 | 229 | Congratulations! You now know the basics or working with Python and Data. As you can see, there's a lot more to learn - so use your new knowledge to expand on what you have learned. -------------------------------------------------------------------------------- /PythonForDataProfessionals/Python for Data Professionals.pyproj: -------------------------------------------------------------------------------- 1 |  2 | 3 | 4 | Debug 5 | 2.0 6 | {b7bafee5-b1f9-4942-b965-117a23fe690d} 7 | 8 | 9 | 10 | . 11 | . 12 | {888888a0-9f3d-457c-b088-3a5042f75d52} 13 | Standard Python launcher 14 | 15 | 16 | 17 | 18 | 19 | 10.0 20 | 21 | 22 | 23 | Content 24 | 00 Pre-Requisites.md 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | -------------------------------------------------------------------------------- /PythonForDataProfessionals/Python for Data Professionals_hdi_settings.json: -------------------------------------------------------------------------------- 1 | // workspace configuration template of HDInsight extension 2 | { 3 | /* example: 4 | "script_to_cluster": [{ 5 | "clusterName": "hdi_cluster_1", 6 | "filePath": "a.hql" 7 | }, 8 | { 9 | "clusterName": "hdi_cluster_2", 10 | "filePath": "src/b.py" 11 | }] 12 | */ 13 | "script_to_cluster": [{ 14 | 15 | }], 16 | /* more details from: https://github.com/cloudera/livy 17 | examples: 18 | "livy_conf": { 19 | "driverMemory": "1G", 20 | "driverCores": 2, 21 | "executorMemory": "512M", 22 | "executorCores": 10, 23 | "numExecutors": 5 24 | } 25 | */ 26 | "livy_conf": { 27 | 28 | }, 29 | /* examples: 30 | "additional_conf": { 31 | azure_environment: AzureChina // Only Azure or AzureChina works here 32 | } 33 | */ 34 | 35 | "additional_conf": { 36 | 37 | } 38 | } -------------------------------------------------------------------------------- /PythonForDataProfessionals/assets/MLCheatSheet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/assets/MLCheatSheet.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/assets/NoStarchPressPython.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/assets/NoStarchPressPython.pdf -------------------------------------------------------------------------------- /PythonForDataProfessionals/assets/NumpyPythonCheatSheet.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/assets/NumpyPythonCheatSheet.pdf -------------------------------------------------------------------------------- /PythonForDataProfessionals/assets/PandasCheatSheet.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/assets/PandasCheatSheet.pdf -------------------------------------------------------------------------------- /PythonForDataProfessionals/assets/Python3CheatSheet.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/assets/Python3CheatSheet.pdf -------------------------------------------------------------------------------- /PythonForDataProfessionals/assets/UseCases.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/assets/UseCases.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/code/01_OverviewAndCourseSetup.py: -------------------------------------------------------------------------------- 1 | # 01_OverviewAndCourseSetup.py 2 | # Purpose: Initial Course Setup and displaying versions 3 | # Author: Buck Woody 4 | # Credits and Sources: Inline 5 | # Last Updated: 27 June 2018 6 | 7 | # Check the Python Version and Information 8 | import platform 9 | python_version=platform.python_version() 10 | print(python_version) 11 | 12 | # - Fix this code so that it runs 13 | 14 | print "The Python Version is: " python_version 15 | 16 | # - Using "platform", what other information can you derive about this system? 17 | 18 | # EOF: 01_OverviewAndCourseSetup.py -------------------------------------------------------------------------------- /PythonForDataProfessionals/code/02_ProgrammingBasics.py: -------------------------------------------------------------------------------- 1 | # 02_ProgrammingBasics.py 2 | # Purpose: General Programming exercises for Python 3 | # Author: Buck Woody 4 | # Credits and Sources: Inline 5 | # Last Updated: 27 June 2018 6 | 7 | # 2.1 Getting Help 8 | help() 9 | help(str) 10 | 11 | # - Write code to find help on help 12 | 13 | # 2.2 Code Syntax and Structure 14 | 15 | # - Python uses spaces to indicate code blocks. Fix the code below: 16 | x=10 17 | y=5 18 | if x > y: 19 | print(str(x) + " is greater than " + str(y)) 20 | 21 | # - Arguments on first line are forbidden when not using vertical alignment. Fix this code: 22 | foo = long_function_name(var_one, var_two, 23 | var_three, var_four) 24 | 25 | # operators sit far away from their operands. Fix this code: 26 | income = (gross_wages + 27 | taxable_interest + 28 | (dividends - qualified_dividends) - 29 | ira_deduction - 30 | student_loan_interest) 31 | 32 | # - The import statement should use separate lines for each effort. You can fix the code below 33 | # using separate lines or by using the "from" statement: 34 | import sys, os 35 | 36 | # - The following code has extra spaces in the wrong places. Fix this code: 37 | i=i+1 38 | submitted +=1 39 | x = x * 2 - 1 40 | hypot2 = x * x + y * y 41 | c = (a + b) * (a - b) 42 | 43 | # 2.3 Variables 44 | 45 | # - Add a line below x=3 that changes the variable x from int to a string 46 | x=3 47 | type(x) 48 | 49 | # - Write code that prints the string "This class is awesome" using variables: 50 | x="is awesome" 51 | y="This Class" 52 | 53 | # 2.4 Operations and Functions 54 | 55 | # - Use some basic operators to write the following code: 56 | # Assign two variables 57 | # Add them 58 | # Subtract 20 from each, add those values together, save that to a new variable 59 | # Create a new string variable with the text "The result of my operations are: " 60 | # Print out a single string on the screen with the result of the variables 61 | # showing that result. 62 | 63 | # EOF: 02_ProgrammingBasics.py -------------------------------------------------------------------------------- /PythonForDataProfessionals/code/03_WorkingWithData.py: -------------------------------------------------------------------------------- 1 | # 03_WorkingWithData.py 2 | # Purpose: Exercise files for Python for Data Professionals course, section 3 3 | # Author: Buck Woody 4 | # Credits and Sources: Inline 5 | # Last Updated: 02 July 2018 6 | 7 | # - 3.1 Data Types 8 | 9 | # Create a variable called MyName and set it to your name. 10 | # Print out the middle two letters of the variable: 11 | 12 | # Create a new variable of a 3-digit number. Print out the data type for the variable: 13 | 14 | # Change the previous variable to text. Print the data type for the variable: 15 | 16 | # Create a list structure with three numbers in it, add two of the numbers, print the result: 17 | 18 | # Create a Dictionary structure with three values using keys of 1, 2 and 3. 19 | # Query for the value of key 2: 20 | 21 | # - 3.1a NumPy Exercises 22 | # Create a NumPy 1-dimensional array consisting of three numbers. 23 | # Sum those numbers. 24 | # Add three more numbers as an additional dimension to the array. 25 | # Sum the two dimensions over the rows. 26 | # Sum the two dimensions over the columns: 27 | 28 | 29 | # - 3.1b Pandas Exercises 30 | # Use the Pandas library, and alias it as pd: 31 | 32 | # Show the first five values of long_series: 33 | long_series = pd.Series(np.random.randn(1000)) 34 | 35 | # Read the file CATelcoCustomerChurnTrainingSample.csv from the ./data directory 36 | # into a Pandas Data Frame: 37 | 38 | # Explore the Data Frame you just created with Pandas: 39 | 40 | # - 3.2 Data Ingestion 41 | # Read customer data from the ./data/CATelcoCustomerChurnTrainingSample.csv 42 | # into a data frame called df using pandas: 43 | 44 | # Show the Data in the Data Frame: 45 | 46 | # - 3.3 Data Inspection 47 | # Ensure that you have 29 columns and 20,468 rows loaded 48 | print('There should be 20468 observations of 29 variables:') 49 | 50 | # Explore the df Dataframe, using at least a five-number statistical summary. 51 | # NOTE: Your exploration may be much different - you will show this data 52 | # using graphs in the next exercise. 53 | 54 | # Show the size and shape of the data: 55 | 56 | # Show the first and last 10 rows: 57 | 58 | # Show the dataframe structure: 59 | 60 | # Check for missing values: 61 | print('Missing values: ', '\n') 62 | 63 | # perform a simple statistical display: 64 | print('Dataframe Statistics: ', '\n') 65 | 66 | # - 3.4 Graphing 67 | # Using any graphical library or representation you like, create three separate graphs 68 | # that best illustrate the data layout of the dataframe you just created: 69 | 70 | # - 3.5 Machine Learning and AI 71 | # Review the following code, observing what it does. 72 | 73 | # 1 - Setup - Get everything up to date, and add any pips you want here 74 | # Import Libraries for the Customer Churn Prediction Labs - Change for other uses 75 | 76 | # Serializing output/input 77 | import pickle 78 | 79 | # Libraries for training and scoring 80 | from sklearn.naive_bayes import GaussianNB 81 | from sklearn.tree import DecisionTreeClassifier 82 | from sklearn.metrics import accuracy_score 83 | from sklearn.model_selection import train_test_split 84 | from sklearn.preprocessing import LabelEncoder 85 | 86 | # Data and Numeric Manipulation 87 | import pandas as pd 88 | import numpy as np 89 | 90 | # Working with files 91 | import csv 92 | 93 | #/ 1 - Setup 94 | 95 | #2 - Read data and verify 96 | # Read customer data from a single file 97 | df = pd.read_csv('./data/CATelcoCustomerChurnTrainingSample.csv') 98 | 99 | # Ensure that you have 29 columns and 20,468 rows loaded 100 | print('There should be 20468 obervations of 29 variables:') 101 | print(df.shape, '\n') 102 | 103 | # Optional - Instead, read the data from source: 104 | # https://github.com/Azure/MachineLearningSamples-ChurnPrediction/blob/master/data/CATelcoCustomerChurnTrainingSample.csv 105 | #/ 2 - Read Data 106 | 107 | # 2.1 - Explore Data 108 | # Explore the df Dataframe, using at least a five-number statistical summary. 109 | # NOTE: Your exploration may be much different - experiment with graphics as well. 110 | 111 | # Show the size and shape of data: 112 | print('The size of the data is: %d rows and %d columns' % df.shape, '\n') 113 | 114 | # Show the first and last 10 rows 115 | print('First ten rows of the data: ') 116 | print(df.head(10), '\n') 117 | print('Last ten rows of the data: ') 118 | print(df.tail(10), '\n') 119 | 120 | # Show the dataframe structure: 121 | print('Dataframe Structure: ', '\n') 122 | print(df.info(), '\n') 123 | 124 | # Check for missing values: 125 | print('Missing values: ', '\n') 126 | print(df.apply(lambda x: sum(x.isnull()),axis=0), '\n') 127 | 128 | # perform a simple statistical display: 129 | print('Dataframe Statistics: ', '\n') 130 | print(df.describe(), '\n') 131 | 132 | #/ 2.1 133 | 134 | # 3.0 - Customer Churn Prediction Experiment 135 | # For completeness of this example, let's re-import our libraries 136 | import pickle 137 | import pandas as pd 138 | import numpy as np 139 | import csv 140 | from sklearn.naive_bayes import GaussianNB 141 | from sklearn.tree import DecisionTreeClassifier 142 | from sklearn.metrics import accuracy_score 143 | from sklearn.model_selection import train_test_split 144 | from sklearn.preprocessing import LabelEncoder 145 | 146 | # We'll re-load the data as "CustomerDataFrame" 147 | CustomerDataFrame = pd.read_csv('data/CATelcoCustomerChurnTrainingSample.csv') 148 | 149 | # Fill all NA values with 0: 150 | CustomerDataFrame = CustomerDataFrame.fillna(0) 151 | 152 | # Drop all duplicate observations: 153 | CustomerDataFrame = CustomerDataFrame.drop_duplicates() 154 | 155 | # We don't need the 'year" or 'month' variables 156 | CustomerDataFrame = CustomerDataFrame.drop('year', 1) 157 | CustomerDataFrame = CustomerDataFrame.drop('month', 1) 158 | 159 | # Implement One-Hot Encoding for this model (https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning/) 160 | columns_to_encode = list(CustomerDataFrame.select_dtypes(include=['category','object'])) 161 | dummies = pd.get_dummies(CustomerDataFrame[columns_to_encode]) # 162 | 163 | # Drop the original categorical columns: 164 | CustomerDataFrame = CustomerDataFrame.drop(columns_to_encode, axis=1) # 165 | 166 | # Re-join the dummies frame to the original data: 167 | CustomerDataFrame = CustomerDataFrame.join(dummies) 168 | 169 | # Show the new columns in the joined dataframe: 170 | print(CustomerDataFrame.columns, '\n') 171 | 172 | # Experiment using Naive Bayes: 173 | nb_model = GaussianNB() 174 | random_seed = 42 175 | split_ratio = .3 176 | train, test = train_test_split(CustomerDataFrame, random_state = random_seed, test_size = split_ratio) 177 | 178 | target = train['churn'].values 179 | train = train.drop('churn', 1) 180 | train = train.values 181 | nb_model.fit(train, target) 182 | 183 | expected = test['churn'].values 184 | test = test.drop('churn', 1) 185 | predicted = nb_model.predict(test) 186 | 187 | # Print out the Naive Bayes Classification Accuracy: 188 | print("Naive Bayes Classification Accuracy", accuracy_score(expected, predicted)) 189 | 190 | # Experiment using Decision Trees: 191 | dt_model = DecisionTreeClassifier(min_samples_split=20, random_state=99) 192 | dt_model.fit(train, target) 193 | predicted = dt_model.predict(test) 194 | 195 | # Print out the Decision Tree Accuracy: 196 | print("Decision Tree Classification Accuracy", accuracy_score(expected, predicted)) 197 | 198 | #/ 3.0 199 | 200 | # 4.0a - Create the Model File 201 | # serialize the best performing model on disk 202 | print ("Serialize the model to a model.pkl file in the root") 203 | ModelFile = open('./model.pkl', 'wb') 204 | pickle.dump(dt_model, ModelFile) 205 | ModelFile.close() 206 | #/ 4.0a 207 | 208 | # 4.0b - Operationalization: Scoring the calls to the model 209 | # Prepare the web service definition before deploying 210 | # Import for the pickle 211 | from sklearn.externals import joblib 212 | 213 | # load the model file 214 | global model 215 | model = joblib.load('model.pkl') 216 | 217 | # Import for handling the JSON file 218 | import json 219 | import pandas as pd 220 | 221 | # Set up a sample "call" from a client: 222 | input_df = "{\"callfailurerate\": 0, \"education\": \"Bachelor or equivalent\", \"usesinternetservice\": \"No\", \"gender\": \"Male\", \"unpaidbalance\": 19, \"occupation\": \"Technology Related Job\", \"year\": 2015, \"numberofcomplaints\": 0, \"avgcallduration\": 663, \"usesvoiceservice\": \"No\", \"annualincome\": 168147, \"totalminsusedinlastmonth\": 15, \"homeowner\": \"Yes\", \"age\": 12, \"maritalstatus\": \"Single\", \"month\": 1, \"calldroprate\": 0.06, \"percentagecalloutsidenetwork\": 0.82, \"penaltytoswitch\": 371, \"monthlybilledamount\": 71, \"churn\": 0, \"numdayscontractequipmentplanexpiring\": 96, \"totalcallduration\": 5971, \"callingnum\": 4251078442, \"state\": \"WA\", \"customerid\": 1, \"customersuspended\": \"Yes\", \"numberofmonthunpaid\": 7, \"noadditionallines\": \"\\\\N\"}" 223 | 224 | # Cleanup 225 | input_df_encoded = json.loads(input_df) 226 | input_df_encoded = pd.DataFrame([input_df_encoded], columns=input_df_encoded.keys()) 227 | input_df_encoded = input_df_encoded.drop('year', 1) 228 | input_df_encoded = input_df_encoded.drop('month', 1) 229 | input_df_encoded = input_df_encoded.drop('churn', 1) 230 | 231 | # Pre-process scoring data consistent with training data 232 | columns_to_encode = ['customersuspended', 'education', 'gender', 'homeowner', 'maritalstatus', 'noadditionallines', 'occupation', 'state', 'usesinternetservice', 'usesvoiceservice'] 233 | dummies = pd.get_dummies(input_df_encoded[columns_to_encode]) 234 | input_df_encoded = input_df_encoded.join(dummies) 235 | input_df_encoded = input_df_encoded.drop(columns_to_encode, axis=1) 236 | 237 | columns_encoded = ['age', 'annualincome', 'calldroprate', 'callfailurerate', 'callingnum', 238 | 'customerid', 'monthlybilledamount', 'numberofcomplaints', 239 | 'numberofmonthunpaid', 'numdayscontractequipmentplanexpiring', 240 | 'penaltytoswitch', 'totalminsusedinlastmonth', 'unpaidbalance', 241 | 'percentagecalloutsidenetwork', 'totalcallduration', 'avgcallduration', 242 | 'customersuspended_No', 'customersuspended_Yes', 243 | 'education_Bachelor or equivalent', 'education_High School or below', 244 | 'education_Master or equivalent', 'education_PhD or equivalent', 245 | 'gender_Female', 'gender_Male', 'homeowner_No', 'homeowner_Yes', 246 | 'maritalstatus_Married', 'maritalstatus_Single', 'noadditionallines_\\N', 247 | 'occupation_Non-technology Related Job', 'occupation_Others', 248 | 'occupation_Technology Related Job', 'state_AK', 'state_AL', 'state_AR', 249 | 'state_AZ', 'state_CA', 'state_CO', 'state_CT', 'state_DE', 'state_FL', 250 | 'state_GA', 'state_HI', 'state_IA', 'state_ID', 'state_IL', 'state_IN', 251 | 'state_KS', 'state_KY', 'state_LA', 'state_MA', 'state_MD', 'state_ME', 252 | 'state_MI', 'state_MN', 'state_MO', 'state_MS', 'state_MT', 'state_NC', 253 | 'state_ND', 'state_NE', 'state_NH', 'state_NJ', 'state_NM', 'state_NV', 254 | 'state_NY', 'state_OH', 'state_OK', 'state_OR', 'state_PA', 'state_RI', 255 | 'state_SC', 'state_SD', 'state_TN', 'state_TX', 'state_UT', 'state_VA', 256 | 'state_VT', 'state_WA', 'state_WI', 'state_WV', 'state_WY', 257 | 'usesinternetservice_No', 'usesinternetservice_Yes', 258 | 'usesvoiceservice_No', 'usesvoiceservice_Yes'] 259 | 260 | # Now that they are encoded, some values will be "empty". Fill those with 0's: 261 | for column_encoded in columns_encoded: 262 | if not column_encoded in input_df_encoded.columns: 263 | input_df_encoded[column_encoded] = 0 264 | 265 | # Return final prediction 266 | pred = model.predict(input_df_encoded) 267 | 268 | # (In production you would replace Print() statement here with some sort of return to JSON) 269 | print('JSON sent to the prediction Model:', '\n') 270 | print(input_df, '\n') 271 | print('For the JSON string sent from the client, The prediction is returned as more JSON (0 = No churn, 1 = Churn):', '\n') 272 | print(json.dumps(str(pred[0]))) 273 | 274 | #/ 4.0b 275 | 276 | # EOF: 03_WorkingWithData.py -------------------------------------------------------------------------------- /PythonForDataProfessionals/code/04_EnvrionmentsAndDeployment.py: -------------------------------------------------------------------------------- 1 | # 04_EnvironmentsAndDeployments.py 2 | # Purpose: Environmental settings and configurations 3 | # Author: Buck Woody 4 | # Credits and Sources: Inline 5 | # Last Updated: 07 July 2018 6 | 7 | # - 4.1 Show the main environment variables in the current Python environment. Which directory has the libraries? 8 | 9 | # - What else can you find in the sysconfig library? How would you find that out? 10 | 11 | # - Using conda commands, what libraries are currently loaded? 12 | # How would you install a new one? 13 | # What environment are you using now? 14 | 15 | # - 4.2 Create a program that has three text variables. Combine these three into another varaible. 16 | # Load the pickle library and save the results of the first program as a pkl file. 17 | # Close the first program, and create another one that opens and reads the pkl file. 18 | # Combine the final variable from the last program with a next text variable from this program. 19 | 20 | # EOF: 04_EnvironmentsAndDeployment.py -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/AnalyticsAreas.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/AnalyticsAreas.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/DataScience.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/DataScience.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/MLCapabilities.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/MLCapabilities.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/MatPlotLib.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/MatPlotLib.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/SmallBuck.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/SmallBuck.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/aml-logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/aml-logo.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/brain.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/brain.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/check.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/check.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/checkbox.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/checkbox.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/checkmark.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/checkmark.jpg -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/cortanalogo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/cortanalogo.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/files.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/files.jpg -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/ggplot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/ggplot.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/keyboard.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/keyboard.jpg -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/microsoftlogo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/microsoftlogo.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/pin.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/pin.jpg -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/solutions-microsoft-logo-small.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/solutions-microsoft-logo-small.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/tdsp.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/tdsp.png -------------------------------------------------------------------------------- /PythonForDataProfessionals/graphics/thinking.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/PythonForDataProfessionals/graphics/thinking.jpg -------------------------------------------------------------------------------- /PythonForDataProfessionals/notebooks/.ipynb_checkpoints/00 Pre-Requisites-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "![](../graphics/solutions-microsoft-logo-small.png)\n", 10 | "\n", 11 | "# Python for Data Professionals\n", 12 | "\n", 13 | "

00 Pre-Requisites

\n", 14 | "\n", 15 | "This \"Python for Data Professionals\" course is taught using [Jupyter Notebooks](https://notebooks.azure.com/help). You'll be able to run the code samples you see by typing in the Python examples as decribed and clicking the \"Run\" button you see at the top of the screen. \n", 16 | "\n", 17 | "For the most part, there are no pre-requisites for this course using a Notebook. However, if you would like to learn this material on your own machine, you'll need Microsoft Windows, SQL Server, and Visual Studio. You can of course use the Python language on many platforms and in other distributions and with other tools, but using this configuration allows you to stay consistent for instruction during this course. Feel free to use other installations after you complete the course.\n", 18 | "\n", 19 | "Read over this section and then proceed to the next notebook." 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "

Activity 1: Set up the Windows Operating System

\n", 27 | "\n", 28 | "You have three options for setting up Microsoft Windows to complete this course. You can use a Local installation of Windows, a Virtual Machine on your local system, or a Virtual Machine stored in a Cloud provider such as Microsoft Azure. *(The third option is only for classrooms where you have reliable connections to the Internet)*\n", 29 | "\n", 30 | "

Option 1 - Local Installation

\n", 31 | "\n", 32 | "- Install a recent version of Microsoft Windows. For this course, Windows 10, or any current of Windows Server is acceptable.\n", 33 | "- Install all updates to the operating system.\n", 34 | "\n", 35 | "

Option 2 - Install Windows on a Local Virtual Machine Environment

\n", 36 | "\n", 37 | "- Using your local system, [navigate to this resource](https://developer.microsoft.com/en-us/windows/downloads/virtual-machines) and follow the instructions there.\n", 38 | "\n", 39 | "**NOTE: Wait as long as reasonably possible to ensure that the system does not expire - these are free licenses, but they have a time limit**\n", 40 | "\n", 41 | "- You can also use whatever Hypervisor you like for your system and install a legal, registered copy of Microsoft Windows.\n", 42 | "\n", 43 | "

Option 3 - Use a Virtual Machine in a Cloud Provider

\n", 44 | "\n", 45 | "- If you have access to the Internet, you can set up a [free Microsoft Azure Account](https://azure.microsoft.com/en-us/free/search/?&OCID=AID631184_SEM_bSHIQHtA&lnkd=Google_Azure_Brand&gclid=Cj0KCQjwpcLZBRCnARIsAMPBgF2myLWEk3Hllm2354GEs0rD1sDST_xcfkFGRdAE8toYZMalbQJ4M3YaAs9UEALw_wcB&dclid=CPDRgcv57tsCFVXE4Qodo-gLzg) and use a [Data Science Virtual Machine](https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/provision-vm). Any size will do, and the free account provides enough resources for a single course. You will not need to install Anaconda, VSCode or SQL Server if you use this choice, as they are already installed for you.\n", 46 | "- Log in to the system and run [Windows Update](https://support.microsoft.com/en-us/help/4027667/windows-update-windows-10)\n", 47 | "\n" 48 | ] 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "metadata": {}, 53 | "source": [ 54 | "

Activity 2: Install SQL Server 2017 with ML Services

\n", 55 | "\n", 56 | "- [Navigate to this resource](https://www.microsoft.com/en-us/sql-server/sql-server-downloads), Select **Developer** from the lower part of the page, and install the **Developer Edition**. Select all components for installation.\n", 57 | "\n", 58 | "- Run Windows Update and select the [\"Install updates for other products\" option](https://www.lifewire.com/how-to-change-windows-update-settings-2625778). Apply the latest updates to the classroom system.\n", 59 | "\n", 60 | "

Activity 3: Install Visual Studio with Machine Learning and Data Science workloads

\n", 61 | "\n", 62 | "- On your classroom system, [install Visual Studio 2017](https://www.visualstudio.com/downloads/) - The free Community Edition is adequate for this course.\n", 63 | "\n", 64 | "- During the installation, select the \"Data storage and processing\" and \"Data science and analytical applicaitons\" Workloads. *(NOTE: [In the Data Science Workload installation box, select ALL optional components on the Summary pane!](https://blogs.msdn.microsoft.com/visualstudio/2016/11/18/data-science-workloads-in-visual-studio-2017-rc/))*\n", 65 | "\n", 66 | "- Log in with a Live ID to Visual Studio, let the system load, and apply any updates.\n", 67 | "\n", 68 | "- After the updates complete, click the \"R Tools\" menu item and open the \"Interactive R Window\" option (This will verify that the Data Science Workloads add-ins are working, R and Python). Type the following in that panel to ensure the installation was successful:\n", 69 | "\n" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": null, 75 | "metadata": {}, 76 | "outputs": [], 77 | "source": [ 78 | "x <- 10\n", 79 | "\n", 80 | "x\n" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "metadata": {}, 86 | "source": [ 87 | "You should see the result **\\[1\\]10** returned. If not, open the Visual Studio Installer and select the \"Repair\" option." 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "

For Further Study

\n", 95 | "\n", 96 | "- Platforms supported: https://www.python.org/download/other/ \n", 97 | "\n", 98 | "- Installing Python: https://www.python.org/downloads/\n", 99 | "\n", 100 | "- Installing Python using Anaconda: https://www.infoworld.com/article/3267976/python/anaconda-cpython-pypy-and-more-know-your-python-distributions.html\n", 101 | "\n", 102 | "Next, Continue to *01 Overview and Course Setup*" 103 | ] 104 | } 105 | ], 106 | "metadata": { 107 | "kernelspec": { 108 | "display_name": "Python 3", 109 | "language": "python", 110 | "name": "python3" 111 | }, 112 | "language_info": { 113 | "codemirror_mode": { 114 | "name": "ipython", 115 | "version": 3 116 | }, 117 | "file_extension": ".py", 118 | "mimetype": "text/x-python", 119 | "name": "python", 120 | "nbconvert_exporter": "python", 121 | "pygments_lexer": "ipython3", 122 | "version": "3.6.5" 123 | } 124 | }, 125 | "nbformat": 4, 126 | "nbformat_minor": 2 127 | } 128 | -------------------------------------------------------------------------------- /PythonForDataProfessionals/notebooks/.ipynb_checkpoints/01 Overview and Setup-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "![](../graphics/solutions-microsoft-logo-small.png)\n", 10 | "\n", 11 | "# Python for Data Professionals\n", 12 | "\n", 13 | "## 01 Overview and Setup\n", 14 | "\n", 15 | "In this course you'll cover the basics of the Python language and environment from a Data Professional's perspective. While you will learn Python, you'll quickly cover topics that have a lot more depth available. In each section you'll get more references to go deeper, which you should follow up on. Also watch for links within the text - click on each one to explore that topic.\n", 16 | "\n", 17 | "Make sure you check out the **00 Pre-Requisites** page before you start. You'll need all of the items loaded there before you can proceed with the course.\n", 18 | "\n", 19 | "You'll cover these topics in the course:\n", 20 | "\n", 21 | "

\n", 22 | "\n", 23 | "
\n", 24 | "
Course Outline
\n", 25 | "
1 - Overview and Course Setup (This section)
\n", 26 | "
2 - Programming Basics
\n", 27 | "
3 Working with Data
\n", 28 | "
4 Deployment and Environments
\n", 29 | "
\n", 30 | "\n", 31 | "

" 32 | ] 33 | }, 34 | { 35 | "cell_type": "markdown", 36 | "metadata": {}, 37 | "source": [ 38 | "

Overview

\n", 39 | "\n", 40 | "There are two main versions of Python - 2 and 3. So many programs were written for version 2 that it is still around, and version 3 was such an upgrade that programs for 2 don't always run in 3 and visa-versa. For this course we'll do everything in version 3 - it's becoming the accepted standard for data professionals.\n", 41 | "\n", 42 | "You have a few ways of working with Python:\n", 43 | "\n", 44 | "- The Interactive Interpreter (Type `python` and the version number if it is in your path)\n", 45 | "- Writing code and running it in some graphical environment (Such as VSCode, Visual Studio, Spyder, PyCharm, IDLE, etc.)\n", 46 | "- Calling a `.py` script file from the `python` command \n", 47 | "\n", 48 | "When you're in command-mode, you'll see that the code looks more like a scripting language, meaning that some parenthesis around functions might not be there. Programming-mode looks like a standard programming language environment - you'll normally use that within an Integrated Programming Environment (IDE)." 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": {}, 54 | "source": [ 55 | "

Activity: Verify Your Installation and Configure Python

\n", 56 | "\n", 57 | "Open the **01_OverviewAndCourseSetup.py** file and run the code you see there. The exercises will be marked out using comments: \n", 58 | "\n", 59 | "
\n",
 60 |     "# TODO - Section Number\n",
 61 |     "
" 62 | ] 63 | }, 64 | { 65 | "cell_type": "code", 66 | "execution_count": null, 67 | "metadata": {}, 68 | "outputs": [], 69 | "source": [ 70 | "# 01_OverviewAndCourseSetup.py\n", 71 | "# Purpose: Initial Course Setup and displaying versions\n", 72 | "# Author: Buck Woody\n", 73 | "# Credits and Sources: Inline\n", 74 | "# Last Updated: 27 June 2018\n", 75 | "\n", 76 | "# Check the Python Version and Information\n", 77 | "import platform\n", 78 | "python_version=platform.python_version()\n", 79 | "print(python_version)\n", 80 | "\n", 81 | "# - Fix this code so that it runs\n", 82 | "\n", 83 | "print \"The Python Version is: \" python_version\n", 84 | "\n", 85 | "# - Using \"platform\", what other information can you derive about this system?\n", 86 | "\n", 87 | "# EOF: 01_OverviewAndCourseSetup.py" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "

For Further Study

\n", 95 | "\n", 96 | "- Version differences: https://wiki.python.org/moin/Python2orPython3 \n", 97 | "- Development Environments: IDLE, tk, VSCode, PyCharm, Jupyter Notebooks, Documentation, Training Resources: https://www.python.org/doc/\n", 98 | "- and https://docs.python.org/3/tutorial/index.html \n", 99 | "- The Official Python Documentation Course: https://docs.python.org/3/tutorial/index.html\n", 100 | "\n", 101 | "Next, Continue to *02 Programming Basics*" 102 | ] 103 | }, 104 | { 105 | "cell_type": "code", 106 | "execution_count": null, 107 | "metadata": {}, 108 | "outputs": [], 109 | "source": [] 110 | } 111 | ], 112 | "metadata": { 113 | "kernelspec": { 114 | "display_name": "Python 3", 115 | "language": "python", 116 | "name": "python3" 117 | }, 118 | "language_info": { 119 | "codemirror_mode": { 120 | "name": "ipython", 121 | "version": 3 122 | }, 123 | "file_extension": ".py", 124 | "mimetype": "text/x-python", 125 | "name": "python", 126 | "nbconvert_exporter": "python", 127 | "pygments_lexer": "ipython3", 128 | "version": "3.6.5" 129 | } 130 | }, 131 | "nbformat": 4, 132 | "nbformat_minor": 2 133 | } 134 | -------------------------------------------------------------------------------- /PythonForDataProfessionals/notebooks/.ipynb_checkpoints/02 Programming Basics-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "![](../graphics/solutions-microsoft-logo-small.png)\n", 10 | "\n", 11 | "# Python for Data Professionals\n", 12 | "\n", 13 | "## 02 Programming Basics\n", 14 | "\n", 15 | "

\n", 16 | "\n", 17 | "
\n", 18 | "
Course Outline
\n", 19 | "
1 - Overview and Course Setup
\n", 20 | "
2 - Programming Basics (This section)
\n", 21 | "
2.1 - Getting help
\n", 22 | "
2.2 Code Syntax and Structure
\n", 23 | "
2.3 Variables
\n", 24 | "
2.4 Operations and Functions
\n", 25 | "
3 Working with Data
\n", 26 | "
4 Deployment and Environments
\n", 27 | "
\n", 28 | "\n", 29 | "

" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": {}, 35 | "source": [ 36 | "## Programming Basics Overview\n", 37 | "\n", 38 | "From here on out, you'll focus on using Python in programming mode - you'll write code that you run from an IDE or a calling environment, not interactively from the command-line. As you work through this explanation, copy the code you see and run it to see the results. After you work through these copy-and-paste examples, you'll create your own code in the Activities that follow each section." 39 | ] 40 | }, 41 | { 42 | "cell_type": "markdown", 43 | "metadata": {}, 44 | "source": [ 45 | "

2.1 - Getting help

\n", 46 | "\n", 47 | "The very first thing you should learn in any language is how to get help. You can [find the help documents on-line](https://docs.python.org/3/index.html), or simply type\n", 48 | " \n", 49 | "`help()`\n", 50 | " \n", 51 | "in your code. For help on a specific topic, put the topic in the parenthesis:\n", 52 | " \n", 53 | " `help(str)`\n", 54 | "\n", 55 | " To see a list of topics, type \n", 56 | "\n", 57 | " `help(topics)`" 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": null, 63 | "metadata": { 64 | "collapsed": true 65 | }, 66 | "outputs": [], 67 | "source": [ 68 | "# Try it:" 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "metadata": {}, 74 | "source": [ 75 | "

2.2 Code Syntax and Structure

\n", 76 | "\n", 77 | "Let's cover a few basics about how Python code is written. (For a full discussion, check out the [Style Guide for Python, called PEP 8](https://www.python.org/dev/peps/pep-0008/) ) Let's use the \"Zen of Python\" rules from Tim Peters for this course:\n", 78 | "\n", 79 | "
\n",
 80 |     "\n",
 81 |     "    Beautiful is better than ugly.\n",
 82 |     "    Explicit is better than implicit.\n",
 83 |     "    Simple is better than complex.\n",
 84 |     "    Complex is better than complicated.\n",
 85 |     "    Flat is better than nested.\n",
 86 |     "    Sparse is better than dense.\n",
 87 |     "    Readability counts.\n",
 88 |     "    Special cases aren't special enough to break the rules.\n",
 89 |     "    Although practicality beats purity.\n",
 90 |     "    Errors should never pass silently.\n",
 91 |     "    Unless explicitly silenced.\n",
 92 |     "    In the face of ambiguity, refuse the temptation to guess.\n",
 93 |     "    There should be one-- and preferably only one --obvious way to do it.\n",
 94 |     "    Although that way may not be obvious at first unless you're Dutch.\n",
 95 |     "    Now is better than never.\n",
 96 |     "    Although never is often better than right now.\n",
 97 |     "    If the implementation is hard to explain, it's a bad idea.\n",
 98 |     "    If the implementation is easy to explain, it may be a good idea.\n",
 99 |     "    Namespaces are one honking great idea -- let's do more of those!\n",
100 |     "    --Tim Peters\n",
101 |     "\n",
102 |     "
\n", 103 | "\n", 104 | "In general, use standard coding practices - don't use keywords for variables, be consistent in your naming (camel-case, lower-case, etc.), comment your code clearly, and understand the general syntax of your language, and follow the principles above. But the most important tip is to at least read the PEP 8 and decide for yourself how well that fits into your Zen.\n", 105 | "\n", 106 | "There is one hard-and-fast rule for Python that you *do* need to be aware of: indentation. You **must** indent your code for classes, functions (or methods), loops, conditions, and lists. You can use a tab or four spaces (spaces are the accepted way to do it) but in any case, you have to be consistent. If you use tabs, you always use tabs. If you use spaces, you have to use that throughout. It's best if you set your IDE to handle that for you, whichever way you go.\n", 107 | "\n", 108 | "Python code files have an extension of `.py`. \n", 109 | "\n", 110 | "Comments in Python start with the hash-tag: `#`. There are no block comments (and this makes us all sad) so each line you want to comment must have a tag in front of that line. Keep the lines short (80 characters or so) so that they don't fall off a single-line display like at the command line." 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": {}, 116 | "source": [ 117 | "

2.3 Variables

\n", 118 | "\n", 119 | "Variables stand in for replaceable values. Python is not strongly-typed, meaning you can just declare a variable name and set it to a value at the same time, and Python will try and guess what data type you want. You use an `=` sign to assign values, and `==` to compare things.\n", 120 | "\n", 121 | "Quotes \\\" or ticks \\' are fine, just be consistent.\n", 122 | "\n", 123 | "`# There are some keywords to be aware of, but x and y are always good choices.`\n", 124 | "\n", 125 | "`x = \"Buck\" # I'm a string.`\n", 126 | "\n", 127 | "`type(x)`\n", 128 | "\n", 129 | "`y = 10 # I'm an integer.`\n", 130 | "\n", 131 | "`type(y)`\n", 132 | "\n", 133 | "To change the type of a value, just re-enter something else:\n", 134 | "\n", 135 | "`x = \"Buck\" # I'm a string.`\n", 136 | "\n", 137 | "`type(x)`\n", 138 | "\n", 139 | "`x = 10 # Now I'm an integer.`\n", 140 | "\n", 141 | "`type(x)`\n", 142 | "\n", 143 | "Or cast it By implicitly declaring the conversion:\n", 144 | "\n", 145 | "`x = \"10\"`\n", 146 | "\n", 147 | "`type(x)`\n", 148 | "\n", 149 | "`print int(x)`\n", 150 | "\n", 151 | "To concatenate string values, use the `+` sign:\n", 152 | "\n", 153 | "`x = \"Buck\"`\n", 154 | "\n", 155 | "`y = \" Woody\"`\n", 156 | "\n", 157 | "`print(x + y)`" 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "execution_count": null, 163 | "metadata": {}, 164 | "outputs": [], 165 | "source": [ 166 | "# Try it:\n" 167 | ] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": {}, 172 | "source": [ 173 | "

2.4 Operations and Functions

\n", 174 | "\n", 175 | "Python has the following operators:\n", 176 | "\n", 177 | " Arithmetic Operators\n", 178 | " Comparison (Relational) Operators\n", 179 | " Assignment Operators\n", 180 | " Logical Operators\n", 181 | " Bitwise Operators\n", 182 | " Membership Operators\n", 183 | " Identity Operators\n", 184 | "\n", 185 | "You have the standard operators and functions from most every language. Here are some of the tokens:\n", 186 | "\n", 187 | "
\n",
188 |     "\n",
189 |     "    !=                  *=                  <<                  ^  \n",
190 |     "    \"                   +                   <<=                 ^= \n",
191 |     "    \"\"\"                 +=                  <=                  `\n",
192 |     "    %                   ,                   <>                  __\n",
193 |     "    %=                  -                   ==                     \n",
194 |     "    &                   -=                  >                   b\" \n",
195 |     "    &=                  .                   >=                  b' \n",
196 |     "    '                   ...                 >>                  j  \n",
197 |     "    '''                 /                   >>=                 r\" \n",
198 |     "    (                   //                  @                   r' \n",
199 |     "    )                   //=                 J                   |'\n",
200 |     "    *                   /=                  [                   |= \n",
201 |     "    **                  :                   \\                   ~  \n",
202 |     "    **=                 <                   ]                      \n",
203 |     "\n",
204 |     "
\n", 205 | "\n", 206 | "Wait...that's it? That's all you're going to tell me? *(Hint: use what you've learned):*\n", 207 | "\n", 208 | "`help('symbols')`\n", 209 | "\n", 210 | "Walk through each of these operators carefully - you'll use them when you work with data in the next module.\n" 211 | ] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": null, 216 | "metadata": { 217 | "collapsed": true 218 | }, 219 | "outputs": [], 220 | "source": [ 221 | "# Try it:" 222 | ] 223 | }, 224 | { 225 | "cell_type": "markdown", 226 | "metadata": {}, 227 | "source": [ 228 | "

Activity - Programming basics

\n", 229 | "\n", 230 | "Open the **02_ProgrammingBasics.py** file and run the code you see there. The exercises will be marked out using comments:\n", 231 | "\n", 232 | "`# - Section Number`" 233 | ] 234 | }, 235 | { 236 | "cell_type": "code", 237 | "execution_count": null, 238 | "metadata": {}, 239 | "outputs": [], 240 | "source": [ 241 | "# 02_ProgrammingBasics.py\n", 242 | "# Purpose: General Programming exercises for Python \n", 243 | "# Author: Buck Woody\n", 244 | "# Credits and Sources: Inline\n", 245 | "# Last Updated: 27 June 2018\n", 246 | "\n", 247 | "# 2.1 Getting Help\n", 248 | "help()\n", 249 | "help(str)\n", 250 | "\n", 251 | "# - Write code to find help on help\n", 252 | "\n", 253 | "# 2.2 Code Syntax and Structure\n", 254 | "\n", 255 | "# - Python uses spaces to indicate code blocks. Fix the code below:\n", 256 | "x=10\n", 257 | "y=5\n", 258 | "if x > y:\n", 259 | "print(str(x) + \" is greater than \" + str(y))\n", 260 | "\n", 261 | "# - Arguments on first line are forbidden when not using vertical alignment. Fix this code:\n", 262 | "foo = long_function_name(var_one, var_two,\n", 263 | " var_three, var_four)\n", 264 | "\n", 265 | "# operators sit far away from their operands. Fix this code:\n", 266 | "income = (gross_wages +\n", 267 | " taxable_interest +\n", 268 | " (dividends - qualified_dividends) -\n", 269 | " ira_deduction -\n", 270 | " student_loan_interest)\n", 271 | "\n", 272 | "# - The import statement should use separate lines for each effort. You can fix the code below \n", 273 | "# using separate lines or by using the \"from\" statement:\n", 274 | "import sys, os\n", 275 | "\n", 276 | "# - The following code has extra spaces in the wrong places. Fix this code:\n", 277 | "i=i+1\n", 278 | "submitted +=1\n", 279 | "x = x * 2 - 1\n", 280 | "hypot2 = x * x + y * y\n", 281 | "c = (a + b) * (a - b)\n", 282 | "\n", 283 | "# 2.3 Variables \n", 284 | "\n", 285 | "# - Add a line below x=3 that changes the variable x from int to a string\n", 286 | "x=3\n", 287 | "type(x)\n", 288 | "\n", 289 | "# - Write code that prints the string \"This class is awesome\" using variables:\n", 290 | "x=\"is awesome\"\n", 291 | "y=\"This Class\"\n", 292 | "\n", 293 | "# 2.4 Operations and Functions\n", 294 | "\n", 295 | "# - Use some basic operators to write the following code:\n", 296 | "# Assign two variables\n", 297 | "# Add them\n", 298 | "# Subtract 20 from each, add those values together, save that to a new variable\n", 299 | "# Create a new string variable with the text \"The result of my operations are: \"\n", 300 | "# Print out a single string on the screen with the result of the variables \n", 301 | "# showing that result. \n", 302 | "\n", 303 | "# EOF: 02_ProgrammingBasics.py" 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "metadata": {}, 309 | "source": [ 310 | "

For Further Study

\n", 311 | "\n", 312 | "- The PEP - https://www.python.org/dev/peps/pep-0008/\n", 313 | "- Introduction to the Python Coding Style - http://stackabuse.com/introduction-to-the-python-coding-style/\n", 314 | "- The Microsoft Tutorial and samples for Python - https://code.visualstudio.com/docs/languages/python \n", 315 | "- Coding requirements and standards - PEP - https://www.python.org/dev/peps/pep-0008/\n", 316 | "- Another free online self-paced course - https://www.w3schools.com/python/default.asp \n", 317 | "\n", 318 | "Next, Continue to *03 Working with Data*" 319 | ] 320 | }, 321 | { 322 | "cell_type": "code", 323 | "execution_count": null, 324 | "metadata": {}, 325 | "outputs": [], 326 | "source": [] 327 | } 328 | ], 329 | "metadata": { 330 | "kernelspec": { 331 | "display_name": "Python 3", 332 | "language": "python", 333 | "name": "python3" 334 | }, 335 | "language_info": { 336 | "codemirror_mode": { 337 | "name": "ipython", 338 | "version": 3 339 | }, 340 | "file_extension": ".py", 341 | "mimetype": "text/x-python", 342 | "name": "python", 343 | "nbconvert_exporter": "python", 344 | "pygments_lexer": "ipython3", 345 | "version": "3.6.5" 346 | } 347 | }, 348 | "nbformat": 4, 349 | "nbformat_minor": 2 350 | } 351 | -------------------------------------------------------------------------------- /PythonForDataProfessionals/notebooks/.ipynb_checkpoints/04 Environments and Deployment-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "![](../graphics/solutions-microsoft-logo-small.png)\n", 10 | "\n", 11 | "# Python for Data Professionals\n", 12 | "\n", 13 | "## 04 Environments and Deployment\n", 14 | "\n", 15 | "

\n", 16 | "\n", 17 | "
\n", 18 | "
Course Outline
\n", 19 | "
1 - Overview and Course Setup
\n", 20 | "
2 - Programming Basics
\n", 21 | "
3 Working with Data
\n", 22 | "
4 Deployment and Environments (This section)
\n", 23 | "
4.1 Conda
\n", 24 | "
4.2 Pickling
\n", 25 | "
\n", 26 | "\n", 27 | "

" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "metadata": {}, 33 | "source": [ 34 | "The main installation of Python - sometimes called \"Core\" or \"base\" - has a set of parameters it works with. Since it runs on many operating systems, these variables are set and altered in different ways. Here are the primary environment settings on the standard installation of Python:\n", 35 | "\n", 36 | "- PYTHONPATH - Sets the location for the Python interpreter to locate the module files imported into a program.\n", 37 | "- PYTHONHOME - The alternative module search path. \n", 38 | "- PYTHONSTARTUP - The initialization file path ( `.pythonrc.py` ) containing the Python source code. It is executed every time you start the interpreter.\n", 39 | "- PYTHONCASEOK - For the Windows OS, find the first case-insensitive match in an \"import\" statement.\n", 40 | "\n", 41 | "You can show all of the variables by importing the base configuration system library, and then calling a print statement:\n", 42 | "\n", 43 | "`import sysconfig`\n", 44 | "\n", 45 | "`sysconfig.get_config_vars()`\n", 46 | "\n", 47 | "If you want to see just one variable, remember, it's just an array:\n", 48 | "\n", 49 | "`sysconfig.get_config_var('LIBDIR')`" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": 1, 55 | "metadata": {}, 56 | "outputs": [], 57 | "source": [ 58 | "# Try it:" 59 | ] 60 | }, 61 | { 62 | "cell_type": "markdown", 63 | "metadata": {}, 64 | "source": [ 65 | "

4.1 pip and Conda

\n", 66 | "\n", 67 | "To install new packages, you can build the source code manually, but that's not the way it's most often done. Typically you use a \"package manager\", and the most popular is \"pip\". The pip program installs and configures most of the libraries you will need for the base installation of Python.\n", 68 | "\n", 69 | "You probably already have the pip program. However, to install pip, you can use the [cURL](https://curl.haxx.se/download.html) program to get it:\n", 70 | "\n", 71 | "`curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py`\n", 72 | "\n", 73 | "Then use Python to run the script to install it:\n", 74 | "\n", 75 | "`python get-pip.py`\n", 76 | "\n", 77 | "From there, you can query the packages you have with this command, from the command-line in your operating system:\n", 78 | "\n", 79 | "`pip list`\n", 80 | "\n", 81 | "You can install a package using this command:\n", 82 | "\n", 83 | "`pip install SomePackage # latest version`\n", 84 | "\n", 85 | "`pip install SomePackage==1.0.4 # specific version`\n", 86 | "\n", 87 | "`pip install 'SomePackage>=1.0.4' # minimum version`\n", 88 | "\n", 89 | "And you can remove a package with this command:\n", 90 | "\n", 91 | "`pip uninstall SomePackage`\n", 92 | "\n", 93 | "There is a lot more that you can do with pip, and you can find out the list here:\n", 94 | "\n", 95 | "`pip`\n", 96 | "\n", 97 | "A more robust package manager, which even installs a distribution of Python for you along with other tools, is [Conda](https://conda.io/docs/user-guide/getting-started.html). For this course, you have installed Python using Conda, which not only has a package manager, but also isolates environments for you. This means that you can create a \"boundary\" of variables, package directories, and more around a name you specify. You can then switch to that environment to create your code, and that code will always have a consistent set of variables and packages.\n", 98 | "\n", 99 | "To create a Conda environment, issue the following command:\n", 100 | "\n", 101 | "`conda create --name`\n", 102 | "\n", 103 | "For instance, this command creates a new environment called \"bucktest\" and installs the biology package called biopython:\n", 104 | "\n", 105 | "`conda create --name bucktest biopython`\n", 106 | "\n", 107 | "To see the environments, issue the following command:\n", 108 | "\n", 109 | "`conda info --envs`\n", 110 | "\n", 111 | "The one with the asterisk (*) is the one you are using now. To switch to another environment, issue the following command:\n", 112 | "\n", 113 | "`activate bucktest` (In Windows)\n", 114 | "\n", 115 | "`source activate bucktest` (Mac and Linux)\n", 116 | "\n", 117 | "And to see information about that environment, issue the following command:\n", 118 | "\n", 119 | "`conda list`\n", 120 | "\n", 121 | "or just `conda` to find out everything you can do with Conda.\n", 122 | "\n", 123 | "To install packages in that environment, use this command:\n", 124 | "\n", 125 | "`conda install biopython`" 126 | ] 127 | }, 128 | { 129 | "cell_type": "markdown", 130 | "metadata": {}, 131 | "source": [ 132 | "

Activity - pip and Conda

\n", 133 | "\n", 134 | "Now open the `/code/04_EnvironmentsAndDeployment.py` file and follow the instructions you see there for 4.1." 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": 2, 140 | "metadata": {}, 141 | "outputs": [], 142 | "source": [ 143 | "# 04_EnvironmentsAndDeployments.py\n", 144 | "# Purpose: Environmental settings and configurations\n", 145 | "# Author: Buck Woody\n", 146 | "# Credits and Sources: Inline\n", 147 | "# Last Updated: 07 July 2018\n", 148 | "\n", 149 | "# - 4.1 Show the main environment variables in the current Python environment. Which directory has the libraries?\n", 150 | "\n", 151 | "# - What else can you find in the sysconfig library? How would you find that out?\n", 152 | "\n", 153 | "# - Using conda commands, what libraries are currently loaded? \n", 154 | "# How would you install a new one? \n", 155 | "# What environment are you using now?" 156 | ] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "metadata": {}, 161 | "source": [ 162 | "

4.2 Pickling

\n", 163 | "\n", 164 | "\"Pickling\" in Python means to serialize a Python object. Perhaps that isn't very helpful - what it really means is to take the output of whatever you did in Python and make it available again in another environment or program. It's a way of saving the \"state\" of a program so that it can be transferred and then re-loaded.\n", 165 | "\n", 166 | "It's best illustrated with some code:\n", 167 | "\n", 168 | "`import pickle`\n", 169 | "\n", 170 | "`a = ['1','2','3']`\n", 171 | "\n", 172 | "`PickleFileName = \"picklefile\"`\n", 173 | "\n", 174 | "`FileObject = open(PickleFileName,'wb')`\n", 175 | "\n", 176 | "`pickle.dump(a,FileObject)`\n", 177 | "\n", 178 | "`fileObject.close()`\n", 179 | "\n", 180 | "Now you can copy that file to a new computer, open Python, and work with it again as if you ran it there:\n", 181 | "\n", 182 | "`import pickle`\n", 183 | "\n", 184 | "`PickleFileName = \"picklefile\"`\n", 185 | "\n", 186 | "`FileObject = open(PickleFileName,'r') ` \n", 187 | "\n", 188 | "`b = pickle.load(FileObject) ` \n", 189 | "\n", 190 | "`b`\n", 191 | "\n", 192 | "And now *a* equals *b*. Of course, your program would be much longer, most often a series of steps, which might for instance do a Machine Learning prediction. \n", 193 | "\n", 194 | "You can read a lot more about pickling here: https://wiki.python.org/moin/UsingPickle" 195 | ] 196 | }, 197 | { 198 | "cell_type": "markdown", 199 | "metadata": {}, 200 | "source": [ 201 | "

Activity - Pickle

\n", 202 | "\n", 203 | "Now open the `/code/04_EnvironmentsAndDeployment.py` file and follow the instructions you see there for step 4.2." 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": 4, 209 | "metadata": {}, 210 | "outputs": [], 211 | "source": [ 212 | "# - 4.2 Create a program that has three text variables. Combine these three into another varaible. \n", 213 | "# Load the pickle library and save the results of the first program as a pkl file.\n", 214 | "# Close the first program, and create another one that opens and reads the pkl file.\n", 215 | "# Combine the final variable from the last program with a next text variable from this program. \n", 216 | "\n", 217 | "# EOF: 04_EnvironmentsAndDeployment.py" 218 | ] 219 | }, 220 | { 221 | "cell_type": "markdown", 222 | "metadata": {}, 223 | "source": [ 224 | "

4.3 Docker and Flask

\n", 225 | "\n", 226 | "Two other abstraction levels are useful to think about. You're probably familiar with Virtual Machines - which uses software to emulate hardware. This lets you install a complete new \"computer\" in a computer's OS. One level up from that abstraction layer is a *Container*. A Container goes slightly further by including a very small kernel of an operating system (most often Linux) to operate a runtime - like Python. This provides an even more consistent environment for your application, since it can also include settings and programs above the Python level. \n", 227 | "\n", 228 | "The *Flask* micro-framework for Python isn't technically an abstraction layer, it has more to do with serving your application up to a Web call. You'll often see Docker and Flask used together, so you'll cover it here for completeness. Once again, seeing some code is useful to understand - this example comes from the documentation site:\n", 229 | "\n", 230 | "
\n",
231 |     "\n",
232 |     "from flask import Flask\n",
233 |     "app = Flask(__name__)\n",
234 |     "\n",
235 |     "@app.route('/')\n",
236 |     "def hello_world():\n",
237 |     "    return 'Hello, World!'\n",
238 |     "\n",
239 |     "
\n", 240 | "\n", 241 | "You can probably follow the layout of this code, but there are some specifics here. First, the code imported Flask itself. Next, the code creates an instance of a Flask app, called \"app\" in this case. From there, the route was set to the base URL call - just as in the main part of a web page. And finally, a simple function returns the words \"Hello World!\".\n", 242 | "\n", 243 | "So far, nothing is happening - the code is just on disk. However, you can \"deploy\" the code on a system that is running with these commands (in Linux):\n", 244 | "\n", 245 | "
\n",
246 |     "$ export FLASK_APP=hello.py\n",
247 |     "$ flask run\n",
248 |     " * Running on http://127.0.0.1:5000/\n",
249 |     "
\n", 250 | "\n", 251 | "OK...so what? Well, in this case, you could open a Web Browser on that system and type in that URL - and you'll see \"Hello World!\" pop up on the screen. Of course, real applications are much more complicated, can take POST and GET operations, and much more. But this is a very convenient way to serve up your Python application without having to tell your users to install and run Python.\n", 252 | "\n", 253 | "Of course, there's a lot more to both of these topics - read the references below to learn more." 254 | ] 255 | }, 256 | { 257 | "cell_type": "markdown", 258 | "metadata": {}, 259 | "source": [ 260 | "

For Further Study

\n", 261 | "\n", 262 | "- More on Docker: https://www.fullstackpython.com/docker.html\n", 263 | "- More on Flask: http://flask.pocoo.org/\n", 264 | "- Creating a simple Flask application: http://containertutorials.com/docker-compose/flask-simple-app.html \n", 265 | "\n", 266 | "Congratulations! You now know the basics or working with Python and Data. As you can see, there's a lot more to learn - so use your new knowledge to expand on what you have learned. " 267 | ] 268 | } 269 | ], 270 | "metadata": { 271 | "kernelspec": { 272 | "display_name": "Python 3", 273 | "language": "python", 274 | "name": "python3" 275 | }, 276 | "language_info": { 277 | "codemirror_mode": { 278 | "name": "ipython", 279 | "version": 3 280 | }, 281 | "file_extension": ".py", 282 | "mimetype": "text/x-python", 283 | "name": "python", 284 | "nbconvert_exporter": "python", 285 | "pygments_lexer": "ipython3", 286 | "version": "3.6.5" 287 | } 288 | }, 289 | "nbformat": 4, 290 | "nbformat_minor": 2 291 | } 292 | -------------------------------------------------------------------------------- /PythonForDataProfessionals/notebooks/00 Pre-Requisites.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "![](../graphics/solutions-microsoft-logo-small.png)\n", 10 | "\n", 11 | "# Python for Data Professionals\n", 12 | "\n", 13 | "

00 Pre-Requisites

\n", 14 | "\n", 15 | "This \"Python for Data Professionals\" course is taught using [Jupyter Notebooks](https://notebooks.azure.com/help). You'll be able to run the code samples you see by typing in the Python examples as decribed and clicking the \"Run\" button you see at the top of the screen. \n", 16 | "\n", 17 | "For the most part, there are no pre-requisites for this course using a Notebook. However, if you would like to learn this material on your own machine, you'll need Microsoft Windows, SQL Server, and Visual Studio. You can of course use the Python language on many platforms and in other distributions and with other tools, but using this configuration allows you to stay consistent for instruction during this course. Feel free to use other installations after you complete the course.\n", 18 | "\n", 19 | "Read over this section and then proceed to the next notebook." 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "

Activity 1: Set up the Windows Operating System

\n", 27 | "\n", 28 | "You have three options for setting up Microsoft Windows to complete this course. You can use a Local installation of Windows, a Virtual Machine on your local system, or a Virtual Machine stored in a Cloud provider such as Microsoft Azure. *(The third option is only for classrooms where you have reliable connections to the Internet)*\n", 29 | "\n", 30 | "

Option 1 - Local Installation

\n", 31 | "\n", 32 | "- Install a recent version of Microsoft Windows. For this course, Windows 10, or any current of Windows Server is acceptable.\n", 33 | "- Install all updates to the operating system.\n", 34 | "\n", 35 | "

Option 2 - Install Windows on a Local Virtual Machine Environment

\n", 36 | "\n", 37 | "- Using your local system, [navigate to this resource](https://developer.microsoft.com/en-us/windows/downloads/virtual-machines) and follow the instructions there.\n", 38 | "\n", 39 | "**NOTE: Wait as long as reasonably possible to ensure that the system does not expire - these are free licenses, but they have a time limit**\n", 40 | "\n", 41 | "- You can also use whatever Hypervisor you like for your system and install a legal, registered copy of Microsoft Windows.\n", 42 | "\n", 43 | "

Option 3 - Use a Virtual Machine in a Cloud Provider

\n", 44 | "\n", 45 | "- If you have access to the Internet, you can set up a [free Microsoft Azure Account](https://azure.microsoft.com/en-us/free/search/?&OCID=AID631184_SEM_bSHIQHtA&lnkd=Google_Azure_Brand&gclid=Cj0KCQjwpcLZBRCnARIsAMPBgF2myLWEk3Hllm2354GEs0rD1sDST_xcfkFGRdAE8toYZMalbQJ4M3YaAs9UEALw_wcB&dclid=CPDRgcv57tsCFVXE4Qodo-gLzg) and use a [Data Science Virtual Machine](https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/provision-vm). Any size will do, and the free account provides enough resources for a single course. You will not need to install Anaconda, VSCode or SQL Server if you use this choice, as they are already installed for you.\n", 46 | "- Log in to the system and run [Windows Update](https://support.microsoft.com/en-us/help/4027667/windows-update-windows-10)\n", 47 | "\n" 48 | ] 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "metadata": {}, 53 | "source": [ 54 | "

Activity 2: Install SQL Server 2017 with ML Services

\n", 55 | "\n", 56 | "- [Navigate to this resource](https://www.microsoft.com/en-us/sql-server/sql-server-downloads), Select **Developer** from the lower part of the page, and install the **Developer Edition**. Select all components for installation.\n", 57 | "\n", 58 | "- Run Windows Update and select the [\"Install updates for other products\" option](https://www.lifewire.com/how-to-change-windows-update-settings-2625778). Apply the latest updates to the classroom system.\n", 59 | "\n", 60 | "

Activity 3: Install Visual Studio with Machine Learning and Data Science workloads

\n", 61 | "\n", 62 | "- On your classroom system, [install Visual Studio 2017](https://www.visualstudio.com/downloads/) - The free Community Edition is adequate for this course.\n", 63 | "\n", 64 | "- During the installation, select the \"Data storage and processing\" and \"Data science and analytical applicaitons\" Workloads. *(NOTE: [In the Data Science Workload installation box, select ALL optional components on the Summary pane!](https://blogs.msdn.microsoft.com/visualstudio/2016/11/18/data-science-workloads-in-visual-studio-2017-rc/))*\n", 65 | "\n", 66 | "- Log in with a Live ID to Visual Studio, let the system load, and apply any updates.\n", 67 | "\n", 68 | "- After the updates complete, click the \"R Tools\" menu item and open the \"Interactive R Window\" option (This will verify that the Data Science Workloads add-ins are working, R and Python). Type the following in that panel to ensure the installation was successful:\n", 69 | "\n" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": null, 75 | "metadata": {}, 76 | "outputs": [], 77 | "source": [ 78 | "x <- 10\n", 79 | "\n", 80 | "x\n" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "metadata": {}, 86 | "source": [ 87 | "You should see the result **\\[1\\]10** returned. If not, open the Visual Studio Installer and select the \"Repair\" option." 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "

For Further Study

\n", 95 | "\n", 96 | "- Platforms supported: https://www.python.org/download/other/ \n", 97 | "\n", 98 | "- Installing Python: https://www.python.org/downloads/\n", 99 | "\n", 100 | "- Installing Python using Anaconda: https://www.infoworld.com/article/3267976/python/anaconda-cpython-pypy-and-more-know-your-python-distributions.html\n", 101 | "\n", 102 | "Next, Continue to *01 Overview and Course Setup*" 103 | ] 104 | } 105 | ], 106 | "metadata": { 107 | "kernelspec": { 108 | "display_name": "Python 3", 109 | "language": "python", 110 | "name": "python3" 111 | }, 112 | "language_info": { 113 | "codemirror_mode": { 114 | "name": "ipython", 115 | "version": 3 116 | }, 117 | "file_extension": ".py", 118 | "mimetype": "text/x-python", 119 | "name": "python", 120 | "nbconvert_exporter": "python", 121 | "pygments_lexer": "ipython3", 122 | "version": "3.6.5" 123 | } 124 | }, 125 | "nbformat": 4, 126 | "nbformat_minor": 2 127 | } 128 | -------------------------------------------------------------------------------- /PythonForDataProfessionals/notebooks/01 Overview and Setup.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "![](../graphics/solutions-microsoft-logo-small.png)\n", 10 | "\n", 11 | "# Python for Data Professionals\n", 12 | "\n", 13 | "## 01 Overview and Setup\n", 14 | "\n", 15 | "In this course you'll cover the basics of the Python language and environment from a Data Professional's perspective. While you will learn Python, you'll quickly cover topics that have a lot more depth available. In each section you'll get more references to go deeper, which you should follow up on. Also watch for links within the text - click on each one to explore that topic.\n", 16 | "\n", 17 | "Make sure you check out the **00 Pre-Requisites** page before you start. You'll need all of the items loaded there before you can proceed with the course.\n", 18 | "\n", 19 | "You'll cover these topics in the course:\n", 20 | "\n", 21 | "

\n", 22 | "\n", 23 | "
\n", 24 | "
Course Outline
\n", 25 | "
1 - Overview and Course Setup (This section)
\n", 26 | "
2 - Programming Basics
\n", 27 | "
3 Working with Data
\n", 28 | "
4 Deployment and Environments
\n", 29 | "
\n", 30 | "\n", 31 | "

" 32 | ] 33 | }, 34 | { 35 | "cell_type": "markdown", 36 | "metadata": {}, 37 | "source": [ 38 | "

Overview

\n", 39 | "\n", 40 | "There are two main versions of Python - 2 and 3. So many programs were written for version 2 that it is still around, and version 3 was such an upgrade that programs for 2 don't always run in 3 and visa-versa. For this course we'll do everything in version 3 - it's becoming the accepted standard for data professionals.\n", 41 | "\n", 42 | "You have a few ways of working with Python:\n", 43 | "\n", 44 | "- The Interactive Interpreter (Type `python` and the version number if it is in your path)\n", 45 | "- Writing code and running it in some graphical environment (Such as VSCode, Visual Studio, Spyder, PyCharm, IDLE, etc.)\n", 46 | "- Calling a `.py` script file from the `python` command \n", 47 | "\n", 48 | "When you're in command-mode, you'll see that the code looks more like a scripting language, meaning that some parenthesis around functions might not be there. Programming-mode looks like a standard programming language environment - you'll normally use that within an Integrated Programming Environment (IDE)." 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": {}, 54 | "source": [ 55 | "

Activity: Verify Your Installation and Configure Python

\n", 56 | "\n", 57 | "Open the **01_OverviewAndCourseSetup.py** file and run the code you see there. The exercises will be marked out using comments: \n", 58 | "\n", 59 | "
\n",
 60 |     "# TODO - Section Number\n",
 61 |     "
" 62 | ] 63 | }, 64 | { 65 | "cell_type": "code", 66 | "execution_count": 1, 67 | "metadata": {}, 68 | "outputs": [ 69 | { 70 | "ename": "SyntaxError", 71 | "evalue": "Missing parentheses in call to 'print'. Did you mean print(\"The Python Version is: \" python_version)? (, line 14)", 72 | "output_type": "error", 73 | "traceback": [ 74 | "\u001b[1;36m File \u001b[1;32m\"\"\u001b[1;36m, line \u001b[1;32m14\u001b[0m\n\u001b[1;33m print \"The Python Version is: \" python_version\u001b[0m\n\u001b[1;37m ^\u001b[0m\n\u001b[1;31mSyntaxError\u001b[0m\u001b[1;31m:\u001b[0m Missing parentheses in call to 'print'. Did you mean print(\"The Python Version is: \" python_version)?\n" 75 | ] 76 | } 77 | ], 78 | "source": [ 79 | "# 01_OverviewAndCourseSetup.py\n", 80 | "# Purpose: Initial Course Setup and displaying versions\n", 81 | "# Author: Buck Woody\n", 82 | "# Credits and Sources: Inline\n", 83 | "# Last Updated: 27 June 2018\n", 84 | "\n", 85 | "# Check the Python Version and Information\n", 86 | "import platform\n", 87 | "python_version=platform.python_version()\n", 88 | "print(python_version)\n", 89 | "\n", 90 | "# - Fix this code so that it runs\n", 91 | "\n", 92 | "print \"The Python Version is: \" python_version\n", 93 | "\n", 94 | "# - Using \"platform\", what other information can you derive about this system?\n", 95 | "\n", 96 | "# EOF: 01_OverviewAndCourseSetup.py" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "metadata": {}, 102 | "source": [ 103 | "

For Further Study

\n", 104 | "\n", 105 | "- Version differences: https://wiki.python.org/moin/Python2orPython3 \n", 106 | "- Development Environments: IDLE, tk, VSCode, PyCharm, Jupyter Notebooks, Documentation, Training Resources: https://www.python.org/doc/\n", 107 | "- and https://docs.python.org/3/tutorial/index.html \n", 108 | "- The Official Python Documentation Course: https://docs.python.org/3/tutorial/index.html\n", 109 | "\n", 110 | "Next, Continue to *02 Programming Basics*" 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": null, 116 | "metadata": {}, 117 | "outputs": [], 118 | "source": [] 119 | } 120 | ], 121 | "metadata": { 122 | "kernelspec": { 123 | "display_name": "Python 3", 124 | "language": "python", 125 | "name": "python3" 126 | }, 127 | "language_info": { 128 | "codemirror_mode": { 129 | "name": "ipython", 130 | "version": 3 131 | }, 132 | "file_extension": ".py", 133 | "mimetype": "text/x-python", 134 | "name": "python", 135 | "nbconvert_exporter": "python", 136 | "pygments_lexer": "ipython3", 137 | "version": "3.6.5" 138 | } 139 | }, 140 | "nbformat": 4, 141 | "nbformat_minor": 2 142 | } 143 | -------------------------------------------------------------------------------- /PythonForDataProfessionals/notebooks/02 Programming Basics.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "![](../graphics/solutions-microsoft-logo-small.png)\n", 10 | "\n", 11 | "# Python for Data Professionals\n", 12 | "\n", 13 | "## 02 Programming Basics\n", 14 | "\n", 15 | "

\n", 16 | "\n", 17 | "
\n", 18 | "
Course Outline
\n", 19 | "
1 - Overview and Course Setup
\n", 20 | "
2 - Programming Basics (This section)
\n", 21 | "
2.1 - Getting help
\n", 22 | "
2.2 Code Syntax and Structure
\n", 23 | "
2.3 Variables
\n", 24 | "
2.4 Operations and Functions
\n", 25 | "
3 Working with Data
\n", 26 | "
4 Deployment and Environments
\n", 27 | "
\n", 28 | "\n", 29 | "

" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": {}, 35 | "source": [ 36 | "## Programming Basics Overview\n", 37 | "\n", 38 | "From here on out, you'll focus on using Python in programming mode - you'll write code that you run from an IDE or a calling environment, not interactively from the command-line. As you work through this explanation, copy the code you see and run it to see the results. After you work through these copy-and-paste examples, you'll create your own code in the Activities that follow each section." 39 | ] 40 | }, 41 | { 42 | "cell_type": "markdown", 43 | "metadata": {}, 44 | "source": [ 45 | "

2.1 - Getting help

\n", 46 | "\n", 47 | "The very first thing you should learn in any language is how to get help. You can [find the help documents on-line](https://docs.python.org/3/index.html), or simply type\n", 48 | " \n", 49 | "`help()`\n", 50 | " \n", 51 | "in your code. For help on a specific topic, put the topic in the parenthesis:\n", 52 | " \n", 53 | " `help(str)`\n", 54 | "\n", 55 | " To see a list of topics, type \n", 56 | "\n", 57 | " `help(topics)`" 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": null, 63 | "metadata": {}, 64 | "outputs": [], 65 | "source": [ 66 | "# Try it:\n", 67 | "help(topics)" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": {}, 73 | "source": [ 74 | "

2.2 Code Syntax and Structure

\n", 75 | "\n", 76 | "Let's cover a few basics about how Python code is written. (For a full discussion, check out the [Style Guide for Python, called PEP 8](https://www.python.org/dev/peps/pep-0008/) ) Let's use the \"Zen of Python\" rules from Tim Peters for this course:\n", 77 | "\n", 78 | "
\n",
 79 |     "\n",
 80 |     "    Beautiful is better than ugly.\n",
 81 |     "    Explicit is better than implicit.\n",
 82 |     "    Simple is better than complex.\n",
 83 |     "    Complex is better than complicated.\n",
 84 |     "    Flat is better than nested.\n",
 85 |     "    Sparse is better than dense.\n",
 86 |     "    Readability counts.\n",
 87 |     "    Special cases aren't special enough to break the rules.\n",
 88 |     "    Although practicality beats purity.\n",
 89 |     "    Errors should never pass silently.\n",
 90 |     "    Unless explicitly silenced.\n",
 91 |     "    In the face of ambiguity, refuse the temptation to guess.\n",
 92 |     "    There should be one-- and preferably only one --obvious way to do it.\n",
 93 |     "    Although that way may not be obvious at first unless you're Dutch.\n",
 94 |     "    Now is better than never.\n",
 95 |     "    Although never is often better than right now.\n",
 96 |     "    If the implementation is hard to explain, it's a bad idea.\n",
 97 |     "    If the implementation is easy to explain, it may be a good idea.\n",
 98 |     "    Namespaces are one honking great idea -- let's do more of those!\n",
 99 |     "    --Tim Peters\n",
100 |     "\n",
101 |     "
\n", 102 | "\n", 103 | "In general, use standard coding practices - don't use keywords for variables, be consistent in your naming (camel-case, lower-case, etc.), comment your code clearly, and understand the general syntax of your language, and follow the principles above. But the most important tip is to at least read the PEP 8 and decide for yourself how well that fits into your Zen.\n", 104 | "\n", 105 | "There is one hard-and-fast rule for Python that you *do* need to be aware of: indentation. You **must** indent your code for classes, functions (or methods), loops, conditions, and lists. You can use a tab or four spaces (spaces are the accepted way to do it) but in any case, you have to be consistent. If you use tabs, you always use tabs. If you use spaces, you have to use that throughout. It's best if you set your IDE to handle that for you, whichever way you go.\n", 106 | "\n", 107 | "Python code files have an extension of `.py`. \n", 108 | "\n", 109 | "Comments in Python start with the hash-tag: `#`. There are no block comments (and this makes us all sad) so each line you want to comment must have a tag in front of that line. Keep the lines short (80 characters or so) so that they don't fall off a single-line display like at the command line." 110 | ] 111 | }, 112 | { 113 | "cell_type": "markdown", 114 | "metadata": {}, 115 | "source": [ 116 | "

2.3 Variables

\n", 117 | "\n", 118 | "Variables stand in for replaceable values. Python is not strongly-typed, meaning you can just declare a variable name and set it to a value at the same time, and Python will try and guess what data type you want. You use an `=` sign to assign values, and `==` to compare things.\n", 119 | "\n", 120 | "Quotes \\\" or ticks \\' are fine, just be consistent.\n", 121 | "\n", 122 | "`# There are some keywords to be aware of, but x and y are always good choices.`\n", 123 | "\n", 124 | "`x = \"Buck\" # I'm a string.`\n", 125 | "\n", 126 | "`type(x)`\n", 127 | "\n", 128 | "`y = 10 # I'm an integer.`\n", 129 | "\n", 130 | "`type(y)`\n", 131 | "\n", 132 | "To change the type of a value, just re-enter something else:\n", 133 | "\n", 134 | "`x = \"Buck\" # I'm a string.`\n", 135 | "\n", 136 | "`type(x)`\n", 137 | "\n", 138 | "`x = 10 # Now I'm an integer.`\n", 139 | "\n", 140 | "`type(x)`\n", 141 | "\n", 142 | "Or cast it By implicitly declaring the conversion:\n", 143 | "\n", 144 | "`x = \"10\"`\n", 145 | "\n", 146 | "`type(x)`\n", 147 | "\n", 148 | "`print int(x)`\n", 149 | "\n", 150 | "To concatenate string values, use the `+` sign:\n", 151 | "\n", 152 | "`x = \"Buck\"`\n", 153 | "\n", 154 | "`y = \" Woody\"`\n", 155 | "\n", 156 | "`print(x + y)`" 157 | ] 158 | }, 159 | { 160 | "cell_type": "code", 161 | "execution_count": null, 162 | "metadata": {}, 163 | "outputs": [], 164 | "source": [ 165 | "# Try it:\n", 166 | "x = \"Buck\" # I'm a string.\n", 167 | "\n", 168 | "type(x)\n", 169 | "\n", 170 | "x = 10 # Now I'm an integer.\n", 171 | "\n", 172 | "type(x)" 173 | ] 174 | }, 175 | { 176 | "cell_type": "markdown", 177 | "metadata": {}, 178 | "source": [ 179 | "

2.4 Operations and Functions

\n", 180 | "\n", 181 | "Python has the following operators:\n", 182 | "\n", 183 | " Arithmetic Operators\n", 184 | " Comparison (Relational) Operators\n", 185 | " Assignment Operators\n", 186 | " Logical Operators\n", 187 | " Bitwise Operators\n", 188 | " Membership Operators\n", 189 | " Identity Operators\n", 190 | "\n", 191 | "You have the standard operators and functions from most every language. Here are some of the tokens:\n", 192 | "\n", 193 | "
\n",
194 |     "\n",
195 |     "    !=                  *=                  <<                  ^  \n",
196 |     "    \"                   +                   <<=                 ^= \n",
197 |     "    \"\"\"                 +=                  <=                  `\n",
198 |     "    %                   ,                   <>                  __\n",
199 |     "    %=                  -                   ==                     \n",
200 |     "    &                   -=                  >                   b\" \n",
201 |     "    &=                  .                   >=                  b' \n",
202 |     "    '                   ...                 >>                  j  \n",
203 |     "    '''                 /                   >>=                 r\" \n",
204 |     "    (                   //                  @                   r' \n",
205 |     "    )                   //=                 J                   |'\n",
206 |     "    *                   /=                  [                   |= \n",
207 |     "    **                  :                   \\                   ~  \n",
208 |     "    **=                 <                   ]                      \n",
209 |     "\n",
210 |     "
\n", 211 | "\n", 212 | "Wait...that's it? That's all you're going to tell me? *(Hint: use what you've learned):*\n", 213 | "\n", 214 | "`help('symbols')`\n", 215 | "\n", 216 | "Walk through each of these operators carefully - you'll use them when you work with data in the next module.\n" 217 | ] 218 | }, 219 | { 220 | "cell_type": "code", 221 | "execution_count": null, 222 | "metadata": { 223 | "collapsed": true 224 | }, 225 | "outputs": [], 226 | "source": [ 227 | "# Try it:" 228 | ] 229 | }, 230 | { 231 | "cell_type": "markdown", 232 | "metadata": {}, 233 | "source": [ 234 | "

Activity - Programming basics

\n", 235 | "\n", 236 | "Open the **02_ProgrammingBasics.py** file and run the code you see there. The exercises will be marked out using comments:\n", 237 | "\n", 238 | "`# - Section Number`" 239 | ] 240 | }, 241 | { 242 | "cell_type": "code", 243 | "execution_count": null, 244 | "metadata": {}, 245 | "outputs": [], 246 | "source": [ 247 | "# 02_ProgrammingBasics.py\n", 248 | "# Purpose: General Programming exercises for Python \n", 249 | "# Author: Buck Woody\n", 250 | "# Credits and Sources: Inline\n", 251 | "# Last Updated: 27 June 2018\n", 252 | "\n", 253 | "# 2.1 Getting Help\n", 254 | "help()\n", 255 | "help(str)\n", 256 | "\n", 257 | "# - Write code to find help on help\n", 258 | "\n", 259 | "# 2.2 Code Syntax and Structure\n", 260 | "\n", 261 | "# - Python uses spaces to indicate code blocks. Fix the code below:\n", 262 | "x=10\n", 263 | "y=5\n", 264 | "if x > y:\n", 265 | "print(str(x) + \" is greater than \" + str(y))\n", 266 | "\n", 267 | "# - Arguments on first line are forbidden when not using vertical alignment. Fix this code:\n", 268 | "foo = long_function_name(var_one, var_two,\n", 269 | " var_three, var_four)\n", 270 | "\n", 271 | "# operators sit far away from their operands. Fix this code:\n", 272 | "income = (gross_wages +\n", 273 | " taxable_interest +\n", 274 | " (dividends - qualified_dividends) -\n", 275 | " ira_deduction -\n", 276 | " student_loan_interest)\n", 277 | "\n", 278 | "# - The import statement should use separate lines for each effort. You can fix the code below \n", 279 | "# using separate lines or by using the \"from\" statement:\n", 280 | "import sys, os\n", 281 | "\n", 282 | "# - The following code has extra spaces in the wrong places. Fix this code:\n", 283 | "i=i+1\n", 284 | "submitted +=1\n", 285 | "x = x * 2 - 1\n", 286 | "hypot2 = x * x + y * y\n", 287 | "c = (a + b) * (a - b)\n", 288 | "\n", 289 | "# 2.3 Variables \n", 290 | "\n", 291 | "# - Add a line below x=3 that changes the variable x from int to a string\n", 292 | "x=3\n", 293 | "type(x)\n", 294 | "\n", 295 | "# - Write code that prints the string \"This class is awesome\" using variables:\n", 296 | "x=\"is awesome\"\n", 297 | "y=\"This Class\"\n", 298 | "\n", 299 | "# 2.4 Operations and Functions\n", 300 | "\n", 301 | "# - Use some basic operators to write the following code:\n", 302 | "# Assign two variables\n", 303 | "# Add them\n", 304 | "# Subtract 20 from each, add those values together, save that to a new variable\n", 305 | "# Create a new string variable with the text \"The result of my operations are: \"\n", 306 | "# Print out a single string on the screen with the result of the variables \n", 307 | "# showing that result. \n", 308 | "\n", 309 | "# EOF: 02_ProgrammingBasics.py" 310 | ] 311 | }, 312 | { 313 | "cell_type": "markdown", 314 | "metadata": {}, 315 | "source": [ 316 | "

For Further Study

\n", 317 | "\n", 318 | "- The PEP - https://www.python.org/dev/peps/pep-0008/\n", 319 | "- Introduction to the Python Coding Style - http://stackabuse.com/introduction-to-the-python-coding-style/\n", 320 | "- The Microsoft Tutorial and samples for Python - https://code.visualstudio.com/docs/languages/python \n", 321 | "- Coding requirements and standards - PEP - https://www.python.org/dev/peps/pep-0008/\n", 322 | "- Another free online self-paced course - https://www.w3schools.com/python/default.asp \n", 323 | "\n", 324 | "Next, Continue to *03 Working with Data*" 325 | ] 326 | }, 327 | { 328 | "cell_type": "code", 329 | "execution_count": null, 330 | "metadata": {}, 331 | "outputs": [], 332 | "source": [] 333 | } 334 | ], 335 | "metadata": { 336 | "kernelspec": { 337 | "display_name": "Python 3", 338 | "language": "python", 339 | "name": "python3" 340 | }, 341 | "language_info": { 342 | "codemirror_mode": { 343 | "name": "ipython", 344 | "version": 3 345 | }, 346 | "file_extension": ".py", 347 | "mimetype": "text/x-python", 348 | "name": "python", 349 | "nbconvert_exporter": "python", 350 | "pygments_lexer": "ipython3", 351 | "version": "3.6.5" 352 | } 353 | }, 354 | "nbformat": 4, 355 | "nbformat_minor": 2 356 | } 357 | -------------------------------------------------------------------------------- /PythonForDataProfessionals/notebooks/04 Environments and Deployment.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "![](../graphics/solutions-microsoft-logo-small.png)\n", 10 | "\n", 11 | "# Python for Data Professionals\n", 12 | "\n", 13 | "## 04 Environments and Deployment\n", 14 | "\n", 15 | "

\n", 16 | "\n", 17 | "
\n", 18 | "
Course Outline
\n", 19 | "
1 - Overview and Course Setup
\n", 20 | "
2 - Programming Basics
\n", 21 | "
3 Working with Data
\n", 22 | "
4 Deployment and Environments (This section)
\n", 23 | "
4.1 Conda
\n", 24 | "
4.2 Pickling
\n", 25 | "
\n", 26 | "\n", 27 | "

" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "metadata": {}, 33 | "source": [ 34 | "The main installation of Python - sometimes called \"Core\" or \"base\" - has a set of parameters it works with. Since it runs on many operating systems, these variables are set and altered in different ways. Here are the primary environment settings on the standard installation of Python:\n", 35 | "\n", 36 | "- PYTHONPATH - Sets the location for the Python interpreter to locate the module files imported into a program.\n", 37 | "- PYTHONHOME - The alternative module search path. \n", 38 | "- PYTHONSTARTUP - The initialization file path ( `.pythonrc.py` ) containing the Python source code. It is executed every time you start the interpreter.\n", 39 | "- PYTHONCASEOK - For the Windows OS, find the first case-insensitive match in an \"import\" statement.\n", 40 | "\n", 41 | "You can show all of the variables by importing the base configuration system library, and then calling a print statement:\n", 42 | "\n", 43 | "`import sysconfig`\n", 44 | "\n", 45 | "`sysconfig.get_config_vars()`\n", 46 | "\n", 47 | "If you want to see just one variable, remember, it's just an array:\n", 48 | "\n", 49 | "`sysconfig.get_config_var('LIBDIR')`" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": 1, 55 | "metadata": {}, 56 | "outputs": [], 57 | "source": [ 58 | "# Try it:" 59 | ] 60 | }, 61 | { 62 | "cell_type": "markdown", 63 | "metadata": {}, 64 | "source": [ 65 | "

4.1 pip and Conda

\n", 66 | "\n", 67 | "To install new packages, you can build the source code manually, but that's not the way it's most often done. Typically you use a \"package manager\", and the most popular is \"pip\". The pip program installs and configures most of the libraries you will need for the base installation of Python.\n", 68 | "\n", 69 | "You probably already have the pip program. However, to install pip, you can use the [cURL](https://curl.haxx.se/download.html) program to get it:\n", 70 | "\n", 71 | "`curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py`\n", 72 | "\n", 73 | "Then use Python to run the script to install it:\n", 74 | "\n", 75 | "`python get-pip.py`\n", 76 | "\n", 77 | "From there, you can query the packages you have with this command, from the command-line in your operating system:\n", 78 | "\n", 79 | "`pip list`\n", 80 | "\n", 81 | "You can install a package using this command:\n", 82 | "\n", 83 | "`pip install SomePackage # latest version`\n", 84 | "\n", 85 | "`pip install SomePackage==1.0.4 # specific version`\n", 86 | "\n", 87 | "`pip install 'SomePackage>=1.0.4' # minimum version`\n", 88 | "\n", 89 | "And you can remove a package with this command:\n", 90 | "\n", 91 | "`pip uninstall SomePackage`\n", 92 | "\n", 93 | "There is a lot more that you can do with pip, and you can find out the list here:\n", 94 | "\n", 95 | "`pip`\n", 96 | "\n", 97 | "A more robust package manager, which even installs a distribution of Python for you along with other tools, is [Conda](https://conda.io/docs/user-guide/getting-started.html). For this course, you have installed Python using Conda, which not only has a package manager, but also isolates environments for you. This means that you can create a \"boundary\" of variables, package directories, and more around a name you specify. You can then switch to that environment to create your code, and that code will always have a consistent set of variables and packages.\n", 98 | "\n", 99 | "To create a Conda environment, issue the following command:\n", 100 | "\n", 101 | "`conda create --name`\n", 102 | "\n", 103 | "For instance, this command creates a new environment called \"bucktest\" and installs the biology package called biopython:\n", 104 | "\n", 105 | "`conda create --name bucktest biopython`\n", 106 | "\n", 107 | "To see the environments, issue the following command:\n", 108 | "\n", 109 | "`conda info --envs`\n", 110 | "\n", 111 | "The one with the asterisk (*) is the one you are using now. To switch to another environment, issue the following command:\n", 112 | "\n", 113 | "`activate bucktest` (In Windows)\n", 114 | "\n", 115 | "`source activate bucktest` (Mac and Linux)\n", 116 | "\n", 117 | "And to see information about that environment, issue the following command:\n", 118 | "\n", 119 | "`conda list`\n", 120 | "\n", 121 | "or just `conda` to find out everything you can do with Conda.\n", 122 | "\n", 123 | "To install packages in that environment, use this command:\n", 124 | "\n", 125 | "`conda install biopython`" 126 | ] 127 | }, 128 | { 129 | "cell_type": "markdown", 130 | "metadata": {}, 131 | "source": [ 132 | "

Activity - pip and Conda

\n", 133 | "\n", 134 | "Now open the `/code/04_EnvironmentsAndDeployment.py` file and follow the instructions you see there for 4.1." 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": 2, 140 | "metadata": {}, 141 | "outputs": [], 142 | "source": [ 143 | "# 04_EnvironmentsAndDeployments.py\n", 144 | "# Purpose: Environmental settings and configurations\n", 145 | "# Author: Buck Woody\n", 146 | "# Credits and Sources: Inline\n", 147 | "# Last Updated: 07 July 2018\n", 148 | "\n", 149 | "# - 4.1 Show the main environment variables in the current Python environment. Which directory has the libraries?\n", 150 | "\n", 151 | "# - What else can you find in the sysconfig library? How would you find that out?\n", 152 | "\n", 153 | "# - Using conda commands, what libraries are currently loaded? \n", 154 | "# How would you install a new one? \n", 155 | "# What environment are you using now?" 156 | ] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "metadata": {}, 161 | "source": [ 162 | "

4.2 Pickling

\n", 163 | "\n", 164 | "\"Pickling\" in Python means to serialize a Python object. Perhaps that isn't very helpful - what it really means is to take the output of whatever you did in Python and make it available again in another environment or program. It's a way of saving the \"state\" of a program so that it can be transferred and then re-loaded.\n", 165 | "\n", 166 | "It's best illustrated with some code:\n", 167 | "\n", 168 | "`import pickle`\n", 169 | "\n", 170 | "`a = ['1','2','3']`\n", 171 | "\n", 172 | "`PickleFileName = \"picklefile\"`\n", 173 | "\n", 174 | "`FileObject = open(PickleFileName,'wb')`\n", 175 | "\n", 176 | "`pickle.dump(a,FileObject)`\n", 177 | "\n", 178 | "`fileObject.close()`\n", 179 | "\n", 180 | "Now you can copy that file to a new computer, open Python, and work with it again as if you ran it there:\n", 181 | "\n", 182 | "`import pickle`\n", 183 | "\n", 184 | "`PickleFileName = \"picklefile\"`\n", 185 | "\n", 186 | "`FileObject = open(PickleFileName,'r') ` \n", 187 | "\n", 188 | "`b = pickle.load(FileObject) ` \n", 189 | "\n", 190 | "`b`\n", 191 | "\n", 192 | "And now *a* equals *b*. Of course, your program would be much longer, most often a series of steps, which might for instance do a Machine Learning prediction. \n", 193 | "\n", 194 | "You can read a lot more about pickling here: https://wiki.python.org/moin/UsingPickle" 195 | ] 196 | }, 197 | { 198 | "cell_type": "markdown", 199 | "metadata": {}, 200 | "source": [ 201 | "

Activity - Pickle

\n", 202 | "\n", 203 | "Now open the `/code/04_EnvironmentsAndDeployment.py` file and follow the instructions you see there for step 4.2." 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": 4, 209 | "metadata": {}, 210 | "outputs": [], 211 | "source": [ 212 | "# - 4.2 Create a program that has three text variables. Combine these three into another varaible. \n", 213 | "# Load the pickle library and save the results of the first program as a pkl file.\n", 214 | "# Close the first program, and create another one that opens and reads the pkl file.\n", 215 | "# Combine the final variable from the last program with a next text variable from this program. \n", 216 | "\n", 217 | "# EOF: 04_EnvironmentsAndDeployment.py" 218 | ] 219 | }, 220 | { 221 | "cell_type": "markdown", 222 | "metadata": {}, 223 | "source": [ 224 | "

4.3 Docker and Flask

\n", 225 | "\n", 226 | "Two other abstraction levels are useful to think about. You're probably familiar with Virtual Machines - which uses software to emulate hardware. This lets you install a complete new \"computer\" in a computer's OS. One level up from that abstraction layer is a *Container*. A Container goes slightly further by including a very small kernel of an operating system (most often Linux) to operate a runtime - like Python. This provides an even more consistent environment for your application, since it can also include settings and programs above the Python level. \n", 227 | "\n", 228 | "The *Flask* micro-framework for Python isn't technically an abstraction layer, it has more to do with serving your application up to a Web call. You'll often see Docker and Flask used together, so you'll cover it here for completeness. Once again, seeing some code is useful to understand - this example comes from the documentation site:\n", 229 | "\n", 230 | "
\n",
231 |     "\n",
232 |     "from flask import Flask\n",
233 |     "app = Flask(__name__)\n",
234 |     "\n",
235 |     "@app.route('/')\n",
236 |     "def hello_world():\n",
237 |     "    return 'Hello, World!'\n",
238 |     "\n",
239 |     "
\n", 240 | "\n", 241 | "You can probably follow the layout of this code, but there are some specifics here. First, the code imported Flask itself. Next, the code creates an instance of a Flask app, called \"app\" in this case. From there, the route was set to the base URL call - just as in the main part of a web page. And finally, a simple function returns the words \"Hello World!\".\n", 242 | "\n", 243 | "So far, nothing is happening - the code is just on disk. However, you can \"deploy\" the code on a system that is running with these commands (in Linux):\n", 244 | "\n", 245 | "
\n",
246 |     "$ export FLASK_APP=hello.py\n",
247 |     "$ flask run\n",
248 |     " * Running on http://127.0.0.1:5000/\n",
249 |     "
\n", 250 | "\n", 251 | "OK...so what? Well, in this case, you could open a Web Browser on that system and type in that URL - and you'll see \"Hello World!\" pop up on the screen. Of course, real applications are much more complicated, can take POST and GET operations, and much more. But this is a very convenient way to serve up your Python application without having to tell your users to install and run Python.\n", 252 | "\n", 253 | "Of course, there's a lot more to both of these topics - read the references below to learn more." 254 | ] 255 | }, 256 | { 257 | "cell_type": "markdown", 258 | "metadata": {}, 259 | "source": [ 260 | "

For Further Study

\n", 261 | "\n", 262 | "- More on Docker: https://www.fullstackpython.com/docker.html\n", 263 | "- More on Flask: http://flask.pocoo.org/\n", 264 | "- Creating a simple Flask application: http://containertutorials.com/docker-compose/flask-simple-app.html \n", 265 | "\n", 266 | "Congratulations! You now know the basics or working with Python and Data. As you can see, there's a lot more to learn - so use your new knowledge to expand on what you have learned. " 267 | ] 268 | } 269 | ], 270 | "metadata": { 271 | "kernelspec": { 272 | "display_name": "Python 3", 273 | "language": "python", 274 | "name": "python3" 275 | }, 276 | "language_info": { 277 | "codemirror_mode": { 278 | "name": "ipython", 279 | "version": 3 280 | }, 281 | "file_extension": ".py", 282 | "mimetype": "text/x-python", 283 | "name": "python", 284 | "nbconvert_exporter": "python", 285 | "pygments_lexer": "ipython3", 286 | "version": "3.6.5" 287 | } 288 | }, 289 | "nbformat": 4, 290 | "nbformat_minor": 2 291 | } 292 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ![](graphics/microsoftlogo.png) 2 | 3 | # Lab: Python Basics for Data Professionals 4 | 5 | #### A Microsoft Course from the SQL Server team 6 | 7 |

8 | 9 |
10 | 11 |
About this lab
12 |
Business Applications of this lab
13 |
Technologies used in this lab
14 |
Before Taking this lab
15 |
lab Details
16 |
Related labs
17 |
Lab Modules
18 |
Next Steps
19 | 20 |
21 | 22 |

About this lab

23 | 24 | > NOTE: This course is in active re-development. [The course files are complete, and located here](https://github.com/microsoft/sqlworkshops-pythonfordatapros/tree/master/PythonForDataProfessionals), but this page is currently being worked on. 25 | 26 | Welcome to this Microsoft solutions lab on the architecture on *Python Basics for the Data Professional*. In this lab, you'll learn basic Python structures, programming and data flow. You'll get resources to go much further in your learning journey, but this short lab will get you up and running quickly. 27 | 28 | The focus of this lab is to familiarize the database professional in the basics of Python, while implementing it in SQL Server Stored Procedures using SQL Server's Machine Learning Services. After this basic introduction, the professional can move on to more in-depth training in Python if desired. 29 | 30 | You'll start by setting up your system to work with Python, then move to understanding the course itself. From there, you will move though programming basics, working with data, and then on to understanding the concepts of Python environments and how to deploy Python code. 31 | 32 | This [github README.MD file](https://lab.github.com/githubtraining/introduction-to-github) explains how the workshop is laid out, what you will learn, and the technologies you will use in this solution. To download this Lab to your local computer, click the **Clone or Download** button you see at the top right side of this page. [More about that process is here](https://help.github.com/en/github/creating-cloning-and-archiving-repositories/cloning-a-repository). 33 | 34 | You can view all of the [courses and other labs our team has created at this link - open in a new tab to find out more.](https://microsoft.github.io/sqllabs/) 35 | 36 |

37 | 38 |

Learning Objectives

39 | 40 | In this lab you'll learn: 41 | - How to set up a Python environment for SQL Server using Machine Learning Services 42 | - The Basics of programming in Python including code syntax, getting help, variables, operators, and functions 43 | - Working with data structures, and understanding popular data libraries 44 | - Data Ingestion and access 45 | - Machine Learning in Python 46 | - Environments and code deployment 47 | 48 |
49 | 50 | The goal of this lab is to familiarize the data professional with Python environments and programming. 51 | 52 | The concepts and skills taught in this lab form the starting points for: 53 | 54 | - Data professionals that wish to include Python code in their data access and programming 55 | - Security professionals who wish to understand how to securely implement secure Python coding practices 56 | - Anyone interested in learning more about programming with Python and databases 57 | 58 |

59 | 60 |

Business Applications of this lab

61 | 62 | Businesses require the ability to securely access their data for many workloads, including various programming languages. Python (along with the R language) has merged as a powerful tool for data ingestion, processing and analysis. Previously, Python programmers accessed various databases and retrieved data over a network connection like any application, but this often means pulling large amounts of data over a potentially insecure network to bring multiple copies to each developer to work with locally. The SQL Server Machine Learning Services feature allows Python code to run inside a Stored Procedure in SQL Server, which then accesses data directly. This also allows the Python developer to create code locally, and then send that code on to the Database Administrator for installation on the server - the developer never has to touch the production server or data. 63 | 64 | This couse explains how to work with Python, and then how to operationalize the code on a SQL Server. 65 | 66 | 67 |

68 | 69 |

Technologies used in this lab

70 | 71 | The solution includes the following technologies - although you are not limited to these, they form the basis of the lab. At the end of the lab you will learn how to extrapolate these components into other solutions. You will cover these at an overview level, with references to much deeper training provided. 72 | 73 | 74 | 75 | 76 | 77 |
Technology Description
Python*An Open-Source, multiple paradigm coding language with extensible packages
Microsoft SQL Server*A complete data platform, including a Relational Database Management System (RDBMS), Data Pipeline, Business Intelligence, Graph Database Processing, and other constructs to work securely with multiple forms of data, including structured, semi-structured and unstructured.
78 | 79 |

80 | 81 |

82 | 83 |

Before Taking this lab

84 | 85 | You'll need a local system that you are able to install software on. The lab demonstrations use Microsoft Windows as an operating system and all examples use Windows for the lab. Optionally, you can use a Microsoft Azure Virtual Machine (VM) to install the software on and work with the solution. 86 | 87 | This lab expects that you understand data structures and working with SQL Server and computer networks. This lab does not expect you to have any prior data science knowledge, but a basic knowledge of programming and statistics is helpful. 88 | 89 | If you are new to these, here are a few references you can complete prior to class: 90 | 91 | - [Microsoft SQL Server](https://docs.microsoft.com/en-us/sql/relational-databases/database-engine-tutorials?view=sql-server-ver15) 92 | - [Microsoft Azure](https://docs.microsoft.com/en-us/learn/paths/azure-fundamentals/) 93 | - [Basic Programming](https://www.khanacademy.org/computing/computer-programming/programming/intro-to-programming/v/programming-intro) 94 | 95 |

Setup

96 | 97 | A full prerequisites document is located here. These instructions should be completed before the lab starts, since you will not have time to cover these in class. Remember to turn off any Virtual Machines from the Azure Portal when not taking the class so that you do incur charges (shutting down the machine in the VM itself is not sufficient). 98 | 99 |

100 | 101 |

lab Details

102 | 103 | This lab uses the Microsoft Windows operating system, although Linux is also supported once you have completed the exercises. 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 |
Primary Audience:Data Professionals tasked with implementing Big Data, Machine Learning and AI solutions
Secondary Audience: Security Architects and Developers
Level: 300
Type:In-Person
Length: 8-9 hours
114 | 115 |

116 | 117 |

Related labs

118 | 119 | - This course is also availalbe in a zero-install, online Jupyter Notebook format. [You can find that here](https://notebooks.azure.com/BuckWoodyNoteBooks/projects/PythonDataProfessional). 120 | 121 |

122 | 123 |

lab Modules

124 | 125 | This is a modular lab, and in each section, you'll learn concepts, technologies and processes to help you complete the solution. 126 | 127 | 128 | 129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 |
ModuleTopics
01 - Overview and Course Setup In this Module you will cover and overview of the Python language and set up your system for the course.
02 - Programming Basics This Module covers the commands and procedures for getting help in Python, code syntax and structure, variables, and operators and functions.
03 - Working with Data In this Module you will learn more about data types, ingestion, inpsection, and graphing, with a brief introduction to Data Science with Python.
04 - Environments and Deployment In this Module you will learn more about Python environments such as Conda, and how to deploy your code using the "pickle" library.
137 | 138 |

139 | 140 |

Next Steps

141 | 142 | Next, Continue to 00 - Prerequisites 143 | 144 | 145 | # Contributing 146 | 147 | This project welcomes contributions and suggestions. Most contributions require you to agree to a 148 | Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us 149 | the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com. 150 | 151 | When you submit a pull request, a CLA bot will automatically determine whether you need to provide 152 | a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions 153 | provided by the bot. You will only need to do this once across all repos using our CLA. 154 | 155 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 156 | For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or 157 | contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. 158 | 159 | # Legal Notices 160 | 161 | ### License 162 | Microsoft and any contributors grant you a license to the Microsoft documentation and other content in this repository under the [Creative Commons Attribution 4.0 International Public License](https://creativecommons.org/licenses/by/4.0/legalcode), see [the LICENSE file](https://github.com/MicrosoftDocs/mslearn-tailspin-spacegame-web/blob/master/LICENSE), and grant you a license to any code in the repository under [the MIT License](https://opensource.org/licenses/MIT), see the [LICENSE-CODE file](https://github.com/MicrosoftDocs/mslearn-tailspin-spacegame-web/blob/master/LICENSE-CODE). 163 | 164 | Microsoft, Windows, Microsoft Azure and/or other Microsoft products and services referenced in the documentation 165 | may be either trademarks or registered trademarks of Microsoft in the United States and/or other countries. 166 | The licenses for this project do not grant you rights to use any Microsoft names, logos, or trademarks. 167 | Microsoft's general trademark guidelines can be found at http://go.microsoft.com/fwlink/?LinkID=254653. 168 | 169 | Privacy information can be found at https://privacy.microsoft.com/en-us/ 170 | 171 | Microsoft and any contributors reserve all other rights, whether under their respective copyrights, patents, 172 | or trademarks, whether by implication, estoppel or otherwise. 173 | 174 | -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ## Security 4 | 5 | Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/). 6 | 7 | If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://aka.ms/opensource/security/definition), please report it to us as described below. 8 | 9 | ## Reporting Security Issues 10 | 11 | **Please do not report security vulnerabilities through public GitHub issues.** 12 | 13 | Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://aka.ms/opensource/security/create-report). 14 | 15 | If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://aka.ms/opensource/security/pgpkey). 16 | 17 | You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://aka.ms/opensource/security/msrc). 18 | 19 | Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue: 20 | 21 | * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.) 22 | * Full paths of source file(s) related to the manifestation of the issue 23 | * The location of the affected source code (tag/branch/commit or direct URL) 24 | * Any special configuration required to reproduce the issue 25 | * Step-by-step instructions to reproduce the issue 26 | * Proof-of-concept or exploit code (if possible) 27 | * Impact of the issue, including how an attacker might exploit the issue 28 | 29 | This information will help us triage your report more quickly. 30 | 31 | If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://aka.ms/opensource/security/bounty) page for more details about our active programs. 32 | 33 | ## Preferred Languages 34 | 35 | We prefer all communications to be in English. 36 | 37 | ## Policy 38 | 39 | Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://aka.ms/opensource/security/cvd). 40 | 41 | 42 | -------------------------------------------------------------------------------- /graphics/AnalyticsAreas.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/AnalyticsAreas.png -------------------------------------------------------------------------------- /graphics/DataScience.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/DataScience.png -------------------------------------------------------------------------------- /graphics/MLCapabilities.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/MLCapabilities.png -------------------------------------------------------------------------------- /graphics/MatPlotLib.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/MatPlotLib.png -------------------------------------------------------------------------------- /graphics/SmallBuck.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/SmallBuck.png -------------------------------------------------------------------------------- /graphics/aml-logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/aml-logo.png -------------------------------------------------------------------------------- /graphics/brain.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/brain.png -------------------------------------------------------------------------------- /graphics/check.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/check.png -------------------------------------------------------------------------------- /graphics/checkbox.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/checkbox.png -------------------------------------------------------------------------------- /graphics/checkmark.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/checkmark.jpg -------------------------------------------------------------------------------- /graphics/cortanalogo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/cortanalogo.png -------------------------------------------------------------------------------- /graphics/files.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/files.jpg -------------------------------------------------------------------------------- /graphics/ggplot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/ggplot.png -------------------------------------------------------------------------------- /graphics/keyboard.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/keyboard.jpg -------------------------------------------------------------------------------- /graphics/microsoftlogo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/microsoftlogo.png -------------------------------------------------------------------------------- /graphics/pin.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/pin.jpg -------------------------------------------------------------------------------- /graphics/solutions-microsoft-logo-small.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/solutions-microsoft-logo-small.png -------------------------------------------------------------------------------- /graphics/tdsp.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/tdsp.png -------------------------------------------------------------------------------- /graphics/thinking.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/sqlworkshops-pythonfordatapros/e7a9fa3dadd492872812c32dc5c030359c3f3905/graphics/thinking.jpg --------------------------------------------------------------------------------