├── speech.wav ├── salaries.csv ├── README.md ├── 9-2_SciKit-Learn-Overview.ipynb ├── 6-3_Numpy_review.ipynb ├── 7-Pandas_Intro.ipynb ├── 7-1-Pandas_review.ipynb ├── 6-2_Numpy_2D.ipynb ├── 5-1_Dictionaries_v5.ipynb ├── 7-2_Intro_Data_Analysis.ipynb ├── 02-print.ipynb ├── 7-3_Data_Wrangling.ipynb ├── 01-Intro.ipynb ├── 03-String operations.ipynb └── 6-1_Numpy_1D.ipynb /speech.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/python-labs/HEAD/speech.wav -------------------------------------------------------------------------------- /salaries.csv: -------------------------------------------------------------------------------- 1 | Name,Salary,Age 2 | John,50000,34 3 | Sally,120000,45 4 | Alyssa,80000,27 -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # python-labs 2 | SRTTU: Labs for python 3 | 4 |
5 | 6 | این صفحه متعلق به درس **آزمایشگاه پایتون** دانشگاه **تربیت دبیر شهید رجایی** در ترم بهمن ماه سال تحصیلی 96-97 است. 7 |
8 | 9 | |شماره و گروه درس|زمان| 10 | |----|---| 11 | | ۰۱-۰۲۷-۹۸-۹۰ |سه شنبه ۱۷:۳۰-۱۹:۳۰| 12 | | ۰۲-۰۲۷-۹۸-۹۰ |چهار شنبه ۱۷:۳۰-۱۹:۳۰ | 13 | 14 |
15 | 16 | ***چه نسخه ای از پایتون استفاده می‌کنیم؟*** 17 |
18 | در حال حاضر دو نسخه‌ی 2و 3 رواج دارد. کدهای این ترم بر اساس ***پایتون 3*** خواهد بود. 19 | 20 | ***چه مباحثی را میخوانیم؟*** 21 |
22 | در زیر سیلابس پیشنهادی برای این ترم لیست شده است. 23 |
24 | امید است تا انتهای ماژول 7 حتما پوشش داده شود، مابقی مباحث بر اساس روند کلاس و علاقه دانشجویان خواهد بود. 25 |
26 | 27 | ***Course Syllabus:*** 28 | 29 | Module 1 - Python Basics 30 | 31 | How to install Python? 32 | Your first program 33 | Input and output 34 | Variables 35 | Operators 36 | Types 37 | Expressions and Variables 38 | String Operations 39 | 40 | Module 2 - Python Data Structures & Containers 41 | 42 | Lists 43 | Tuples 44 | Sets 45 | Dictionaries 46 | 47 | Module 3 - Python Programming Fundamentals 48 | 49 | Conditions and Branching 50 | Loops 51 | Functions 52 | Objects and Classes 53 | Functions and Packages 54 | 55 | Module 4 - Numpy 56 | 57 | Vectorization 58 | Arrays 59 | Array indexing 60 | Datatypes 61 | Array math 62 | Broadcasting 63 | 64 | Module 5 - Visualization with Matplotlib 65 | 66 | Plotting 67 | Subplots 68 | Images 69 | 70 | Module 6 - Working with Data in Python 71 | 72 | Reading files with open 73 | Writing files with open 74 | Loading data with Pandas 75 | Working with and Saving data with Pandas 76 | 77 | 78 | Module 7 - Importing Datasets 79 | 80 | Learning Objectives 81 | Understanding the Domain 82 | Understanding the Dataset 83 | Python package for data science 84 | Importing and Exporting Data in Python 85 | Basic Insights from Datasets 86 | 87 | Module 8 - Cleaning the Data 88 | 89 | Identify and Handle Missing Values 90 | Data Formatting 91 | Data NormalizationSets 92 | Binning 93 | Indicator variables 94 | 95 | Module 9 - Summarizing the Data Frame 96 | 97 | Descriptive Statistics 98 | Basic of Grouping 99 | ANOVA 100 | Correlation 101 | More on Correlation 102 | 103 | Module 10 - Model Development 104 | 105 | Simple and Multiple Linear Regression 106 | Model Evaluation Using Visualization 107 | Polynomial Regression and Pipelines 108 | R-squared and MSE for In-Sample Evaluation 109 | Prediction and Decision Making 110 | 111 | Module 11 - Model Evaluation 112 | 113 | Model Evaluation 114 | Over-fitting, Under-fitting and Model Selection 115 | Ridge Regression 116 | Grid Search 117 | Model Refinement 118 | 119 | Module 12 - More about Visualization Tools 120 | 121 | Introduction to Data Visualization 122 | Introduction to Matplotlib 123 | Basic Plotting with Matplotlib 124 | Dataset on Immigration to Canada 125 | Line Plots 126 | Area Plots 127 | Histograms 128 | Bar Charts 129 | Pie Charts 130 | Box Plots 131 | Scatter Plots 132 | Bubble Plots 133 | 134 | Module 13 - Advanced Visualization Tools 135 | 136 | Waffle Charts 137 | Word Clouds 138 | Seaborn and Regression Plots 139 | 140 | Module 14 - SciPy 141 | 142 | Image operations 143 | MATLAB files 144 | Distance between points 145 | 146 | Module 15 - Intro to Unit Testing and Test Driven Development in Python 147 | 148 | Overview of Unit Testing and Test Driven Development 149 | XUnit 150 | Assert Statements and Exceptions 151 | Test Fixtures 152 | PyTest Command Line Arguments 153 | 154 | 155 | 156 | 157 | -------------------------------------------------------------------------------- /9-2_SciKit-Learn-Overview.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"SRTTU\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "# SciKit Learn Preprocessing Overview" 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": null, 20 | "metadata": { 21 | "collapsed": true 22 | }, 23 | "outputs": [], 24 | "source": [ 25 | "import numpy as np" 26 | ] 27 | }, 28 | { 29 | "cell_type": "code", 30 | "execution_count": null, 31 | "metadata": { 32 | "collapsed": true 33 | }, 34 | "outputs": [], 35 | "source": [ 36 | "from sklearn.preprocessing import MinMaxScaler" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": null, 42 | "metadata": { 43 | "collapsed": true 44 | }, 45 | "outputs": [], 46 | "source": [ 47 | "data = np.random.randint(0,100,(10,2))" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": null, 53 | "metadata": { 54 | "collapsed": false 55 | }, 56 | "outputs": [], 57 | "source": [ 58 | "data" 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "metadata": { 65 | "collapsed": true 66 | }, 67 | "outputs": [], 68 | "source": [ 69 | "scaler_model = MinMaxScaler()" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": null, 75 | "metadata": { 76 | "collapsed": false 77 | }, 78 | "outputs": [], 79 | "source": [ 80 | "scaler_model.fit(data)" 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": null, 86 | "metadata": { 87 | "collapsed": false 88 | }, 89 | "outputs": [], 90 | "source": [ 91 | "scaler_model.transform(data)" 92 | ] 93 | }, 94 | { 95 | "cell_type": "code", 96 | "execution_count": null, 97 | "metadata": { 98 | "collapsed": false 99 | }, 100 | "outputs": [], 101 | "source": [ 102 | "# In one step\n", 103 | "result = scaler_model.fit_transform(data)" 104 | ] 105 | }, 106 | { 107 | "cell_type": "code", 108 | "execution_count": null, 109 | "metadata": { 110 | "collapsed": false 111 | }, 112 | "outputs": [], 113 | "source": [ 114 | "result" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "metadata": { 121 | "collapsed": false 122 | }, 123 | "outputs": [], 124 | "source": [ 125 | "import pandas as pd" 126 | ] 127 | }, 128 | { 129 | "cell_type": "code", 130 | "execution_count": null, 131 | "metadata": { 132 | "collapsed": true 133 | }, 134 | "outputs": [], 135 | "source": [ 136 | "data = pd.DataFrame(data=np.random.randint(0,101,(50,4)),columns=['f1','f2','f3','label'])" 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": null, 142 | "metadata": { 143 | "collapsed": false 144 | }, 145 | "outputs": [], 146 | "source": [ 147 | "data.head()" 148 | ] 149 | }, 150 | { 151 | "cell_type": "code", 152 | "execution_count": null, 153 | "metadata": { 154 | "collapsed": false 155 | }, 156 | "outputs": [], 157 | "source": [ 158 | "x = data[['f1','f2','f3']] # Alternatively x = data.drop('label',axis=1)\n", 159 | "y = data['label']" 160 | ] 161 | }, 162 | { 163 | "cell_type": "code", 164 | "execution_count": null, 165 | "metadata": { 166 | "collapsed": false 167 | }, 168 | "outputs": [], 169 | "source": [ 170 | "from sklearn.model_selection import train_test_split" 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": null, 176 | "metadata": { 177 | "collapsed": true 178 | }, 179 | "outputs": [], 180 | "source": [ 181 | "X_train, X_test, y_train, y_test = train_test_split(x,y,test_size=0.3,random_state=101)" 182 | ] 183 | }, 184 | { 185 | "cell_type": "code", 186 | "execution_count": null, 187 | "metadata": { 188 | "collapsed": false 189 | }, 190 | "outputs": [], 191 | "source": [ 192 | "X_train.shape" 193 | ] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "execution_count": null, 198 | "metadata": { 199 | "collapsed": false 200 | }, 201 | "outputs": [], 202 | "source": [ 203 | "X_test.shape" 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": null, 209 | "metadata": { 210 | "collapsed": false 211 | }, 212 | "outputs": [], 213 | "source": [ 214 | "y_train.shape" 215 | ] 216 | }, 217 | { 218 | "cell_type": "code", 219 | "execution_count": null, 220 | "metadata": { 221 | "collapsed": false 222 | }, 223 | "outputs": [], 224 | "source": [ 225 | "y_test.shape" 226 | ] 227 | }, 228 | { 229 | "cell_type": "markdown", 230 | "metadata": {}, 231 | "source": [ 232 | "## Great Job!" 233 | ] 234 | }, 235 | { 236 | "cell_type": "markdown", 237 | "metadata": { 238 | "collapsed": true 239 | }, 240 | "source": [ 241 | "
\n", 242 | "
دانشگاه تربیت دبیر شهید رجایی
کارگاه پایتون
علیرضا اخوان پور
96-97
\n", 243 | "
\n", 244 | "SRTTU.edu - PyLab.AkhavanPour.ir - AkhavanPour.ir\n", 245 | "
" 246 | ] 247 | } 248 | ], 249 | "metadata": { 250 | "anaconda-cloud": {}, 251 | "kernelspec": { 252 | "display_name": "Python [conda env:keras]", 253 | "language": "python", 254 | "name": "conda-env-keras-py" 255 | }, 256 | "language_info": { 257 | "codemirror_mode": { 258 | "name": "ipython", 259 | "version": 3 260 | }, 261 | "file_extension": ".py", 262 | "mimetype": "text/x-python", 263 | "name": "python", 264 | "nbconvert_exporter": "python", 265 | "pygments_lexer": "ipython3", 266 | "version": "3.5.4" 267 | } 268 | }, 269 | "nbformat": 4, 270 | "nbformat_minor": 2 271 | } 272 | -------------------------------------------------------------------------------- /6-3_Numpy_review.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"SRTTU\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": null, 13 | "metadata": { 14 | "collapsed": true 15 | }, 16 | "outputs": [], 17 | "source": [ 18 | "import numpy as np" 19 | ] 20 | }, 21 | { 22 | "cell_type": "markdown", 23 | "metadata": {}, 24 | "source": [ 25 | "## Creating Arrays" 26 | ] 27 | }, 28 | { 29 | "cell_type": "code", 30 | "execution_count": null, 31 | "metadata": { 32 | "collapsed": true 33 | }, 34 | "outputs": [], 35 | "source": [ 36 | "my_list = [0,1,2,3,4]" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": null, 42 | "metadata": { 43 | "collapsed": false 44 | }, 45 | "outputs": [], 46 | "source": [ 47 | "type(my_list)" 48 | ] 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "metadata": { 53 | "collapsed": true 54 | }, 55 | "source": [ 56 | "# arr = np.array(my_list)" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": null, 62 | "metadata": { 63 | "collapsed": false 64 | }, 65 | "outputs": [], 66 | "source": [ 67 | "np.arange(0,10)" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": null, 73 | "metadata": { 74 | "collapsed": false 75 | }, 76 | "outputs": [], 77 | "source": [ 78 | "np.arange(0,10,2)" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": { 85 | "collapsed": false 86 | }, 87 | "outputs": [], 88 | "source": [ 89 | "np.zeros((5,5))+5\n" 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": null, 95 | "metadata": { 96 | "collapsed": false 97 | }, 98 | "outputs": [], 99 | "source": [ 100 | "np.ones((2,4))*7" 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": null, 106 | "metadata": { 107 | "collapsed": false 108 | }, 109 | "outputs": [], 110 | "source": [ 111 | "np.random.randint(0,10)" 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": null, 117 | "metadata": { 118 | "collapsed": false 119 | }, 120 | "outputs": [], 121 | "source": [ 122 | "np.random.randint(0,10,(3,3))" 123 | ] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "execution_count": null, 128 | "metadata": { 129 | "collapsed": false 130 | }, 131 | "outputs": [], 132 | "source": [ 133 | "np.linspace(0,10,6)" 134 | ] 135 | }, 136 | { 137 | "cell_type": "code", 138 | "execution_count": null, 139 | "metadata": { 140 | "collapsed": false 141 | }, 142 | "outputs": [], 143 | "source": [ 144 | "np.linspace(0,10,101)" 145 | ] 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "metadata": {}, 150 | "source": [ 151 | "## Operations" 152 | ] 153 | }, 154 | { 155 | "cell_type": "code", 156 | "execution_count": null, 157 | "metadata": { 158 | "collapsed": true 159 | }, 160 | "outputs": [], 161 | "source": [ 162 | "np.random.seed(101) # watch video for details\n", 163 | "arr = np.random.randint(0,100,10)" 164 | ] 165 | }, 166 | { 167 | "cell_type": "code", 168 | "execution_count": null, 169 | "metadata": { 170 | "collapsed": false 171 | }, 172 | "outputs": [], 173 | "source": [ 174 | "arr" 175 | ] 176 | }, 177 | { 178 | "cell_type": "code", 179 | "execution_count": null, 180 | "metadata": { 181 | "collapsed": true 182 | }, 183 | "outputs": [], 184 | "source": [ 185 | "arr2 = np.random.randint(0,100,10)" 186 | ] 187 | }, 188 | { 189 | "cell_type": "code", 190 | "execution_count": null, 191 | "metadata": { 192 | "collapsed": false 193 | }, 194 | "outputs": [], 195 | "source": [ 196 | "arr2" 197 | ] 198 | }, 199 | { 200 | "cell_type": "code", 201 | "execution_count": null, 202 | "metadata": { 203 | "collapsed": false 204 | }, 205 | "outputs": [], 206 | "source": [ 207 | "arr.max()" 208 | ] 209 | }, 210 | { 211 | "cell_type": "code", 212 | "execution_count": null, 213 | "metadata": { 214 | "collapsed": false 215 | }, 216 | "outputs": [], 217 | "source": [ 218 | "arr.min()" 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": null, 224 | "metadata": { 225 | "collapsed": false 226 | }, 227 | "outputs": [], 228 | "source": [ 229 | "arr.mean()" 230 | ] 231 | }, 232 | { 233 | "cell_type": "code", 234 | "execution_count": null, 235 | "metadata": { 236 | "collapsed": false 237 | }, 238 | "outputs": [], 239 | "source": [ 240 | "arr.argmin()" 241 | ] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "execution_count": null, 246 | "metadata": { 247 | "collapsed": false 248 | }, 249 | "outputs": [], 250 | "source": [ 251 | "arr.argmax()" 252 | ] 253 | }, 254 | { 255 | "cell_type": "code", 256 | "execution_count": null, 257 | "metadata": { 258 | "collapsed": false 259 | }, 260 | "outputs": [], 261 | "source": [ 262 | "arr.reshape(2,5)" 263 | ] 264 | }, 265 | { 266 | "cell_type": "markdown", 267 | "metadata": {}, 268 | "source": [ 269 | "## Indexing" 270 | ] 271 | }, 272 | { 273 | "cell_type": "code", 274 | "execution_count": null, 275 | "metadata": { 276 | "collapsed": true 277 | }, 278 | "outputs": [], 279 | "source": [ 280 | "mat = np.arange(0,100).reshape(10,10)" 281 | ] 282 | }, 283 | { 284 | "cell_type": "code", 285 | "execution_count": null, 286 | "metadata": { 287 | "collapsed": false 288 | }, 289 | "outputs": [], 290 | "source": [ 291 | "mat" 292 | ] 293 | }, 294 | { 295 | "cell_type": "code", 296 | "execution_count": null, 297 | "metadata": { 298 | "collapsed": true 299 | }, 300 | "outputs": [], 301 | "source": [ 302 | "row = 0\n", 303 | "col = 1" 304 | ] 305 | }, 306 | { 307 | "cell_type": "code", 308 | "execution_count": null, 309 | "metadata": { 310 | "collapsed": false 311 | }, 312 | "outputs": [], 313 | "source": [ 314 | "mat[row,col]" 315 | ] 316 | }, 317 | { 318 | "cell_type": "code", 319 | "execution_count": null, 320 | "metadata": { 321 | "collapsed": false 322 | }, 323 | "outputs": [], 324 | "source": [ 325 | "# With Slices\n", 326 | "mat[:,col:col+1]" 327 | ] 328 | }, 329 | { 330 | "cell_type": "code", 331 | "execution_count": null, 332 | "metadata": { 333 | "collapsed": false 334 | }, 335 | "outputs": [], 336 | "source": [ 337 | "mat[row,:]" 338 | ] 339 | }, 340 | { 341 | "cell_type": "code", 342 | "execution_count": null, 343 | "metadata": { 344 | "collapsed": false 345 | }, 346 | "outputs": [], 347 | "source": [ 348 | "mat[0:3,0:3]" 349 | ] 350 | }, 351 | { 352 | "cell_type": "markdown", 353 | "metadata": {}, 354 | "source": [ 355 | "## Masking" 356 | ] 357 | }, 358 | { 359 | "cell_type": "code", 360 | "execution_count": null, 361 | "metadata": { 362 | "collapsed": false 363 | }, 364 | "outputs": [], 365 | "source": [ 366 | "a=mat > 50" 367 | ] 368 | }, 369 | { 370 | "cell_type": "code", 371 | "execution_count": null, 372 | "metadata": { 373 | "collapsed": false 374 | }, 375 | "outputs": [], 376 | "source": [ 377 | "mat[mat<30]\n" 378 | ] 379 | }, 380 | { 381 | "cell_type": "markdown", 382 | "metadata": {}, 383 | "source": [ 384 | "# Great Job!" 385 | ] 386 | } 387 | ], 388 | "metadata": { 389 | "anaconda-cloud": {}, 390 | "kernelspec": { 391 | "display_name": "Python [conda env:virtual_platform]", 392 | "language": "python", 393 | "name": "conda-env-virtual_platform-py" 394 | }, 395 | "language_info": { 396 | "codemirror_mode": { 397 | "name": "ipython", 398 | "version": 3 399 | }, 400 | "file_extension": ".py", 401 | "mimetype": "text/x-python", 402 | "name": "python", 403 | "nbconvert_exporter": "python", 404 | "pygments_lexer": "ipython3", 405 | "version": "3.5.4" 406 | } 407 | }, 408 | "nbformat": 4, 409 | "nbformat_minor": 2 410 | } 411 | -------------------------------------------------------------------------------- /7-Pandas_Intro.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "colab_type": "text", 7 | "id": "view-in-github" 8 | }, 9 | "source": [ 10 | "\"Open" 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": { 16 | "colab_type": "text", 17 | "id": "8GLCM_xrxJdc" 18 | }, 19 | "source": [ 20 | "Let's start by importing Pandas. Click on a cell to activate it. Press Shift + Enter to execute. Cells should be executed in order." 21 | ] 22 | }, 23 | { 24 | "cell_type": "code", 25 | "execution_count": 0, 26 | "metadata": { 27 | "colab": {}, 28 | "colab_type": "code", 29 | "id": "U_5zMEGZL96I" 30 | }, 31 | "outputs": [], 32 | "source": [ 33 | "import pandas as pd" 34 | ] 35 | }, 36 | { 37 | "cell_type": "markdown", 38 | "metadata": { 39 | "colab_type": "text", 40 | "id": "7qQ1n2fkx234" 41 | }, 42 | "source": [ 43 | "## Load / Create Data Frame\n", 44 | "We can read CSV or excel files to import data. We can also create dataframes programatically like this:" 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": 0, 50 | "metadata": { 51 | "colab": {}, 52 | "colab_type": "code", 53 | "id": "yyFj31jcMmH6" 54 | }, 55 | "outputs": [], 56 | "source": [ 57 | "df = pd.DataFrame({'Artist':['Billie Holiday','Jimi Hendrix', 'Miles Davis', 'SIA'],\n", 58 | " 'Genre': ['Jazz', 'Rock', 'Jazz', 'Pop'],\n", 59 | " 'Listeners': [1300000, 2700000, 1500000, 2000000],\n", 60 | " 'Plays': [27000000, 70000000, 48000000, 74000000]})" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": { 66 | "colab_type": "text", 67 | "id": "wXj4OuOrxwJm" 68 | }, 69 | "source": [ 70 | " The variable `df` now contains a dataframe. Executing this cell shows the contents of the dataframe" 71 | ] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": 0, 76 | "metadata": { 77 | "colab": {}, 78 | "colab_type": "code", 79 | "id": "WF4_LVYOO__m" 80 | }, 81 | "outputs": [], 82 | "source": [ 83 | "df" 84 | ] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "metadata": { 89 | "colab_type": "text", 90 | "id": "6LUcXFMKyF6z" 91 | }, 92 | "source": [ 93 | "## Selection\n", 94 | "We can select any column using its label:" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": 0, 100 | "metadata": { 101 | "colab": {}, 102 | "colab_type": "code", 103 | "id": "sTNHtQz3Mu_K" 104 | }, 105 | "outputs": [], 106 | "source": [ 107 | "df['Artist']" 108 | ] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": { 113 | "colab_type": "text", 114 | "id": "UeH1ClC3yOJy" 115 | }, 116 | "source": [ 117 | "We can select one or multiple rows using their numbers (inclusive of both bounding row numbers):" 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": 0, 123 | "metadata": { 124 | "colab": {}, 125 | "colab_type": "code", 126 | "id": "41MpPCPwM0AR" 127 | }, 128 | "outputs": [], 129 | "source": [ 130 | "df.loc[1:3]" 131 | ] 132 | }, 133 | { 134 | "cell_type": "code", 135 | "execution_count": 0, 136 | "metadata": { 137 | "colab": {}, 138 | "colab_type": "code", 139 | "id": "fhOMUtTNPnc0" 140 | }, 141 | "outputs": [], 142 | "source": [ 143 | "# Exercise: Select only row #0\n", 144 | "\n" 145 | ] 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "metadata": { 150 | "colab_type": "text", 151 | "id": "7eeFjuAUyTDj" 152 | }, 153 | "source": [ 154 | "We can select any slice of the table using a both column label and row numbers using loc:" 155 | ] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": 0, 160 | "metadata": { 161 | "colab": {}, 162 | "colab_type": "code", 163 | "id": "p7EPN96RPHQt" 164 | }, 165 | "outputs": [], 166 | "source": [ 167 | "df.loc[1:3,['Artist']]" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": 0, 173 | "metadata": { 174 | "colab": {}, 175 | "colab_type": "code", 176 | "id": "ov_6TtCvPubJ" 177 | }, 178 | "outputs": [], 179 | "source": [ 180 | "# Exercise: Select 'Artist' and 'Plays' for rows #1 and #2\n" 181 | ] 182 | }, 183 | { 184 | "cell_type": "markdown", 185 | "metadata": { 186 | "colab_type": "text", 187 | "id": "GeuS62DoyXlJ" 188 | }, 189 | "source": [ 190 | "## Filtering\n", 191 | "Now it gets more interesting. We can easily filter rows using the values of a specific row. For example, here are our jazz musicians:" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": 0, 197 | "metadata": { 198 | "colab": {}, 199 | "colab_type": "code", 200 | "id": "ax8qDMHfPQmP" 201 | }, 202 | "outputs": [], 203 | "source": [ 204 | "df[df['Genre'] == \"Jazz\" ]" 205 | ] 206 | }, 207 | { 208 | "cell_type": "code", 209 | "execution_count": 0, 210 | "metadata": { 211 | "colab": {}, 212 | "colab_type": "code", 213 | "id": "Surgh6XVRmlT" 214 | }, 215 | "outputs": [], 216 | "source": [ 217 | "# Exercise: Select the row where the Genre is 'Rock'\n", 218 | "\n" 219 | ] 220 | }, 221 | { 222 | "cell_type": "markdown", 223 | "metadata": { 224 | "colab_type": "text", 225 | "id": "a33J488lyeFj" 226 | }, 227 | "source": [ 228 | "Here are the artists who have more than 1,800,000 listeners:" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": 0, 234 | "metadata": { 235 | "colab": {}, 236 | "colab_type": "code", 237 | "id": "H1Ld-RXQPTLL" 238 | }, 239 | "outputs": [], 240 | "source": [ 241 | "df[df['Listeners'] > 1800000 ]" 242 | ] 243 | }, 244 | { 245 | "cell_type": "code", 246 | "execution_count": 0, 247 | "metadata": { 248 | "colab": {}, 249 | "colab_type": "code", 250 | "id": "8XLaLBKORva4" 251 | }, 252 | "outputs": [], 253 | "source": [ 254 | "# Exercise: Select the rows where 'Plays' is less than 50,000,000\n", 255 | "\n" 256 | ] 257 | }, 258 | { 259 | "cell_type": "markdown", 260 | "metadata": { 261 | "colab_type": "text", 262 | "id": "MPeYsXStyuwA" 263 | }, 264 | "source": [ 265 | "## Grouping" 266 | ] 267 | }, 268 | { 269 | "cell_type": "code", 270 | "execution_count": 0, 271 | "metadata": { 272 | "colab": {}, 273 | "colab_type": "code", 274 | "id": "qPG_jFSdPMF5" 275 | }, 276 | "outputs": [], 277 | "source": [ 278 | "df.groupby('Genre').sum()" 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": 0, 284 | "metadata": { 285 | "colab": {}, 286 | "colab_type": "code", 287 | "id": "kb5smli-PVE3" 288 | }, 289 | "outputs": [], 290 | "source": [ 291 | "# Exercise: Group by Genre, use mean() as the aggregation function \n" 292 | ] 293 | }, 294 | { 295 | "cell_type": "code", 296 | "execution_count": 0, 297 | "metadata": { 298 | "colab": {}, 299 | "colab_type": "code", 300 | "id": "g2IFgCF4Rg48" 301 | }, 302 | "outputs": [], 303 | "source": [ 304 | "# Exercise: Group by Genre, use max() as the aggregation function \n" 305 | ] 306 | }, 307 | { 308 | "cell_type": "markdown", 309 | "metadata": {}, 310 | "source": [ 311 | "source:\n", 312 | " https://github.com/jalammar/pandas-intro/blob/master/Pandas_Intro.ipynb" 313 | ] 314 | } 315 | ], 316 | "metadata": { 317 | "colab": { 318 | "collapsed_sections": [], 319 | "include_colab_link": true, 320 | "name": "Pandas Intro.ipynb", 321 | "provenance": [], 322 | "version": "0.3.2" 323 | }, 324 | "kernelspec": { 325 | "display_name": "Python 3", 326 | "language": "python", 327 | "name": "python3" 328 | }, 329 | "language_info": { 330 | "codemirror_mode": { 331 | "name": "ipython", 332 | "version": 3 333 | }, 334 | "file_extension": ".py", 335 | "mimetype": "text/x-python", 336 | "name": "python", 337 | "nbconvert_exporter": "python", 338 | "pygments_lexer": "ipython3", 339 | "version": "3.8.5" 340 | } 341 | }, 342 | "nbformat": 4, 343 | "nbformat_minor": 1 344 | } 345 | -------------------------------------------------------------------------------- /7-1-Pandas_review.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"SRTTU\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 3, 13 | "metadata": { 14 | "collapsed": true 15 | }, 16 | "outputs": [], 17 | "source": [ 18 | "import pandas as pd" 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": 4, 24 | "metadata": { 25 | "collapsed": false 26 | }, 27 | "outputs": [], 28 | "source": [ 29 | "df = pd.read_csv('salaries.csv')" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": 5, 35 | "metadata": { 36 | "collapsed": false 37 | }, 38 | "outputs": [ 39 | { 40 | "data": { 41 | "text/html": [ 42 | "
\n", 43 | "\n", 44 | " \n", 45 | " \n", 46 | " \n", 47 | " \n", 48 | " \n", 49 | " \n", 50 | " \n", 51 | " \n", 52 | " \n", 53 | " \n", 54 | " \n", 55 | " \n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | "
NameSalaryAge
0John5000034
1Sally12000045
2Alyssa8000027
\n", 73 | "
" 74 | ], 75 | "text/plain": [ 76 | " Name Salary Age\n", 77 | "0 John 50000 34\n", 78 | "1 Sally 120000 45\n", 79 | "2 Alyssa 80000 27" 80 | ] 81 | }, 82 | "execution_count": 5, 83 | "metadata": {}, 84 | "output_type": "execute_result" 85 | } 86 | ], 87 | "source": [ 88 | "df" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 6, 94 | "metadata": { 95 | "collapsed": false 96 | }, 97 | "outputs": [ 98 | { 99 | "data": { 100 | "text/plain": [ 101 | "0 John\n", 102 | "1 Sally\n", 103 | "2 Alyssa\n", 104 | "Name: Name, dtype: object" 105 | ] 106 | }, 107 | "execution_count": 6, 108 | "metadata": {}, 109 | "output_type": "execute_result" 110 | } 111 | ], 112 | "source": [ 113 | "df['Name']" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": 7, 119 | "metadata": { 120 | "collapsed": false 121 | }, 122 | "outputs": [ 123 | { 124 | "data": { 125 | "text/plain": [ 126 | "0 50000\n", 127 | "1 120000\n", 128 | "2 80000\n", 129 | "Name: Salary, dtype: int64" 130 | ] 131 | }, 132 | "execution_count": 7, 133 | "metadata": {}, 134 | "output_type": "execute_result" 135 | } 136 | ], 137 | "source": [ 138 | "df['Salary']" 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": 8, 144 | "metadata": { 145 | "collapsed": false 146 | }, 147 | "outputs": [ 148 | { 149 | "data": { 150 | "text/html": [ 151 | "
\n", 152 | "\n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | "
NameAge
0John34
1Sally45
2Alyssa27
\n", 178 | "
" 179 | ], 180 | "text/plain": [ 181 | " Name Age\n", 182 | "0 John 34\n", 183 | "1 Sally 45\n", 184 | "2 Alyssa 27" 185 | ] 186 | }, 187 | "execution_count": 8, 188 | "metadata": {}, 189 | "output_type": "execute_result" 190 | } 191 | ], 192 | "source": [ 193 | "df[['Name','Age']]" 194 | ] 195 | }, 196 | { 197 | "cell_type": "code", 198 | "execution_count": 9, 199 | "metadata": { 200 | "collapsed": false 201 | }, 202 | "outputs": [ 203 | { 204 | "data": { 205 | "text/plain": [ 206 | "0 34\n", 207 | "1 45\n", 208 | "2 27\n", 209 | "Name: Age, dtype: int64" 210 | ] 211 | }, 212 | "execution_count": 9, 213 | "metadata": {}, 214 | "output_type": "execute_result" 215 | } 216 | ], 217 | "source": [ 218 | "df['Age']" 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": 10, 224 | "metadata": { 225 | "collapsed": false 226 | }, 227 | "outputs": [ 228 | { 229 | "data": { 230 | "text/plain": [ 231 | "35.333333333333336" 232 | ] 233 | }, 234 | "execution_count": 10, 235 | "metadata": {}, 236 | "output_type": "execute_result" 237 | } 238 | ], 239 | "source": [ 240 | "df['Age'].mean()" 241 | ] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "execution_count": 11, 246 | "metadata": { 247 | "collapsed": false 248 | }, 249 | "outputs": [ 250 | { 251 | "data": { 252 | "text/plain": [ 253 | "0 True\n", 254 | "1 True\n", 255 | "2 False\n", 256 | "Name: Age, dtype: bool" 257 | ] 258 | }, 259 | "execution_count": 11, 260 | "metadata": {}, 261 | "output_type": "execute_result" 262 | } 263 | ], 264 | "source": [ 265 | "df['Age'] > 30" 266 | ] 267 | }, 268 | { 269 | "cell_type": "code", 270 | "execution_count": 12, 271 | "metadata": { 272 | "collapsed": true 273 | }, 274 | "outputs": [], 275 | "source": [ 276 | "age_filter = df['Age'] > 30" 277 | ] 278 | }, 279 | { 280 | "cell_type": "code", 281 | "execution_count": 13, 282 | "metadata": { 283 | "collapsed": false 284 | }, 285 | "outputs": [ 286 | { 287 | "data": { 288 | "text/plain": [ 289 | "0 True\n", 290 | "1 True\n", 291 | "2 False\n", 292 | "Name: Age, dtype: bool" 293 | ] 294 | }, 295 | "execution_count": 13, 296 | "metadata": {}, 297 | "output_type": "execute_result" 298 | } 299 | ], 300 | "source": [ 301 | "age_filter" 302 | ] 303 | }, 304 | { 305 | "cell_type": "code", 306 | "execution_count": 14, 307 | "metadata": { 308 | "collapsed": false 309 | }, 310 | "outputs": [ 311 | { 312 | "data": { 313 | "text/html": [ 314 | "
\n", 315 | "\n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | "
NameSalaryAge
2Alyssa8000027
\n", 333 | "
" 334 | ], 335 | "text/plain": [ 336 | " Name Salary Age\n", 337 | "2 Alyssa 80000 27" 338 | ] 339 | }, 340 | "execution_count": 14, 341 | "metadata": {}, 342 | "output_type": "execute_result" 343 | } 344 | ], 345 | "source": [ 346 | "df[df['Age'] < 30]" 347 | ] 348 | }, 349 | { 350 | "cell_type": "code", 351 | "execution_count": 15, 352 | "metadata": { 353 | "collapsed": false 354 | }, 355 | "outputs": [ 356 | { 357 | "data": { 358 | "text/html": [ 359 | "
\n", 360 | "\n", 361 | " \n", 362 | " \n", 363 | " \n", 364 | " \n", 365 | " \n", 366 | " \n", 367 | " \n", 368 | " \n", 369 | " \n", 370 | " \n", 371 | " \n", 372 | " \n", 373 | " \n", 374 | " \n", 375 | " \n", 376 | " \n", 377 | " \n", 378 | " \n", 379 | " \n", 380 | " \n", 381 | " \n", 382 | " \n", 383 | "
NameSalaryAge
0John5000034
1Sally12000045
\n", 384 | "
" 385 | ], 386 | "text/plain": [ 387 | " Name Salary Age\n", 388 | "0 John 50000 34\n", 389 | "1 Sally 120000 45" 390 | ] 391 | }, 392 | "execution_count": 15, 393 | "metadata": {}, 394 | "output_type": "execute_result" 395 | } 396 | ], 397 | "source": [ 398 | "df[df['Age'] > 30]" 399 | ] 400 | }, 401 | { 402 | "cell_type": "markdown", 403 | "metadata": { 404 | "collapsed": true 405 | }, 406 | "source": [ 407 | "
\n", 408 | "
دانشگاه تربیت دبیر شهید رجایی
کارگاه پایتون
علیرضا اخوان پور
96-97
\n", 409 | "
\n", 410 | "SRTTU.edu - PyLab.AkhavanPour.ir - AkhavanPour.ir\n", 411 | "
" 412 | ] 413 | } 414 | ], 415 | "metadata": { 416 | "anaconda-cloud": {}, 417 | "kernelspec": { 418 | "display_name": "Python [default]", 419 | "language": "python", 420 | "name": "python3" 421 | }, 422 | "language_info": { 423 | "codemirror_mode": { 424 | "name": "ipython", 425 | "version": 3 426 | }, 427 | "file_extension": ".py", 428 | "mimetype": "text/x-python", 429 | "name": "python", 430 | "nbconvert_exporter": "python", 431 | "pygments_lexer": "ipython3", 432 | "version": "3.5.2" 433 | } 434 | }, 435 | "nbformat": 4, 436 | "nbformat_minor": 2 437 | } 438 | -------------------------------------------------------------------------------- /6-2_Numpy_2D.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"SRTTU\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": null, 13 | "metadata": { 14 | "collapsed": true 15 | }, 16 | "outputs": [], 17 | "source": [ 18 | "\n", 19 | "import numpy as np \n", 20 | "import matplotlib.pyplot as plt\n" 21 | ] 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": {}, 26 | "source": [ 27 | " Consider the list **a **, the list contains three nested lists **each of equal size**. " 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": null, 33 | "metadata": { 34 | "collapsed": false 35 | }, 36 | "outputs": [], 37 | "source": [ 38 | "a=[[11,12,13],[21,22,23],[31,32,33]]\n", 39 | "a" 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "metadata": {}, 45 | "source": [ 46 | " We can cast the list to a numpy array as follow" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": null, 52 | "metadata": { 53 | "collapsed": false 54 | }, 55 | "outputs": [], 56 | "source": [ 57 | "#every elment if of the same type \n", 58 | "A=np.array(a)\n", 59 | "A" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "metadata": {}, 65 | "source": [ 66 | "We can use the attribute **ndim** to obtain the number of axes or dimensions referred to as the rank. " 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": null, 72 | "metadata": { 73 | "collapsed": false 74 | }, 75 | "outputs": [], 76 | "source": [ 77 | "A.ndim" 78 | ] 79 | }, 80 | { 81 | "cell_type": "markdown", 82 | "metadata": {}, 83 | "source": [ 84 | " Attribute **shape** returns a tuple corresponding to the size or number of each dimension." 85 | ] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": null, 90 | "metadata": { 91 | "collapsed": false 92 | }, 93 | "outputs": [], 94 | "source": [ 95 | "A.shape" 96 | ] 97 | }, 98 | { 99 | "cell_type": "markdown", 100 | "metadata": {}, 101 | "source": [ 102 | "The total number of elements in the array is given by the attribute **size**." 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": null, 108 | "metadata": { 109 | "collapsed": false 110 | }, 111 | "outputs": [], 112 | "source": [ 113 | "A.size" 114 | ] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "metadata": {}, 119 | "source": [ 120 | "### Accessing different elements of an Numpy Array " 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "metadata": {}, 126 | "source": [ 127 | " We can use rectangular brackets to access the different elements of the array,The correspondence between the rectangular brackets and the list and the rectangular representation is shown in the following figure for a 3x3 array: " 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": {}, 133 | "source": [ 134 | "\n" 135 | ] 136 | }, 137 | { 138 | "cell_type": "markdown", 139 | "metadata": {}, 140 | "source": [ 141 | "We can access the 2nd-row 3rd column as shown in the following figure:" 142 | ] 143 | }, 144 | { 145 | "cell_type": "markdown", 146 | "metadata": {}, 147 | "source": [ 148 | "\n" 149 | ] 150 | }, 151 | { 152 | "cell_type": "markdown", 153 | "metadata": {}, 154 | "source": [ 155 | " We simply use the square brackets and the indices corresponding to the element we would like:" 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": null, 161 | "metadata": { 162 | "collapsed": false 163 | }, 164 | "outputs": [], 165 | "source": [ 166 | "A[1,2]" 167 | ] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": {}, 172 | "source": [ 173 | " We can also use the following notation to obtain the elements: " 174 | ] 175 | }, 176 | { 177 | "cell_type": "code", 178 | "execution_count": null, 179 | "metadata": { 180 | "collapsed": false 181 | }, 182 | "outputs": [], 183 | "source": [ 184 | "A[1][2]" 185 | ] 186 | }, 187 | { 188 | "cell_type": "markdown", 189 | "metadata": {}, 190 | "source": [ 191 | " Consider the elements shown in the following figure " 192 | ] 193 | }, 194 | { 195 | "cell_type": "markdown", 196 | "metadata": {}, 197 | "source": [ 198 | "\n" 199 | ] 200 | }, 201 | { 202 | "cell_type": "markdown", 203 | "metadata": {}, 204 | "source": [ 205 | " we can access the element as follows " 206 | ] 207 | }, 208 | { 209 | "cell_type": "code", 210 | "execution_count": null, 211 | "metadata": { 212 | "collapsed": false 213 | }, 214 | "outputs": [], 215 | "source": [ 216 | "A[0][0]" 217 | ] 218 | }, 219 | { 220 | "cell_type": "markdown", 221 | "metadata": {}, 222 | "source": [ 223 | " We can also use slicing in numpy arrays,consider the following figure; we would like to obtain the first two columns in the first row " 224 | ] 225 | }, 226 | { 227 | "cell_type": "markdown", 228 | "metadata": {}, 229 | "source": [ 230 | "\n" 231 | ] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "metadata": {}, 236 | "source": [ 237 | " This can be done with the following syntax " 238 | ] 239 | }, 240 | { 241 | "cell_type": "code", 242 | "execution_count": null, 243 | "metadata": { 244 | "collapsed": false 245 | }, 246 | "outputs": [], 247 | "source": [ 248 | " A[0][0:2]" 249 | ] 250 | }, 251 | { 252 | "cell_type": "markdown", 253 | "metadata": {}, 254 | "source": [ 255 | "Similarly, we can obtain the first two rows of the 3rd column as follows:" 256 | ] 257 | }, 258 | { 259 | "cell_type": "code", 260 | "execution_count": null, 261 | "metadata": { 262 | "collapsed": false 263 | }, 264 | "outputs": [], 265 | "source": [ 266 | "A[1:3,2]" 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": { 272 | "collapsed": false 273 | }, 274 | "source": [ 275 | "Corresponding to the following figure: " 276 | ] 277 | }, 278 | { 279 | "cell_type": "markdown", 280 | "metadata": {}, 281 | "source": [ 282 | "\n" 283 | ] 284 | }, 285 | { 286 | "cell_type": "markdown", 287 | "metadata": {}, 288 | "source": [ 289 | "# Basic Operations " 290 | ] 291 | }, 292 | { 293 | "cell_type": "markdown", 294 | "metadata": {}, 295 | "source": [ 296 | " We can also add arrays; the process is identical to matrix addition. Matrix addition of **X** and **Y** is shown in the following figure:" 297 | ] 298 | }, 299 | { 300 | "cell_type": "markdown", 301 | "metadata": {}, 302 | "source": [ 303 | "\n" 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "metadata": {}, 309 | "source": [ 310 | " The numpy array is given by **X** and **Y**" 311 | ] 312 | }, 313 | { 314 | "cell_type": "code", 315 | "execution_count": null, 316 | "metadata": { 317 | "collapsed": false 318 | }, 319 | "outputs": [], 320 | "source": [ 321 | "X=np.array([[1,0],[0,1]]) \n", 322 | "X" 323 | ] 324 | }, 325 | { 326 | "cell_type": "code", 327 | "execution_count": null, 328 | "metadata": { 329 | "collapsed": false 330 | }, 331 | "outputs": [], 332 | "source": [ 333 | "Y=np.array([[2,1],[1,2]]) \n", 334 | "Y" 335 | ] 336 | }, 337 | { 338 | "cell_type": "markdown", 339 | "metadata": {}, 340 | "source": [ 341 | " We can add the numpy arrays as follows." 342 | ] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "execution_count": null, 347 | "metadata": { 348 | "collapsed": false 349 | }, 350 | "outputs": [], 351 | "source": [ 352 | "Z=X+Y\n", 353 | "Z" 354 | ] 355 | }, 356 | { 357 | "cell_type": "markdown", 358 | "metadata": {}, 359 | "source": [ 360 | " Multiplying a numpy array by a scaler is identical to multiplying a matrix by a scaler. If we multiply the matrix **Y** by the scaler 2, we simply multiply every element in the matrix by 2 as shown in the figure." 361 | ] 362 | }, 363 | { 364 | "cell_type": "markdown", 365 | "metadata": {}, 366 | "source": [ 367 | "\n" 368 | ] 369 | }, 370 | { 371 | "cell_type": "markdown", 372 | "metadata": {}, 373 | "source": [ 374 | " We can perform the same operation in numpy as follows " 375 | ] 376 | }, 377 | { 378 | "cell_type": "code", 379 | "execution_count": null, 380 | "metadata": { 381 | "collapsed": false 382 | }, 383 | "outputs": [], 384 | "source": [ 385 | "Y=np.array([[2,1],[1,2]]) \n", 386 | "Y" 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": null, 392 | "metadata": { 393 | "collapsed": false 394 | }, 395 | "outputs": [], 396 | "source": [ 397 | "Z=2*Y\n", 398 | "Z" 399 | ] 400 | }, 401 | { 402 | "cell_type": "markdown", 403 | "metadata": {}, 404 | "source": [ 405 | " Multiplication of two arrays corresponds to an element-wise product or Hadamard product. Consider matrix **X** and **Y**. The Hadamard product corresponds to multiplying each of the elements in the same position, i.e. multiplying elements contained in the same colour boxes together. The result is a new matrix that is the same size as matrix **Y** or **X**, as shown in the following figure." 406 | ] 407 | }, 408 | { 409 | "cell_type": "markdown", 410 | "metadata": {}, 411 | "source": [ 412 | "\n" 413 | ] 414 | }, 415 | { 416 | "cell_type": "markdown", 417 | "metadata": {}, 418 | "source": [ 419 | "We can perform element-wise product of the array **X** and **Y** as follows:" 420 | ] 421 | }, 422 | { 423 | "cell_type": "code", 424 | "execution_count": null, 425 | "metadata": { 426 | "collapsed": false 427 | }, 428 | "outputs": [], 429 | "source": [ 430 | "Y=np.array([[2,1],[1,2]]) \n", 431 | "Y" 432 | ] 433 | }, 434 | { 435 | "cell_type": "code", 436 | "execution_count": null, 437 | "metadata": { 438 | "collapsed": false 439 | }, 440 | "outputs": [], 441 | "source": [ 442 | "X=np.array([[1,0],[0,1]]) \n", 443 | "X" 444 | ] 445 | }, 446 | { 447 | "cell_type": "code", 448 | "execution_count": null, 449 | "metadata": { 450 | "collapsed": false 451 | }, 452 | "outputs": [], 453 | "source": [ 454 | "Z=X*Y\n", 455 | "Z" 456 | ] 457 | }, 458 | { 459 | "cell_type": "markdown", 460 | "metadata": {}, 461 | "source": [ 462 | " We can also perform matrix multiplication with the numpy arrays **A** and **B** as follows:" 463 | ] 464 | }, 465 | { 466 | "cell_type": "markdown", 467 | "metadata": {}, 468 | "source": [ 469 | " First, we define matrix **A** and **B**:" 470 | ] 471 | }, 472 | { 473 | "cell_type": "code", 474 | "execution_count": null, 475 | "metadata": { 476 | "collapsed": false 477 | }, 478 | "outputs": [], 479 | "source": [ 480 | "A=np.array([[0,1,1],[1,0,1]])\n", 481 | "A" 482 | ] 483 | }, 484 | { 485 | "cell_type": "code", 486 | "execution_count": null, 487 | "metadata": { 488 | "collapsed": false 489 | }, 490 | "outputs": [], 491 | "source": [ 492 | "B=np.array([[1,1],[1,1],[-1,1]])\n", 493 | "B" 494 | ] 495 | }, 496 | { 497 | "cell_type": "markdown", 498 | "metadata": {}, 499 | "source": [ 500 | "We use the numpy function **dot** to multiply the arrays together." 501 | ] 502 | }, 503 | { 504 | "cell_type": "code", 505 | "execution_count": null, 506 | "metadata": { 507 | "collapsed": false 508 | }, 509 | "outputs": [], 510 | "source": [ 511 | "Z=np.dot(A,B)\n", 512 | "Z" 513 | ] 514 | }, 515 | { 516 | "cell_type": "code", 517 | "execution_count": null, 518 | "metadata": { 519 | "collapsed": false 520 | }, 521 | "outputs": [], 522 | "source": [ 523 | "np.sin(Z)" 524 | ] 525 | }, 526 | { 527 | "cell_type": "code", 528 | "execution_count": null, 529 | "metadata": { 530 | "collapsed": false 531 | }, 532 | "outputs": [], 533 | "source": [ 534 | "C=np.array([[1,1],[2,2],[3,3]])\n", 535 | "C" 536 | ] 537 | }, 538 | { 539 | "cell_type": "code", 540 | "execution_count": null, 541 | "metadata": { 542 | "collapsed": false 543 | }, 544 | "outputs": [], 545 | "source": [ 546 | "C.T" 547 | ] 548 | }, 549 | { 550 | "cell_type": "markdown", 551 | "metadata": { 552 | "collapsed": true 553 | }, 554 | "source": [ 555 | "#### About the Authors: \n", 556 | "\n", 557 | " [Joseph Santarcangelo]( https://www.linkedin.com/in/joseph-s-50398b136/) has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD." 558 | ] 559 | }, 560 | { 561 | "cell_type": "markdown", 562 | "metadata": {}, 563 | "source": [ 564 | "Copyright © 2017 [cognitiveclass.ai](cognitiveclass.ai?utm_source=bducopyrightlink&utm_medium=dswb&utm_campaign=bdu). This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/).​" 565 | ] 566 | }, 567 | { 568 | "cell_type": "code", 569 | "execution_count": null, 570 | "metadata": { 571 | "collapsed": true 572 | }, 573 | "outputs": [], 574 | "source": [] 575 | } 576 | ], 577 | "metadata": { 578 | "anaconda-cloud": {}, 579 | "kernelspec": { 580 | "display_name": "Python [default]", 581 | "language": "python", 582 | "name": "python3" 583 | }, 584 | "language_info": { 585 | "codemirror_mode": { 586 | "name": "ipython", 587 | "version": 3 588 | }, 589 | "file_extension": ".py", 590 | "mimetype": "text/x-python", 591 | "name": "python", 592 | "nbconvert_exporter": "python", 593 | "pygments_lexer": "ipython3", 594 | "version": "3.5.2" 595 | } 596 | }, 597 | "nbformat": 4, 598 | "nbformat_minor": 2 599 | } 600 | -------------------------------------------------------------------------------- /5-1_Dictionaries_v5.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"SRTTU\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "\n", 15 | "## Table of Contents\n", 16 | "\n", 17 | "\n", 18 | "
\n", 19 | "\n", 20 | "
  • Dictionaries
  • \n", 21 | "
    \n", 22 | "

    \n", 23 | "Estimated Time Needed: 20 min\n", 24 | "
    \n", 25 | "\n", 26 | "
    " 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "metadata": {}, 32 | "source": [ 33 | "\n", 34 | "

    Dictionaries in Python

    \n" 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "metadata": {}, 40 | "source": [ 41 | "A dictionary consists of keys and values. It is helpful to compare a dictionary to a list. Instead of the numerical indexes such as a list, dictionaries have keys. These keys are labels that are used to access values within a dictionary." 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "metadata": {}, 47 | "source": [ 48 | "\n", 49 | "

    A Comparison of a Dictionary to a list: Instead of the numerical indexes like a list, dictionaries have keys.\n", 50 | "

    \n" 51 | ] 52 | }, 53 | { 54 | "cell_type": "markdown", 55 | "metadata": {}, 56 | "source": [ 57 | " An example of a Dictionary **Dict**:" 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": null, 63 | "metadata": { 64 | "collapsed": false 65 | }, 66 | "outputs": [], 67 | "source": [ 68 | "Dict={\"key1\":1,\"key2\":\"2\",\"key3\":[3,3,3],\"key4\":(4,4,4),('key5'):5,(0,1):6}\n", 69 | "Dict" 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "metadata": {}, 75 | "source": [ 76 | " The keys can be strings:" 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": null, 82 | "metadata": { 83 | "collapsed": false 84 | }, 85 | "outputs": [], 86 | "source": [ 87 | "Dict[\"key1\"]" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | " Keys can also be any immutable object such as a tuple: " 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": null, 100 | "metadata": { 101 | "collapsed": false 102 | }, 103 | "outputs": [], 104 | "source": [ 105 | "Dict[(0,1)]" 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "metadata": {}, 111 | "source": [ 112 | " Each key is separated from its value by a colon \"**:**\". Commas separate the items, and the whole dictionary is enclosed in curly braces. An empty dictionary without any items is written with just two curly braces, like this \"**{}**\"." 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": null, 118 | "metadata": { 119 | "collapsed": false 120 | }, 121 | "outputs": [], 122 | "source": [ 123 | "release_year_dict = {\"Thriller\":\"1982\", \"Back in Black\":\"1980\", \\\n", 124 | " \"The Dark Side of the Moon\":\"1973\", \"The Bodyguard\":\"1992\", \\\n", 125 | " \"Bat Out of Hell\":\"1977\", \"Their Greatest Hits (1971-1975)\":\"1976\", \\\n", 126 | " \"Saturday Night Fever\":\"1977\", \"Rumours\":\"1977\"}\n", 127 | "release_year_dict" 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": {}, 133 | "source": [ 134 | "In summary, like a list, a dictionary holds a sequence of elements. Each element is represented by a key and its corresponding value. Dictionaries are created with two curly braces containing keys and values separated by a colon. For every key, there can only be a single value, however, multiple keys can hold the same value. Keys can only be strings, numbers, or tuples, but values can be any data type." 135 | ] 136 | }, 137 | { 138 | "cell_type": "markdown", 139 | "metadata": {}, 140 | "source": [ 141 | "It is helpful to visualize the dictionary as a table, as in figure 9. The first column represents the keys, the second column represents the values.\n" 142 | ] 143 | }, 144 | { 145 | "cell_type": "markdown", 146 | "metadata": {}, 147 | "source": [ 148 | " \n", 149 | "

    Figure 9: Table representing a Dictionary \n", 150 | "\n", 151 | "

    \n" 152 | ] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "metadata": {}, 157 | "source": [ 158 | "You will need this dictionary for the next two questions :" 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": null, 164 | "metadata": { 165 | "collapsed": false 166 | }, 167 | "outputs": [], 168 | "source": [ 169 | "soundtrack_dic = { \"The Bodyguard\":\"1992\", \"Saturday Night Fever\":\"1977\"}\n", 170 | "soundtrack_dic " 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "#### In the dictionary \"soundtrack_dict\" what are the keys ?" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": null, 183 | "metadata": { 184 | "collapsed": true 185 | }, 186 | "outputs": [], 187 | "source": [] 188 | }, 189 | { 190 | "cell_type": "markdown", 191 | "metadata": {}, 192 | "source": [ 193 | "
    \n", 194 | "Click here for the solution\n", 195 | "\n", 196 | "
    \n", 197 | "
    \n", 198 | "```\n", 199 | "The Keys \"The Bodyguard\" and \"Saturday Night Fever\" \n", 200 | "```\n", 201 | "
    " 202 | ] 203 | }, 204 | { 205 | "cell_type": "markdown", 206 | "metadata": {}, 207 | "source": [ 208 | "#### In the dictionary \"soundtrack_dict\" what are the values ?" 209 | ] 210 | }, 211 | { 212 | "cell_type": "markdown", 213 | "metadata": {}, 214 | "source": [ 215 | "
    \n", 216 | "Click here for the solution\n", 217 | "\n", 218 | "
    \n", 219 | "
    \n", 220 | "```\n", 221 | "The values are \"1992\" and \"1977\"\n", 222 | "```\n", 223 | "
    " 224 | ] 225 | }, 226 | { 227 | "cell_type": "markdown", 228 | "metadata": {}, 229 | "source": [ 230 | "You can retrieve the values based on the names:" 231 | ] 232 | }, 233 | { 234 | "cell_type": "code", 235 | "execution_count": null, 236 | "metadata": { 237 | "collapsed": false 238 | }, 239 | "outputs": [], 240 | "source": [ 241 | "release_year_dict['Thriller'] " 242 | ] 243 | }, 244 | { 245 | "cell_type": "markdown", 246 | "metadata": {}, 247 | "source": [ 248 | "This corresponds to: \n" 249 | ] 250 | }, 251 | { 252 | "cell_type": "markdown", 253 | "metadata": {}, 254 | "source": [ 255 | " \n", 256 | "

    \n", 257 | " Table used to represent accessing the value for \"Thriller\" \n", 258 | "\n", 259 | "

    \n" 260 | ] 261 | }, 262 | { 263 | "cell_type": "markdown", 264 | "metadata": {}, 265 | "source": [ 266 | "Similarly for The Bodyguard \n" 267 | ] 268 | }, 269 | { 270 | "cell_type": "code", 271 | "execution_count": null, 272 | "metadata": { 273 | "collapsed": false 274 | }, 275 | "outputs": [], 276 | "source": [ 277 | "release_year_dict['The Bodyguard'] " 278 | ] 279 | }, 280 | { 281 | "cell_type": "markdown", 282 | "metadata": {}, 283 | "source": [ 284 | " \n", 285 | "

    \n", 286 | "Accessing the value for the \"The Bodyguard\"\n", 287 | "\n", 288 | "

    \n", 289 | "\n", 290 | "\n" 291 | ] 292 | }, 293 | { 294 | "cell_type": "markdown", 295 | "metadata": {}, 296 | "source": [ 297 | "Now let us retrieve the keys of the dictionary using the method **release_year_dict()**:" 298 | ] 299 | }, 300 | { 301 | "cell_type": "code", 302 | "execution_count": null, 303 | "metadata": { 304 | "collapsed": false 305 | }, 306 | "outputs": [], 307 | "source": [ 308 | "release_year_dict.keys() " 309 | ] 310 | }, 311 | { 312 | "cell_type": "markdown", 313 | "metadata": {}, 314 | "source": [ 315 | " You can retrieve the values using the method **`values()`**:" 316 | ] 317 | }, 318 | { 319 | "cell_type": "code", 320 | "execution_count": null, 321 | "metadata": { 322 | "collapsed": false 323 | }, 324 | "outputs": [], 325 | "source": [ 326 | "release_year_dict.values() " 327 | ] 328 | }, 329 | { 330 | "cell_type": "markdown", 331 | "metadata": {}, 332 | "source": [ 333 | "We can add an entry:" 334 | ] 335 | }, 336 | { 337 | "cell_type": "code", 338 | "execution_count": null, 339 | "metadata": { 340 | "collapsed": false 341 | }, 342 | "outputs": [], 343 | "source": [ 344 | "release_year_dict['Graduation']='2007'\n", 345 | "release_year_dict" 346 | ] 347 | }, 348 | { 349 | "cell_type": "markdown", 350 | "metadata": {}, 351 | "source": [ 352 | "We can delete an entry: " 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "execution_count": null, 358 | "metadata": { 359 | "collapsed": false 360 | }, 361 | "outputs": [], 362 | "source": [ 363 | "del(release_year_dict['Thriller'])\n", 364 | "del(release_year_dict['Graduation'])\n", 365 | "release_year_dict" 366 | ] 367 | }, 368 | { 369 | "cell_type": "markdown", 370 | "metadata": {}, 371 | "source": [ 372 | " We can verify if an element is in the dictionary: " 373 | ] 374 | }, 375 | { 376 | "cell_type": "code", 377 | "execution_count": null, 378 | "metadata": { 379 | "collapsed": false 380 | }, 381 | "outputs": [], 382 | "source": [ 383 | "'The Bodyguard' in release_year_dict" 384 | ] 385 | }, 386 | { 387 | "cell_type": "markdown", 388 | "metadata": {}, 389 | "source": [ 390 | "#### The Albums 'Back in Black', 'The Bodyguard' and 'Thriller' have the following music recording sales in millions 50, 50 and 65 respectively:" 391 | ] 392 | }, 393 | { 394 | "cell_type": "markdown", 395 | "metadata": {}, 396 | "source": [ 397 | "##### a) Create a dictionary “album_sales_dict” where the keys are the album name and the sales in millions are the values. " 398 | ] 399 | }, 400 | { 401 | "cell_type": "code", 402 | "execution_count": null, 403 | "metadata": { 404 | "collapsed": true 405 | }, 406 | "outputs": [], 407 | "source": [] 408 | }, 409 | { 410 | "cell_type": "markdown", 411 | "metadata": {}, 412 | "source": [ 413 | "
    \n", 414 | "Click here for the solution\n", 415 | "\n", 416 | "
    \n", 417 | "
    \n", 418 | "```\n", 419 | "album_sales_dict= { \"The Bodyguard\":50, \"Back in Black\":50,\"Thriller\":65}\n", 420 | "```\n", 421 | "
    " 422 | ] 423 | }, 424 | { 425 | "cell_type": "markdown", 426 | "metadata": {}, 427 | "source": [ 428 | "#### b) Use the dictionary to find the total sales of \"Thriller\":" 429 | ] 430 | }, 431 | { 432 | "cell_type": "code", 433 | "execution_count": null, 434 | "metadata": { 435 | "collapsed": false 436 | }, 437 | "outputs": [], 438 | "source": [] 439 | }, 440 | { 441 | "cell_type": "markdown", 442 | "metadata": {}, 443 | "source": [ 444 | "
    \n", 445 | "Click here for the solution\n", 446 | "\n", 447 | "
    \n", 448 | "
    \n", 449 | "```\n", 450 | "album_sales_dict[\"Thriller\"]\n", 451 | "```\n", 452 | "
    " 453 | ] 454 | }, 455 | { 456 | "cell_type": "markdown", 457 | "metadata": {}, 458 | "source": [ 459 | "#### c) Find the names of the albums from the dictionary using the method \"keys\":" 460 | ] 461 | }, 462 | { 463 | "cell_type": "code", 464 | "execution_count": null, 465 | "metadata": { 466 | "collapsed": false 467 | }, 468 | "outputs": [], 469 | "source": [] 470 | }, 471 | { 472 | "cell_type": "markdown", 473 | "metadata": {}, 474 | "source": [ 475 | "
    \n", 476 | "Click here for the solution\n", 477 | "\n", 478 | "
    \n", 479 | "
    \n", 480 | "```\n", 481 | "album_sales_dict.keys()\n", 482 | "```" 483 | ] 484 | }, 485 | { 486 | "cell_type": "markdown", 487 | "metadata": {}, 488 | "source": [ 489 | "#### d) Find the names of the recording sales​ from the dictionary using the method \"values\":" 490 | ] 491 | }, 492 | { 493 | "cell_type": "code", 494 | "execution_count": null, 495 | "metadata": { 496 | "collapsed": false 497 | }, 498 | "outputs": [], 499 | "source": [] 500 | }, 501 | { 502 | "cell_type": "markdown", 503 | "metadata": {}, 504 | "source": [ 505 | "
    \n", 506 | "Click here for the solution\n", 507 | "\n", 508 | "
    \n", 509 | "
    \n", 510 | "```\n", 511 | "album_sales_dict.values()\n", 512 | "```\n", 513 | "
    " 514 | ] 515 | }, 516 | { 517 | "cell_type": "markdown", 518 | "metadata": {}, 519 | "source": [ 520 | "\n", 521 | "\n", 522 | "# About the Authors: \n", 523 | "\n", 524 | " [Joseph Santarcangelo]( https://www.linkedin.com/in/joseph-s-50398b136/) has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.\n", 525 | "\n" 526 | ] 527 | }, 528 | { 529 | "cell_type": "markdown", 530 | "metadata": {}, 531 | "source": [ 532 | "
    \n", 533 | "Copyright © 2017 [cognitiveclass.ai](cognitiveclass.ai?utm_source=bducopyrightlink&utm_medium=dswb&utm_campaign=bdu). This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/)." 534 | ] 535 | } 536 | ], 537 | "metadata": { 538 | "anaconda-cloud": {}, 539 | "kernelspec": { 540 | "display_name": "Python [default]", 541 | "language": "python", 542 | "name": "python3" 543 | }, 544 | "language_info": { 545 | "codemirror_mode": { 546 | "name": "ipython", 547 | "version": 3 548 | }, 549 | "file_extension": ".py", 550 | "mimetype": "text/x-python", 551 | "name": "python", 552 | "nbconvert_exporter": "python", 553 | "pygments_lexer": "ipython3", 554 | "version": "3.5.2" 555 | } 556 | }, 557 | "nbformat": 4, 558 | "nbformat_minor": 2 559 | } 560 | -------------------------------------------------------------------------------- /7-2_Intro_Data_Analysis.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"SRTTU\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Table of content\n", 15 | "\n", 16 | "
    \n", 17 | "
  • Data Acquisition\n", 18 | "
  • Basic Insight of Dataset
  • \n", 19 | "

    \n", 20 | "Estimated Time Needed: 10 min\n", 21 | "
    \n", 22 | "
    " 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": {}, 28 | "source": [ 29 | "\n", 30 | "# Data Acquisition\n", 31 | "There are various formats for a dataset, .csv, .json, .xlsx etc. The dataset can be stored in different places, on your local machine or sometimes online.\n", 32 | "In this section, you will learn how to load a dataset into our Jupyter Notebook.\n", 33 | "In our case, the Automobile Dataset is an online source, and it is in CSV (comma separated value) format. Let's use this dataset as an example to practice data reading.\n", 34 | "data source: https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data\n", 35 | "data type: csv\n", 36 | "The Pandas Library is a useful tool that enables us to read various datasets into a data frame; \n" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": null, 42 | "metadata": { 43 | "collapsed": true 44 | }, 45 | "outputs": [], 46 | "source": [ 47 | "# import pandas library\n", 48 | "import pandas as pd" 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": {}, 54 | "source": [ 55 | "## Read Data\n", 56 | "We use **\"pandas.read_csv()\"** function to read the csv file. In the bracket, we put the file path along with a quotation mark, so that pandas will read the file into a data frame from that address. The file path can be either an URL or your local file address.\n", 57 | "Because the data does not include headers, we can add an argument **\" headers = None\"** inside the **\"read_csv()\"** method, so that pandas will not automatically set the first row as a header.\n", 58 | "You can also assign the dataset to any variable you create." 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "metadata": { 65 | "collapsed": false 66 | }, 67 | "outputs": [], 68 | "source": [ 69 | "# import pandas library\n", 70 | "import pandas as pd\n", 71 | "# read the online file by the URL provides above, and assign it to variable \"df\"\n", 72 | "path=\"https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data\"\n", 73 | "\n", 74 | "df = pd.read_csv(path,header=None)\n", 75 | "print(\"Done\")" 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": {}, 81 | "source": [ 82 | "After reading the dataset, we can use the **`dataframe.head(n)`** method to check the top n rows of the dataframe; where n is an integer. Contrary to **`dataframe.head(n)`**, **`dataframe.tail(n)`** will show you the bottom n rows of the dataframe.\n" 83 | ] 84 | }, 85 | { 86 | "cell_type": "code", 87 | "execution_count": null, 88 | "metadata": { 89 | "collapsed": false 90 | }, 91 | "outputs": [], 92 | "source": [ 93 | "# show the first 5 rows using dataframe.head() method\n", 94 | "df.head(5)" 95 | ] 96 | }, 97 | { 98 | "cell_type": "markdown", 99 | "metadata": {}, 100 | "source": [ 101 | "
    \n", 102 | "

    Question #1:

    \n", 103 | "check the bottom 10 rows of data frame \"df\".\n", 104 | "
    " 105 | ] 106 | }, 107 | { 108 | "cell_type": "code", 109 | "execution_count": null, 110 | "metadata": { 111 | "collapsed": false 112 | }, 113 | "outputs": [], 114 | "source": [ 115 | "\n" 116 | ] 117 | }, 118 | { 119 | "cell_type": "markdown", 120 | "metadata": {}, 121 | "source": [ 122 | "## Add Headers\n", 123 | "Take a look at our dataset; pandas automatically set the header by an integer from 0. \n", 124 | "
    \n", 125 | "To better describe our data we can introduce a header, this information is available at: https://archive.ics.uci.edu/ml/datasets/Automobile
    \n", 126 | "

    \n", 127 | "
    Thus, we have to add headers manually.
    \n", 128 | "
    Firstly, we create a list \"headers\" that include all column names in order.
    \n", 129 | "
    Then, we use **`dataframe.columns = headers`** to replace the headers by the list we created.
    " 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": null, 135 | "metadata": { 136 | "collapsed": false 137 | }, 138 | "outputs": [], 139 | "source": [ 140 | "# create headers list\n", 141 | "headers = [\"symboling\",\"normalized-losses\",\"make\",\"fuel-type\",\"aspiration\", \"num-of-doors\",\"body-style\",\n", 142 | " \"drive-wheels\",\"engine-location\",\"wheel-base\", \"length\",\"width\",\"height\",\"curb-weight\",\"engine-type\",\n", 143 | " \"num-of-cylinders\", \"engine-size\",\"fuel-system\",\"bore\",\"stroke\",\"compression-ratio\",\"horsepower\",\n", 144 | " \"peak-rpm\",\"city-mpg\",\"highway-mpg\",\"price\"]\n", 145 | "headers" 146 | ] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": {}, 151 | "source": [ 152 | " We replace headers and recheck our data frame" 153 | ] 154 | }, 155 | { 156 | "cell_type": "code", 157 | "execution_count": null, 158 | "metadata": { 159 | "collapsed": false 160 | }, 161 | "outputs": [], 162 | "source": [ 163 | "df.columns = headers\n", 164 | "df.head(10)" 165 | ] 166 | }, 167 | { 168 | "cell_type": "markdown", 169 | "metadata": {}, 170 | "source": [ 171 | "we can drop missing values along the column \"price\" as follows " 172 | ] 173 | }, 174 | { 175 | "cell_type": "code", 176 | "execution_count": null, 177 | "metadata": { 178 | "collapsed": false 179 | }, 180 | "outputs": [], 181 | "source": [ 182 | "df.dropna(subset=[\"price\"], axis=0)" 183 | ] 184 | }, 185 | { 186 | "cell_type": "markdown", 187 | "metadata": {}, 188 | "source": [ 189 | "Now, we have successfully read the raw dataset and add the correct headers into the data frame." 190 | ] 191 | }, 192 | { 193 | "cell_type": "markdown", 194 | "metadata": {}, 195 | "source": [ 196 | "
    \n", 197 | "

    Question #2:

    \n", 198 | "Find the name of the colunms of the dataframe\n", 199 | "
    " 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": null, 205 | "metadata": { 206 | "collapsed": false 207 | }, 208 | "outputs": [], 209 | "source": [] 210 | }, 211 | { 212 | "cell_type": "markdown", 213 | "metadata": {}, 214 | "source": [ 215 | "## Save Dataset\n", 216 | "Correspondingly, Pandas enables us to save the dataset to csv by using the **`dataframe.to_csv()`** method, you can add the file path and name along with quotation marks in the brackets.\n", 217 | "\n", 218 | "For example, if you would save the dataframe \"df\" as \"automobile.csv\" to your local machine, you may use the syntax below:\n", 219 | "~~~~\n", 220 | "df.to_csv(\"automobile.csv\")\n", 221 | "~~~~\n", 222 | "\n" 223 | ] 224 | }, 225 | { 226 | "cell_type": "markdown", 227 | "metadata": {}, 228 | "source": [ 229 | " We can also read and save other file formats, we can use similar functions to **`pd.read_csv()`** and **`df.to_csv()`** for other data formats, the functions are listed in the following table:\n" 230 | ] 231 | }, 232 | { 233 | "cell_type": "markdown", 234 | "metadata": {}, 235 | "source": [ 236 | "## Read/Save Other Data Formats\n", 237 | "\n", 238 | "\n", 239 | "\n", 240 | "| Data Formate | Read | Save |\n", 241 | "| ------------- |:--------------:| ----------------:|\n", 242 | "| csv | `pd.read_csv()` |`df.to_csv()` |\n", 243 | "| json | `pd.read_json()` |`df.to_json()` |\n", 244 | "| excel | `pd.read_excel()`|`df.to_excel()` |\n", 245 | "| hdf | `pd.read_hdf()` |`df.to_hdf()` |\n", 246 | "| sql | `pd.read_sql()` |`df.to_sql()` |\n", 247 | "| ... | ... | ... |" 248 | ] 249 | }, 250 | { 251 | "cell_type": "markdown", 252 | "metadata": {}, 253 | "source": [ 254 | "\n", 255 | "# Basic Insight of Dataset\n", 256 | "After reading data into Pandas dataframe, it is time for us to explore the dataset.\n", 257 | "There are several ways to obtain essential insights of the data to help us better understand our dataset." 258 | ] 259 | }, 260 | { 261 | "cell_type": "markdown", 262 | "metadata": {}, 263 | "source": [ 264 | "## Data Types\n", 265 | "Data has a variety of types.\n", 266 | "The main types stored in Pandas dataframes are `object`, `float`, `int`, `bool` and `datetime64`. In order to better learn about each attribute, it is always good for us to know the data type of each column. In Pandas:\n", 267 | "~~~~\n", 268 | "dataframe.dtypes\n", 269 | "~~~~\n", 270 | "returns a Series with the data type of each column." 271 | ] 272 | }, 273 | { 274 | "cell_type": "code", 275 | "execution_count": null, 276 | "metadata": { 277 | "collapsed": false, 278 | "scrolled": false 279 | }, 280 | "outputs": [], 281 | "source": [ 282 | "# check the data type of data frame \"df\" by .dtypes\n", 283 | "df.dtypes" 284 | ] 285 | }, 286 | { 287 | "cell_type": "markdown", 288 | "metadata": {}, 289 | "source": [ 290 | "As a result, as shown above, it is clear to see that the data type of \"symboling\" and \"curb-weight\" are `int64`, \"normalized-losses\" is `object`, and \"wheel-base\" is `float64`, etc.\n", 291 | "These data types can be changed; we will learn how to accomplish this in a later module. " 292 | ] 293 | }, 294 | { 295 | "cell_type": "markdown", 296 | "metadata": {}, 297 | "source": [ 298 | "## Describe\n", 299 | "If we would like to get a statistical summary of each column, such as count, column mean value, column standard deviation, etc. We use the describe method:\n", 300 | "~~~~\n", 301 | "dataframe.describe()\n", 302 | "~~~~\n", 303 | "This method will provide various summary statistics, excluding `NaN` (Not a Number) values." 304 | ] 305 | }, 306 | { 307 | "cell_type": "code", 308 | "execution_count": null, 309 | "metadata": { 310 | "collapsed": false 311 | }, 312 | "outputs": [], 313 | "source": [ 314 | "df.describe()" 315 | ] 316 | }, 317 | { 318 | "cell_type": "markdown", 319 | "metadata": {}, 320 | "source": [ 321 | "This shows the statistical summary of all numeric-typed (int, float) columns.\n", 322 | "For example, the attribute \"symboling\" has 205 counts, the mean value of this column is 0.83, the standard deviation is 1.25, the minimum value is -2, 25th percentile is 0, 50th percentile is 1, 75th percentile is 2, and the maximum value is 3.\n", 323 | "\n", 324 | "However, what if we would also like to check all the columns including those that are of type object.\n", 325 | "\n", 326 | "\n", 327 | "You can add an argument `include = \"all\"` inside the bracket. Let's try it again." 328 | ] 329 | }, 330 | { 331 | "cell_type": "code", 332 | "execution_count": null, 333 | "metadata": { 334 | "collapsed": false 335 | }, 336 | "outputs": [], 337 | "source": [ 338 | "# describe all the columns in \"df\" \n", 339 | "df.describe(include = \"all\")" 340 | ] 341 | }, 342 | { 343 | "cell_type": "markdown", 344 | "metadata": {}, 345 | "source": [ 346 | "Now, it provides the statistical summary of all the columns, including object-typed attributes.\n", 347 | "We can now see how many unique values, which is the top value and the frequency of top value in the object-typed columns.\n", 348 | "Some values in the table above show as \"NaN\", this is because those numbers are not available regarding a particular column type.\n" 349 | ] 350 | }, 351 | { 352 | "cell_type": "markdown", 353 | "metadata": {}, 354 | "source": [ 355 | "
    \n", 356 | "

    Question #3:

    \n", 357 | "\n", 358 | "

    You can select the columns of a data frame by indicating the name of each column, for example, you can select the three columns as follows:

    \n", 359 | "

    \n", 360 | "

    dataframe[[' column 1 ',column 2', 'column 3'] ]

    \n", 361 | "

    \n", 362 | "

    where \"column\" is the name of the column, you can apply the method \".describe()\" to get the statistics of those columns as follows:

    \n", 363 | "

    \n", 364 | "

    dataframe[[' column 1 ',column 2', 'column 3'] ].describe()

    \n", 365 | "

    \n", 366 | "\n", 367 | "Apply the method to \".describe()\" to the columns 'length' and 'compression-ratio'.\n", 368 | "
    " 369 | ] 370 | }, 371 | { 372 | "cell_type": "code", 373 | "execution_count": null, 374 | "metadata": { 375 | "collapsed": true 376 | }, 377 | "outputs": [], 378 | "source": [] 379 | }, 380 | { 381 | "cell_type": "markdown", 382 | "metadata": {}, 383 | "source": [ 384 | "## Info\n", 385 | "Another method you can use to check your dataset is:\n", 386 | "~~~~\n", 387 | "dataframe.info\n", 388 | "~~~~\n", 389 | "It provide a concise summary of your DataFrame." 390 | ] 391 | }, 392 | { 393 | "cell_type": "code", 394 | "execution_count": null, 395 | "metadata": { 396 | "collapsed": false 397 | }, 398 | "outputs": [], 399 | "source": [ 400 | "# look at the info of \"df\"\n", 401 | "df.info" 402 | ] 403 | }, 404 | { 405 | "cell_type": "markdown", 406 | "metadata": {}, 407 | "source": [ 408 | "Here we are able to see the information of our dataframe, with the top 30 rows and the bottom 30 rows.\n", 409 | "\n", 410 | "And, it also shows us the whole data frame has 205 rows and 26 columns in total." 411 | ] 412 | }, 413 | { 414 | "cell_type": "markdown", 415 | "metadata": {}, 416 | "source": [ 417 | "# Excellent! You have just completed the Introduction Notebook! " 418 | ] 419 | }, 420 | { 421 | "cell_type": "markdown", 422 | "metadata": {}, 423 | "source": [ 424 | " \n", 425 | "### About the Authors: \n", 426 | "\n", 427 | "This notebook was written by [Mahdi Noorian PhD](https://www.linkedin.com/in/mahdi-noorian-58219234/) ,[Joseph Santarcangelo PhD]( https://www.linkedin.com/in/joseph-s-50398b136/), Bahare Talayian, Eric Xiao, Steven Dong, Parizad , Hima Vsudevan and [Fiorella Wenver](https://www.linkedin.com/in/fiorellawever/).\n", 428 | "\n" 429 | ] 430 | }, 431 | { 432 | "cell_type": "markdown", 433 | "metadata": {}, 434 | "source": [ 435 | " Copyright © 2017 [cognitiveclass.ai](cognitiveclass.ai?utm_source=bducopyrightlink&utm_medium=dswb&utm_campaign=bdu). This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/).\n" 436 | ] 437 | }, 438 | { 439 | "cell_type": "code", 440 | "execution_count": null, 441 | "metadata": { 442 | "collapsed": true 443 | }, 444 | "outputs": [], 445 | "source": [] 446 | } 447 | ], 448 | "metadata": { 449 | "anaconda-cloud": {}, 450 | "kernelspec": { 451 | "display_name": "Python [default]", 452 | "language": "python", 453 | "name": "python3" 454 | }, 455 | "language_info": { 456 | "codemirror_mode": { 457 | "name": "ipython", 458 | "version": 3 459 | }, 460 | "file_extension": ".py", 461 | "mimetype": "text/x-python", 462 | "name": "python", 463 | "nbconvert_exporter": "python", 464 | "pygments_lexer": "ipython3", 465 | "version": "3.5.2" 466 | } 467 | }, 468 | "nbformat": 4, 469 | "nbformat_minor": 2 470 | } 471 | -------------------------------------------------------------------------------- /02-print.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"SRTTU\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "#
    چاپ (Print)
    " 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "
    \n", 22 | "تابع\n", 23 | " **print( )** \n", 24 | "می‌تواند به روش‌های مختلفی فراخوانی شود، چند نمونه فراخوانی در زیر لیست شده است:\n", 25 | "
    \n", 26 | "\n", 27 | " - print(\"Hello World\")\n", 28 | " - print(\"Hello\", <متغیر از نوع رشته>)\n", 29 | " - print(\"Hello\" + <متغیر از نوع رشته>)\n", 30 | " - print(\"Hello %s\" % <متغیر از نوع رشته>)" 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 1, 36 | "metadata": { 37 | "collapsed": false 38 | }, 39 | "outputs": [ 40 | { 41 | "name": "stdout", 42 | "output_type": "stream", 43 | "text": [ 44 | "Hello World\n" 45 | ] 46 | } 47 | ], 48 | "source": [ 49 | "print(\"Hello World\")" 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "metadata": {}, 55 | "source": [ 56 | "
    \n", 57 | "در پایتون، از ***'***، ***\"*** و \"\"\" برای نشان دادن یک رشته استفاده می شود.\n", 58 | "اگر رشته چند خط باشد از ***\"\"\"*** استفاده می‌شود.\n", 59 | "
    " 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": 2, 65 | "metadata": { 66 | "collapsed": false 67 | }, 68 | "outputs": [ 69 | { 70 | "name": "stdout", 71 | "output_type": "stream", 72 | "text": [ 73 | "Foo\n", 74 | "Bar\n" 75 | ] 76 | } 77 | ], 78 | "source": [ 79 | "print('Foo')\n", 80 | "print(\"Bar\")" 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": 3, 86 | "metadata": { 87 | "collapsed": false 88 | }, 89 | "outputs": [ 90 | { 91 | "name": "stdout", 92 | "output_type": "stream", 93 | "text": [ 94 | "Hello dear.\n", 95 | "I love Python.\n", 96 | "What about you?\n", 97 | "\n" 98 | ] 99 | } 100 | ], 101 | "source": [ 102 | "print(\"\"\"Hello dear.\n", 103 | "I love Python.\n", 104 | "What about you?\n", 105 | "\"\"\")" 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "metadata": {}, 111 | "source": [ 112 | "
    \n", 113 | "رشته ها را می‌توان به متغیر نسبت داد و سپس در print از آن استفاده کرد.\n", 114 | "
    " 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": 4, 120 | "metadata": { 121 | "collapsed": false, 122 | "scrolled": true 123 | }, 124 | "outputs": [ 125 | { 126 | "name": "stdout", 127 | "output_type": "stream", 128 | "text": [ 129 | "Hello World\n", 130 | "Hello World !\n" 131 | ] 132 | } 133 | ], 134 | "source": [ 135 | "string1 = 'World'\n", 136 | "print('Hello', string1)\n", 137 | "\n", 138 | "string2 = '!'\n", 139 | "print('Hello', string1, string2)" 140 | ] 141 | }, 142 | { 143 | "cell_type": "markdown", 144 | "metadata": {}, 145 | "source": [ 146 | "
    \n", 147 | "برای چسباندن دو یا چند رشته به هم از عملگر ***+*** استفاده می‌کنیم.\n", 148 | "
    \n", 149 | "توجه کنید که این عملگر فضای خالی یا space بین دو رشته اضافه نمیکند.\n", 150 | "
    " 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": 5, 156 | "metadata": { 157 | "collapsed": false 158 | }, 159 | "outputs": [ 160 | { 161 | "name": "stdout", 162 | "output_type": "stream", 163 | "text": [ 164 | "HelloWorld!\n" 165 | ] 166 | } 167 | ], 168 | "source": [ 169 | "print('Hello' + string1 + string2)" 170 | ] 171 | }, 172 | { 173 | "cell_type": "markdown", 174 | "metadata": {}, 175 | "source": [ 176 | "
    \n", 177 | "**٪ s **  \n", 178 | "برای اشاره به یک متغیر از نوع رشته استفاده می شود.\n", 179 | "
    " 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": 6, 185 | "metadata": { 186 | "collapsed": false 187 | }, 188 | "outputs": [ 189 | { 190 | "name": "stdout", 191 | "output_type": "stream", 192 | "text": [ 193 | "Hello World\n" 194 | ] 195 | } 196 | ], 197 | "source": [ 198 | "print(\"Hello %s\" % string1)" 199 | ] 200 | }, 201 | { 202 | "cell_type": "markdown", 203 | "metadata": {}, 204 | "source": [ 205 | "
    \n", 206 | "به طور مشابه، برای استفاده از انواع داده ای دیگر:\n", 207 | "
    \n", 208 | "\n", 209 | " - %s -> string\n", 210 | " - %d -> Integer\n", 211 | " - %f -> Float\n", 212 | " - %o -> Octal\n", 213 | " - %x -> Hexadecimal\n", 214 | " - %e -> exponential\n", 215 | " \n", 216 | "
    \n", 217 | "که می‌توان برای تبدیل متغیر در تابع print از این روش استفاده کرد.\n", 218 | "
    " 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": 7, 224 | "metadata": { 225 | "collapsed": false 226 | }, 227 | "outputs": [ 228 | { 229 | "name": "stdout", 230 | "output_type": "stream", 231 | "text": [ 232 | "Actual Number = 18\n", 233 | "Float of the number = 18.000000\n", 234 | "Octal equivalent of the number = 22\n", 235 | "Hexadecimal equivalent of the number = 12\n", 236 | "Exponential equivalent of the number = 1.800000e+01\n" 237 | ] 238 | } 239 | ], 240 | "source": [ 241 | "print(\"Actual Number = %d\" %18)\n", 242 | "print(\"Float of the number = %f\" %18)\n", 243 | "print(\"Octal equivalent of the number = %o\" %18)\n", 244 | "print(\"Hexadecimal equivalent of the number = %x\" %18)\n", 245 | "print(\"Exponential equivalent of the number = %e\" %18)" 246 | ] 247 | }, 248 | { 249 | "cell_type": "markdown", 250 | "metadata": {}, 251 | "source": [ 252 | "
    \n", 253 | "هنگامی که به چندین متغیر اشاره می شود، از پرانتز استفاده می شود.\n", 254 | "
    " 255 | ] 256 | }, 257 | { 258 | "cell_type": "code", 259 | "execution_count": 8, 260 | "metadata": { 261 | "collapsed": false 262 | }, 263 | "outputs": [ 264 | { 265 | "name": "stdout", 266 | "output_type": "stream", 267 | "text": [ 268 | "Hello World !\n" 269 | ] 270 | } 271 | ], 272 | "source": [ 273 | "print(\"Hello %s %s\" %(string1,string2))" 274 | ] 275 | }, 276 | { 277 | "cell_type": "markdown", 278 | "metadata": {}, 279 | "source": [ 280 | "##
    مثال‌های دیگر
    " 281 | ] 282 | }, 283 | { 284 | "cell_type": "markdown", 285 | "metadata": {}, 286 | "source": [ 287 | "The following are other different ways the print statement can be put to use." 288 | ] 289 | }, 290 | { 291 | "cell_type": "code", 292 | "execution_count": 9, 293 | "metadata": { 294 | "collapsed": false 295 | }, 296 | "outputs": [ 297 | { 298 | "name": "stdout", 299 | "output_type": "stream", 300 | "text": [ 301 | "I want %d to be printed here\n" 302 | ] 303 | } 304 | ], 305 | "source": [ 306 | "print(\"I want %%d to be printed %s\" %'here')" 307 | ] 308 | }, 309 | { 310 | "cell_type": "markdown", 311 | "metadata": {}, 312 | "source": [ 313 | "
    ضرب عدد صحیح در رشته
    " 314 | ] 315 | }, 316 | { 317 | "cell_type": "code", 318 | "execution_count": 10, 319 | "metadata": { 320 | "collapsed": false 321 | }, 322 | "outputs": [ 323 | { 324 | "name": "stdout", 325 | "output_type": "stream", 326 | "text": [ 327 | "_SRTTU_SRTTU_SRTTU_SRTTU_SRTTU_SRTTU_SRTTU_SRTTU_SRTTU_SRTTU\n" 328 | ] 329 | } 330 | ], 331 | "source": [ 332 | "print('_SRTTU'*10)" 333 | ] 334 | }, 335 | { 336 | "cell_type": "code", 337 | "execution_count": 11, 338 | "metadata": { 339 | "collapsed": false 340 | }, 341 | "outputs": [ 342 | { 343 | "name": "stdout", 344 | "output_type": "stream", 345 | "text": [ 346 | "Jan\n", 347 | "Feb\n", 348 | "Mar\n", 349 | "Apr\n", 350 | "May\n", 351 | "Jun\n", 352 | "Jul\n", 353 | "Aug\n" 354 | ] 355 | } 356 | ], 357 | "source": [ 358 | "print(\"Jan\\nFeb\\nMar\\nApr\\nMay\\nJun\\nJul\\nAug\")" 359 | ] 360 | }, 361 | { 362 | "cell_type": "code", 363 | "execution_count": 12, 364 | "metadata": { 365 | "collapsed": false 366 | }, 367 | "outputs": [ 368 | { 369 | "name": "stdout", 370 | "output_type": "stream", 371 | "text": [ 372 | "I want \\n to be printed.\n" 373 | ] 374 | } 375 | ], 376 | "source": [ 377 | "print(\"I want \\\\n to be printed.\")" 378 | ] 379 | }, 380 | { 381 | "cell_type": "code", 382 | "execution_count": 13, 383 | "metadata": { 384 | "collapsed": false 385 | }, 386 | "outputs": [ 387 | { 388 | "name": "stdout", 389 | "output_type": "stream", 390 | "text": [ 391 | "\n", 392 | "Routine:\n", 393 | "\t- Eat\n", 394 | "\t- Sleep\n", 395 | "\t- Repeat\n", 396 | "\n" 397 | ] 398 | } 399 | ], 400 | "source": [ 401 | "print(\"\"\"\n", 402 | "Routine:\n", 403 | "\\t- Eat\n", 404 | "\\t- Sleep\\n\\t- Repeat\n", 405 | "\"\"\")" 406 | ] 407 | }, 408 | { 409 | "cell_type": "markdown", 410 | "metadata": {}, 411 | "source": [ 412 | "# PrecisionWidth and FieldWidth " 413 | ] 414 | }, 415 | { 416 | "cell_type": "markdown", 417 | "metadata": {}, 418 | "source": [ 419 | "
    \n", 420 | "Fieldwidth - اندازه رشته را تعیین می‌کند؛ در مواردی که اندازه تعیین شده بیشتر از اندازه واقعی مقدار باشد، فضای اضافی را می‌توان با صفر یا فضای خالی (Space) پر کرد و البته زمانی که کمتر تعیین گردد، این گزینه نادیده گرفته می‌شود.\n", 421 | "
    \n", 422 | "Precision Width. - در مورد اعداد ممیز شناور، دقت یا تعداد ارقام بعد از ممیز را تعیین می‌کند (دقت پیش‌فرض: ۶). در مواردی که تعداد تعیین شده کمتر از تعداد واقعی ارقام بعد ممیز باشد، عدد گِرد می‌گردد.\n", 423 | "
    \n", 424 | "'%(fieldwidth).(precisionwidth)f'" 425 | ] 426 | }, 427 | { 428 | "cell_type": "markdown", 429 | "metadata": {}, 430 | "source": [ 431 | "
    \n", 432 | "هیچ دقتی تعیین نشده و تا 6 رقم اعشار چاپ می‌شود.\n", 433 | "
    " 434 | ] 435 | }, 436 | { 437 | "cell_type": "code", 438 | "execution_count": 14, 439 | "metadata": { 440 | "collapsed": false 441 | }, 442 | "outputs": [ 443 | { 444 | "data": { 445 | "text/plain": [ 446 | "'8.121312'" 447 | ] 448 | }, 449 | "execution_count": 14, 450 | "metadata": {}, 451 | "output_type": "execute_result" 452 | } 453 | ], 454 | "source": [ 455 | "\"%f\" % 8.121312312312" 456 | ] 457 | }, 458 | { 459 | "cell_type": "markdown", 460 | "metadata": {}, 461 | "source": [ 462 | "
    \n", 463 | "در اینجا دقت یا تعداد اعداد بعد از ممیز 3 تعیین شده است.\n", 464 | "
    " 465 | ] 466 | }, 467 | { 468 | "cell_type": "code", 469 | "execution_count": 15, 470 | "metadata": { 471 | "collapsed": false 472 | }, 473 | "outputs": [ 474 | { 475 | "data": { 476 | "text/plain": [ 477 | "'8.121'" 478 | ] 479 | }, 480 | "execution_count": 15, 481 | "metadata": {}, 482 | "output_type": "execute_result" 483 | } 484 | ], 485 | "source": [ 486 | "\"%.3f\" % 8.121312312312" 487 | ] 488 | }, 489 | { 490 | "cell_type": "markdown", 491 | "metadata": {}, 492 | "source": [ 493 | "
    \n", 494 | "اگر عرض فیلد بیش از طول رشته خروجی باشد، قبل از عدد فاصله خالی یا space گذاشته می‌شود تا رشته خروجی هم اندازه عرض ذکر شده شود.\n", 495 | "
    " 496 | ] 497 | }, 498 | { 499 | "cell_type": "code", 500 | "execution_count": 16, 501 | "metadata": { 502 | "collapsed": false 503 | }, 504 | "outputs": [ 505 | { 506 | "data": { 507 | "text/plain": [ 508 | "' 8.12131'" 509 | ] 510 | }, 511 | "execution_count": 16, 512 | "metadata": {}, 513 | "output_type": "execute_result" 514 | } 515 | ], 516 | "source": [ 517 | "\"%9.5f\" % 8.121312312312" 518 | ] 519 | }, 520 | { 521 | "cell_type": "markdown", 522 | "metadata": {}, 523 | "source": [ 524 | "
    \n", 525 | "Zero padding  \n", 526 | " با اضافه کردن 0 در شروع fieldwidth  (عرض فیلد)\n", 527 | "انجام می شود.\n", 528 | "
    " 529 | ] 530 | }, 531 | { 532 | "cell_type": "code", 533 | "execution_count": 17, 534 | "metadata": { 535 | "collapsed": false 536 | }, 537 | "outputs": [ 538 | { 539 | "data": { 540 | "text/plain": [ 541 | "'008.12131'" 542 | ] 543 | }, 544 | "execution_count": 17, 545 | "metadata": {}, 546 | "output_type": "execute_result" 547 | } 548 | ], 549 | "source": [ 550 | "\"%09.5f\" % 8.121312312312" 551 | ] 552 | }, 553 | { 554 | "cell_type": "markdown", 555 | "metadata": {}, 556 | "source": [ 557 | "
    \n", 558 | "برای هماهنگی مناسب، می توان در fieldwidth  \n", 559 | "یک فاصله خالی یا space بگذاریم تا وقتی یک عدد منفی استفاده می شود، تراز و هماهنگی مناسب اعداد حفظ شود.\n", 560 | "
    " 561 | ] 562 | }, 563 | { 564 | "cell_type": "code", 565 | "execution_count": 18, 566 | "metadata": { 567 | "collapsed": false 568 | }, 569 | "outputs": [ 570 | { 571 | "name": "stdout", 572 | "output_type": "stream", 573 | "text": [ 574 | " 8.121312\n", 575 | "-8.121312\n" 576 | ] 577 | } 578 | ], 579 | "source": [ 580 | "print(\"% 9f\" % 8.121312312312)\n", 581 | "print(\"% 9f\" % -8.121312312312)" 582 | ] 583 | }, 584 | { 585 | "cell_type": "markdown", 586 | "metadata": {}, 587 | "source": [ 588 | "
    \n", 589 | "علامت '+' را می توان در ابتدای عدد مثبت با اضافه کردن علامت + در ابتدای fieldwidth   برگشت داد.\n", 590 | "
    " 591 | ] 592 | }, 593 | { 594 | "cell_type": "code", 595 | "execution_count": 19, 596 | "metadata": { 597 | "collapsed": false 598 | }, 599 | "outputs": [ 600 | { 601 | "name": "stdout", 602 | "output_type": "stream", 603 | "text": [ 604 | "+8.121312\n", 605 | "-8.121312\n" 606 | ] 607 | } 608 | ], 609 | "source": [ 610 | "print(\"%+9f\" % 8.121312312312)\n", 611 | "print(\"% 9f\" % -8.121312312312)" 612 | ] 613 | }, 614 | { 615 | "cell_type": "markdown", 616 | "metadata": {}, 617 | "source": [ 618 | "
    \n", 619 | "همانطور که در بالا ذکر شد، داده ها زمانی که عرض فیلد ذکر شده یا \n", 620 | "fieldwidth  \n", 621 | "بزرگتر از طول رشته باشد، تراز راست صورت گرفته و قبل ار عدد فاصله خالی قرار میگیرد. اما تراز چپ را می توان با تعیین یک نماد منفی در\n", 622 | "fieldwidth  \n", 623 | "انجام داد.\n", 624 | "
    " 625 | ] 626 | }, 627 | { 628 | "cell_type": "code", 629 | "execution_count": 20, 630 | "metadata": { 631 | "collapsed": false 632 | }, 633 | "outputs": [ 634 | { 635 | "data": { 636 | "text/plain": [ 637 | "'3.121 '" 638 | ] 639 | }, 640 | "execution_count": 20, 641 | "metadata": {}, 642 | "output_type": "execute_result" 643 | } 644 | ], 645 | "source": [ 646 | "\"%-9.3f\" % 3.121312312312" 647 | ] 648 | }, 649 | { 650 | "cell_type": "markdown", 651 | "metadata": { 652 | "collapsed": true 653 | }, 654 | "source": [ 655 | "
    \n", 656 | "
    دانشگاه تربیت دبیر شهید رجایی
    کارگاه پایتون
    علیرضا اخوان پور
    96-97
    \n", 657 | "
    \n", 658 | "SRTTU.edu - PyLab.AkhavanPour.ir - AkhavanPour.ir\n", 659 | "
    " 660 | ] 661 | }, 662 | { 663 | "cell_type": "code", 664 | "execution_count": null, 665 | "metadata": { 666 | "collapsed": true 667 | }, 668 | "outputs": [], 669 | "source": [] 670 | } 671 | ], 672 | "metadata": { 673 | "anaconda-cloud": {}, 674 | "kernelspec": { 675 | "display_name": "Python [conda env:virtual_platform]", 676 | "language": "python", 677 | "name": "conda-env-virtual_platform-py" 678 | }, 679 | "language_info": { 680 | "codemirror_mode": { 681 | "name": "ipython", 682 | "version": 3 683 | }, 684 | "file_extension": ".py", 685 | "mimetype": "text/x-python", 686 | "name": "python", 687 | "nbconvert_exporter": "python", 688 | "pygments_lexer": "ipython3", 689 | "version": "3.5.4" 690 | } 691 | }, 692 | "nbformat": 4, 693 | "nbformat_minor": 0 694 | } 695 | -------------------------------------------------------------------------------- /7-3_Data_Wrangling.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"SRTTU\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Table of content\n", 15 | "\n", 16 | "
    \n", 17 | "
  • Identify and handle missing values\n", 18 | "
  • \n", 21 | "

    \n", 22 | "
  • Data standardization
  • \n", 23 | "
  • Data Normalization (centring/scaling)
  • \n", 24 | "
  • Binning
  • \n", 25 | "
  • Indicator variable
  • \n", 26 | "

    \n", 27 | "Estimated Time Needed: 30 min\n", 28 | "
    \n", 29 | " \n", 30 | "
    " 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "## What is the purpose of Data Wrangling?" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": [ 44 | "Data Wrangling is the process of converting data from the initial format to a format that may be better for analysis." 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "metadata": {}, 50 | "source": [ 51 | "### What is the fuel consumption (L/100k) rate for the disel car?" 52 | ] 53 | }, 54 | { 55 | "cell_type": "markdown", 56 | "metadata": {}, 57 | "source": [ 58 | "### Import data\n", 59 | "\n", 60 | "You can find the \"Automobile Data Set\" from the following link: https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data. We will be using this data set throughout this course.\n" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "#### Import pandas " 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": null, 73 | "metadata": { 74 | "collapsed": true 75 | }, 76 | "outputs": [], 77 | "source": [ 78 | "import pandas as pd" 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "metadata": {}, 84 | "source": [ 85 | "## Reading the data set from the URL and adding the related headers." 86 | ] 87 | }, 88 | { 89 | "cell_type": "markdown", 90 | "metadata": {}, 91 | "source": [ 92 | "#### URL of dataset" 93 | ] 94 | }, 95 | { 96 | "cell_type": "code", 97 | "execution_count": null, 98 | "metadata": { 99 | "collapsed": true 100 | }, 101 | "outputs": [], 102 | "source": [ 103 | "filename = 'https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data'" 104 | ] 105 | }, 106 | { 107 | "cell_type": "markdown", 108 | "metadata": {}, 109 | "source": [ 110 | " Python list \"headers\" containing name of headers " 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": null, 116 | "metadata": { 117 | "collapsed": true 118 | }, 119 | "outputs": [], 120 | "source": [ 121 | "headers = [\"symboling\",\"normalized-losses\",\"make\",\"fuel-type\",\"aspiration\", \"num-of-doors\",\"body-style\",\n", 122 | " \"drive-wheels\",\"engine-location\",\"wheel-base\", \"length\",\"width\",\"height\",\"curb-weight\",\"engine-type\",\n", 123 | " \"num-of-cylinders\", \"engine-size\",\"fuel-system\",\"bore\",\"stroke\",\"compression-ratio\",\"horsepower\",\n", 124 | " \"peak-rpm\",\"city-mpg\",\"highway-mpg\",\"price\"]" 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "metadata": {}, 130 | "source": [ 131 | "Use the Pandas method **read_csv()** to load the data from the web address. Set the parameter \"names\" equal to the Python list \"headers\"." 132 | ] 133 | }, 134 | { 135 | "cell_type": "code", 136 | "execution_count": null, 137 | "metadata": { 138 | "collapsed": false 139 | }, 140 | "outputs": [], 141 | "source": [ 142 | "df = pd.read_csv(filename, names = headers)\n", 143 | "print(\"Done\")" 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "metadata": {}, 149 | "source": [ 150 | " Use the method **head()** to display the first five rows of the dataframe. " 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": null, 156 | "metadata": { 157 | "collapsed": false 158 | }, 159 | "outputs": [], 160 | "source": [ 161 | "# To see what the data set looks like, we'll use the head() method.\n", 162 | "df.head()" 163 | ] 164 | }, 165 | { 166 | "cell_type": "markdown", 167 | "metadata": {}, 168 | "source": [ 169 | "As we can see, several question marks appeared in the dataframe; those are missing values which may hinder our further analysis. \n", 170 | "
    So, how do we identify all those missing values and deal with them?
    \n", 171 | "\n", 172 | "\n", 173 | "**How to work with missing data?**\n", 174 | "\n", 175 | "Steps for working with missing data:\n", 176 | "1. identify missing data\n", 177 | "2. deal with missing data\n", 178 | "3. correct data format" 179 | ] 180 | }, 181 | { 182 | "cell_type": "markdown", 183 | "metadata": {}, 184 | "source": [ 185 | "\n", 186 | "# 1. Identify and handle missing values\n", 187 | "\n", 188 | "\n", 189 | "\n", 190 | "### Convert \"?\" to NaN\n", 191 | "In the car dataset, missing data comes with the question mark \"?\".\n", 192 | "We replace \"?\" with NaN (Not a Number), which is Python's default missing value marker, for reasons of computational speed and convenience. Here we use the function: \n", 193 | "
    .replace(A, B, inplace = True) 
    \n", 194 | "to replace A by B" 195 | ] 196 | }, 197 | { 198 | "cell_type": "code", 199 | "execution_count": null, 200 | "metadata": { 201 | "collapsed": false 202 | }, 203 | "outputs": [], 204 | "source": [ 205 | "import numpy as np\n", 206 | "\n", 207 | "# replace \"?\" to NaN\n", 208 | "df.replace(\"?\", np.nan, inplace = True)\n", 209 | "df.head(5)" 210 | ] 211 | }, 212 | { 213 | "cell_type": "markdown", 214 | "metadata": {}, 215 | "source": [ 216 | "### Evaluating for Missing Data\n", 217 | "\n", 218 | "The missing values are converted to Python's default. We use Python's built-in functions to identify these missing values. There are two methods to detect missing data:\n", 219 | "1. **.isnull()**\n", 220 | "2. **.notnull()**\n", 221 | "\n", 222 | "The output is a boolean value indicating whether the passed in argument value are in fact missing data." 223 | ] 224 | }, 225 | { 226 | "cell_type": "code", 227 | "execution_count": null, 228 | "metadata": { 229 | "collapsed": false 230 | }, 231 | "outputs": [], 232 | "source": [ 233 | "missing_data = df.isnull()\n", 234 | "missing_data.head(5)" 235 | ] 236 | }, 237 | { 238 | "cell_type": "markdown", 239 | "metadata": {}, 240 | "source": [ 241 | "\"True\" stands for missing value, while \"False\" stands for not missing value." 242 | ] 243 | }, 244 | { 245 | "cell_type": "markdown", 246 | "metadata": {}, 247 | "source": [ 248 | "### Count missing values in each column\n", 249 | "Using a for loop in Python, we can quickly figure out the number of missing values in each column. As mentioned above, \"True\" represents a missing value, \"False\" means the value is present in the dataset. In the body of the for loop the method \".value_couts()\" counts the number of \"True\" values. " 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "execution_count": null, 255 | "metadata": { 256 | "collapsed": false 257 | }, 258 | "outputs": [], 259 | "source": [ 260 | "for column in missing_data.columns.values.tolist():\n", 261 | " print(column)\n", 262 | " print (missing_data[column].value_counts())\n", 263 | " print(\"\") " 264 | ] 265 | }, 266 | { 267 | "cell_type": "markdown", 268 | "metadata": {}, 269 | "source": [ 270 | "Based on the summary above, each column has 205 rows of data, seven columns containing missing data:\n", 271 | "\n", 272 | "1. \"normalized-losses\": 41 missing data\n", 273 | "2. \"num-of-doors\": 2 missing data\n", 274 | "3. \"bore\": 4 missing data\n", 275 | "4. \"stroke\" : 4 missing data\n", 276 | "5. \"horsepower\": 2 missing data\n", 277 | "6. \"peak-rpm\": 2 missing data\n", 278 | "7. \"price\": 4 missing data" 279 | ] 280 | }, 281 | { 282 | "cell_type": "markdown", 283 | "metadata": {}, 284 | "source": [ 285 | "\n", 286 | "## Deal with missing data\n", 287 | "**How to deal with missing data?**\n", 288 | "\n", 289 | " \n", 290 | " 1. drop data \n", 291 | " a. drop the whole row\n", 292 | " b. drop the whole column\n", 293 | " 2. replace data\n", 294 | " a. replace it by mean\n", 295 | " b. replace it by frequency\n", 296 | " c. replace it based on other functions" 297 | ] 298 | }, 299 | { 300 | "cell_type": "markdown", 301 | "metadata": {}, 302 | "source": [ 303 | "Whole columns should be dropped only if most entries in the column are empty. In our dataset, none of the columns are empty enough to drop entirely.\n", 304 | "We have some freedom in choosing which method to replace data; however, some methods may seem more reasonable than others. We will apply each method to many different columns:\n", 305 | "\n", 306 | "**Replace by mean:**\n", 307 | "\n", 308 | " \"normalized-losses\": 41 missing data, replace them with mean\n", 309 | " \"stroke\": 4 missing data, replace them with mean\n", 310 | " \"bore\": 4 missing data, replace them with mean\n", 311 | " \"horsepower\": 2 missing data, replace them with mean\n", 312 | " \"peak-rpm\": 2 missing data, replace them with mean\n", 313 | " \n", 314 | "**Replace by frequency:**\n", 315 | "\n", 316 | " \"num-of-doors\": 2 missing data, replace them with \"four\". \n", 317 | " * Reason: 84% sedans is four doors. Since four doors is most frequent, it is most likely to \n", 318 | " \n", 319 | "\n", 320 | "**Drop the whole row:**\n", 321 | "\n", 322 | " \"price\": 4 missing data, simply delete the whole row\n", 323 | " * Reason: price is what we want to predict. Any data entry without price data cannot be used for prediction; therefore they are not useful to us" 324 | ] 325 | }, 326 | { 327 | "cell_type": "markdown", 328 | "metadata": {}, 329 | "source": [ 330 | "#### Calculate the average of the column " 331 | ] 332 | }, 333 | { 334 | "cell_type": "code", 335 | "execution_count": null, 336 | "metadata": { 337 | "collapsed": false 338 | }, 339 | "outputs": [], 340 | "source": [ 341 | "avg_1 = df[\"normalized-losses\"].astype(\"float\").mean(axis = 0)" 342 | ] 343 | }, 344 | { 345 | "cell_type": "markdown", 346 | "metadata": {}, 347 | "source": [ 348 | "#### Replace \"NaN\" by mean value in \"normalized-losses\" column" 349 | ] 350 | }, 351 | { 352 | "cell_type": "code", 353 | "execution_count": null, 354 | "metadata": { 355 | "collapsed": true 356 | }, 357 | "outputs": [], 358 | "source": [ 359 | "df[\"normalized-losses\"].replace(np.nan, avg_1, inplace = True)" 360 | ] 361 | }, 362 | { 363 | "cell_type": "markdown", 364 | "metadata": {}, 365 | "source": [ 366 | "#### Calculate the mean value for 'bore' column" 367 | ] 368 | }, 369 | { 370 | "cell_type": "code", 371 | "execution_count": null, 372 | "metadata": { 373 | "collapsed": true 374 | }, 375 | "outputs": [], 376 | "source": [ 377 | "avg_2=df['bore'].astype('float').mean(axis=0)" 378 | ] 379 | }, 380 | { 381 | "cell_type": "markdown", 382 | "metadata": {}, 383 | "source": [ 384 | "#### Replace NaN by mean value" 385 | ] 386 | }, 387 | { 388 | "cell_type": "code", 389 | "execution_count": null, 390 | "metadata": { 391 | "collapsed": true 392 | }, 393 | "outputs": [], 394 | "source": [ 395 | "df['bore'].replace(np.nan, avg_2, inplace= True)" 396 | ] 397 | }, 398 | { 399 | "cell_type": "markdown", 400 | "metadata": {}, 401 | "source": [ 402 | "
    \n", 403 | "

    Question #1:

    \n", 404 | "\n", 405 | "According to the example above, replace NaN in \"stroke\" column by mean.\n", 406 | "
    " 407 | ] 408 | }, 409 | { 410 | "cell_type": "code", 411 | "execution_count": null, 412 | "metadata": { 413 | "collapsed": false 414 | }, 415 | "outputs": [], 416 | "source": [] 417 | }, 418 | { 419 | "cell_type": "markdown", 420 | "metadata": {}, 421 | "source": [ 422 | "#### Calculate the mean value for the 'horsepower' column:" 423 | ] 424 | }, 425 | { 426 | "cell_type": "code", 427 | "execution_count": null, 428 | "metadata": { 429 | "collapsed": true 430 | }, 431 | "outputs": [], 432 | "source": [ 433 | "avg_4=df['horsepower'].astype('float').mean(axis=0)" 434 | ] 435 | }, 436 | { 437 | "cell_type": "markdown", 438 | "metadata": {}, 439 | "source": [ 440 | "#### Replace \"NaN\" by mean value :" 441 | ] 442 | }, 443 | { 444 | "cell_type": "code", 445 | "execution_count": null, 446 | "metadata": { 447 | "collapsed": true 448 | }, 449 | "outputs": [], 450 | "source": [ 451 | "df['horsepower'].replace(np.nan, avg_4, inplace= True)" 452 | ] 453 | }, 454 | { 455 | "cell_type": "markdown", 456 | "metadata": {}, 457 | "source": [ 458 | "#### Calculate the mean value for 'peak-rpm' column:" 459 | ] 460 | }, 461 | { 462 | "cell_type": "code", 463 | "execution_count": null, 464 | "metadata": { 465 | "collapsed": true 466 | }, 467 | "outputs": [], 468 | "source": [ 469 | "avg_5=df['peak-rpm'].astype('float').mean(axis=0)" 470 | ] 471 | }, 472 | { 473 | "cell_type": "markdown", 474 | "metadata": {}, 475 | "source": [ 476 | "#### Replace NaN by mean value:" 477 | ] 478 | }, 479 | { 480 | "cell_type": "code", 481 | "execution_count": null, 482 | "metadata": { 483 | "collapsed": true 484 | }, 485 | "outputs": [], 486 | "source": [ 487 | "df['peak-rpm'].replace(np.nan, avg_5, inplace= True)" 488 | ] 489 | }, 490 | { 491 | "cell_type": "markdown", 492 | "metadata": {}, 493 | "source": [ 494 | "To see which values are present in a particular column, we can use the \".value_counts()\" method:" 495 | ] 496 | }, 497 | { 498 | "cell_type": "code", 499 | "execution_count": null, 500 | "metadata": { 501 | "collapsed": false 502 | }, 503 | "outputs": [], 504 | "source": [ 505 | "df['num-of-doors'].value_counts()" 506 | ] 507 | }, 508 | { 509 | "cell_type": "markdown", 510 | "metadata": {}, 511 | "source": [ 512 | "We can see that four doors are the most common type. We can also use the \".idxmax()\" method to calculate for us the most common type automatically:" 513 | ] 514 | }, 515 | { 516 | "cell_type": "code", 517 | "execution_count": null, 518 | "metadata": { 519 | "collapsed": false 520 | }, 521 | "outputs": [], 522 | "source": [ 523 | "df['num-of-doors'].value_counts().idxmax()" 524 | ] 525 | }, 526 | { 527 | "cell_type": "markdown", 528 | "metadata": {}, 529 | "source": [ 530 | "The replacement procedure is very similar to what we have seen previously" 531 | ] 532 | }, 533 | { 534 | "cell_type": "code", 535 | "execution_count": null, 536 | "metadata": { 537 | "collapsed": false 538 | }, 539 | "outputs": [], 540 | "source": [ 541 | "#replace the missing 'num-of-doors' values by the most frequent \n", 542 | "df[\"num-of-doors\"].replace(np.nan, \"four\", inplace = True)" 543 | ] 544 | }, 545 | { 546 | "cell_type": "markdown", 547 | "metadata": {}, 548 | "source": [ 549 | "Finally, let's drop all rows that do not have price data:" 550 | ] 551 | }, 552 | { 553 | "cell_type": "code", 554 | "execution_count": null, 555 | "metadata": { 556 | "collapsed": true 557 | }, 558 | "outputs": [], 559 | "source": [ 560 | "# simply drop whole row with NaN in \"price\" column\n", 561 | "df.dropna(subset=[\"price\"], axis=0, inplace = True)\n", 562 | "\n", 563 | "# reset index, because we droped two rows\n", 564 | "df.reset_index(drop = True, inplace = True)" 565 | ] 566 | }, 567 | { 568 | "cell_type": "code", 569 | "execution_count": null, 570 | "metadata": { 571 | "collapsed": false 572 | }, 573 | "outputs": [], 574 | "source": [ 575 | "df.head()" 576 | ] 577 | }, 578 | { 579 | "cell_type": "markdown", 580 | "metadata": {}, 581 | "source": [ 582 | "**Good!** Now, we obtain the dataset with no missing values." 583 | ] 584 | }, 585 | { 586 | "cell_type": "markdown", 587 | "metadata": {}, 588 | "source": [ 589 | "\n", 590 | "## Correct data format\n", 591 | "**We are almost there!**\n", 592 | "
    The last step in data cleaning is checking and making sure that all data is in the correct format (int, float, text or other).
    \n", 593 | "\n", 594 | "In Pandas, we use \n", 595 | "
    **.dtype()** to check the data type
    \n", 596 | "
    **.astype()** to change the data type
    " 597 | ] 598 | }, 599 | { 600 | "cell_type": "markdown", 601 | "metadata": {}, 602 | "source": [ 603 | "#### Lets list the data types for each column" 604 | ] 605 | }, 606 | { 607 | "cell_type": "code", 608 | "execution_count": null, 609 | "metadata": { 610 | "collapsed": false 611 | }, 612 | "outputs": [], 613 | "source": [ 614 | "df.dtypes" 615 | ] 616 | }, 617 | { 618 | "cell_type": "markdown", 619 | "metadata": {}, 620 | "source": [ 621 | "As we can see above, some columns are not of the correct data type. Numerical variables should have type 'float' or 'int', and variables with strings such as categories should have type 'object'. For example, 'bore' and 'stroke' variables are numerical values that describe the engines, so we should expect them to be of the type 'float' or 'int'; however, they are shown as type 'object'. We have to convert data types into a proper format for each column using the \"astype()\" method. " 622 | ] 623 | }, 624 | { 625 | "cell_type": "markdown", 626 | "metadata": {}, 627 | "source": [ 628 | "#### Convert data types to proper format" 629 | ] 630 | }, 631 | { 632 | "cell_type": "code", 633 | "execution_count": null, 634 | "metadata": { 635 | "collapsed": false 636 | }, 637 | "outputs": [], 638 | "source": [ 639 | "\n", 640 | "df[[\"bore\", \"stroke\"]] = df[[\"bore\", \"stroke\"]].astype(\"float\")\n", 641 | "df[[\"normalized-losses\"]] = df[[\"normalized-losses\"]].astype(\"int\")\n", 642 | "df[[\"price\"]] = df[[\"price\"]].astype(\"float\")\n", 643 | "df[[\"peak-rpm\"]] = df[[\"peak-rpm\"]].astype(\"float\")\n", 644 | "print(\"Done\")" 645 | ] 646 | }, 647 | { 648 | "cell_type": "markdown", 649 | "metadata": {}, 650 | "source": [ 651 | "#### Let us list the columns after the conversion " 652 | ] 653 | }, 654 | { 655 | "cell_type": "code", 656 | "execution_count": null, 657 | "metadata": { 658 | "collapsed": false 659 | }, 660 | "outputs": [], 661 | "source": [ 662 | "df.dtypes" 663 | ] 664 | }, 665 | { 666 | "cell_type": "markdown", 667 | "metadata": {}, 668 | "source": [ 669 | "**Wonderful!**\n", 670 | "\n", 671 | "Now, we finally obtain the cleaned dataset with no missing values and all data in its proper format." 672 | ] 673 | }, 674 | { 675 | "cell_type": "markdown", 676 | "metadata": {}, 677 | "source": [ 678 | "\n", 679 | "# Data Standardization\n", 680 | "Data is usually collected from different agencies with different formats.\n", 681 | "(Data Standardization is also a term for a particular type of data normalization, where we subtract the mean and divide by the standard deviation)\n", 682 | "\n", 683 | "**What is Standardization?**\n", 684 | "
    Standardization is the process of transforming data into a common format which allows the researcher to make the meaningful comparison.\n", 685 | "
    \n", 686 | "\n", 687 | "**Example**\n", 688 | "
    Transform mpg to L/100km:
    \n", 689 | "
    In our dataset, the fuel consumption columns \"city-mpg\" and \"highway-mpg\" are represented by mpg (miles per gallon) unit. Assume we are developing an application in a country that accept the fuel consumption with L/100km standard
    \n", 690 | "
    We will need to apply **data transformation** to transform mpg into L/100km?
    \n" 691 | ] 692 | }, 693 | { 694 | "cell_type": "markdown", 695 | "metadata": {}, 696 | "source": [ 697 | "The formula for unit conversion is\n", 698 | "L/100km = 235 / mpg\n", 699 | "
    We can do many mathematical operations directly in Pandas.
    " 700 | ] 701 | }, 702 | { 703 | "cell_type": "code", 704 | "execution_count": null, 705 | "metadata": { 706 | "collapsed": false 707 | }, 708 | "outputs": [], 709 | "source": [ 710 | "df.head()" 711 | ] 712 | }, 713 | { 714 | "cell_type": "code", 715 | "execution_count": null, 716 | "metadata": { 717 | "collapsed": false 718 | }, 719 | "outputs": [], 720 | "source": [ 721 | "# transform mpg to L/100km by mathematical operation (235 divided by mpg)\n", 722 | "df['city-L/100km'] = 235/df[\"city-mpg\"]\n", 723 | "\n", 724 | "# check your transformed data \n", 725 | "df.head()" 726 | ] 727 | }, 728 | { 729 | "cell_type": "markdown", 730 | "metadata": {}, 731 | "source": [ 732 | "
    \n", 733 | "

    Question #2:

    \n", 734 | "\n", 735 | "According to the example above, transform mpg to L/100km in the column of \"highway-mpg\", and change the name of column to \"highway-L/100km\".\n", 736 | "
    " 737 | ] 738 | }, 739 | { 740 | "cell_type": "code", 741 | "execution_count": null, 742 | "metadata": { 743 | "collapsed": false 744 | }, 745 | "outputs": [], 746 | "source": [] 747 | }, 748 | { 749 | "cell_type": "markdown", 750 | "metadata": {}, 751 | "source": [ 752 | "\n", 753 | "# Data Normalization \n", 754 | "\n", 755 | "**Why normalization?**\n", 756 | "
    Normalization is the process of transforming values of several variables into a similar range. Typical normalizations include scaling the variable so the variable average is 0, scaling the variable so the variable variance is 1, or scaling variable so the variable values range from 0 to 1\n", 757 | "
    \n", 758 | "\n", 759 | "**Example**\n", 760 | "
    To demonstrate normalization, let's say we want to scale the columns \"length\", \"width\" and \"height\"
    \n", 761 | "
    **Target:** we would like to Normalize those variables so their value ranges from 0 to 1.
    \n", 762 | "
    **Approach:** replace origianl value by (original value)/(maximum value)
    " 763 | ] 764 | }, 765 | { 766 | "cell_type": "code", 767 | "execution_count": null, 768 | "metadata": { 769 | "collapsed": false 770 | }, 771 | "outputs": [], 772 | "source": [ 773 | "# replace (origianl value) by (original value)/(maximum value)\n", 774 | "df['length'] = df['length']/df['length'].max()\n", 775 | "df['width'] = df['width']/df['width'].max()" 776 | ] 777 | }, 778 | { 779 | "cell_type": "markdown", 780 | "metadata": {}, 781 | "source": [ 782 | "
    \n", 783 | "

    Questiont #3:

    \n", 784 | "\n", 785 | "According to the example above, normalize the column \"height\".\n", 786 | "
    " 787 | ] 788 | }, 789 | { 790 | "cell_type": "code", 791 | "execution_count": null, 792 | "metadata": { 793 | "collapsed": false 794 | }, 795 | "outputs": [], 796 | "source": [ 797 | "\n" 798 | ] 799 | }, 800 | { 801 | "cell_type": "markdown", 802 | "metadata": {}, 803 | "source": [ 804 | "Here we can see, we've normalized \"length\", \"width\" and \"height\" in the range of [0,1]." 805 | ] 806 | }, 807 | { 808 | "cell_type": "markdown", 809 | "metadata": {}, 810 | "source": [ 811 | "# About the Authors: \n", 812 | "\n", 813 | "This notebook written by [Mahdi Noorian PhD](https://www.linkedin.com/in/mahdi-noorian-58219234/) ,[Joseph Santarcangelo PhD]( https://www.linkedin.com/in/joseph-s-50398b136/), Bahare Talayian, Eric Xiao, Steven Dong, Parizad , Hima Vsudevan and [Fiorella Wenver](https://www.linkedin.com/in/fiorellawever/).\n", 814 | "Copyright © 2017 [cognitiveclass.ai](cognitiveclass.ai?utm_source=bducopyrightlink&utm_medium=dswb&utm_campaign=bdu). This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/)." 815 | ] 816 | } 817 | ], 818 | "metadata": { 819 | "anaconda-cloud": {}, 820 | "kernelspec": { 821 | "display_name": "Python [default]", 822 | "language": "python", 823 | "name": "python3" 824 | }, 825 | "language_info": { 826 | "codemirror_mode": { 827 | "name": "ipython", 828 | "version": 3 829 | }, 830 | "file_extension": ".py", 831 | "mimetype": "text/x-python", 832 | "name": "python", 833 | "nbconvert_exporter": "python", 834 | "pygments_lexer": "ipython3", 835 | "version": "3.5.2" 836 | } 837 | }, 838 | "nbformat": 4, 839 | "nbformat_minor": 2 840 | } 841 | -------------------------------------------------------------------------------- /01-Intro.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"SRTTU\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "#
    Variables (متغیرها)
    " 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "
    \n", 22 | "\n", 23 | "نامی که برای نشان دادن چیزی یا مقداری استفاده می‌شود متغیر نامیده می شود. در پایتون متغیرها به صورت زیر اعلان و مقدار دهی می‌شوند\n", 24 | "
    " 25 | ] 26 | }, 27 | { 28 | "cell_type": "code", 29 | "execution_count": 1, 30 | "metadata": { 31 | "collapsed": true 32 | }, 33 | "outputs": [], 34 | "source": [ 35 | "a = 10\n", 36 | "b = 56666\n", 37 | "c = 'Salam'" 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": 2, 43 | "metadata": { 44 | "collapsed": false 45 | }, 46 | "outputs": [ 47 | { 48 | "name": "stdout", 49 | "output_type": "stream", 50 | "text": [ 51 | "56676 Salam\n" 52 | ] 53 | } 54 | ], 55 | "source": [ 56 | "print (a + b, c)" 57 | ] 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "metadata": {}, 62 | "source": [ 63 | "
    \n", 64 | "چند متغیر می‌توانند با یک مقدار، مقداردهی بشوند.\n", 65 | "
    " 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": null, 71 | "metadata": { 72 | "collapsed": true 73 | }, 74 | "outputs": [], 75 | "source": [ 76 | "x = y = 1" 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": null, 82 | "metadata": { 83 | "collapsed": false 84 | }, 85 | "outputs": [], 86 | "source": [ 87 | "print (x,y)" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "#
    comment (توضیح کد)
    \n" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": null, 100 | "metadata": { 101 | "collapsed": false 102 | }, 103 | "outputs": [], 104 | "source": [ 105 | "print('Hello Python') #this my comment\n", 106 | "#print('Hi')" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": null, 112 | "metadata": { 113 | "collapsed": false 114 | }, 115 | "outputs": [], 116 | "source": [ 117 | "\"\"\"\n", 118 | "This is My multi-line comment,\n", 119 | "It's an example!\n", 120 | "print('1')\n", 121 | "\"\"\"\n", 122 | "print(\"2\")" 123 | ] 124 | }, 125 | { 126 | "cell_type": "markdown", 127 | "metadata": {}, 128 | "source": [ 129 | "#
    انواع داده ای در پایتون
    \n" 130 | ] 131 | }, 132 | { 133 | "cell_type": "markdown", 134 | "metadata": {}, 135 | "source": [ 136 | "\"Data" 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": null, 142 | "metadata": { 143 | "collapsed": false 144 | }, 145 | "outputs": [], 146 | "source": [ 147 | "81 #integer" 148 | ] 149 | }, 150 | { 151 | "cell_type": "code", 152 | "execution_count": null, 153 | "metadata": { 154 | "collapsed": false 155 | }, 156 | "outputs": [], 157 | "source": [ 158 | "5.9 #float" 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": null, 164 | "metadata": { 165 | "collapsed": false 166 | }, 167 | "outputs": [], 168 | "source": [ 169 | "\"Hello Python\" #character string" 170 | ] 171 | }, 172 | { 173 | "cell_type": "markdown", 174 | "metadata": {}, 175 | "source": [ 176 | "
    \n", 177 | "در پایتون می‌توان نوع داده را با تابع \n", 178 | "**`type()`**\n", 179 | "مشاهده کرد.\n", 180 | "
    " 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": null, 186 | "metadata": { 187 | "collapsed": false 188 | }, 189 | "outputs": [], 190 | "source": [ 191 | "type(12)" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": null, 197 | "metadata": { 198 | "collapsed": false 199 | }, 200 | "outputs": [], 201 | "source": [ 202 | "type(3.14)" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": null, 208 | "metadata": { 209 | "collapsed": false 210 | }, 211 | "outputs": [], 212 | "source": [ 213 | "type('salam')" 214 | ] 215 | }, 216 | { 217 | "cell_type": "code", 218 | "execution_count": null, 219 | "metadata": { 220 | "collapsed": false 221 | }, 222 | "outputs": [], 223 | "source": [ 224 | "a = 3 + 4.5\n", 225 | "type(a)" 226 | ] 227 | }, 228 | { 229 | "cell_type": "code", 230 | "execution_count": null, 231 | "metadata": { 232 | "collapsed": false 233 | }, 234 | "outputs": [], 235 | "source": [ 236 | "type(True)" 237 | ] 238 | }, 239 | { 240 | "cell_type": "code", 241 | "execution_count": null, 242 | "metadata": { 243 | "collapsed": false 244 | }, 245 | "outputs": [], 246 | "source": [ 247 | "type(False)" 248 | ] 249 | }, 250 | { 251 | "cell_type": "code", 252 | "execution_count": null, 253 | "metadata": { 254 | "collapsed": false 255 | }, 256 | "outputs": [], 257 | "source": [ 258 | "type(\"False\")" 259 | ] 260 | }, 261 | { 262 | "cell_type": "markdown", 263 | "metadata": {}, 264 | "source": [ 265 | "#
    Operators (عملگرها)
    \n" 266 | ] 267 | }, 268 | { 269 | "cell_type": "markdown", 270 | "metadata": {}, 271 | "source": [ 272 | "##
    عملگرهای ریاضی
    \n", 273 | " Arithmetic Operators" 274 | ] 275 | }, 276 | { 277 | "cell_type": "markdown", 278 | "metadata": {}, 279 | "source": [ 280 | "| علامت | وظیفه |\n", 281 | "|----|---|\n", 282 | "| + | جمع |\n", 283 | "| - | تفریق |\n", 284 | "| / | تقسیم |\n", 285 | "| % | باقیمانده |\n", 286 | "| * | ضرب |\n", 287 | "| // |تقسیم صحیح |\n", 288 | "| ** |توان |\n" 289 | ] 290 | }, 291 | { 292 | "cell_type": "code", 293 | "execution_count": null, 294 | "metadata": { 295 | "collapsed": false 296 | }, 297 | "outputs": [], 298 | "source": [ 299 | "3 + 2" 300 | ] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": null, 305 | "metadata": { 306 | "collapsed": false 307 | }, 308 | "outputs": [], 309 | "source": [ 310 | "5 - 1" 311 | ] 312 | }, 313 | { 314 | "cell_type": "code", 315 | "execution_count": null, 316 | "metadata": { 317 | "collapsed": false 318 | }, 319 | "outputs": [], 320 | "source": [ 321 | "2 * 5" 322 | ] 323 | }, 324 | { 325 | "cell_type": "code", 326 | "execution_count": null, 327 | "metadata": { 328 | "collapsed": false 329 | }, 330 | "outputs": [], 331 | "source": [ 332 | "1 / 2" 333 | ] 334 | }, 335 | { 336 | "cell_type": "code", 337 | "execution_count": null, 338 | "metadata": { 339 | "collapsed": false 340 | }, 341 | "outputs": [], 342 | "source": [ 343 | "a = 1\n", 344 | "b = 2\n", 345 | "a / b" 346 | ] 347 | }, 348 | { 349 | "cell_type": "code", 350 | "execution_count": null, 351 | "metadata": { 352 | "collapsed": false 353 | }, 354 | "outputs": [], 355 | "source": [ 356 | "4 / 2" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": null, 362 | "metadata": { 363 | "collapsed": false 364 | }, 365 | "outputs": [], 366 | "source": [ 367 | "15 % 10" 368 | ] 369 | }, 370 | { 371 | "cell_type": "markdown", 372 | "metadata": {}, 373 | "source": [ 374 | "
    \n", 375 | "تقسیم صحیح به پایین گرد میکند و اعشار تقسیم را در نظر نمی‌گیرد.
    " 376 | ] 377 | }, 378 | { 379 | "cell_type": "code", 380 | "execution_count": null, 381 | "metadata": { 382 | "collapsed": false 383 | }, 384 | "outputs": [], 385 | "source": [ 386 | "5 // 2" 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": null, 392 | "metadata": { 393 | "collapsed": false 394 | }, 395 | "outputs": [], 396 | "source": [ 397 | "3.999 // 2" 398 | ] 399 | }, 400 | { 401 | "cell_type": "code", 402 | "execution_count": null, 403 | "metadata": { 404 | "collapsed": false 405 | }, 406 | "outputs": [], 407 | "source": [ 408 | "43 + 60 + 16 + 41" 409 | ] 410 | }, 411 | { 412 | "cell_type": "markdown", 413 | "metadata": {}, 414 | "source": [ 415 | "#
    عملگرهای مقایسه‌ای
    " 416 | ] 417 | }, 418 | { 419 | "cell_type": "markdown", 420 | "metadata": {}, 421 | "source": [ 422 | "| Operation | Meaning |\n", 423 | "|----|---|\n", 424 | "| < | strictly less than |\n", 425 | "| <= | less than or equal |\n", 426 | "| > | strictly greater than |\n", 427 | "| >= | greater than or equal |\n", 428 | "| == | equal |\n", 429 | "| != | not equal |\n", 430 | "| is | object identity |\n", 431 | "| is not | negated object identity |" 432 | ] 433 | }, 434 | { 435 | "cell_type": "code", 436 | "execution_count": null, 437 | "metadata": { 438 | "collapsed": true 439 | }, 440 | "outputs": [], 441 | "source": [ 442 | "z = 1" 443 | ] 444 | }, 445 | { 446 | "cell_type": "code", 447 | "execution_count": null, 448 | "metadata": { 449 | "collapsed": false 450 | }, 451 | "outputs": [], 452 | "source": [ 453 | "z == 1" 454 | ] 455 | }, 456 | { 457 | "cell_type": "code", 458 | "execution_count": null, 459 | "metadata": { 460 | "collapsed": false 461 | }, 462 | "outputs": [], 463 | "source": [ 464 | "z > 1" 465 | ] 466 | }, 467 | { 468 | "cell_type": "markdown", 469 | "metadata": {}, 470 | "source": [ 471 | "#
    عملگرهای بیتی
    " 472 | ] 473 | }, 474 | { 475 | "cell_type": "markdown", 476 | "metadata": {}, 477 | "source": [ 478 | "| Symbol | Task Performed |\n", 479 | "|----|---|\n", 480 | "| & | Logical And |\n", 481 | "| l | Logical OR |\n", 482 | "| ^ | XOR |\n", 483 | "| ~ | Negate |\n", 484 | "| >> | Right shift |\n", 485 | "| << | Left shift |" 486 | ] 487 | }, 488 | { 489 | "cell_type": "code", 490 | "execution_count": null, 491 | "metadata": { 492 | "collapsed": true 493 | }, 494 | "outputs": [], 495 | "source": [ 496 | "a = 2 #10\n", 497 | "b = 3 #11" 498 | ] 499 | }, 500 | { 501 | "cell_type": "code", 502 | "execution_count": null, 503 | "metadata": { 504 | "collapsed": false 505 | }, 506 | "outputs": [], 507 | "source": [ 508 | "print(a & b)\n", 509 | "bin(a & b)" 510 | ] 511 | }, 512 | { 513 | "cell_type": "code", 514 | "execution_count": null, 515 | "metadata": { 516 | "collapsed": false 517 | }, 518 | "outputs": [], 519 | "source": [ 520 | "5 >> 1" 521 | ] 522 | }, 523 | { 524 | "cell_type": "markdown", 525 | "metadata": {}, 526 | "source": [ 527 | "0000 0101 -> 5 \n", 528 | "\n", 529 | "Shifting the digits by 1 to the right and zero padding\n", 530 | "\n", 531 | "0000 0010 -> 2" 532 | ] 533 | }, 534 | { 535 | "cell_type": "code", 536 | "execution_count": null, 537 | "metadata": { 538 | "collapsed": false 539 | }, 540 | "outputs": [], 541 | "source": [ 542 | "5 << 1" 543 | ] 544 | }, 545 | { 546 | "cell_type": "markdown", 547 | "metadata": {}, 548 | "source": [ 549 | "0000 0101 -> 5 \n", 550 | "\n", 551 | "Shifting the digits by 1 to the left and zero padding\n", 552 | "\n", 553 | "0000 1010 -> 10" 554 | ] 555 | }, 556 | { 557 | "cell_type": "markdown", 558 | "metadata": {}, 559 | "source": [ 560 | "#
    تبدیل از یک نوع به نوع دیگر
    " 561 | ] 562 | }, 563 | { 564 | "cell_type": "markdown", 565 | "metadata": {}, 566 | "source": [ 567 | "
    \n", 568 | "شما می توانید نوع را در پایتون تغییر دهید؛ این عمل typecasting نامیده می‌شود.\n", 569 | "به عنوان مثال، شما می توانید یک عدد صحیح را به یک شناور تبدیل کنید، برای مثال تبدیل 2 به 2.0 .\n", 570 | "
    " 571 | ] 572 | }, 573 | { 574 | "cell_type": "code", 575 | "execution_count": null, 576 | "metadata": { 577 | "collapsed": false 578 | }, 579 | "outputs": [], 580 | "source": [ 581 | "type(2) #verify that this is an integer" 582 | ] 583 | }, 584 | { 585 | "cell_type": "markdown", 586 | "metadata": {}, 587 | "source": [ 588 | "####
    تبدیل به float
    " 589 | ] 590 | }, 591 | { 592 | "cell_type": "code", 593 | "execution_count": null, 594 | "metadata": { 595 | "collapsed": false 596 | }, 597 | "outputs": [], 598 | "source": [ 599 | "float(2)" 600 | ] 601 | }, 602 | { 603 | "cell_type": "code", 604 | "execution_count": null, 605 | "metadata": { 606 | "collapsed": false 607 | }, 608 | "outputs": [], 609 | "source": [ 610 | "type(float(2))" 611 | ] 612 | }, 613 | { 614 | "cell_type": "markdown", 615 | "metadata": {}, 616 | "source": [ 617 | "####
    تبدیل به int
    " 618 | ] 619 | }, 620 | { 621 | "cell_type": "code", 622 | "execution_count": null, 623 | "metadata": { 624 | "collapsed": false 625 | }, 626 | "outputs": [], 627 | "source": [ 628 | "int(5.1)" 629 | ] 630 | }, 631 | { 632 | "cell_type": "markdown", 633 | "metadata": {}, 634 | "source": [ 635 | "####
    تبدیل از نوع رشته ای به float/int
    " 636 | ] 637 | }, 638 | { 639 | "cell_type": "code", 640 | "execution_count": null, 641 | "metadata": { 642 | "collapsed": false 643 | }, 644 | "outputs": [], 645 | "source": [ 646 | "int('1')" 647 | ] 648 | }, 649 | { 650 | "cell_type": "code", 651 | "execution_count": null, 652 | "metadata": { 653 | "collapsed": false 654 | }, 655 | "outputs": [], 656 | "source": [ 657 | "float('1.2')" 658 | ] 659 | }, 660 | { 661 | "cell_type": "markdown", 662 | "metadata": {}, 663 | "source": [ 664 | "
    \n", 665 | "اگر رشته قابل تبدیل به عدد نباشد مفسر پایتون خطا می‌دهد.\n", 666 | "
    " 667 | ] 668 | }, 669 | { 670 | "cell_type": "code", 671 | "execution_count": null, 672 | "metadata": { 673 | "collapsed": false 674 | }, 675 | "outputs": [], 676 | "source": [ 677 | "#int(\"A\") # Error!" 678 | ] 679 | }, 680 | { 681 | "cell_type": "markdown", 682 | "metadata": {}, 683 | "source": [ 684 | "####
    تبدیل به str
    " 685 | ] 686 | }, 687 | { 688 | "cell_type": "code", 689 | "execution_count": null, 690 | "metadata": { 691 | "collapsed": false 692 | }, 693 | "outputs": [], 694 | "source": [ 695 | "str(4)" 696 | ] 697 | }, 698 | { 699 | "cell_type": "code", 700 | "execution_count": null, 701 | "metadata": { 702 | "collapsed": false 703 | }, 704 | "outputs": [], 705 | "source": [ 706 | "str(3.81)" 707 | ] 708 | }, 709 | { 710 | "cell_type": "markdown", 711 | "metadata": {}, 712 | "source": [ 713 | "####
    تبدیل bool
    " 714 | ] 715 | }, 716 | { 717 | "cell_type": "markdown", 718 | "metadata": {}, 719 | "source": [ 720 | "
    \n", 721 | "Boolean یک نوع مهم دیگر در پایتون است؛ یک Boolean می تواند دو ارزش داشته باشد. \n", 722 | "مقدار درست یا ***True*** و مقدار غلط یا ***False***.\n", 723 | "فقط به یاد داشته باشید که حرف نخست بزرگ است و درون ' یا \"\" قرار نمیگیرد.\n", 724 | "
    " 725 | ] 726 | }, 727 | { 728 | "cell_type": "code", 729 | "execution_count": null, 730 | "metadata": { 731 | "collapsed": false 732 | }, 733 | "outputs": [], 734 | "source": [ 735 | "True" 736 | ] 737 | }, 738 | { 739 | "cell_type": "code", 740 | "execution_count": null, 741 | "metadata": { 742 | "collapsed": false 743 | }, 744 | "outputs": [], 745 | "source": [ 746 | "False" 747 | ] 748 | }, 749 | { 750 | "cell_type": "code", 751 | "execution_count": null, 752 | "metadata": { 753 | "collapsed": false 754 | }, 755 | "outputs": [], 756 | "source": [ 757 | "type(True)" 758 | ] 759 | }, 760 | { 761 | "cell_type": "code", 762 | "execution_count": null, 763 | "metadata": { 764 | "collapsed": false 765 | }, 766 | "outputs": [], 767 | "source": [ 768 | "type(False)" 769 | ] 770 | }, 771 | { 772 | "cell_type": "markdown", 773 | "metadata": {}, 774 | "source": [ 775 | "
    \n", 776 | "اگر یک مقدار بولین True را به int یا float تبدیل کنیم مقدار 1، و اگر یک مقدار بولین False را تبدیل کنیم مقدار 0 خواهد شد.\n", 777 | "همچنین اگر مقدار 1 را به bool تبدیل کنید مقدار True و اگر عدد 0 را به bool تبدیل کنید False خواهد شد.\n", 778 | "
    " 779 | ] 780 | }, 781 | { 782 | "cell_type": "code", 783 | "execution_count": null, 784 | "metadata": { 785 | "collapsed": false 786 | }, 787 | "outputs": [], 788 | "source": [ 789 | "int(True)" 790 | ] 791 | }, 792 | { 793 | "cell_type": "code", 794 | "execution_count": null, 795 | "metadata": { 796 | "collapsed": false 797 | }, 798 | "outputs": [], 799 | "source": [ 800 | "int(False)" 801 | ] 802 | }, 803 | { 804 | "cell_type": "code", 805 | "execution_count": null, 806 | "metadata": { 807 | "collapsed": false 808 | }, 809 | "outputs": [], 810 | "source": [ 811 | "bool(1)" 812 | ] 813 | }, 814 | { 815 | "cell_type": "code", 816 | "execution_count": null, 817 | "metadata": { 818 | "collapsed": false 819 | }, 820 | "outputs": [], 821 | "source": [ 822 | "bool(0)" 823 | ] 824 | }, 825 | { 826 | "cell_type": "code", 827 | "execution_count": null, 828 | "metadata": { 829 | "collapsed": false 830 | }, 831 | "outputs": [], 832 | "source": [ 833 | "float(False)" 834 | ] 835 | }, 836 | { 837 | "cell_type": "markdown", 838 | "metadata": {}, 839 | "source": [ 840 | "#
    تبدیل سیستم‌های عددی
    " 841 | ] 842 | }, 843 | { 844 | "cell_type": "markdown", 845 | "metadata": {}, 846 | "source": [ 847 | "
    \n", 848 | "\n", 849 | "تبدیل از مبنای 16 یا hexadecimal به مبنای 10 با اضافه کردن پیشوند **0x** به مبنای 16 یک مقدار و تبدیل بالعکس با تابع \n", 850 | " \n", 851 | "**hex()**\n", 852 | " \n", 853 | "انجام می‌شود.\n", 854 | "همچنین تبدیل مبنای 8 یا octal به مبنای 10 باافزودن پیشوند **0o** به عدد مبنای 8 و تبدیل بالعکس با تابع \n", 855 | " \n", 856 | "**oct()**\n", 857 | " \n", 858 | "انجام می‌شود.\n", 859 | "
    " 860 | ] 861 | }, 862 | { 863 | "cell_type": "code", 864 | "execution_count": null, 865 | "metadata": { 866 | "collapsed": false 867 | }, 868 | "outputs": [], 869 | "source": [ 870 | "hex(170)" 871 | ] 872 | }, 873 | { 874 | "cell_type": "code", 875 | "execution_count": null, 876 | "metadata": { 877 | "collapsed": false 878 | }, 879 | "outputs": [], 880 | "source": [ 881 | "0xAA" 882 | ] 883 | }, 884 | { 885 | "cell_type": "code", 886 | "execution_count": null, 887 | "metadata": { 888 | "collapsed": false 889 | }, 890 | "outputs": [], 891 | "source": [ 892 | "oct(8)" 893 | ] 894 | }, 895 | { 896 | "cell_type": "code", 897 | "execution_count": null, 898 | "metadata": { 899 | "collapsed": false 900 | }, 901 | "outputs": [], 902 | "source": [ 903 | "0o10" 904 | ] 905 | }, 906 | { 907 | "cell_type": "markdown", 908 | "metadata": {}, 909 | "source": [ 910 | "
    \n", 911 | "تابع **int( )** دو مقدار به عنوان ورودی قبول کرده و برای تبدیل مبنا استفاده می‌گردد.\n", 912 | "پارامتر نسخت یک مقدار در سیستم عددی متفاوت و پارامتر دیگر مبنای آن است. \n", 913 | "توجه داشته باشید که پارامتر نخست (مقدار در سیستم عددی دیگر) باید به صورت رشته به این تابع ارسال شود.\n", 914 | "
    " 915 | ] 916 | }, 917 | { 918 | "cell_type": "code", 919 | "execution_count": 1, 920 | "metadata": { 921 | "collapsed": false 922 | }, 923 | "outputs": [ 924 | { 925 | "name": "stdout", 926 | "output_type": "stream", 927 | "text": [ 928 | "8\n", 929 | "170\n", 930 | "10\n" 931 | ] 932 | } 933 | ], 934 | "source": [ 935 | "print(int('010',8))\n", 936 | "print(int('0xaa',16))\n", 937 | "print(int('1010',2))" 938 | ] 939 | }, 940 | { 941 | "cell_type": "markdown", 942 | "metadata": {}, 943 | "source": [ 944 | "
    \n", 945 | "همچنین توجه داشته باشید که تابع \n", 946 | " \n", 947 | "** bin () **\n", 948 | " \n", 949 | "برای مقدار باینری و \n", 950 | " \n", 951 | "** float () **\n", 952 | " \n", 953 | "برای مقادیر اعشاری / شناور(float) استفاده می شود. \n", 954 | " \n", 955 | "** chr () ** \n", 956 | " \n", 957 | "برای تبدیل ASCII به کاراکتر معادل آن استفاده می شود \n", 958 | " \n", 959 | "** ord () **\n", 960 | " \n", 961 | "کد اسکی کاراکتر را خروجی می‌دهد.\n", 962 | "
    " 963 | ] 964 | }, 965 | { 966 | "cell_type": "code", 967 | "execution_count": 3, 968 | "metadata": { 969 | "collapsed": false 970 | }, 971 | "outputs": [ 972 | { 973 | "name": "stdout", 974 | "output_type": "stream", 975 | "text": [ 976 | "4\n", 977 | "5\n" 978 | ] 979 | } 980 | ], 981 | "source": [ 982 | "print(int(4.3))\n", 983 | "print(int('5'))\n" 984 | ] 985 | }, 986 | { 987 | "cell_type": "code", 988 | "execution_count": null, 989 | "metadata": { 990 | "collapsed": false 991 | }, 992 | "outputs": [], 993 | "source": [ 994 | "chr(98)" 995 | ] 996 | }, 997 | { 998 | "cell_type": "code", 999 | "execution_count": null, 1000 | "metadata": { 1001 | "collapsed": false 1002 | }, 1003 | "outputs": [], 1004 | "source": [ 1005 | "ord('b')" 1006 | ] 1007 | }, 1008 | { 1009 | "cell_type": "markdown", 1010 | "metadata": {}, 1011 | "source": [ 1012 | "##
    ساده سازی عملیات ریاضی
    " 1013 | ] 1014 | }, 1015 | { 1016 | "cell_type": "markdown", 1017 | "metadata": {}, 1018 | "source": [ 1019 | "
    \n", 1020 | "تابع \n", 1021 | " \n", 1022 | "**round()**\n", 1023 | " \n", 1024 | "را اگر با یک مقدار فراخوانی کنیم آن عدد را به نزدیکترین عدد صحیح گرد می‌کند. اگر این تابع با دو مقدار فراخوانی شود، پارامتر دوم نشانگر تعداد رقم اعشار برای گرد کردن خواهد بود.\n", 1025 | "
    " 1026 | ] 1027 | }, 1028 | { 1029 | "cell_type": "code", 1030 | "execution_count": null, 1031 | "metadata": { 1032 | "collapsed": false, 1033 | "scrolled": false 1034 | }, 1035 | "outputs": [], 1036 | "source": [ 1037 | "print(round(5.1234))\n", 1038 | "print(round(5.9876))\n", 1039 | "print(round(2.5581, 2))" 1040 | ] 1041 | }, 1042 | { 1043 | "cell_type": "markdown", 1044 | "metadata": {}, 1045 | "source": [ 1046 | "
    \n", 1047 | "تابع \n", 1048 | " \n", 1049 | "**complex( )**\n", 1050 | "  \n", 1051 | "برای تعریف اعداد مختلط و تابع \n", 1052 | "\n", 1053 | "**abs( )**\n", 1054 | "  \n", 1055 | "اندازه‌ی آن را برمی‌گرداند.\n", 1056 | "
    " 1057 | ] 1058 | }, 1059 | { 1060 | "cell_type": "code", 1061 | "execution_count": null, 1062 | "metadata": { 1063 | "collapsed": false 1064 | }, 1065 | "outputs": [], 1066 | "source": [ 1067 | "z = complex('5+2j')\n", 1068 | "print(z)\n", 1069 | "print(abs(z))" 1070 | ] 1071 | }, 1072 | { 1073 | "cell_type": "markdown", 1074 | "metadata": {}, 1075 | "source": [ 1076 | "
    \n", 1077 | "** divmod (x، y) **  \n", 1078 | "خارج قسمت و باقیمانده‌ی تقسیم را به عنوان tuple \n", 1079 | "(ساختمان داده ای که در فصل های دیگر یاد خواهید گرفت) را در فرمت \n", 1080 | "  \n", 1081 | "(باقیمانده، خارج قسمت)\n", 1082 | "خروجی می دهد.\n", 1083 | "
    فرمت خروجی: (quotient, remainder)" 1084 | ] 1085 | }, 1086 | { 1087 | "cell_type": "code", 1088 | "execution_count": null, 1089 | "metadata": { 1090 | "collapsed": false 1091 | }, 1092 | "outputs": [], 1093 | "source": [ 1094 | "divmod(9,2)" 1095 | ] 1096 | }, 1097 | { 1098 | "cell_type": "markdown", 1099 | "metadata": {}, 1100 | "source": [ 1101 | "
    \n", 1102 | "**isinstance( )** \n", 1103 | "اگر ورودی اول این تابع نمونه ای از کلاس ورودی دوم این تابع باشد True و در غیر این صورت False خروجی می‌دهد.\n", 1104 | "می‌توان یک نمونه را همزمان با چند کلاس مختلف نیز مقایسه کرد.\n", 1105 | "
    " 1106 | ] 1107 | }, 1108 | { 1109 | "cell_type": "code", 1110 | "execution_count": null, 1111 | "metadata": { 1112 | "collapsed": false 1113 | }, 1114 | "outputs": [], 1115 | "source": [ 1116 | "print (isinstance(1, int))\n", 1117 | "print (isinstance(1.0,int))\n", 1118 | "print (isinstance(1.0,(int,float)))" 1119 | ] 1120 | }, 1121 | { 1122 | "cell_type": "code", 1123 | "execution_count": null, 1124 | "metadata": { 1125 | "collapsed": false 1126 | }, 1127 | "outputs": [], 1128 | "source": [ 1129 | "print (isinstance('salam',(int,float)))\n", 1130 | "print (isinstance('salam',(int,float,str)))" 1131 | ] 1132 | }, 1133 | { 1134 | "cell_type": "markdown", 1135 | "metadata": {}, 1136 | "source": [ 1137 | "
    \n", 1138 | "**pow(x,y)** \n", 1139 | "برای یافتن\n", 1140 | "$x^y$ \n", 1141 | "استفاده می‌شود. \n", 1142 | "\n", 1143 | "همچنین می‌توان با سه پارامتر این تابع را فراخوانی کرد. \n", 1144 | "**pow(x,y,z)** \n", 1145 | "که خروجی باقیمانده عملیات توان\n", 1146 | "$x^y$ \n", 1147 | "بر پارامتر سوم یا z است، یعنی:\n", 1148 | " ($x^y$ % z) \n", 1149 | "
    \n" 1150 | ] 1151 | }, 1152 | { 1153 | "cell_type": "code", 1154 | "execution_count": null, 1155 | "metadata": { 1156 | "collapsed": false 1157 | }, 1158 | "outputs": [], 1159 | "source": [ 1160 | "print(pow(3,3))\n", 1161 | "print(pow(3,3,5))" 1162 | ] 1163 | }, 1164 | { 1165 | "cell_type": "markdown", 1166 | "metadata": {}, 1167 | "source": [ 1168 | "##
    ورودی از کاربر
    " 1169 | ] 1170 | }, 1171 | { 1172 | "cell_type": "markdown", 1173 | "metadata": {}, 1174 | "source": [ 1175 | "
    \n", 1176 | "تابع\n", 1177 | " **input( )** \n", 1178 | " همواره از نوع رشته خروجی می‌دهد. برای عملیات ریاضی باید خروجی را به نوع مورد نظر عددی (int یا float) تبدیل کنید.\n", 1179 | "
    " 1180 | ] 1181 | }, 1182 | { 1183 | "cell_type": "code", 1184 | "execution_count": null, 1185 | "metadata": { 1186 | "collapsed": false 1187 | }, 1188 | "outputs": [], 1189 | "source": [ 1190 | "name = input(\"What is your name? \\t\")\n", 1191 | "print (\"Hello dear \", name )" 1192 | ] 1193 | }, 1194 | { 1195 | "cell_type": "code", 1196 | "execution_count": null, 1197 | "metadata": { 1198 | "collapsed": false 1199 | }, 1200 | "outputs": [], 1201 | "source": [ 1202 | "in1 = input(\"Type something:\")\n", 1203 | "print(in1 * 3)" 1204 | ] 1205 | }, 1206 | { 1207 | "cell_type": "code", 1208 | "execution_count": null, 1209 | "metadata": { 1210 | "collapsed": false 1211 | }, 1212 | "outputs": [], 1213 | "source": [ 1214 | "in2 = input(\"Type something:\")\n", 1215 | "in2 = int(in2)\n", 1216 | "print(in2 * 3)" 1217 | ] 1218 | }, 1219 | { 1220 | "cell_type": "markdown", 1221 | "metadata": {}, 1222 | "source": [ 1223 | "
    \n", 1224 | "
    دانشگاه تربیت دبیر شهید رجایی
    کارگاه پایتون
    علیرضا اخوان پور
    96-97
    \n", 1225 | "
    \n", 1226 | "SRTTU.edu - PyLab.AkhavanPour.ir - AkhavanPour.ir\n", 1227 | "
    " 1228 | ] 1229 | }, 1230 | { 1231 | "cell_type": "code", 1232 | "execution_count": null, 1233 | "metadata": { 1234 | "collapsed": true 1235 | }, 1236 | "outputs": [], 1237 | "source": [] 1238 | } 1239 | ], 1240 | "metadata": { 1241 | "anaconda-cloud": {}, 1242 | "kernelspec": { 1243 | "display_name": "Python [conda env:virtual_platform]", 1244 | "language": "python", 1245 | "name": "conda-env-virtual_platform-py" 1246 | }, 1247 | "language_info": { 1248 | "codemirror_mode": { 1249 | "name": "ipython", 1250 | "version": 3 1251 | }, 1252 | "file_extension": ".py", 1253 | "mimetype": "text/x-python", 1254 | "name": "python", 1255 | "nbconvert_exporter": "python", 1256 | "pygments_lexer": "ipython3", 1257 | "version": "3.5.4" 1258 | } 1259 | }, 1260 | "nbformat": 4, 1261 | "nbformat_minor": 0 1262 | } 1263 | -------------------------------------------------------------------------------- /03-String operations.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"SRTTU\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "
    " 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "#
    فهرست مطالب:
    \n", 22 | "\n", 23 | "
    \n", 24 | "
  • Strings (رشته‌ها)
  • \n", 25 | "
  • اندیس‌ها در رشته (indexing)
  • \n", 26 | "
  • استفاده از Escape در رشته ها
  • \n", 27 | "
  • عملیات رشته‌ای
  • \n", 28 | "
  • کوئیز
  • \n", 29 | "
    \n", 30 | "

    \n", 31 | "زمان تقریبی: 15 دقیقه\n", 32 | "
    \n", 33 | "\n", 34 | "
    " 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "metadata": {}, 40 | "source": [ 41 | "\n", 42 | "#
    Strings (رشته‌ها)
    \n", 43 | "
    \n", 44 | "نوع داده ای رشته یا string در پایتون:\n", 45 | "\n", 49 | "
    " 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "metadata": {}, 55 | "source": [ 56 | "
    \n", 57 | "رشته می‌تواند در دابل کوتیشن یا \" تعریف شود\n", 58 | "
    " 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": 1, 64 | "metadata": { 65 | "collapsed": false 66 | }, 67 | "outputs": [ 68 | { 69 | "data": { 70 | "text/plain": [ 71 | "'SRTTU University'" 72 | ] 73 | }, 74 | "execution_count": 1, 75 | "metadata": {}, 76 | "output_type": "execute_result" 77 | } 78 | ], 79 | "source": [ 80 | "\"SRTTU University\"" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "metadata": {}, 86 | "source": [ 87 | "
    \n", 88 | "همچنین میتوان رشته را در تک کوتیشن یا ' نیز تعریف کرد\n", 89 | "
    " 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": 2, 95 | "metadata": { 96 | "collapsed": false 97 | }, 98 | "outputs": [ 99 | { 100 | "data": { 101 | "text/plain": [ 102 | "'SRTTU University'" 103 | ] 104 | }, 105 | "execution_count": 2, 106 | "metadata": {}, 107 | "output_type": "execute_result" 108 | } 109 | ], 110 | "source": [ 111 | "'SRTTU University'" 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "metadata": {}, 117 | "source": [ 118 | "###
    چند مثال از رشته
    \n", 119 | "A string can be spaces, digits:" 120 | ] 121 | }, 122 | { 123 | "cell_type": "code", 124 | "execution_count": 3, 125 | "metadata": { 126 | "collapsed": false 127 | }, 128 | "outputs": [ 129 | { 130 | "data": { 131 | "text/plain": [ 132 | "'1 2 3 4 5 6 '" 133 | ] 134 | }, 135 | "execution_count": 3, 136 | "metadata": {}, 137 | "output_type": "execute_result" 138 | } 139 | ], 140 | "source": [ 141 | "'1 2 3 4 5 6 '" 142 | ] 143 | }, 144 | { 145 | "cell_type": "markdown", 146 | "metadata": {}, 147 | "source": [ 148 | "A String can also be special characters : " 149 | ] 150 | }, 151 | { 152 | "cell_type": "code", 153 | "execution_count": 4, 154 | "metadata": { 155 | "collapsed": false 156 | }, 157 | "outputs": [ 158 | { 159 | "data": { 160 | "text/plain": [ 161 | "'@#2_#]&*^%$'" 162 | ] 163 | }, 164 | "execution_count": 4, 165 | "metadata": {}, 166 | "output_type": "execute_result" 167 | } 168 | ], 169 | "source": [ 170 | "'@#2_#]&*^%$'" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "we can print out string using the print statement:" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": 5, 183 | "metadata": { 184 | "collapsed": false 185 | }, 186 | "outputs": [ 187 | { 188 | "name": "stdout", 189 | "output_type": "stream", 190 | "text": [ 191 | "Hello!\n" 192 | ] 193 | } 194 | ], 195 | "source": [ 196 | "print(\"Hello!\")" 197 | ] 198 | }, 199 | { 200 | "cell_type": "markdown", 201 | "metadata": {}, 202 | "source": [ 203 | "We can bind or assign a string to another variable \n" 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": 6, 209 | "metadata": { 210 | "collapsed": false 211 | }, 212 | "outputs": [ 213 | { 214 | "data": { 215 | "text/plain": [ 216 | "'SRTTU'" 217 | ] 218 | }, 219 | "execution_count": 6, 220 | "metadata": {}, 221 | "output_type": "execute_result" 222 | } 223 | ], 224 | "source": [ 225 | "Name= \"SRTTU\"\n", 226 | "Name" 227 | ] 228 | }, 229 | { 230 | "cell_type": "markdown", 231 | "metadata": {}, 232 | "source": [ 233 | " \n", 234 | "#
    اندیس‌ها در رشته (indexing)
    " 235 | ] 236 | }, 237 | { 238 | "cell_type": "markdown", 239 | "metadata": {}, 240 | "source": [ 241 | "
    \n", 242 | " \n", 246 | "
    " 247 | ] 248 | }, 249 | { 250 | "cell_type": "markdown", 251 | "metadata": {}, 252 | "source": [ 253 | "\"string" 254 | ] 255 | }, 256 | { 257 | "cell_type": "markdown", 258 | "metadata": {}, 259 | "source": [ 260 | "
    \n", 261 | "چند مثال:
    " 262 | ] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "execution_count": 7, 267 | "metadata": { 268 | "collapsed": false 269 | }, 270 | "outputs": [ 271 | { 272 | "name": "stdout", 273 | "output_type": "stream", 274 | "text": [ 275 | "S\n" 276 | ] 277 | } 278 | ], 279 | "source": [ 280 | "print(Name[0])" 281 | ] 282 | }, 283 | { 284 | "cell_type": "code", 285 | "execution_count": 8, 286 | "metadata": { 287 | "collapsed": false 288 | }, 289 | "outputs": [ 290 | { 291 | "name": "stdout", 292 | "output_type": "stream", 293 | "text": [ 294 | "U\n" 295 | ] 296 | } 297 | ], 298 | "source": [ 299 | "print(Name[4])" 300 | ] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": 9, 305 | "metadata": { 306 | "collapsed": false 307 | }, 308 | "outputs": [ 309 | { 310 | "name": "stdout", 311 | "output_type": "stream", 312 | "text": [ 313 | "R\n" 314 | ] 315 | } 316 | ], 317 | "source": [ 318 | "print(Name[1])" 319 | ] 320 | }, 321 | { 322 | "cell_type": "markdown", 323 | "metadata": {}, 324 | "source": [ 325 | "
    \n", 326 | "در پایتون می‌توانیم از اندیس منفی استفاده کنیم.
    " 327 | ] 328 | }, 329 | { 330 | "cell_type": "markdown", 331 | "metadata": {}, 332 | "source": [ 333 | "\"negative" 334 | ] 335 | }, 336 | { 337 | "cell_type": "markdown", 338 | "metadata": {}, 339 | "source": [ 340 | "
    \n", 341 | "آخرین کاراکتر رشته با اندیس 1- قابل دسترسی است.\n", 342 | "
    " 343 | ] 344 | }, 345 | { 346 | "cell_type": "code", 347 | "execution_count": 10, 348 | "metadata": { 349 | "collapsed": false 350 | }, 351 | "outputs": [ 352 | { 353 | "name": "stdout", 354 | "output_type": "stream", 355 | "text": [ 356 | "U\n" 357 | ] 358 | } 359 | ], 360 | "source": [ 361 | "print(Name[-1])" 362 | ] 363 | }, 364 | { 365 | "cell_type": "markdown", 366 | "metadata": {}, 367 | "source": [ 368 | "
    \n", 369 | "اولین سلول در این مثال:
    " 370 | ] 371 | }, 372 | { 373 | "cell_type": "code", 374 | "execution_count": 11, 375 | "metadata": { 376 | "collapsed": false 377 | }, 378 | "outputs": [ 379 | { 380 | "name": "stdout", 381 | "output_type": "stream", 382 | "text": [ 383 | "S\n" 384 | ] 385 | } 386 | ], 387 | "source": [ 388 | "print(Name[-5])" 389 | ] 390 | }, 391 | { 392 | "cell_type": "markdown", 393 | "metadata": {}, 394 | "source": [ 395 | "###
    طول رشته
    " 396 | ] 397 | }, 398 | { 399 | "cell_type": "code", 400 | "execution_count": 12, 401 | "metadata": { 402 | "collapsed": false 403 | }, 404 | "outputs": [ 405 | { 406 | "data": { 407 | "text/plain": [ 408 | "5" 409 | ] 410 | }, 411 | "execution_count": 12, 412 | "metadata": {}, 413 | "output_type": "execute_result" 414 | } 415 | ], 416 | "source": [ 417 | "len('SRTTU')" 418 | ] 419 | }, 420 | { 421 | "cell_type": "markdown", 422 | "metadata": {}, 423 | "source": [ 424 | "###
    برش یک زیر رشته (slicing)
    \n", 425 | "'SRTTU'[1:3]\n", 426 | "
    \n", 427 | " \n", 433 | "
    " 434 | ] 435 | }, 436 | { 437 | "cell_type": "code", 438 | "execution_count": 13, 439 | "metadata": { 440 | "collapsed": false 441 | }, 442 | "outputs": [ 443 | { 444 | "data": { 445 | "text/plain": [ 446 | "'SRTT'" 447 | ] 448 | }, 449 | "execution_count": 13, 450 | "metadata": {}, 451 | "output_type": "execute_result" 452 | } 453 | ], 454 | "source": [ 455 | "Name[0:4]" 456 | ] 457 | }, 458 | { 459 | "cell_type": "code", 460 | "execution_count": 14, 461 | "metadata": { 462 | "collapsed": false 463 | }, 464 | "outputs": [ 465 | { 466 | "data": { 467 | "text/plain": [ 468 | "'SRTT'" 469 | ] 470 | }, 471 | "execution_count": 14, 472 | "metadata": {}, 473 | "output_type": "execute_result" 474 | } 475 | ], 476 | "source": [ 477 | "Name[:4]" 478 | ] 479 | }, 480 | { 481 | "cell_type": "code", 482 | "execution_count": 15, 483 | "metadata": { 484 | "collapsed": false 485 | }, 486 | "outputs": [ 487 | { 488 | "data": { 489 | "text/plain": [ 490 | "'TTU'" 491 | ] 492 | }, 493 | "execution_count": 15, 494 | "metadata": {}, 495 | "output_type": "execute_result" 496 | } 497 | ], 498 | "source": [ 499 | "Name[2:5]" 500 | ] 501 | }, 502 | { 503 | "cell_type": "code", 504 | "execution_count": 16, 505 | "metadata": { 506 | "collapsed": false 507 | }, 508 | "outputs": [ 509 | { 510 | "data": { 511 | "text/plain": [ 512 | "'TTU'" 513 | ] 514 | }, 515 | "execution_count": 16, 516 | "metadata": {}, 517 | "output_type": "execute_result" 518 | } 519 | ], 520 | "source": [ 521 | "Name[2:]" 522 | ] 523 | }, 524 | { 525 | "cell_type": "code", 526 | "execution_count": 17, 527 | "metadata": { 528 | "collapsed": false 529 | }, 530 | "outputs": [ 531 | { 532 | "data": { 533 | "text/plain": [ 534 | "'SRTTU'" 535 | ] 536 | }, 537 | "execution_count": 17, 538 | "metadata": {}, 539 | "output_type": "execute_result" 540 | } 541 | ], 542 | "source": [ 543 | "Name[:]" 544 | ] 545 | }, 546 | { 547 | "cell_type": "markdown", 548 | "metadata": {}, 549 | "source": [ 550 | "###
    stride در برش رشته ها
    " 551 | ] 552 | }, 553 | { 554 | "cell_type": "markdown", 555 | "metadata": {}, 556 | "source": [ 557 | "\"stride" 558 | ] 559 | }, 560 | { 561 | "cell_type": "code", 562 | "execution_count": 18, 563 | "metadata": { 564 | "collapsed": false 565 | }, 566 | "outputs": [ 567 | { 568 | "data": { 569 | "text/plain": [ 570 | "'STU'" 571 | ] 572 | }, 573 | "execution_count": 18, 574 | "metadata": {}, 575 | "output_type": "execute_result" 576 | } 577 | ], 578 | "source": [ 579 | "Name[::2]" 580 | ] 581 | }, 582 | { 583 | "cell_type": "code", 584 | "execution_count": 19, 585 | "metadata": { 586 | "collapsed": false 587 | }, 588 | "outputs": [ 589 | { 590 | "data": { 591 | "text/plain": [ 592 | "'ST'" 593 | ] 594 | }, 595 | "execution_count": 19, 596 | "metadata": {}, 597 | "output_type": "execute_result" 598 | } 599 | ], 600 | "source": [ 601 | "Name[0:3:2]" 602 | ] 603 | }, 604 | { 605 | "cell_type": "markdown", 606 | "metadata": {}, 607 | "source": [ 608 | "
    \n", 609 | "رشته‌ها را میتوانیم با هم جمع کنیم، به این عمل به اصطلاح concatenate میگویند.\n", 610 | "
    " 611 | ] 612 | }, 613 | { 614 | "cell_type": "code", 615 | "execution_count": 20, 616 | "metadata": { 617 | "collapsed": false 618 | }, 619 | "outputs": [ 620 | { 621 | "data": { 622 | "text/plain": [ 623 | "'SRTTU is the best'" 624 | ] 625 | }, 626 | "execution_count": 20, 627 | "metadata": {}, 628 | "output_type": "execute_result" 629 | } 630 | ], 631 | "source": [ 632 | "Statement = Name + \" is the best\"\n", 633 | "Statement" 634 | ] 635 | }, 636 | { 637 | "cell_type": "markdown", 638 | "metadata": {}, 639 | "source": [ 640 | "
    \n", 641 | "به فاصله قبل از is دقت کنید\n", 642 | "
    " 643 | ] 644 | }, 645 | { 646 | "cell_type": "markdown", 647 | "metadata": {}, 648 | "source": [ 649 | "###
    n بار تکرار رشته
    " 650 | ] 651 | }, 652 | { 653 | "cell_type": "code", 654 | "execution_count": 21, 655 | "metadata": { 656 | "collapsed": false 657 | }, 658 | "outputs": [ 659 | { 660 | "data": { 661 | "text/plain": [ 662 | "'SRTTU SRTTU SRTTU '" 663 | ] 664 | }, 665 | "execution_count": 21, 666 | "metadata": {}, 667 | "output_type": "execute_result" 668 | } 669 | ], 670 | "source": [ 671 | "3 * \"SRTTU \"" 672 | ] 673 | }, 674 | { 675 | "cell_type": "code", 676 | "execution_count": 22, 677 | "metadata": { 678 | "collapsed": false 679 | }, 680 | "outputs": [ 681 | { 682 | "data": { 683 | "text/plain": [ 684 | "'SRTTU is the best'" 685 | ] 686 | }, 687 | "execution_count": 22, 688 | "metadata": {}, 689 | "output_type": "execute_result" 690 | } 691 | ], 692 | "source": [ 693 | "Name = \"SRTTU\"\n", 694 | "Name = Name + \" is the best\"\n", 695 | "Name\n" 696 | ] 697 | }, 698 | { 699 | "cell_type": "markdown", 700 | "metadata": {}, 701 | "source": [ 702 | " \n", 703 | "#
    استفاده از Escape در رشته ها
    " 704 | ] 705 | }, 706 | { 707 | "cell_type": "markdown", 708 | "metadata": {}, 709 | "source": [ 710 | "###
    خط جدید
    " 711 | ] 712 | }, 713 | { 714 | "cell_type": "code", 715 | "execution_count": 23, 716 | "metadata": { 717 | "collapsed": false 718 | }, 719 | "outputs": [ 720 | { 721 | "name": "stdout", 722 | "output_type": "stream", 723 | "text": [ 724 | " SRTTU \n", 725 | " is the best\n" 726 | ] 727 | } 728 | ], 729 | "source": [ 730 | "print(\" SRTTU \\n is the best\" )" 731 | ] 732 | }, 733 | { 734 | "cell_type": "markdown", 735 | "metadata": {}, 736 | "source": [ 737 | "###
    فاصله tab
    " 738 | ] 739 | }, 740 | { 741 | "cell_type": "code", 742 | "execution_count": 24, 743 | "metadata": { 744 | "collapsed": false 745 | }, 746 | "outputs": [ 747 | { 748 | "name": "stdout", 749 | "output_type": "stream", 750 | "text": [ 751 | " SRTTU \t is the best\n" 752 | ] 753 | } 754 | ], 755 | "source": [ 756 | "print(\" SRTTU \\t is the best\" )" 757 | ] 758 | }, 759 | { 760 | "cell_type": "markdown", 761 | "metadata": {}, 762 | "source": [ 763 | "###
    چاپ \\ درون رشته!
    " 764 | ] 765 | }, 766 | { 767 | "cell_type": "code", 768 | "execution_count": 25, 769 | "metadata": { 770 | "collapsed": false 771 | }, 772 | "outputs": [ 773 | { 774 | "name": "stdout", 775 | "output_type": "stream", 776 | "text": [ 777 | " SRTTU \\ is the best\n" 778 | ] 779 | } 780 | ], 781 | "source": [ 782 | "print(\" SRTTU \\\\ is the best\" )" 783 | ] 784 | }, 785 | { 786 | "cell_type": "markdown", 787 | "metadata": {}, 788 | "source": [ 789 | "
    برای اینکار می‌توانیم قبل رشته r اضافه کنیم تا \\ عینا چاپ شود
    " 790 | ] 791 | }, 792 | { 793 | "cell_type": "code", 794 | "execution_count": 26, 795 | "metadata": { 796 | "collapsed": false 797 | }, 798 | "outputs": [ 799 | { 800 | "name": "stdout", 801 | "output_type": "stream", 802 | "text": [ 803 | " SRTTU \\ is the best\n", 804 | " SRTTU \\n is the best\n", 805 | " SRTTU \\t is the best\n" 806 | ] 807 | } 808 | ], 809 | "source": [ 810 | "print(r\" SRTTU \\ is the best\" )\n", 811 | "print(r\" SRTTU \\n is the best\" )\n", 812 | "print(r\" SRTTU \\t is the best\" )" 813 | ] 814 | }, 815 | { 816 | "cell_type": "markdown", 817 | "metadata": {}, 818 | "source": [ 819 | "\n", 820 | "#
    عملیات رشته‌ای
    " 821 | ] 822 | }, 823 | { 824 | "cell_type": "markdown", 825 | "metadata": {}, 826 | "source": [ 827 | "
    \n", 828 | "برخی از متدها برای رشته‌ها
    " 829 | ] 830 | }, 831 | { 832 | "cell_type": "markdown", 833 | "metadata": {}, 834 | "source": [ 835 | "###
    بزرگ کردن حروف
    " 836 | ] 837 | }, 838 | { 839 | "cell_type": "code", 840 | "execution_count": 27, 841 | "metadata": { 842 | "collapsed": false 843 | }, 844 | "outputs": [ 845 | { 846 | "name": "stdout", 847 | "output_type": "stream", 848 | "text": [ 849 | "Befor upper: I Love Python\n", 850 | "After upper: I LOVE PYTHON\n" 851 | ] 852 | } 853 | ], 854 | "source": [ 855 | "A=\"I Love Python\"\n", 856 | "print(\"Befor upper:\",A)\n", 857 | "B=A.upper()\n", 858 | "print(\"After upper:\",B)" 859 | ] 860 | }, 861 | { 862 | "cell_type": "markdown", 863 | "metadata": {}, 864 | "source": [ 865 | "
    " 866 | ] 867 | }, 868 | { 869 | "cell_type": "markdown", 870 | "metadata": {}, 871 | "source": [ 872 | "###
    جایگزینی یا replace در رشته ها
    " 873 | ] 874 | }, 875 | { 876 | "cell_type": "code", 877 | "execution_count": 28, 878 | "metadata": { 879 | "collapsed": false 880 | }, 881 | "outputs": [ 882 | { 883 | "data": { 884 | "text/plain": [ 885 | "'I love Python'" 886 | ] 887 | }, 888 | "execution_count": 28, 889 | "metadata": {}, 890 | "output_type": "execute_result" 891 | } 892 | ], 893 | "source": [ 894 | "S1=\"I love C++\"\n", 895 | "S2=S1.replace('C++', 'Python')\n", 896 | "S2" 897 | ] 898 | }, 899 | { 900 | "cell_type": "markdown", 901 | "metadata": {}, 902 | "source": [ 903 | "###
    یافتن زیر رشته یا find
    " 904 | ] 905 | }, 906 | { 907 | "cell_type": "code", 908 | "execution_count": 29, 909 | "metadata": { 910 | "collapsed": false 911 | }, 912 | "outputs": [ 913 | { 914 | "data": { 915 | "text/plain": [ 916 | "3" 917 | ] 918 | }, 919 | "execution_count": 29, 920 | "metadata": {}, 921 | "output_type": "execute_result" 922 | } 923 | ], 924 | "source": [ 925 | "Name=\"SRTTU\"\n", 926 | "Name.find('TU')" 927 | ] 928 | }, 929 | { 930 | "cell_type": "code", 931 | "execution_count": 30, 932 | "metadata": { 933 | "collapsed": false 934 | }, 935 | "outputs": [ 936 | { 937 | "data": { 938 | "text/plain": [ 939 | "2" 940 | ] 941 | }, 942 | "execution_count": 30, 943 | "metadata": {}, 944 | "output_type": "execute_result" 945 | } 946 | ], 947 | "source": [ 948 | "Name.find('T')" 949 | ] 950 | }, 951 | { 952 | "cell_type": "markdown", 953 | "metadata": {}, 954 | "source": [ 955 | "
    \n", 956 | "اگر زیر رشته را نیافت منفی یک بر میگرداند.\n", 957 | "
    " 958 | ] 959 | }, 960 | { 961 | "cell_type": "code", 962 | "execution_count": 31, 963 | "metadata": { 964 | "collapsed": false 965 | }, 966 | "outputs": [ 967 | { 968 | "data": { 969 | "text/plain": [ 970 | "-1" 971 | ] 972 | }, 973 | "execution_count": 31, 974 | "metadata": {}, 975 | "output_type": "execute_result" 976 | } 977 | ], 978 | "source": [ 979 | "Name.find('Jasdfasdasdf')" 980 | ] 981 | }, 982 | { 983 | "cell_type": "markdown", 984 | "metadata": {}, 985 | "source": [ 986 | "
    \n", 987 | " \n", 988 | "#
    کوئیز
    " 989 | ] 990 | }, 991 | { 992 | "cell_type": "markdown", 993 | "metadata": {}, 994 | "source": [ 995 | "####
    با استفاده از برش زیر رشته‌ها (slicing) سه حرف نخست این رشته را چاپ کنید.\n", 996 | "
    " 997 | ] 998 | }, 999 | { 1000 | "cell_type": "code", 1001 | "execution_count": 32, 1002 | "metadata": { 1003 | "collapsed": true 1004 | }, 1005 | "outputs": [], 1006 | "source": [ 1007 | "D=\"ABCDEFG\"" 1008 | ] 1009 | }, 1010 | { 1011 | "cell_type": "code", 1012 | "execution_count": null, 1013 | "metadata": { 1014 | "collapsed": false 1015 | }, 1016 | "outputs": [], 1017 | "source": [] 1018 | }, 1019 | { 1020 | "cell_type": "markdown", 1021 | "metadata": {}, 1022 | "source": [ 1023 | "
    \n", 1024 | "Click here for the solution\n", 1025 | "\n", 1026 | "
    \n", 1027 | "
    \n", 1028 | "```\n", 1029 | "\"print(D[:3]) or print(D[0:3])\"\n", 1030 | "```\n", 1031 | "
    " 1032 | ] 1033 | }, 1034 | { 1035 | "cell_type": "markdown", 1036 | "metadata": {}, 1037 | "source": [ 1038 | "####
    با استفاده از string slicing رشته ی زیر را یکی در میان چاپ کنید.
    " 1039 | ] 1040 | }, 1041 | { 1042 | "cell_type": "code", 1043 | "execution_count": 33, 1044 | "metadata": { 1045 | "collapsed": true 1046 | }, 1047 | "outputs": [], 1048 | "source": [ 1049 | "E='clocrkr1e1c1t'" 1050 | ] 1051 | }, 1052 | { 1053 | "cell_type": "code", 1054 | "execution_count": 34, 1055 | "metadata": { 1056 | "collapsed": false 1057 | }, 1058 | "outputs": [ 1059 | { 1060 | "name": "stdout", 1061 | "output_type": "stream", 1062 | "text": [ 1063 | "correct\n" 1064 | ] 1065 | } 1066 | ], 1067 | "source": [ 1068 | "print(E[::2])" 1069 | ] 1070 | }, 1071 | { 1072 | "cell_type": "markdown", 1073 | "metadata": {}, 1074 | "source": [ 1075 | "
    \n", 1076 | "Click here for the solution\n", 1077 | "\n", 1078 | "
    \n", 1079 | "
    \n", 1080 | "```\n", 1081 | "\"print(E[::2])\"\n", 1082 | "```\n", 1083 | "
    " 1084 | ] 1085 | }, 1086 | { 1087 | "cell_type": "markdown", 1088 | "metadata": {}, 1089 | "source": [ 1090 | "####
    یک \\ چاپ کنید
    " 1091 | ] 1092 | }, 1093 | { 1094 | "cell_type": "code", 1095 | "execution_count": null, 1096 | "metadata": { 1097 | "collapsed": false 1098 | }, 1099 | "outputs": [], 1100 | "source": [] 1101 | }, 1102 | { 1103 | "cell_type": "markdown", 1104 | "metadata": {}, 1105 | "source": [ 1106 | "
    \n", 1107 | "Click here for the solution\n", 1108 | "\n", 1109 | "
    \n", 1110 | "
    \n", 1111 | "```\n", 1112 | "print(\" \\\\\" )\n", 1113 | "or\n", 1114 | "print(r\" \\ \" )\n", 1115 | "```\n", 1116 | "
    " 1117 | ] 1118 | }, 1119 | { 1120 | "cell_type": "markdown", 1121 | "metadata": {}, 1122 | "source": [ 1123 | "####
    رشته‌ی زیر را به حروف بزرگ تبدیل کنید
    " 1124 | ] 1125 | }, 1126 | { 1127 | "cell_type": "code", 1128 | "execution_count": 35, 1129 | "metadata": { 1130 | "collapsed": true 1131 | }, 1132 | "outputs": [], 1133 | "source": [ 1134 | "F=\"You are wrong\"" 1135 | ] 1136 | }, 1137 | { 1138 | "cell_type": "code", 1139 | "execution_count": null, 1140 | "metadata": { 1141 | "collapsed": false 1142 | }, 1143 | "outputs": [], 1144 | "source": [] 1145 | }, 1146 | { 1147 | "cell_type": "markdown", 1148 | "metadata": {}, 1149 | "source": [ 1150 | "
    \n", 1151 | "Click here for the solution\n", 1152 | "\n", 1153 | "
    \n", 1154 | "
    \n", 1155 | "```\n", 1156 | "F.upper()\n", 1157 | "```\n", 1158 | "
    " 1159 | ] 1160 | }, 1161 | { 1162 | "cell_type": "markdown", 1163 | "metadata": {}, 1164 | "source": [ 1165 | "####
    در رشته‌ی زیر اندیس اولین زیر رشته‌ای که کلمه snow دارد را چاپ کنید.
    " 1166 | ] 1167 | }, 1168 | { 1169 | "cell_type": "code", 1170 | "execution_count": 36, 1171 | "metadata": { 1172 | "collapsed": true 1173 | }, 1174 | "outputs": [], 1175 | "source": [ 1176 | "G=\"Mary had a little lamb Little lamb, little lamb Mary had a little lamb \\\n", 1177 | "Its fleece was white as snow And everywhere that Mary went Mary went, Mary went \\\n", 1178 | "Everywhere that Mary went The lamb was sure to go\"" 1179 | ] 1180 | }, 1181 | { 1182 | "cell_type": "code", 1183 | "execution_count": null, 1184 | "metadata": { 1185 | "collapsed": false 1186 | }, 1187 | "outputs": [], 1188 | "source": [] 1189 | }, 1190 | { 1191 | "cell_type": "markdown", 1192 | "metadata": {}, 1193 | "source": [ 1194 | "
    \n", 1195 | "Click here for the solution\n", 1196 | "\n", 1197 | "
    \n", 1198 | "
    \n", 1199 | "```\n", 1200 | "G.find('snow')\n", 1201 | "```\n", 1202 | "
    " 1203 | ] 1204 | }, 1205 | { 1206 | "cell_type": "markdown", 1207 | "metadata": {}, 1208 | "source": [ 1209 | "####
    در متغیر G سوال قبل نام Marry را با Bob تعویض کنید.
    " 1210 | ] 1211 | }, 1212 | { 1213 | "cell_type": "code", 1214 | "execution_count": null, 1215 | "metadata": { 1216 | "collapsed": false 1217 | }, 1218 | "outputs": [], 1219 | "source": [] 1220 | }, 1221 | { 1222 | "cell_type": "markdown", 1223 | "metadata": {}, 1224 | "source": [ 1225 | "
    \n", 1226 | "Click here for the solution\n", 1227 | "\n", 1228 | "
    \n", 1229 | "
    \n", 1230 | "```\n", 1231 | "G.replace(\"Mary\",\"Bob\")\n", 1232 | "```\n", 1233 | "
    " 1234 | ] 1235 | }, 1236 | { 1237 | "cell_type": "markdown", 1238 | "metadata": {}, 1239 | "source": [ 1240 | "#
    منبع
    \n", 1241 | "\n", 1242 | "https://cognitiveclass.ai/courses/python-for-data-science/" 1243 | ] 1244 | }, 1245 | { 1246 | "cell_type": "markdown", 1247 | "metadata": { 1248 | "collapsed": true 1249 | }, 1250 | "source": [ 1251 | "
    \n", 1252 | "
    دانشگاه تربیت دبیر شهید رجایی
    کارگاه پایتون
    علیرضا اخوان پور
    96-97
    \n", 1253 | "
    \n", 1254 | "SRTTU.edu - PyLab.AkhavanPour.ir - AkhavanPour.ir\n", 1255 | "
    " 1256 | ] 1257 | } 1258 | ], 1259 | "metadata": { 1260 | "anaconda-cloud": {}, 1261 | "kernelspec": { 1262 | "display_name": "Python [default]", 1263 | "language": "python", 1264 | "name": "python3" 1265 | }, 1266 | "language_info": { 1267 | "codemirror_mode": { 1268 | "name": "ipython", 1269 | "version": 3 1270 | }, 1271 | "file_extension": ".py", 1272 | "mimetype": "text/x-python", 1273 | "name": "python", 1274 | "nbconvert_exporter": "python", 1275 | "pygments_lexer": "ipython3", 1276 | "version": "3.5.2" 1277 | } 1278 | }, 1279 | "nbformat": 4, 1280 | "nbformat_minor": 0 1281 | } 1282 | -------------------------------------------------------------------------------- /6-1_Numpy_1D.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"SRTTU\"" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": null, 13 | "metadata": { 14 | "collapsed": true 15 | }, 16 | "outputs": [], 17 | "source": [ 18 | "import time \n", 19 | "import sys\n", 20 | "import numpy as np \n", 21 | "\n", 22 | "import matplotlib.pyplot as plt\n", 23 | "%matplotlib inline " 24 | ] 25 | }, 26 | { 27 | "cell_type": "markdown", 28 | "metadata": {}, 29 | "source": [ 30 | "
    \n", 31 | "نگران سلول زیر نباشید. نیازی نیست آن را بخوانید و درک کنید.\n", 32 | "لطفا فقط آن را اجرا کنید؛ در ادامه‌ی نوت‌بوک برای رسم نمودار استفاده می‌شود.\n", 33 | "
    " 34 | ] 35 | }, 36 | { 37 | "cell_type": "code", 38 | "execution_count": null, 39 | "metadata": { 40 | "collapsed": true 41 | }, 42 | "outputs": [], 43 | "source": [ 44 | "def Plotvec1(u,z,v):\n", 45 | " #this function is used in code \n", 46 | " ax = plt.axes()\n", 47 | " ax.arrow(0, 0, *u, head_width=0.05,color ='r', head_length=0.1)\n", 48 | " plt.text(*(u+0.1), 'u')\n", 49 | " \n", 50 | " ax.arrow(0, 0, *v, head_width=0.05,color ='b', head_length=0.1)\n", 51 | " plt.text(*(v+0.1), 'v')\n", 52 | " ax.arrow(0, 0, *z, head_width=0.05, head_length=0.1)\n", 53 | " plt.text(*(z+0.1), 'z')\n", 54 | " plt.ylim(-2,2)\n", 55 | " plt.xlim(-2,2)\n", 56 | "\n", 57 | "\n", 58 | "\n", 59 | "def Plotvec2(a,b):\n", 60 | " #this function is used in code \n", 61 | " ax = plt.axes()\n", 62 | " ax.arrow(0, 0, *a, head_width=0.05,color ='r', head_length=0.1)\n", 63 | " plt.text(*(a+0.1), 'a')\n", 64 | " ax.arrow(0, 0, *b, head_width=0.05,color ='b', head_length=0.1)\n", 65 | " plt.text(*(b+0.1), 'b')\n", 66 | "\n", 67 | " plt.ylim(-2,2)\n", 68 | " plt.xlim(-2,2)" 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "metadata": {}, 74 | "source": [ 75 | "
    \n", 76 | "تعریف یک لیست در پایتون.
    " 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": null, 82 | "metadata": { 83 | "collapsed": false 84 | }, 85 | "outputs": [], 86 | "source": [ 87 | "a=[\"0\",1,\"two\",\"3\",4]" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "We can access the data via an index:" 95 | ] 96 | }, 97 | { 98 | "cell_type": "markdown", 99 | "metadata": {}, 100 | "source": [ 101 | "" 102 | ] 103 | }, 104 | { 105 | "cell_type": "markdown", 106 | "metadata": {}, 107 | "source": [ 108 | "We can access each element using a square bracket as follows: " 109 | ] 110 | }, 111 | { 112 | "cell_type": "code", 113 | "execution_count": null, 114 | "metadata": { 115 | "collapsed": false 116 | }, 117 | "outputs": [], 118 | "source": [ 119 | "print(\"a[0]:\",a[0])\n", 120 | "print(\"a[1]:\",a[1])\n", 121 | "print(\"a[2]:\",a[2])\n", 122 | "print(\"a[3]:\",a[3])\n", 123 | "print(\"a[4]:\",a[4])" 124 | ] 125 | }, 126 | { 127 | "cell_type": "markdown", 128 | "metadata": {}, 129 | "source": [ 130 | "A numpy array is similar to a list, it's usually fixed in size and each element is of the same type. We can cast a list to a numpy array by first importing numpy: " 131 | ] 132 | }, 133 | { 134 | "cell_type": "code", 135 | "execution_count": null, 136 | "metadata": { 137 | "collapsed": false 138 | }, 139 | "outputs": [], 140 | "source": [ 141 | "import numpy as np " 142 | ] 143 | }, 144 | { 145 | "cell_type": "markdown", 146 | "metadata": {}, 147 | "source": [ 148 | " We then cast the list as follows:" 149 | ] 150 | }, 151 | { 152 | "cell_type": "code", 153 | "execution_count": null, 154 | "metadata": { 155 | "collapsed": false 156 | }, 157 | "outputs": [], 158 | "source": [ 159 | "a=np.array([0,1,2,3, 4])\n", 160 | "a" 161 | ] 162 | }, 163 | { 164 | "cell_type": "markdown", 165 | "metadata": {}, 166 | "source": [ 167 | "Each element is of the same type, in this case integers: " 168 | ] 169 | }, 170 | { 171 | "cell_type": "markdown", 172 | "metadata": {}, 173 | "source": [ 174 | "" 175 | ] 176 | }, 177 | { 178 | "cell_type": "markdown", 179 | "metadata": {}, 180 | "source": [ 181 | " As with lists, we can access each element via a square bracket:" 182 | ] 183 | }, 184 | { 185 | "cell_type": "code", 186 | "execution_count": null, 187 | "metadata": { 188 | "collapsed": false 189 | }, 190 | "outputs": [], 191 | "source": [ 192 | "print(\"a[0]:\",a[0])\n", 193 | "print(\"a[1]:\",a[1])\n", 194 | "print(\"a[2]:\",a[2])\n", 195 | "print(\"a[3]:\",a[3])\n", 196 | "print(\"a[4]:\",a[4])" 197 | ] 198 | }, 199 | { 200 | "cell_type": "markdown", 201 | "metadata": {}, 202 | "source": [ 203 | "The value of “a” is stored a follows: " 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": null, 209 | "metadata": { 210 | "collapsed": false 211 | }, 212 | "outputs": [], 213 | "source": [ 214 | "a" 215 | ] 216 | }, 217 | { 218 | "cell_type": "markdown", 219 | "metadata": {}, 220 | "source": [ 221 | "If we check the type of the array we get \"numpy.ndarray\":" 222 | ] 223 | }, 224 | { 225 | "cell_type": "code", 226 | "execution_count": null, 227 | "metadata": { 228 | "collapsed": false 229 | }, 230 | "outputs": [], 231 | "source": [ 232 | "type(a)" 233 | ] 234 | }, 235 | { 236 | "cell_type": "markdown", 237 | "metadata": {}, 238 | "source": [ 239 | "As numpy arrays contain data of the same type, we can use the attribute \"dtype\" to obtain the Data-type of the array’s elements. In this case a 64-bit integer: \n" 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": null, 245 | "metadata": { 246 | "collapsed": false 247 | }, 248 | "outputs": [], 249 | "source": [ 250 | "a.dtype" 251 | ] 252 | }, 253 | { 254 | "cell_type": "markdown", 255 | "metadata": {}, 256 | "source": [ 257 | "We can create a numpy array with real numbers:" 258 | ] 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": null, 263 | "metadata": { 264 | "collapsed": false 265 | }, 266 | "outputs": [], 267 | "source": [ 268 | "b=np.array([3.1,11.02,6.2, 213.2,5.2])" 269 | ] 270 | }, 271 | { 272 | "cell_type": "markdown", 273 | "metadata": {}, 274 | "source": [ 275 | "When we check the type of the array we get \"numpy.ndarray\":" 276 | ] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "execution_count": null, 281 | "metadata": { 282 | "collapsed": false 283 | }, 284 | "outputs": [], 285 | "source": [ 286 | "type(b)" 287 | ] 288 | }, 289 | { 290 | "cell_type": "markdown", 291 | "metadata": {}, 292 | "source": [ 293 | "If we examine the attribute \"dtype\" we see float 64, as the elements are not integers: " 294 | ] 295 | }, 296 | { 297 | "cell_type": "code", 298 | "execution_count": null, 299 | "metadata": { 300 | "collapsed": false 301 | }, 302 | "outputs": [], 303 | "source": [ 304 | "b.dtype" 305 | ] 306 | }, 307 | { 308 | "cell_type": "markdown", 309 | "metadata": {}, 310 | "source": [ 311 | "We can change the value of the array, consider the array \"c\":" 312 | ] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": null, 317 | "metadata": { 318 | "collapsed": false 319 | }, 320 | "outputs": [], 321 | "source": [ 322 | "c=np.array([20,1,2,3,4])\n", 323 | "c" 324 | ] 325 | }, 326 | { 327 | "cell_type": "markdown", 328 | "metadata": {}, 329 | "source": [ 330 | "We can change the first element of the array to 100 as follows:" 331 | ] 332 | }, 333 | { 334 | "cell_type": "code", 335 | "execution_count": null, 336 | "metadata": { 337 | "collapsed": false 338 | }, 339 | "outputs": [], 340 | "source": [ 341 | "c[0]=100\n", 342 | "c" 343 | ] 344 | }, 345 | { 346 | "cell_type": "markdown", 347 | "metadata": {}, 348 | "source": [ 349 | "We can change the 5th element of the array as follows:" 350 | ] 351 | }, 352 | { 353 | "cell_type": "code", 354 | "execution_count": null, 355 | "metadata": { 356 | "collapsed": false 357 | }, 358 | "outputs": [], 359 | "source": [ 360 | "c[4]=0\n", 361 | "c" 362 | ] 363 | }, 364 | { 365 | "cell_type": "markdown", 366 | "metadata": {}, 367 | "source": [ 368 | "\n", 369 | "Like lists, we can slice the numpy array, and we can select the elements from 1 to 3 and assign it to a new numpy array 'd' as follows:\n" 370 | ] 371 | }, 372 | { 373 | "cell_type": "code", 374 | "execution_count": null, 375 | "metadata": { 376 | "collapsed": false 377 | }, 378 | "outputs": [], 379 | "source": [ 380 | "d=c[1:4]\n", 381 | "d" 382 | ] 383 | }, 384 | { 385 | "cell_type": "markdown", 386 | "metadata": {}, 387 | "source": [ 388 | "We can assign the corresponding indexes to new values as follows: " 389 | ] 390 | }, 391 | { 392 | "cell_type": "code", 393 | "execution_count": null, 394 | "metadata": { 395 | "collapsed": false 396 | }, 397 | "outputs": [], 398 | "source": [ 399 | "c[3:5]=300,400\n", 400 | "c" 401 | ] 402 | }, 403 | { 404 | "cell_type": "markdown", 405 | "metadata": {}, 406 | "source": [ 407 | "Similarly, we can use a list to select a specific index.\n", 408 | "The list ' select ' contains several values:\n" 409 | ] 410 | }, 411 | { 412 | "cell_type": "code", 413 | "execution_count": null, 414 | "metadata": { 415 | "collapsed": true 416 | }, 417 | "outputs": [], 418 | "source": [ 419 | "select=[0,2,3]" 420 | ] 421 | }, 422 | { 423 | "cell_type": "markdown", 424 | "metadata": {}, 425 | "source": [ 426 | "We can use the list as an argument in the brackets. The output is the elements corresponding to the particular index:" 427 | ] 428 | }, 429 | { 430 | "cell_type": "code", 431 | "execution_count": null, 432 | "metadata": { 433 | "collapsed": false 434 | }, 435 | "outputs": [], 436 | "source": [ 437 | "d=c[select]\n", 438 | "d" 439 | ] 440 | }, 441 | { 442 | "cell_type": "markdown", 443 | "metadata": {}, 444 | "source": [ 445 | "We can assign the specified elements to a new value. For example, we can assign the values to 100 000 as follows:" 446 | ] 447 | }, 448 | { 449 | "cell_type": "code", 450 | "execution_count": null, 451 | "metadata": { 452 | "collapsed": false 453 | }, 454 | "outputs": [], 455 | "source": [ 456 | "c[select]=100000\n", 457 | "c" 458 | ] 459 | }, 460 | { 461 | "cell_type": "markdown", 462 | "metadata": {}, 463 | "source": [ 464 | "Let's review some basic array attributes using the array ‘a’:" 465 | ] 466 | }, 467 | { 468 | "cell_type": "code", 469 | "execution_count": null, 470 | "metadata": { 471 | "collapsed": false 472 | }, 473 | "outputs": [], 474 | "source": [ 475 | "a=np.array([0,1,2,3, 4])\n", 476 | "a" 477 | ] 478 | }, 479 | { 480 | "cell_type": "markdown", 481 | "metadata": {}, 482 | "source": [ 483 | "The attribute size is the Number of elements in the array:" 484 | ] 485 | }, 486 | { 487 | "cell_type": "code", 488 | "execution_count": null, 489 | "metadata": { 490 | "collapsed": false 491 | }, 492 | "outputs": [], 493 | "source": [ 494 | "a.size" 495 | ] 496 | }, 497 | { 498 | "cell_type": "markdown", 499 | "metadata": {}, 500 | "source": [ 501 | "The next two attributes will make more sense when we get to higher dimensions but let's review them. The attribute “ndim” represents the Number of array dimensions or the rank of the array, in this case, one:" 502 | ] 503 | }, 504 | { 505 | "cell_type": "code", 506 | "execution_count": null, 507 | "metadata": { 508 | "collapsed": false 509 | }, 510 | "outputs": [], 511 | "source": [ 512 | "a.ndim" 513 | ] 514 | }, 515 | { 516 | "cell_type": "markdown", 517 | "metadata": {}, 518 | "source": [ 519 | "The attribute “shape” is a tuple of integers indicating the size of the array in each dimension:" 520 | ] 521 | }, 522 | { 523 | "cell_type": "code", 524 | "execution_count": null, 525 | "metadata": { 526 | "collapsed": false 527 | }, 528 | "outputs": [], 529 | "source": [ 530 | "a.shape" 531 | ] 532 | }, 533 | { 534 | "cell_type": "code", 535 | "execution_count": null, 536 | "metadata": { 537 | "collapsed": true 538 | }, 539 | "outputs": [], 540 | "source": [ 541 | "a=np.array([1,-1,1,-1])" 542 | ] 543 | }, 544 | { 545 | "cell_type": "code", 546 | "execution_count": null, 547 | "metadata": { 548 | "collapsed": false 549 | }, 550 | "outputs": [], 551 | "source": [ 552 | "mean=a.mean()\n", 553 | "mean" 554 | ] 555 | }, 556 | { 557 | "cell_type": "code", 558 | "execution_count": null, 559 | "metadata": { 560 | "collapsed": false 561 | }, 562 | "outputs": [], 563 | "source": [ 564 | "standard_deviation=a.std()\n", 565 | "standard_deviation" 566 | ] 567 | }, 568 | { 569 | "cell_type": "code", 570 | "execution_count": null, 571 | "metadata": { 572 | "collapsed": false 573 | }, 574 | "outputs": [], 575 | "source": [ 576 | "b=np.array([1,2,3,4,5])\n", 577 | "b" 578 | ] 579 | }, 580 | { 581 | "cell_type": "code", 582 | "execution_count": null, 583 | "metadata": { 584 | "collapsed": false 585 | }, 586 | "outputs": [], 587 | "source": [ 588 | "max_b=b.max()" 589 | ] 590 | }, 591 | { 592 | "cell_type": "code", 593 | "execution_count": null, 594 | "metadata": { 595 | "collapsed": false 596 | }, 597 | "outputs": [], 598 | "source": [ 599 | "max_b=b.min()" 600 | ] 601 | }, 602 | { 603 | "cell_type": "markdown", 604 | "metadata": {}, 605 | "source": [ 606 | "## Array Addition " 607 | ] 608 | }, 609 | { 610 | "cell_type": "markdown", 611 | "metadata": {}, 612 | "source": [ 613 | "Consider the numpy array 'u':" 614 | ] 615 | }, 616 | { 617 | "cell_type": "code", 618 | "execution_count": null, 619 | "metadata": { 620 | "collapsed": false 621 | }, 622 | "outputs": [], 623 | "source": [ 624 | "u=np.array([1,0])\n", 625 | "u" 626 | ] 627 | }, 628 | { 629 | "cell_type": "markdown", 630 | "metadata": {}, 631 | "source": [ 632 | "Consider the numpy array 'v':" 633 | ] 634 | }, 635 | { 636 | "cell_type": "code", 637 | "execution_count": null, 638 | "metadata": { 639 | "collapsed": false 640 | }, 641 | "outputs": [], 642 | "source": [ 643 | "v=np.array([0,1])\n", 644 | "v" 645 | ] 646 | }, 647 | { 648 | "cell_type": "markdown", 649 | "metadata": {}, 650 | "source": [ 651 | "We can add the two arrays and assign it to z:" 652 | ] 653 | }, 654 | { 655 | "cell_type": "code", 656 | "execution_count": null, 657 | "metadata": { 658 | "collapsed": false 659 | }, 660 | "outputs": [], 661 | "source": [ 662 | "z=u+v\n", 663 | "z" 664 | ] 665 | }, 666 | { 667 | "cell_type": "markdown", 668 | "metadata": {}, 669 | "source": [ 670 | " The operation is equivalent to vector addition:" 671 | ] 672 | }, 673 | { 674 | "cell_type": "code", 675 | "execution_count": null, 676 | "metadata": { 677 | "collapsed": false 678 | }, 679 | "outputs": [], 680 | "source": [ 681 | "Plotvec1(u,z,v)" 682 | ] 683 | }, 684 | { 685 | "cell_type": "markdown", 686 | "metadata": {}, 687 | "source": [ 688 | "#### Implement the following vector subtraction in numpy: u-v " 689 | ] 690 | }, 691 | { 692 | "cell_type": "code", 693 | "execution_count": null, 694 | "metadata": { 695 | "collapsed": false 696 | }, 697 | "outputs": [], 698 | "source": [] 699 | }, 700 | { 701 | "cell_type": "markdown", 702 | "metadata": {}, 703 | "source": [ 704 | "
    \n", 705 | "Click here for the solution\n", 706 | "\n", 707 | "
    \n", 708 | "
    \n", 709 | "```\n", 710 | "u-v\n", 711 | "```\n", 712 | "
    " 713 | ] 714 | }, 715 | { 716 | "cell_type": "markdown", 717 | "metadata": {}, 718 | "source": [ 719 | "Consider the vector numpy array 'y':" 720 | ] 721 | }, 722 | { 723 | "cell_type": "code", 724 | "execution_count": null, 725 | "metadata": { 726 | "collapsed": false 727 | }, 728 | "outputs": [], 729 | "source": [ 730 | "y=np.array([1,2])\n", 731 | "y" 732 | ] 733 | }, 734 | { 735 | "cell_type": "markdown", 736 | "metadata": {}, 737 | "source": [ 738 | "We can multiply every element in the array by 2:" 739 | ] 740 | }, 741 | { 742 | "cell_type": "code", 743 | "execution_count": null, 744 | "metadata": { 745 | "collapsed": false 746 | }, 747 | "outputs": [], 748 | "source": [ 749 | "z=2*y\n", 750 | "z" 751 | ] 752 | }, 753 | { 754 | "cell_type": "markdown", 755 | "metadata": {}, 756 | "source": [ 757 | " This is equivalent to multiplying a vector by a scaler: " 758 | ] 759 | }, 760 | { 761 | "cell_type": "markdown", 762 | "metadata": {}, 763 | "source": [ 764 | "#### Multiply the numpy array z with -2:" 765 | ] 766 | }, 767 | { 768 | "cell_type": "code", 769 | "execution_count": null, 770 | "metadata": { 771 | "collapsed": false 772 | }, 773 | "outputs": [], 774 | "source": [] 775 | }, 776 | { 777 | "cell_type": "markdown", 778 | "metadata": {}, 779 | "source": [ 780 | "
    \n", 781 | "Click here for the solution\n", 782 | "\n", 783 | "
    \n", 784 | "
    \n", 785 | "```\n", 786 | "-2*z\n", 787 | "```\n", 788 | "
    " 789 | ] 790 | }, 791 | { 792 | "cell_type": "markdown", 793 | "metadata": {}, 794 | "source": [ 795 | "## Product of two numpy arrays " 796 | ] 797 | }, 798 | { 799 | "cell_type": "markdown", 800 | "metadata": {}, 801 | "source": [ 802 | " Consider the following array 'u':" 803 | ] 804 | }, 805 | { 806 | "cell_type": "code", 807 | "execution_count": null, 808 | "metadata": { 809 | "collapsed": false 810 | }, 811 | "outputs": [], 812 | "source": [ 813 | "u=np.array([1,2])\n", 814 | "u" 815 | ] 816 | }, 817 | { 818 | "cell_type": "markdown", 819 | "metadata": {}, 820 | "source": [ 821 | " Consider the following array 'v':" 822 | ] 823 | }, 824 | { 825 | "cell_type": "code", 826 | "execution_count": null, 827 | "metadata": { 828 | "collapsed": false 829 | }, 830 | "outputs": [], 831 | "source": [ 832 | "v=np.array([3,2])\n", 833 | "v" 834 | ] 835 | }, 836 | { 837 | "cell_type": "markdown", 838 | "metadata": {}, 839 | "source": [ 840 | " The product of the two numpy arrays 'u' and 'v' is given by:" 841 | ] 842 | }, 843 | { 844 | "cell_type": "code", 845 | "execution_count": null, 846 | "metadata": { 847 | "collapsed": false 848 | }, 849 | "outputs": [], 850 | "source": [ 851 | "z=u*v\n", 852 | "z" 853 | ] 854 | }, 855 | { 856 | "cell_type": "markdown", 857 | "metadata": {}, 858 | "source": [ 859 | "#### Consider the list [1,2,3,4,5] and [1,0,1,0,1], and cast both lists to a numpy array then multiply them together:" 860 | ] 861 | }, 862 | { 863 | "cell_type": "code", 864 | "execution_count": null, 865 | "metadata": { 866 | "collapsed": false 867 | }, 868 | "outputs": [], 869 | "source": [] 870 | }, 871 | { 872 | "cell_type": "markdown", 873 | "metadata": {}, 874 | "source": [ 875 | "
    \n", 876 | "Click here for the solution\n", 877 | "\n", 878 | "
    \n", 879 | "
    \n", 880 | "```\n", 881 | "a=np.array([1,2,3,4,5])\n", 882 | "b=np.array([1,0,1,0,1])\n", 883 | "a*b\n", 884 | "```\n", 885 | "
    " 886 | ] 887 | }, 888 | { 889 | "cell_type": "markdown", 890 | "metadata": {}, 891 | "source": [ 892 | "#### Dot Product" 893 | ] 894 | }, 895 | { 896 | "cell_type": "markdown", 897 | "metadata": {}, 898 | "source": [ 899 | " The dot product of the two numpy arrays 'u' and 'v' is given by:" 900 | ] 901 | }, 902 | { 903 | "cell_type": "code", 904 | "execution_count": null, 905 | "metadata": { 906 | "collapsed": false 907 | }, 908 | "outputs": [], 909 | "source": [ 910 | "np.dot(u,v)" 911 | ] 912 | }, 913 | { 914 | "cell_type": "markdown", 915 | "metadata": {}, 916 | "source": [ 917 | "#### Convert the list [-1,1] and [1,1] to numpy arrays 'a' and 'b'. Then, plot the arrays as vectors using the fuction Plotvec2 and find the dot product:" 918 | ] 919 | }, 920 | { 921 | "cell_type": "code", 922 | "execution_count": null, 923 | "metadata": { 924 | "collapsed": false 925 | }, 926 | "outputs": [], 927 | "source": [] 928 | }, 929 | { 930 | "cell_type": "markdown", 931 | "metadata": {}, 932 | "source": [ 933 | "
    \n", 934 | "Click here for the solution\n", 935 | "\n", 936 | "
    \n", 937 | "
    \n", 938 | "```\n", 939 | "a=np.array([-1,1])\n", 940 | "b=np.array([1,1])\n", 941 | "Plotvec2(a,b)\n", 942 | "print(\"the dot product is\",np.dot(a,b) )\n", 943 | "```\n", 944 | "
    \n" 945 | ] 946 | }, 947 | { 948 | "cell_type": "markdown", 949 | "metadata": {}, 950 | "source": [ 951 | "#### Convert the list [1,0] and [0,1] to numpy arrays 'a' and 'b'. Then, plot the arrays as vectors using the function Plotvec2 and find the dot product:\n" 952 | ] 953 | }, 954 | { 955 | "cell_type": "code", 956 | "execution_count": null, 957 | "metadata": { 958 | "collapsed": false 959 | }, 960 | "outputs": [], 961 | "source": [] 962 | }, 963 | { 964 | "cell_type": "markdown", 965 | "metadata": {}, 966 | "source": [ 967 | "
    \n", 968 | "Click here for the solution\n", 969 | "\n", 970 | "
    \n", 971 | "
    \n", 972 | "```\n", 973 | "a=np.array([1,0])\n", 974 | "b=np.array([0,1])\n", 975 | "Plotvec2(a,b)\n", 976 | "print(\"the dot product is\",np.dot(a,b) )\n", 977 | "```\n", 978 | "
    " 979 | ] 980 | }, 981 | { 982 | "cell_type": "markdown", 983 | "metadata": {}, 984 | "source": [ 985 | "#### Convert the list [1,1] and [0,1] to numpy arrays 'a' and 'b'. Then plot the arrays as vectors using the fuction Plotvec2 and find the dot product:\n", 986 | "\n", 987 | "\n" 988 | ] 989 | }, 990 | { 991 | "cell_type": "code", 992 | "execution_count": null, 993 | "metadata": { 994 | "collapsed": false 995 | }, 996 | "outputs": [], 997 | "source": [] 998 | }, 999 | { 1000 | "cell_type": "markdown", 1001 | "metadata": {}, 1002 | "source": [ 1003 | "
    \n", 1004 | "Click here for the solution\n", 1005 | "\n", 1006 | "
    \n", 1007 | "
    \n", 1008 | "```\n", 1009 | "a=np.array([1,1])\n", 1010 | "b=np.array([0,1])\n", 1011 | "Plotvec2(a,b)\n", 1012 | "print(\"the dot product is\",np.dot(a,b) )\n", 1013 | "print(\"the dot product is\",np.dot(a,b) )\n", 1014 | "```\n", 1015 | "
    " 1016 | ] 1017 | }, 1018 | { 1019 | "cell_type": "markdown", 1020 | "metadata": {}, 1021 | "source": [ 1022 | "#### Why is the result of the dot product for question 4 and 5 zero, but not zero for question 6? Hint: study the corresponding figures, pay attention to the direction the arrows are pointing to. " 1023 | ] 1024 | }, 1025 | { 1026 | "cell_type": "code", 1027 | "execution_count": null, 1028 | "metadata": { 1029 | "collapsed": true 1030 | }, 1031 | "outputs": [], 1032 | "source": [] 1033 | }, 1034 | { 1035 | "cell_type": "markdown", 1036 | "metadata": {}, 1037 | "source": [ 1038 | "
    \n", 1039 | "Click here for the solution\n", 1040 | "\n", 1041 | "
    \n", 1042 | "
    \n", 1043 | "```\n", 1044 | "The vectors used for question 4 and 5 are perpendicular. As a result, the dot product is zero.\n", 1045 | "```\n", 1046 | "
    " 1047 | ] 1048 | }, 1049 | { 1050 | "cell_type": "markdown", 1051 | "metadata": {}, 1052 | "source": [ 1053 | "### Adding Constant to a numpy Array " 1054 | ] 1055 | }, 1056 | { 1057 | "cell_type": "markdown", 1058 | "metadata": {}, 1059 | "source": [ 1060 | "Consider the following array: " 1061 | ] 1062 | }, 1063 | { 1064 | "cell_type": "code", 1065 | "execution_count": null, 1066 | "metadata": { 1067 | "collapsed": false 1068 | }, 1069 | "outputs": [], 1070 | "source": [ 1071 | "u=np.array([1,2,3,-1]) \n", 1072 | "u" 1073 | ] 1074 | }, 1075 | { 1076 | "cell_type": "markdown", 1077 | "metadata": {}, 1078 | "source": [ 1079 | " Adding the constant 1 to the array adds 1 to each element in the array:" 1080 | ] 1081 | }, 1082 | { 1083 | "cell_type": "code", 1084 | "execution_count": null, 1085 | "metadata": { 1086 | "collapsed": false 1087 | }, 1088 | "outputs": [], 1089 | "source": [ 1090 | "u+1" 1091 | ] 1092 | }, 1093 | { 1094 | "cell_type": "markdown", 1095 | "metadata": {}, 1096 | "source": [ 1097 | " The process is summarised in the following animation:" 1098 | ] 1099 | }, 1100 | { 1101 | "cell_type": "markdown", 1102 | "metadata": {}, 1103 | "source": [ 1104 | "" 1105 | ] 1106 | }, 1107 | { 1108 | "cell_type": "markdown", 1109 | "metadata": {}, 1110 | "source": [ 1111 | " This part of Broadcasting check out the link for more detail. " 1112 | ] 1113 | }, 1114 | { 1115 | "cell_type": "markdown", 1116 | "metadata": {}, 1117 | "source": [ 1118 | "### Mathematical Functions " 1119 | ] 1120 | }, 1121 | { 1122 | "cell_type": "markdown", 1123 | "metadata": {}, 1124 | "source": [ 1125 | " We can access the value of pie in numpy as follows :" 1126 | ] 1127 | }, 1128 | { 1129 | "cell_type": "code", 1130 | "execution_count": null, 1131 | "metadata": { 1132 | "collapsed": false 1133 | }, 1134 | "outputs": [], 1135 | "source": [ 1136 | "np.pi" 1137 | ] 1138 | }, 1139 | { 1140 | "cell_type": "markdown", 1141 | "metadata": {}, 1142 | "source": [ 1143 | " We can create the following numpy array in Radians:" 1144 | ] 1145 | }, 1146 | { 1147 | "cell_type": "code", 1148 | "execution_count": null, 1149 | "metadata": { 1150 | "collapsed": false 1151 | }, 1152 | "outputs": [], 1153 | "source": [ 1154 | "x=np.array([0,np.pi/2 , np.pi] )" 1155 | ] 1156 | }, 1157 | { 1158 | "cell_type": "markdown", 1159 | "metadata": {}, 1160 | "source": [ 1161 | " We can apply the function \"sine\" to the array 'x' and assign the values to the array 'y'; this applies the sine function to each element in the array: " 1162 | ] 1163 | }, 1164 | { 1165 | "cell_type": "code", 1166 | "execution_count": null, 1167 | "metadata": { 1168 | "collapsed": false 1169 | }, 1170 | "outputs": [], 1171 | "source": [ 1172 | "y=np.sin(x)\n", 1173 | "y" 1174 | ] 1175 | }, 1176 | { 1177 | "cell_type": "markdown", 1178 | "metadata": {}, 1179 | "source": [ 1180 | "#### Linspace" 1181 | ] 1182 | }, 1183 | { 1184 | "cell_type": "markdown", 1185 | "metadata": {}, 1186 | "source": [ 1187 | " A useful function for plotting mathematical functions is \"linespace\". Linespace returns evenly spaced numbers over a specified interval. We specify the starting point of the sequence and the ending point of the sequence. The parameter \"num\" indicates the Number of samples to generate, in this case 5:" 1188 | ] 1189 | }, 1190 | { 1191 | "cell_type": "code", 1192 | "execution_count": null, 1193 | "metadata": { 1194 | "collapsed": false 1195 | }, 1196 | "outputs": [], 1197 | "source": [ 1198 | "np.linspace(-2,2,num=5)" 1199 | ] 1200 | }, 1201 | { 1202 | "cell_type": "markdown", 1203 | "metadata": {}, 1204 | "source": [ 1205 | " If we change the parameter **num** to 9, we get 9 evenly spaced numbers over the interval from -2 to 2: " 1206 | ] 1207 | }, 1208 | { 1209 | "cell_type": "code", 1210 | "execution_count": null, 1211 | "metadata": { 1212 | "collapsed": false 1213 | }, 1214 | "outputs": [], 1215 | "source": [ 1216 | "np.linspace(-2,2,num=9)" 1217 | ] 1218 | }, 1219 | { 1220 | "cell_type": "markdown", 1221 | "metadata": {}, 1222 | "source": [ 1223 | "We can use the function line space to generate 100 evenly spaced samples from the interval 0 to 2 pi: " 1224 | ] 1225 | }, 1226 | { 1227 | "cell_type": "code", 1228 | "execution_count": null, 1229 | "metadata": { 1230 | "collapsed": false 1231 | }, 1232 | "outputs": [], 1233 | "source": [ 1234 | "x=np.linspace(0,2*np.pi,num=100)\n" 1235 | ] 1236 | }, 1237 | { 1238 | "cell_type": "markdown", 1239 | "metadata": {}, 1240 | "source": [ 1241 | "We can apply the sine function to each element in the array 'x' and assign it to the array 'y': " 1242 | ] 1243 | }, 1244 | { 1245 | "cell_type": "code", 1246 | "execution_count": null, 1247 | "metadata": { 1248 | "collapsed": true 1249 | }, 1250 | "outputs": [], 1251 | "source": [ 1252 | "y=np.sin(x)" 1253 | ] 1254 | }, 1255 | { 1256 | "cell_type": "code", 1257 | "execution_count": null, 1258 | "metadata": { 1259 | "collapsed": false 1260 | }, 1261 | "outputs": [], 1262 | "source": [ 1263 | "plt.plot(x,y)" 1264 | ] 1265 | }, 1266 | { 1267 | "cell_type": "markdown", 1268 | "metadata": {}, 1269 | "source": [ 1270 | "#### About the Authors: \n", 1271 | "\n", 1272 | " [Joseph Santarcangelo]( https://www.linkedin.com/in/joseph-s-50398b136/) has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.\n" 1273 | ] 1274 | }, 1275 | { 1276 | "cell_type": "markdown", 1277 | "metadata": {}, 1278 | "source": [ 1279 | "Copyright © 2017 [cognitiveclass.ai](https:cognitiveclass.ai). This notebook and its source code are released under the terms of the [MIT License](cognitiveclass.ai)." 1280 | ] 1281 | }, 1282 | { 1283 | "cell_type": "code", 1284 | "execution_count": null, 1285 | "metadata": { 1286 | "collapsed": true 1287 | }, 1288 | "outputs": [], 1289 | "source": [] 1290 | } 1291 | ], 1292 | "metadata": { 1293 | "anaconda-cloud": {}, 1294 | "kernelspec": { 1295 | "display_name": "Python [default]", 1296 | "language": "python", 1297 | "name": "python3" 1298 | }, 1299 | "language_info": { 1300 | "codemirror_mode": { 1301 | "name": "ipython", 1302 | "version": 3 1303 | }, 1304 | "file_extension": ".py", 1305 | "mimetype": "text/x-python", 1306 | "name": "python", 1307 | "nbconvert_exporter": "python", 1308 | "pygments_lexer": "ipython3", 1309 | "version": "3.5.2" 1310 | } 1311 | }, 1312 | "nbformat": 4, 1313 | "nbformat_minor": 2 1314 | } 1315 | --------------------------------------------------------------------------------