├── .gitignore ├── Big-Data-and-Spark ├── RDD Transformations and Actions.ipynb ├── example.txt ├── example2.txt └── services.txt ├── Machine Learning Sections ├── Decision-Trees-and-Random-Forests │ ├── Decision Trees and Random Forest Project .ipynb │ ├── kyphosis.csv │ └── loan_data.csv ├── K-Means-Clustering │ └── K Means Clustering Project .ipynb ├── K-Nearest-Neighbors │ └── K Nearest Neighbors Project.ipynb ├── Linear-Regression │ ├── Linear Regression - Project Exercise .ipynb │ └── USA_Housing.csv ├── Logistic-Regression │ ├── Logistic Regression Project .ipynb │ ├── advertising.csv │ ├── titanic_test.csv │ └── titanic_train.csv ├── Natural-Language-Processing │ ├── NLP Project .ipynb │ └── yelp.csv ├── Principal-Component-Analysis │ └── Principal Component Analysis.ipynb ├── Recommender-Systems │ ├── Advanced Recommender Systems with Python.ipynb │ └── Recommender Systems with Python.ipynb └── Support-Vector-Machines │ └── Support Vector Machines Project .ipynb ├── Python-Crash-Course └── Python Crash Course Exercises .ipynb ├── Python-for-Data-Analysis ├── NumPy │ └── Numpy Exercise .ipynb └── Pandas │ ├── Excel_Sample.xlsx │ ├── Pandas Exercises │ ├── Ecommerce Purchases Exercise .ipynb │ ├── SF Salaries Exercise.ipynb │ └── Salaries.csv │ └── gal.csv ├── Python-for-Data-Visualization ├── Geographical Plotting │ └── Choropleth Maps Exercise .ipynb ├── Matplotlib │ └── Matplotlib Exercises .ipynb ├── Pandas Built-in Data Viz │ └── Pandas Data Visualization Exercise .ipynb └── Seaborn │ └── Seaborn Exercises .ipynb └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | * 2 | !.gitignore 3 | !*Exercise* 4 | !*Project* 5 | !*.csv 6 | !*.html 7 | !*.txt 8 | !*RDD* 9 | !Seaborn 10 | !NumPy 11 | !Pandas 12 | !*.xlsx 13 | !Big-Data-and-Spark 14 | !*Learning* 15 | !Python-* 16 | !Geographical Plotting 17 | !Matplotlib 18 | !Pandas Built-in Data Viz 19 | !Machine Learning Sections/*/ 20 | *Solution* 21 | *checkpoint* 22 | -------------------------------------------------------------------------------- /Big-Data-and-Spark/RDD Transformations and Actions.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# RDD Transformations and Actions\n", 8 | "\n", 9 | "In this lecture we will begin to delve deeper into using Spark and Python. Please view the video lecture for a full explanation.\n", 10 | "\n", 11 | "## Important Terms\n", 12 | "\n", 13 | "Let's quickly go over some important terms:\n", 14 | "\n", 15 | "Term |Definition\n", 16 | "---- |-------\n", 17 | "RDD |Resilient Distributed Dataset\n", 18 | "Transformation |Spark operation that produces an RDD\n", 19 | "Action |Spark operation that produces a local object\n", 20 | "Spark Job |Sequence of transformations on data with a final action" 21 | ] 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": {}, 26 | "source": [ 27 | "## Creating an RDD\n", 28 | "\n", 29 | "There are two common ways to create an RDD:\n", 30 | "\n", 31 | "Method |Result\n", 32 | "---------- |-------\n", 33 | "`sc.parallelize(array)` |Create RDD of elements of array (or list)\n", 34 | "`sc.textFile(path/to/file)` |Create RDD of lines from file" 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "metadata": {}, 40 | "source": [ 41 | "## RDD Transformations\n", 42 | "\n", 43 | "We can use transformations to create a set of instructions we want to preform on the RDD (before we call an action and actually execute them).\n", 44 | "\n", 45 | "Transformation Example |Result\n", 46 | "---------- |-------\n", 47 | "`filter(lambda x: x % 2 == 0)` |Discard non-even elements\n", 48 | "`map(lambda x: x * 2)` |Multiply each RDD element by `2`\n", 49 | "`map(lambda x: x.split())` |Split each string into words\n", 50 | "`flatMap(lambda x: x.split())` |Split each string into words and flatten sequence\n", 51 | "`sample(withReplacement=True,0.25)` |Create sample of 25% of elements with replacement\n", 52 | "`union(rdd)` |Append `rdd` to existing RDD\n", 53 | "`distinct()` |Remove duplicates in RDD\n", 54 | "`sortBy(lambda x: x, ascending=False)` |Sort elements in descending order" 55 | ] 56 | }, 57 | { 58 | "cell_type": "markdown", 59 | "metadata": {}, 60 | "source": [ 61 | "## RDD Actions\n", 62 | "\n", 63 | "Once you have your 'recipe' of transformations ready, what you will do next is execute them by calling an action. Here are some common actions:\n", 64 | "\n", 65 | "Action |Result\n", 66 | "---------- |-------\n", 67 | "`collect()` |Convert RDD to in-memory list \n", 68 | "`take(3)` |First 3 elements of RDD \n", 69 | "`top(3)` |Top 3 elements of RDD\n", 70 | "`takeSample(withReplacement=True,3)` |Create sample of 3 elements with replacement\n", 71 | "`sum()` |Find element sum (assumes numeric elements)\n", 72 | "`mean()` |Find element mean (assumes numeric elements)\n", 73 | "`stdev()` |Find element deviation (assumes numeric elements)" 74 | ] 75 | }, 76 | { 77 | "cell_type": "markdown", 78 | "metadata": {}, 79 | "source": [ 80 | "____\n", 81 | "# Examples\n", 82 | "\n", 83 | "Now the best way to show all of this is by going through examples! We'll first review a bit by creating and working with a simple text file, then we will move on to more realistic data, such as customers and sales data.\n", 84 | "\n", 85 | "### Creating an RDD from a text file:\n", 86 | "\n", 87 | "** Creating the textfile **" 88 | ] 89 | }, 90 | { 91 | "cell_type": "code", 92 | "execution_count": 1, 93 | "metadata": {}, 94 | "outputs": [ 95 | { 96 | "name": "stdout", 97 | "output_type": "stream", 98 | "text": [ 99 | "Overwriting example2.txt\n" 100 | ] 101 | } 102 | ], 103 | "source": [ 104 | "%%writefile example2.txt\n", 105 | "first \n", 106 | "second line\n", 107 | "the third line\n", 108 | "then a fourth line" 109 | ] 110 | }, 111 | { 112 | "cell_type": "markdown", 113 | "metadata": {}, 114 | "source": [ 115 | "Now let's perform some transformations and actions on this text file:" 116 | ] 117 | }, 118 | { 119 | "cell_type": "code", 120 | "execution_count": 2, 121 | "metadata": { 122 | "collapsed": true 123 | }, 124 | "outputs": [], 125 | "source": [ 126 | "from pyspark import SparkContext" 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "execution_count": 3, 132 | "metadata": { 133 | "collapsed": true 134 | }, 135 | "outputs": [], 136 | "source": [ 137 | "sc = SparkContext()" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": 4, 143 | "metadata": {}, 144 | "outputs": [ 145 | { 146 | "data": { 147 | "text/plain": [ 148 | "MapPartitionsRDD[1] at textFile at NativeMethodAccessorImpl.java:-2" 149 | ] 150 | }, 151 | "execution_count": 4, 152 | "metadata": {}, 153 | "output_type": "execute_result" 154 | } 155 | ], 156 | "source": [ 157 | "# Show RDD\n", 158 | "sc.textFile('example2.txt')" 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": 5, 164 | "metadata": { 165 | "collapsed": true 166 | }, 167 | "outputs": [], 168 | "source": [ 169 | "# Save a reference to this RDD\n", 170 | "text_rdd = sc.textFile('example2.txt')" 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": 7, 176 | "metadata": {}, 177 | "outputs": [ 178 | { 179 | "data": { 180 | "text/plain": [ 181 | "[['first'],\n", 182 | " ['second', 'line'],\n", 183 | " ['the', 'third', 'line'],\n", 184 | " ['then', 'a', 'fourth', 'line']]" 185 | ] 186 | }, 187 | "execution_count": 7, 188 | "metadata": {}, 189 | "output_type": "execute_result" 190 | } 191 | ], 192 | "source": [ 193 | "# Map a function (or lambda expression) to each line\n", 194 | "# Then collect the results.\n", 195 | "text_rdd.map(lambda line: line.split()).collect()" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "metadata": {}, 201 | "source": [ 202 | "## Map vs flatMap" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": 8, 208 | "metadata": {}, 209 | "outputs": [ 210 | { 211 | "data": { 212 | "text/plain": [ 213 | "['first',\n", 214 | " 'second',\n", 215 | " 'line',\n", 216 | " 'the',\n", 217 | " 'third',\n", 218 | " 'line',\n", 219 | " 'then',\n", 220 | " 'a',\n", 221 | " 'fourth',\n", 222 | " 'line']" 223 | ] 224 | }, 225 | "execution_count": 8, 226 | "metadata": {}, 227 | "output_type": "execute_result" 228 | } 229 | ], 230 | "source": [ 231 | "# Collect everything as a single flat map\n", 232 | "text_rdd.flatMap(lambda line: line.split()).collect()" 233 | ] 234 | }, 235 | { 236 | "cell_type": "markdown", 237 | "metadata": {}, 238 | "source": [ 239 | "# RDDs and Key Value Pairs\n", 240 | "\n", 241 | "Now that we've worked with RDDs and how to aggregate values with them, we can begin to look into working with Key Value Pairs. In order to do this, let's create some fake data as a new text file.\n", 242 | "\n", 243 | "This data represents some services sold to customers for some SAAS business." 244 | ] 245 | }, 246 | { 247 | "cell_type": "code", 248 | "execution_count": 9, 249 | "metadata": {}, 250 | "outputs": [ 251 | { 252 | "name": "stdout", 253 | "output_type": "stream", 254 | "text": [ 255 | "Writing services.txt\n" 256 | ] 257 | } 258 | ], 259 | "source": [ 260 | "%%writefile services.txt\n", 261 | "#EventId Timestamp Customer State ServiceID Amount\n", 262 | "201 10/13/2017 100 NY 131 100.00\n", 263 | "204 10/18/2017 700 TX 129 450.00\n", 264 | "202 10/15/2017 203 CA 121 200.00\n", 265 | "206 10/19/2017 202 CA 131 500.00\n", 266 | "203 10/17/2017 101 NY 173 750.00\n", 267 | "205 10/19/2017 202 TX 121 200.00" 268 | ] 269 | }, 270 | { 271 | "cell_type": "code", 272 | "execution_count": 10, 273 | "metadata": { 274 | "collapsed": true 275 | }, 276 | "outputs": [], 277 | "source": [ 278 | "services = sc.textFile('services.txt')" 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": 11, 284 | "metadata": {}, 285 | "outputs": [ 286 | { 287 | "data": { 288 | "text/plain": [ 289 | "['#EventId Timestamp Customer State ServiceID Amount',\n", 290 | " '201 10/13/2017 100 NY 131 100.00']" 291 | ] 292 | }, 293 | "execution_count": 11, 294 | "metadata": {}, 295 | "output_type": "execute_result" 296 | } 297 | ], 298 | "source": [ 299 | "services.take(2)" 300 | ] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": 12, 305 | "metadata": {}, 306 | "outputs": [ 307 | { 308 | "data": { 309 | "text/plain": [ 310 | "PythonRDD[10] at RDD at PythonRDD.scala:43" 311 | ] 312 | }, 313 | "execution_count": 12, 314 | "metadata": {}, 315 | "output_type": "execute_result" 316 | } 317 | ], 318 | "source": [ 319 | "services.map(lambda x: x.split())" 320 | ] 321 | }, 322 | { 323 | "cell_type": "code", 324 | "execution_count": 13, 325 | "metadata": {}, 326 | "outputs": [ 327 | { 328 | "data": { 329 | "text/plain": [ 330 | "[['#EventId', 'Timestamp', 'Customer', 'State', 'ServiceID', 'Amount'],\n", 331 | " ['201', '10/13/2017', '100', 'NY', '131', '100.00'],\n", 332 | " ['204', '10/18/2017', '700', 'TX', '129', '450.00']]" 333 | ] 334 | }, 335 | "execution_count": 13, 336 | "metadata": {}, 337 | "output_type": "execute_result" 338 | } 339 | ], 340 | "source": [ 341 | "services.map(lambda x: x.split()).take(3)" 342 | ] 343 | }, 344 | { 345 | "cell_type": "markdown", 346 | "metadata": {}, 347 | "source": [ 348 | "Let's remove that first hash-tag!" 349 | ] 350 | }, 351 | { 352 | "cell_type": "code", 353 | "execution_count": 26, 354 | "metadata": {}, 355 | "outputs": [ 356 | { 357 | "data": { 358 | "text/plain": [ 359 | "['EventId Timestamp Customer State ServiceID Amount',\n", 360 | " '201 10/13/2017 100 NY 131 100.00',\n", 361 | " '204 10/18/2017 700 TX 129 450.00',\n", 362 | " '202 10/15/2017 203 CA 121 200.00',\n", 363 | " '206 10/19/2017 202 CA 131 500.00',\n", 364 | " '203 10/17/2017 101 NY 173 750.00',\n", 365 | " '205 10/19/2017 202 TX 121 200.00']" 366 | ] 367 | }, 368 | "execution_count": 26, 369 | "metadata": {}, 370 | "output_type": "execute_result" 371 | } 372 | ], 373 | "source": [ 374 | "services.map(lambda x: x[1:] if x[0]=='#' else x).collect()" 375 | ] 376 | }, 377 | { 378 | "cell_type": "code", 379 | "execution_count": 27, 380 | "metadata": {}, 381 | "outputs": [ 382 | { 383 | "data": { 384 | "text/plain": [ 385 | "[['EventId', 'Timestamp', 'Customer', 'State', 'ServiceID', 'Amount'],\n", 386 | " ['201', '10/13/2017', '100', 'NY', '131', '100.00'],\n", 387 | " ['204', '10/18/2017', '700', 'TX', '129', '450.00'],\n", 388 | " ['202', '10/15/2017', '203', 'CA', '121', '200.00'],\n", 389 | " ['206', '10/19/2017', '202', 'CA', '131', '500.00'],\n", 390 | " ['203', '10/17/2017', '101', 'NY', '173', '750.00'],\n", 391 | " ['205', '10/19/2017', '202', 'TX', '121', '200.00']]" 392 | ] 393 | }, 394 | "execution_count": 27, 395 | "metadata": {}, 396 | "output_type": "execute_result" 397 | } 398 | ], 399 | "source": [ 400 | "services.map(lambda x: x[1:] if x[0]=='#' else x).map(lambda x: x.split()).collect()" 401 | ] 402 | }, 403 | { 404 | "cell_type": "markdown", 405 | "metadata": {}, 406 | "source": [ 407 | "## Using Key Value Pairs for Operations\n", 408 | "\n", 409 | "Let us now begin to use methods that combine lambda expressions that use a ByKey argument. These ByKey methods will assume that your data is in a Key,Value form. \n", 410 | "\n", 411 | "\n", 412 | "For example let's find out the total sales per state: " 413 | ] 414 | }, 415 | { 416 | "cell_type": "code", 417 | "execution_count": 28, 418 | "metadata": { 419 | "collapsed": true 420 | }, 421 | "outputs": [], 422 | "source": [ 423 | "# From Previous\n", 424 | "cleanServ = services.map(lambda x: x[1:] if x[0]=='#' else x).map(lambda x: x.split())" 425 | ] 426 | }, 427 | { 428 | "cell_type": "code", 429 | "execution_count": 29, 430 | "metadata": {}, 431 | "outputs": [ 432 | { 433 | "data": { 434 | "text/plain": [ 435 | "[['EventId', 'Timestamp', 'Customer', 'State', 'ServiceID', 'Amount'],\n", 436 | " ['201', '10/13/2017', '100', 'NY', '131', '100.00'],\n", 437 | " ['204', '10/18/2017', '700', 'TX', '129', '450.00'],\n", 438 | " ['202', '10/15/2017', '203', 'CA', '121', '200.00'],\n", 439 | " ['206', '10/19/2017', '202', 'CA', '131', '500.00'],\n", 440 | " ['203', '10/17/2017', '101', 'NY', '173', '750.00'],\n", 441 | " ['205', '10/19/2017', '202', 'TX', '121', '200.00']]" 442 | ] 443 | }, 444 | "execution_count": 29, 445 | "metadata": {}, 446 | "output_type": "execute_result" 447 | } 448 | ], 449 | "source": [ 450 | "cleanServ.collect()" 451 | ] 452 | }, 453 | { 454 | "cell_type": "code", 455 | "execution_count": 52, 456 | "metadata": {}, 457 | "outputs": [ 458 | { 459 | "data": { 460 | "text/plain": [ 461 | "[('State', 'Amount'),\n", 462 | " ('NY', '100.00'),\n", 463 | " ('TX', '450.00'),\n", 464 | " ('CA', '200.00'),\n", 465 | " ('CA', '500.00'),\n", 466 | " ('NY', '750.00'),\n", 467 | " ('TX', '200.00')]" 468 | ] 469 | }, 470 | "execution_count": 52, 471 | "metadata": {}, 472 | "output_type": "execute_result" 473 | } 474 | ], 475 | "source": [ 476 | "# Let's start by practicing grabbing fields\n", 477 | "cleanServ.map(lambda lst: (lst[3],lst[-1])).collect()" 478 | ] 479 | }, 480 | { 481 | "cell_type": "code", 482 | "execution_count": 43, 483 | "metadata": {}, 484 | "outputs": [ 485 | { 486 | "data": { 487 | "text/plain": [ 488 | "[('State', 'Amount'),\n", 489 | " ('NY', '100.00750.00'),\n", 490 | " ('TX', '450.00200.00'),\n", 491 | " ('CA', '200.00500.00')]" 492 | ] 493 | }, 494 | "execution_count": 43, 495 | "metadata": {}, 496 | "output_type": "execute_result" 497 | } 498 | ], 499 | "source": [ 500 | "# Continue with reduceByKey\n", 501 | "# Notice how it assumes that the first item is the key!\n", 502 | "cleanServ.map(lambda lst: (lst[3],lst[-1]))\\\n", 503 | " .reduceByKey(lambda amt1,amt2 : amt1+amt2)\\\n", 504 | " .collect()" 505 | ] 506 | }, 507 | { 508 | "cell_type": "markdown", 509 | "metadata": {}, 510 | "source": [ 511 | "Uh oh! Looks like we forgot that the amounts are still strings! Let's fix that:" 512 | ] 513 | }, 514 | { 515 | "cell_type": "code", 516 | "execution_count": 42, 517 | "metadata": {}, 518 | "outputs": [ 519 | { 520 | "data": { 521 | "text/plain": [ 522 | "[('State', 'Amount'), ('NY', 850.0), ('TX', 650.0), ('CA', 700.0)]" 523 | ] 524 | }, 525 | "execution_count": 42, 526 | "metadata": {}, 527 | "output_type": "execute_result" 528 | } 529 | ], 530 | "source": [ 531 | "# Continue with reduceByKey\n", 532 | "# Notice how it assumes that the first item is the key!\n", 533 | "cleanServ.map(lambda lst: (lst[3],lst[-1]))\\\n", 534 | " .reduceByKey(lambda amt1,amt2 : float(amt1)+float(amt2))\\\n", 535 | " .collect()" 536 | ] 537 | }, 538 | { 539 | "cell_type": "markdown", 540 | "metadata": {}, 541 | "source": [ 542 | "We can continue our analysis by sorting this output:" 543 | ] 544 | }, 545 | { 546 | "cell_type": "code", 547 | "execution_count": 69, 548 | "metadata": {}, 549 | "outputs": [ 550 | { 551 | "data": { 552 | "text/plain": [ 553 | "[('NY', 850.0), ('CA', 700.0), ('TX', 650.0)]" 554 | ] 555 | }, 556 | "execution_count": 69, 557 | "metadata": {}, 558 | "output_type": "execute_result" 559 | } 560 | ], 561 | "source": [ 562 | "# Grab state and amounts\n", 563 | "# Add them\n", 564 | "# Get rid of ('State','Amount')\n", 565 | "# Sort them by the amount value\n", 566 | "cleanServ.map(lambda lst: (lst[3],lst[-1]))\\\n", 567 | ".reduceByKey(lambda amt1,amt2 : float(amt1)+float(amt2))\\\n", 568 | ".filter(lambda x: not x[0]=='State')\\\n", 569 | ".sortBy(lambda stateAmount: stateAmount[1], ascending=False)\\\n", 570 | ".collect()" 571 | ] 572 | }, 573 | { 574 | "cell_type": "markdown", 575 | "metadata": {}, 576 | "source": [ 577 | "** Remember to try to use unpacking for readability. For example: **" 578 | ] 579 | }, 580 | { 581 | "cell_type": "code", 582 | "execution_count": 78, 583 | "metadata": { 584 | "collapsed": true 585 | }, 586 | "outputs": [], 587 | "source": [ 588 | "x = ['ID','State','Amount']" 589 | ] 590 | }, 591 | { 592 | "cell_type": "code", 593 | "execution_count": 79, 594 | "metadata": { 595 | "collapsed": true 596 | }, 597 | "outputs": [], 598 | "source": [ 599 | "def func1(lst):\n", 600 | " return lst[-1]" 601 | ] 602 | }, 603 | { 604 | "cell_type": "code", 605 | "execution_count": 83, 606 | "metadata": { 607 | "collapsed": true 608 | }, 609 | "outputs": [], 610 | "source": [ 611 | "def func2(id_st_amt):\n", 612 | " # Unpack Values\n", 613 | " (Id,st,amt) = id_st_amt\n", 614 | " return amt" 615 | ] 616 | }, 617 | { 618 | "cell_type": "code", 619 | "execution_count": 84, 620 | "metadata": {}, 621 | "outputs": [ 622 | { 623 | "data": { 624 | "text/plain": [ 625 | "'Amount'" 626 | ] 627 | }, 628 | "execution_count": 84, 629 | "metadata": {}, 630 | "output_type": "execute_result" 631 | } 632 | ], 633 | "source": [ 634 | "func1(x)" 635 | ] 636 | }, 637 | { 638 | "cell_type": "code", 639 | "execution_count": 85, 640 | "metadata": {}, 641 | "outputs": [ 642 | { 643 | "data": { 644 | "text/plain": [ 645 | "'Amount'" 646 | ] 647 | }, 648 | "execution_count": 85, 649 | "metadata": {}, 650 | "output_type": "execute_result" 651 | } 652 | ], 653 | "source": [ 654 | "func2(x)" 655 | ] 656 | }, 657 | { 658 | "cell_type": "markdown", 659 | "metadata": {}, 660 | "source": [ 661 | "# Great Job!" 662 | ] 663 | } 664 | ], 665 | "metadata": { 666 | "kernelspec": { 667 | "display_name": "Python 3", 668 | "language": "python", 669 | "name": "python3" 670 | }, 671 | "language_info": { 672 | "codemirror_mode": { 673 | "name": "ipython", 674 | "version": 3 675 | }, 676 | "file_extension": ".py", 677 | "mimetype": "text/x-python", 678 | "name": "python", 679 | "nbconvert_exporter": "python", 680 | "pygments_lexer": "ipython3", 681 | "version": "3.6.1" 682 | } 683 | }, 684 | "nbformat": 4, 685 | "nbformat_minor": 1 686 | } 687 | -------------------------------------------------------------------------------- /Big-Data-and-Spark/example.txt: -------------------------------------------------------------------------------- 1 | first line 2 | second line 3 | third line 4 | fourth line -------------------------------------------------------------------------------- /Big-Data-and-Spark/example2.txt: -------------------------------------------------------------------------------- 1 | first 2 | second line 3 | the third line 4 | then a fourth line -------------------------------------------------------------------------------- /Big-Data-and-Spark/services.txt: -------------------------------------------------------------------------------- 1 | #EventId Timestamp Customer State ServiceID Amount 2 | 201 10/13/2017 100 NY 131 100.00 3 | 204 10/18/2017 700 TX 129 450.00 4 | 202 10/15/2017 203 CA 121 200.00 5 | 206 10/19/2017 202 CA 131 500.00 6 | 203 10/17/2017 101 NY 173 750.00 7 | 205 10/19/2017 202 TX 121 200.00 -------------------------------------------------------------------------------- /Machine Learning Sections/Decision-Trees-and-Random-Forests/kyphosis.csv: -------------------------------------------------------------------------------- 1 | "Kyphosis","Age","Number","Start" 2 | "absent",71,3,5 3 | "absent",158,3,14 4 | "present",128,4,5 5 | "absent",2,5,1 6 | "absent",1,4,15 7 | "absent",1,2,16 8 | "absent",61,2,17 9 | "absent",37,3,16 10 | "absent",113,2,16 11 | "present",59,6,12 12 | "present",82,5,14 13 | "absent",148,3,16 14 | "absent",18,5,2 15 | "absent",1,4,12 16 | "absent",168,3,18 17 | "absent",1,3,16 18 | "absent",78,6,15 19 | "absent",175,5,13 20 | "absent",80,5,16 21 | "absent",27,4,9 22 | "absent",22,2,16 23 | "present",105,6,5 24 | "present",96,3,12 25 | "absent",131,2,3 26 | "present",15,7,2 27 | "absent",9,5,13 28 | "absent",8,3,6 29 | "absent",100,3,14 30 | "absent",4,3,16 31 | "absent",151,2,16 32 | "absent",31,3,16 33 | "absent",125,2,11 34 | "absent",130,5,13 35 | "absent",112,3,16 36 | "absent",140,5,11 37 | "absent",93,3,16 38 | "absent",1,3,9 39 | "present",52,5,6 40 | "absent",20,6,9 41 | "present",91,5,12 42 | "present",73,5,1 43 | "absent",35,3,13 44 | "absent",143,9,3 45 | "absent",61,4,1 46 | "absent",97,3,16 47 | "present",139,3,10 48 | "absent",136,4,15 49 | "absent",131,5,13 50 | "present",121,3,3 51 | "absent",177,2,14 52 | "absent",68,5,10 53 | "absent",9,2,17 54 | "present",139,10,6 55 | "absent",2,2,17 56 | "absent",140,4,15 57 | "absent",72,5,15 58 | "absent",2,3,13 59 | "present",120,5,8 60 | "absent",51,7,9 61 | "absent",102,3,13 62 | "present",130,4,1 63 | "present",114,7,8 64 | "absent",81,4,1 65 | "absent",118,3,16 66 | "absent",118,4,16 67 | "absent",17,4,10 68 | "absent",195,2,17 69 | "absent",159,4,13 70 | "absent",18,4,11 71 | "absent",15,5,16 72 | "absent",158,5,14 73 | "absent",127,4,12 74 | "absent",87,4,16 75 | "absent",206,4,10 76 | "absent",11,3,15 77 | "absent",178,4,15 78 | "present",157,3,13 79 | "absent",26,7,13 80 | "absent",120,2,13 81 | "present",42,7,6 82 | "absent",36,4,13 83 | -------------------------------------------------------------------------------- /Machine Learning Sections/Logistic-Regression/titanic_test.csv: -------------------------------------------------------------------------------- 1 | PassengerId,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked 2 | 892,3,"Kelly, Mr. James",male,34.5,0,0,330911,7.8292,,Q 3 | 893,3,"Wilkes, Mrs. James (Ellen Needs)",female,47,1,0,363272,7,,S 4 | 894,2,"Myles, Mr. Thomas Francis",male,62,0,0,240276,9.6875,,Q 5 | 895,3,"Wirz, Mr. Albert",male,27,0,0,315154,8.6625,,S 6 | 896,3,"Hirvonen, Mrs. Alexander (Helga E Lindqvist)",female,22,1,1,3101298,12.2875,,S 7 | 897,3,"Svensson, Mr. Johan Cervin",male,14,0,0,7538,9.225,,S 8 | 898,3,"Connolly, Miss. Kate",female,30,0,0,330972,7.6292,,Q 9 | 899,2,"Caldwell, Mr. Albert Francis",male,26,1,1,248738,29,,S 10 | 900,3,"Abrahim, Mrs. Joseph (Sophie Halaut Easu)",female,18,0,0,2657,7.2292,,C 11 | 901,3,"Davies, Mr. John Samuel",male,21,2,0,A/4 48871,24.15,,S 12 | 902,3,"Ilieff, Mr. Ylio",male,,0,0,349220,7.8958,,S 13 | 903,1,"Jones, Mr. Charles Cresson",male,46,0,0,694,26,,S 14 | 904,1,"Snyder, Mrs. John Pillsbury (Nelle Stevenson)",female,23,1,0,21228,82.2667,B45,S 15 | 905,2,"Howard, Mr. Benjamin",male,63,1,0,24065,26,,S 16 | 906,1,"Chaffee, Mrs. Herbert Fuller (Carrie Constance Toogood)",female,47,1,0,W.E.P. 5734,61.175,E31,S 17 | 907,2,"del Carlo, Mrs. Sebastiano (Argenia Genovesi)",female,24,1,0,SC/PARIS 2167,27.7208,,C 18 | 908,2,"Keane, Mr. Daniel",male,35,0,0,233734,12.35,,Q 19 | 909,3,"Assaf, Mr. Gerios",male,21,0,0,2692,7.225,,C 20 | 910,3,"Ilmakangas, Miss. Ida Livija",female,27,1,0,STON/O2. 3101270,7.925,,S 21 | 911,3,"Assaf Khalil, Mrs. Mariana (Miriam"")""",female,45,0,0,2696,7.225,,C 22 | 912,1,"Rothschild, Mr. Martin",male,55,1,0,PC 17603,59.4,,C 23 | 913,3,"Olsen, Master. Artur Karl",male,9,0,1,C 17368,3.1708,,S 24 | 914,1,"Flegenheim, Mrs. Alfred (Antoinette)",female,,0,0,PC 17598,31.6833,,S 25 | 915,1,"Williams, Mr. Richard Norris II",male,21,0,1,PC 17597,61.3792,,C 26 | 916,1,"Ryerson, Mrs. Arthur Larned (Emily Maria Borie)",female,48,1,3,PC 17608,262.375,B57 B59 B63 B66,C 27 | 917,3,"Robins, Mr. Alexander A",male,50,1,0,A/5. 3337,14.5,,S 28 | 918,1,"Ostby, Miss. Helene Ragnhild",female,22,0,1,113509,61.9792,B36,C 29 | 919,3,"Daher, Mr. Shedid",male,22.5,0,0,2698,7.225,,C 30 | 920,1,"Brady, Mr. John Bertram",male,41,0,0,113054,30.5,A21,S 31 | 921,3,"Samaan, Mr. Elias",male,,2,0,2662,21.6792,,C 32 | 922,2,"Louch, Mr. Charles Alexander",male,50,1,0,SC/AH 3085,26,,S 33 | 923,2,"Jefferys, Mr. Clifford Thomas",male,24,2,0,C.A. 31029,31.5,,S 34 | 924,3,"Dean, Mrs. Bertram (Eva Georgetta Light)",female,33,1,2,C.A. 2315,20.575,,S 35 | 925,3,"Johnston, Mrs. Andrew G (Elizabeth Lily"" Watson)""",female,,1,2,W./C. 6607,23.45,,S 36 | 926,1,"Mock, Mr. Philipp Edmund",male,30,1,0,13236,57.75,C78,C 37 | 927,3,"Katavelas, Mr. Vassilios (Catavelas Vassilios"")""",male,18.5,0,0,2682,7.2292,,C 38 | 928,3,"Roth, Miss. Sarah A",female,,0,0,342712,8.05,,S 39 | 929,3,"Cacic, Miss. Manda",female,21,0,0,315087,8.6625,,S 40 | 930,3,"Sap, Mr. Julius",male,25,0,0,345768,9.5,,S 41 | 931,3,"Hee, Mr. Ling",male,,0,0,1601,56.4958,,S 42 | 932,3,"Karun, Mr. Franz",male,39,0,1,349256,13.4167,,C 43 | 933,1,"Franklin, Mr. Thomas Parham",male,,0,0,113778,26.55,D34,S 44 | 934,3,"Goldsmith, Mr. Nathan",male,41,0,0,SOTON/O.Q. 3101263,7.85,,S 45 | 935,2,"Corbett, Mrs. Walter H (Irene Colvin)",female,30,0,0,237249,13,,S 46 | 936,1,"Kimball, Mrs. Edwin Nelson Jr (Gertrude Parsons)",female,45,1,0,11753,52.5542,D19,S 47 | 937,3,"Peltomaki, Mr. Nikolai Johannes",male,25,0,0,STON/O 2. 3101291,7.925,,S 48 | 938,1,"Chevre, Mr. Paul Romaine",male,45,0,0,PC 17594,29.7,A9,C 49 | 939,3,"Shaughnessy, Mr. Patrick",male,,0,0,370374,7.75,,Q 50 | 940,1,"Bucknell, Mrs. William Robert (Emma Eliza Ward)",female,60,0,0,11813,76.2917,D15,C 51 | 941,3,"Coutts, Mrs. William (Winnie Minnie"" Treanor)""",female,36,0,2,C.A. 37671,15.9,,S 52 | 942,1,"Smith, Mr. Lucien Philip",male,24,1,0,13695,60,C31,S 53 | 943,2,"Pulbaum, Mr. Franz",male,27,0,0,SC/PARIS 2168,15.0333,,C 54 | 944,2,"Hocking, Miss. Ellen Nellie""""",female,20,2,1,29105,23,,S 55 | 945,1,"Fortune, Miss. Ethel Flora",female,28,3,2,19950,263,C23 C25 C27,S 56 | 946,2,"Mangiavacchi, Mr. Serafino Emilio",male,,0,0,SC/A.3 2861,15.5792,,C 57 | 947,3,"Rice, Master. Albert",male,10,4,1,382652,29.125,,Q 58 | 948,3,"Cor, Mr. Bartol",male,35,0,0,349230,7.8958,,S 59 | 949,3,"Abelseth, Mr. Olaus Jorgensen",male,25,0,0,348122,7.65,F G63,S 60 | 950,3,"Davison, Mr. Thomas Henry",male,,1,0,386525,16.1,,S 61 | 951,1,"Chaudanson, Miss. Victorine",female,36,0,0,PC 17608,262.375,B61,C 62 | 952,3,"Dika, Mr. Mirko",male,17,0,0,349232,7.8958,,S 63 | 953,2,"McCrae, Mr. Arthur Gordon",male,32,0,0,237216,13.5,,S 64 | 954,3,"Bjorklund, Mr. Ernst Herbert",male,18,0,0,347090,7.75,,S 65 | 955,3,"Bradley, Miss. Bridget Delia",female,22,0,0,334914,7.725,,Q 66 | 956,1,"Ryerson, Master. John Borie",male,13,2,2,PC 17608,262.375,B57 B59 B63 B66,C 67 | 957,2,"Corey, Mrs. Percy C (Mary Phyllis Elizabeth Miller)",female,,0,0,F.C.C. 13534,21,,S 68 | 958,3,"Burns, Miss. Mary Delia",female,18,0,0,330963,7.8792,,Q 69 | 959,1,"Moore, Mr. Clarence Bloomfield",male,47,0,0,113796,42.4,,S 70 | 960,1,"Tucker, Mr. Gilbert Milligan Jr",male,31,0,0,2543,28.5375,C53,C 71 | 961,1,"Fortune, Mrs. Mark (Mary McDougald)",female,60,1,4,19950,263,C23 C25 C27,S 72 | 962,3,"Mulvihill, Miss. Bertha E",female,24,0,0,382653,7.75,,Q 73 | 963,3,"Minkoff, Mr. Lazar",male,21,0,0,349211,7.8958,,S 74 | 964,3,"Nieminen, Miss. Manta Josefina",female,29,0,0,3101297,7.925,,S 75 | 965,1,"Ovies y Rodriguez, Mr. Servando",male,28.5,0,0,PC 17562,27.7208,D43,C 76 | 966,1,"Geiger, Miss. Amalie",female,35,0,0,113503,211.5,C130,C 77 | 967,1,"Keeping, Mr. Edwin",male,32.5,0,0,113503,211.5,C132,C 78 | 968,3,"Miles, Mr. Frank",male,,0,0,359306,8.05,,S 79 | 969,1,"Cornell, Mrs. Robert Clifford (Malvina Helen Lamson)",female,55,2,0,11770,25.7,C101,S 80 | 970,2,"Aldworth, Mr. Charles Augustus",male,30,0,0,248744,13,,S 81 | 971,3,"Doyle, Miss. Elizabeth",female,24,0,0,368702,7.75,,Q 82 | 972,3,"Boulos, Master. Akar",male,6,1,1,2678,15.2458,,C 83 | 973,1,"Straus, Mr. Isidor",male,67,1,0,PC 17483,221.7792,C55 C57,S 84 | 974,1,"Case, Mr. Howard Brown",male,49,0,0,19924,26,,S 85 | 975,3,"Demetri, Mr. Marinko",male,,0,0,349238,7.8958,,S 86 | 976,2,"Lamb, Mr. John Joseph",male,,0,0,240261,10.7083,,Q 87 | 977,3,"Khalil, Mr. Betros",male,,1,0,2660,14.4542,,C 88 | 978,3,"Barry, Miss. Julia",female,27,0,0,330844,7.8792,,Q 89 | 979,3,"Badman, Miss. Emily Louisa",female,18,0,0,A/4 31416,8.05,,S 90 | 980,3,"O'Donoghue, Ms. Bridget",female,,0,0,364856,7.75,,Q 91 | 981,2,"Wells, Master. Ralph Lester",male,2,1,1,29103,23,,S 92 | 982,3,"Dyker, Mrs. Adolf Fredrik (Anna Elisabeth Judith Andersson)",female,22,1,0,347072,13.9,,S 93 | 983,3,"Pedersen, Mr. Olaf",male,,0,0,345498,7.775,,S 94 | 984,1,"Davidson, Mrs. Thornton (Orian Hays)",female,27,1,2,F.C. 12750,52,B71,S 95 | 985,3,"Guest, Mr. Robert",male,,0,0,376563,8.05,,S 96 | 986,1,"Birnbaum, Mr. Jakob",male,25,0,0,13905,26,,C 97 | 987,3,"Tenglin, Mr. Gunnar Isidor",male,25,0,0,350033,7.7958,,S 98 | 988,1,"Cavendish, Mrs. Tyrell William (Julia Florence Siegel)",female,76,1,0,19877,78.85,C46,S 99 | 989,3,"Makinen, Mr. Kalle Edvard",male,29,0,0,STON/O 2. 3101268,7.925,,S 100 | 990,3,"Braf, Miss. Elin Ester Maria",female,20,0,0,347471,7.8542,,S 101 | 991,3,"Nancarrow, Mr. William Henry",male,33,0,0,A./5. 3338,8.05,,S 102 | 992,1,"Stengel, Mrs. Charles Emil Henry (Annie May Morris)",female,43,1,0,11778,55.4417,C116,C 103 | 993,2,"Weisz, Mr. Leopold",male,27,1,0,228414,26,,S 104 | 994,3,"Foley, Mr. William",male,,0,0,365235,7.75,,Q 105 | 995,3,"Johansson Palmquist, Mr. Oskar Leander",male,26,0,0,347070,7.775,,S 106 | 996,3,"Thomas, Mrs. Alexander (Thamine Thelma"")""",female,16,1,1,2625,8.5167,,C 107 | 997,3,"Holthen, Mr. Johan Martin",male,28,0,0,C 4001,22.525,,S 108 | 998,3,"Buckley, Mr. Daniel",male,21,0,0,330920,7.8208,,Q 109 | 999,3,"Ryan, Mr. Edward",male,,0,0,383162,7.75,,Q 110 | 1000,3,"Willer, Mr. Aaron (Abi Weller"")""",male,,0,0,3410,8.7125,,S 111 | 1001,2,"Swane, Mr. George",male,18.5,0,0,248734,13,F,S 112 | 1002,2,"Stanton, Mr. Samuel Ward",male,41,0,0,237734,15.0458,,C 113 | 1003,3,"Shine, Miss. Ellen Natalia",female,,0,0,330968,7.7792,,Q 114 | 1004,1,"Evans, Miss. Edith Corse",female,36,0,0,PC 17531,31.6792,A29,C 115 | 1005,3,"Buckley, Miss. Katherine",female,18.5,0,0,329944,7.2833,,Q 116 | 1006,1,"Straus, Mrs. Isidor (Rosalie Ida Blun)",female,63,1,0,PC 17483,221.7792,C55 C57,S 117 | 1007,3,"Chronopoulos, Mr. Demetrios",male,18,1,0,2680,14.4542,,C 118 | 1008,3,"Thomas, Mr. John",male,,0,0,2681,6.4375,,C 119 | 1009,3,"Sandstrom, Miss. Beatrice Irene",female,1,1,1,PP 9549,16.7,G6,S 120 | 1010,1,"Beattie, Mr. Thomson",male,36,0,0,13050,75.2417,C6,C 121 | 1011,2,"Chapman, Mrs. John Henry (Sara Elizabeth Lawry)",female,29,1,0,SC/AH 29037,26,,S 122 | 1012,2,"Watt, Miss. Bertha J",female,12,0,0,C.A. 33595,15.75,,S 123 | 1013,3,"Kiernan, Mr. John",male,,1,0,367227,7.75,,Q 124 | 1014,1,"Schabert, Mrs. Paul (Emma Mock)",female,35,1,0,13236,57.75,C28,C 125 | 1015,3,"Carver, Mr. Alfred John",male,28,0,0,392095,7.25,,S 126 | 1016,3,"Kennedy, Mr. John",male,,0,0,368783,7.75,,Q 127 | 1017,3,"Cribb, Miss. Laura Alice",female,17,0,1,371362,16.1,,S 128 | 1018,3,"Brobeck, Mr. Karl Rudolf",male,22,0,0,350045,7.7958,,S 129 | 1019,3,"McCoy, Miss. Alicia",female,,2,0,367226,23.25,,Q 130 | 1020,2,"Bowenur, Mr. Solomon",male,42,0,0,211535,13,,S 131 | 1021,3,"Petersen, Mr. Marius",male,24,0,0,342441,8.05,,S 132 | 1022,3,"Spinner, Mr. Henry John",male,32,0,0,STON/OQ. 369943,8.05,,S 133 | 1023,1,"Gracie, Col. Archibald IV",male,53,0,0,113780,28.5,C51,C 134 | 1024,3,"Lefebre, Mrs. Frank (Frances)",female,,0,4,4133,25.4667,,S 135 | 1025,3,"Thomas, Mr. Charles P",male,,1,0,2621,6.4375,,C 136 | 1026,3,"Dintcheff, Mr. Valtcho",male,43,0,0,349226,7.8958,,S 137 | 1027,3,"Carlsson, Mr. Carl Robert",male,24,0,0,350409,7.8542,,S 138 | 1028,3,"Zakarian, Mr. Mapriededer",male,26.5,0,0,2656,7.225,,C 139 | 1029,2,"Schmidt, Mr. August",male,26,0,0,248659,13,,S 140 | 1030,3,"Drapkin, Miss. Jennie",female,23,0,0,SOTON/OQ 392083,8.05,,S 141 | 1031,3,"Goodwin, Mr. Charles Frederick",male,40,1,6,CA 2144,46.9,,S 142 | 1032,3,"Goodwin, Miss. Jessie Allis",female,10,5,2,CA 2144,46.9,,S 143 | 1033,1,"Daniels, Miss. Sarah",female,33,0,0,113781,151.55,,S 144 | 1034,1,"Ryerson, Mr. Arthur Larned",male,61,1,3,PC 17608,262.375,B57 B59 B63 B66,C 145 | 1035,2,"Beauchamp, Mr. Henry James",male,28,0,0,244358,26,,S 146 | 1036,1,"Lindeberg-Lind, Mr. Erik Gustaf (Mr Edward Lingrey"")""",male,42,0,0,17475,26.55,,S 147 | 1037,3,"Vander Planke, Mr. Julius",male,31,3,0,345763,18,,S 148 | 1038,1,"Hilliard, Mr. Herbert Henry",male,,0,0,17463,51.8625,E46,S 149 | 1039,3,"Davies, Mr. Evan",male,22,0,0,SC/A4 23568,8.05,,S 150 | 1040,1,"Crafton, Mr. John Bertram",male,,0,0,113791,26.55,,S 151 | 1041,2,"Lahtinen, Rev. William",male,30,1,1,250651,26,,S 152 | 1042,1,"Earnshaw, Mrs. Boulton (Olive Potter)",female,23,0,1,11767,83.1583,C54,C 153 | 1043,3,"Matinoff, Mr. Nicola",male,,0,0,349255,7.8958,,C 154 | 1044,3,"Storey, Mr. Thomas",male,60.5,0,0,3701,,,S 155 | 1045,3,"Klasen, Mrs. (Hulda Kristina Eugenia Lofqvist)",female,36,0,2,350405,12.1833,,S 156 | 1046,3,"Asplund, Master. Filip Oscar",male,13,4,2,347077,31.3875,,S 157 | 1047,3,"Duquemin, Mr. Joseph",male,24,0,0,S.O./P.P. 752,7.55,,S 158 | 1048,1,"Bird, Miss. Ellen",female,29,0,0,PC 17483,221.7792,C97,S 159 | 1049,3,"Lundin, Miss. Olga Elida",female,23,0,0,347469,7.8542,,S 160 | 1050,1,"Borebank, Mr. John James",male,42,0,0,110489,26.55,D22,S 161 | 1051,3,"Peacock, Mrs. Benjamin (Edith Nile)",female,26,0,2,SOTON/O.Q. 3101315,13.775,,S 162 | 1052,3,"Smyth, Miss. Julia",female,,0,0,335432,7.7333,,Q 163 | 1053,3,"Touma, Master. Georges Youssef",male,7,1,1,2650,15.2458,,C 164 | 1054,2,"Wright, Miss. Marion",female,26,0,0,220844,13.5,,S 165 | 1055,3,"Pearce, Mr. Ernest",male,,0,0,343271,7,,S 166 | 1056,2,"Peruschitz, Rev. Joseph Maria",male,41,0,0,237393,13,,S 167 | 1057,3,"Kink-Heilmann, Mrs. Anton (Luise Heilmann)",female,26,1,1,315153,22.025,,S 168 | 1058,1,"Brandeis, Mr. Emil",male,48,0,0,PC 17591,50.4958,B10,C 169 | 1059,3,"Ford, Mr. Edward Watson",male,18,2,2,W./C. 6608,34.375,,S 170 | 1060,1,"Cassebeer, Mrs. Henry Arthur Jr (Eleanor Genevieve Fosdick)",female,,0,0,17770,27.7208,,C 171 | 1061,3,"Hellstrom, Miss. Hilda Maria",female,22,0,0,7548,8.9625,,S 172 | 1062,3,"Lithman, Mr. Simon",male,,0,0,S.O./P.P. 251,7.55,,S 173 | 1063,3,"Zakarian, Mr. Ortin",male,27,0,0,2670,7.225,,C 174 | 1064,3,"Dyker, Mr. Adolf Fredrik",male,23,1,0,347072,13.9,,S 175 | 1065,3,"Torfa, Mr. Assad",male,,0,0,2673,7.2292,,C 176 | 1066,3,"Asplund, Mr. Carl Oscar Vilhelm Gustafsson",male,40,1,5,347077,31.3875,,S 177 | 1067,2,"Brown, Miss. Edith Eileen",female,15,0,2,29750,39,,S 178 | 1068,2,"Sincock, Miss. Maude",female,20,0,0,C.A. 33112,36.75,,S 179 | 1069,1,"Stengel, Mr. Charles Emil Henry",male,54,1,0,11778,55.4417,C116,C 180 | 1070,2,"Becker, Mrs. Allen Oliver (Nellie E Baumgardner)",female,36,0,3,230136,39,F4,S 181 | 1071,1,"Compton, Mrs. Alexander Taylor (Mary Eliza Ingersoll)",female,64,0,2,PC 17756,83.1583,E45,C 182 | 1072,2,"McCrie, Mr. James Matthew",male,30,0,0,233478,13,,S 183 | 1073,1,"Compton, Mr. Alexander Taylor Jr",male,37,1,1,PC 17756,83.1583,E52,C 184 | 1074,1,"Marvin, Mrs. Daniel Warner (Mary Graham Carmichael Farquarson)",female,18,1,0,113773,53.1,D30,S 185 | 1075,3,"Lane, Mr. Patrick",male,,0,0,7935,7.75,,Q 186 | 1076,1,"Douglas, Mrs. Frederick Charles (Mary Helene Baxter)",female,27,1,1,PC 17558,247.5208,B58 B60,C 187 | 1077,2,"Maybery, Mr. Frank Hubert",male,40,0,0,239059,16,,S 188 | 1078,2,"Phillips, Miss. Alice Frances Louisa",female,21,0,1,S.O./P.P. 2,21,,S 189 | 1079,3,"Davies, Mr. Joseph",male,17,2,0,A/4 48873,8.05,,S 190 | 1080,3,"Sage, Miss. Ada",female,,8,2,CA. 2343,69.55,,S 191 | 1081,2,"Veal, Mr. James",male,40,0,0,28221,13,,S 192 | 1082,2,"Angle, Mr. William A",male,34,1,0,226875,26,,S 193 | 1083,1,"Salomon, Mr. Abraham L",male,,0,0,111163,26,,S 194 | 1084,3,"van Billiard, Master. Walter John",male,11.5,1,1,A/5. 851,14.5,,S 195 | 1085,2,"Lingane, Mr. John",male,61,0,0,235509,12.35,,Q 196 | 1086,2,"Drew, Master. Marshall Brines",male,8,0,2,28220,32.5,,S 197 | 1087,3,"Karlsson, Mr. Julius Konrad Eugen",male,33,0,0,347465,7.8542,,S 198 | 1088,1,"Spedden, Master. Robert Douglas",male,6,0,2,16966,134.5,E34,C 199 | 1089,3,"Nilsson, Miss. Berta Olivia",female,18,0,0,347066,7.775,,S 200 | 1090,2,"Baimbrigge, Mr. Charles Robert",male,23,0,0,C.A. 31030,10.5,,S 201 | 1091,3,"Rasmussen, Mrs. (Lena Jacobsen Solvang)",female,,0,0,65305,8.1125,,S 202 | 1092,3,"Murphy, Miss. Nora",female,,0,0,36568,15.5,,Q 203 | 1093,3,"Danbom, Master. Gilbert Sigvard Emanuel",male,0.33,0,2,347080,14.4,,S 204 | 1094,1,"Astor, Col. John Jacob",male,47,1,0,PC 17757,227.525,C62 C64,C 205 | 1095,2,"Quick, Miss. Winifred Vera",female,8,1,1,26360,26,,S 206 | 1096,2,"Andrew, Mr. Frank Thomas",male,25,0,0,C.A. 34050,10.5,,S 207 | 1097,1,"Omont, Mr. Alfred Fernand",male,,0,0,F.C. 12998,25.7417,,C 208 | 1098,3,"McGowan, Miss. Katherine",female,35,0,0,9232,7.75,,Q 209 | 1099,2,"Collett, Mr. Sidney C Stuart",male,24,0,0,28034,10.5,,S 210 | 1100,1,"Rosenbaum, Miss. Edith Louise",female,33,0,0,PC 17613,27.7208,A11,C 211 | 1101,3,"Delalic, Mr. Redjo",male,25,0,0,349250,7.8958,,S 212 | 1102,3,"Andersen, Mr. Albert Karvin",male,32,0,0,C 4001,22.525,,S 213 | 1103,3,"Finoli, Mr. Luigi",male,,0,0,SOTON/O.Q. 3101308,7.05,,S 214 | 1104,2,"Deacon, Mr. Percy William",male,17,0,0,S.O.C. 14879,73.5,,S 215 | 1105,2,"Howard, Mrs. Benjamin (Ellen Truelove Arman)",female,60,1,0,24065,26,,S 216 | 1106,3,"Andersson, Miss. Ida Augusta Margareta",female,38,4,2,347091,7.775,,S 217 | 1107,1,"Head, Mr. Christopher",male,42,0,0,113038,42.5,B11,S 218 | 1108,3,"Mahon, Miss. Bridget Delia",female,,0,0,330924,7.8792,,Q 219 | 1109,1,"Wick, Mr. George Dennick",male,57,1,1,36928,164.8667,,S 220 | 1110,1,"Widener, Mrs. George Dunton (Eleanor Elkins)",female,50,1,1,113503,211.5,C80,C 221 | 1111,3,"Thomson, Mr. Alexander Morrison",male,,0,0,32302,8.05,,S 222 | 1112,2,"Duran y More, Miss. Florentina",female,30,1,0,SC/PARIS 2148,13.8583,,C 223 | 1113,3,"Reynolds, Mr. Harold J",male,21,0,0,342684,8.05,,S 224 | 1114,2,"Cook, Mrs. (Selena Rogers)",female,22,0,0,W./C. 14266,10.5,F33,S 225 | 1115,3,"Karlsson, Mr. Einar Gervasius",male,21,0,0,350053,7.7958,,S 226 | 1116,1,"Candee, Mrs. Edward (Helen Churchill Hungerford)",female,53,0,0,PC 17606,27.4458,,C 227 | 1117,3,"Moubarek, Mrs. George (Omine Amenia"" Alexander)""",female,,0,2,2661,15.2458,,C 228 | 1118,3,"Asplund, Mr. Johan Charles",male,23,0,0,350054,7.7958,,S 229 | 1119,3,"McNeill, Miss. Bridget",female,,0,0,370368,7.75,,Q 230 | 1120,3,"Everett, Mr. Thomas James",male,40.5,0,0,C.A. 6212,15.1,,S 231 | 1121,2,"Hocking, Mr. Samuel James Metcalfe",male,36,0,0,242963,13,,S 232 | 1122,2,"Sweet, Mr. George Frederick",male,14,0,0,220845,65,,S 233 | 1123,1,"Willard, Miss. Constance",female,21,0,0,113795,26.55,,S 234 | 1124,3,"Wiklund, Mr. Karl Johan",male,21,1,0,3101266,6.4958,,S 235 | 1125,3,"Linehan, Mr. Michael",male,,0,0,330971,7.8792,,Q 236 | 1126,1,"Cumings, Mr. John Bradley",male,39,1,0,PC 17599,71.2833,C85,C 237 | 1127,3,"Vendel, Mr. Olof Edvin",male,20,0,0,350416,7.8542,,S 238 | 1128,1,"Warren, Mr. Frank Manley",male,64,1,0,110813,75.25,D37,C 239 | 1129,3,"Baccos, Mr. Raffull",male,20,0,0,2679,7.225,,C 240 | 1130,2,"Hiltunen, Miss. Marta",female,18,1,1,250650,13,,S 241 | 1131,1,"Douglas, Mrs. Walter Donald (Mahala Dutton)",female,48,1,0,PC 17761,106.425,C86,C 242 | 1132,1,"Lindstrom, Mrs. Carl Johan (Sigrid Posse)",female,55,0,0,112377,27.7208,,C 243 | 1133,2,"Christy, Mrs. (Alice Frances)",female,45,0,2,237789,30,,S 244 | 1134,1,"Spedden, Mr. Frederic Oakley",male,45,1,1,16966,134.5,E34,C 245 | 1135,3,"Hyman, Mr. Abraham",male,,0,0,3470,7.8875,,S 246 | 1136,3,"Johnston, Master. William Arthur Willie""""",male,,1,2,W./C. 6607,23.45,,S 247 | 1137,1,"Kenyon, Mr. Frederick R",male,41,1,0,17464,51.8625,D21,S 248 | 1138,2,"Karnes, Mrs. J Frank (Claire Bennett)",female,22,0,0,F.C.C. 13534,21,,S 249 | 1139,2,"Drew, Mr. James Vivian",male,42,1,1,28220,32.5,,S 250 | 1140,2,"Hold, Mrs. Stephen (Annie Margaret Hill)",female,29,1,0,26707,26,,S 251 | 1141,3,"Khalil, Mrs. Betros (Zahie Maria"" Elias)""",female,,1,0,2660,14.4542,,C 252 | 1142,2,"West, Miss. Barbara J",female,0.92,1,2,C.A. 34651,27.75,,S 253 | 1143,3,"Abrahamsson, Mr. Abraham August Johannes",male,20,0,0,SOTON/O2 3101284,7.925,,S 254 | 1144,1,"Clark, Mr. Walter Miller",male,27,1,0,13508,136.7792,C89,C 255 | 1145,3,"Salander, Mr. Karl Johan",male,24,0,0,7266,9.325,,S 256 | 1146,3,"Wenzel, Mr. Linhart",male,32.5,0,0,345775,9.5,,S 257 | 1147,3,"MacKay, Mr. George William",male,,0,0,C.A. 42795,7.55,,S 258 | 1148,3,"Mahon, Mr. John",male,,0,0,AQ/4 3130,7.75,,Q 259 | 1149,3,"Niklasson, Mr. Samuel",male,28,0,0,363611,8.05,,S 260 | 1150,2,"Bentham, Miss. Lilian W",female,19,0,0,28404,13,,S 261 | 1151,3,"Midtsjo, Mr. Karl Albert",male,21,0,0,345501,7.775,,S 262 | 1152,3,"de Messemaeker, Mr. Guillaume Joseph",male,36.5,1,0,345572,17.4,,S 263 | 1153,3,"Nilsson, Mr. August Ferdinand",male,21,0,0,350410,7.8542,,S 264 | 1154,2,"Wells, Mrs. Arthur Henry (Addie"" Dart Trevaskis)""",female,29,0,2,29103,23,,S 265 | 1155,3,"Klasen, Miss. Gertrud Emilia",female,1,1,1,350405,12.1833,,S 266 | 1156,2,"Portaluppi, Mr. Emilio Ilario Giuseppe",male,30,0,0,C.A. 34644,12.7375,,C 267 | 1157,3,"Lyntakoff, Mr. Stanko",male,,0,0,349235,7.8958,,S 268 | 1158,1,"Chisholm, Mr. Roderick Robert Crispin",male,,0,0,112051,0,,S 269 | 1159,3,"Warren, Mr. Charles William",male,,0,0,C.A. 49867,7.55,,S 270 | 1160,3,"Howard, Miss. May Elizabeth",female,,0,0,A. 2. 39186,8.05,,S 271 | 1161,3,"Pokrnic, Mr. Mate",male,17,0,0,315095,8.6625,,S 272 | 1162,1,"McCaffry, Mr. Thomas Francis",male,46,0,0,13050,75.2417,C6,C 273 | 1163,3,"Fox, Mr. Patrick",male,,0,0,368573,7.75,,Q 274 | 1164,1,"Clark, Mrs. Walter Miller (Virginia McDowell)",female,26,1,0,13508,136.7792,C89,C 275 | 1165,3,"Lennon, Miss. Mary",female,,1,0,370371,15.5,,Q 276 | 1166,3,"Saade, Mr. Jean Nassr",male,,0,0,2676,7.225,,C 277 | 1167,2,"Bryhl, Miss. Dagmar Jenny Ingeborg ",female,20,1,0,236853,26,,S 278 | 1168,2,"Parker, Mr. Clifford Richard",male,28,0,0,SC 14888,10.5,,S 279 | 1169,2,"Faunthorpe, Mr. Harry",male,40,1,0,2926,26,,S 280 | 1170,2,"Ware, Mr. John James",male,30,1,0,CA 31352,21,,S 281 | 1171,2,"Oxenham, Mr. Percy Thomas",male,22,0,0,W./C. 14260,10.5,,S 282 | 1172,3,"Oreskovic, Miss. Jelka",female,23,0,0,315085,8.6625,,S 283 | 1173,3,"Peacock, Master. Alfred Edward",male,0.75,1,1,SOTON/O.Q. 3101315,13.775,,S 284 | 1174,3,"Fleming, Miss. Honora",female,,0,0,364859,7.75,,Q 285 | 1175,3,"Touma, Miss. Maria Youssef",female,9,1,1,2650,15.2458,,C 286 | 1176,3,"Rosblom, Miss. Salli Helena",female,2,1,1,370129,20.2125,,S 287 | 1177,3,"Dennis, Mr. William",male,36,0,0,A/5 21175,7.25,,S 288 | 1178,3,"Franklin, Mr. Charles (Charles Fardon)",male,,0,0,SOTON/O.Q. 3101314,7.25,,S 289 | 1179,1,"Snyder, Mr. John Pillsbury",male,24,1,0,21228,82.2667,B45,S 290 | 1180,3,"Mardirosian, Mr. Sarkis",male,,0,0,2655,7.2292,F E46,C 291 | 1181,3,"Ford, Mr. Arthur",male,,0,0,A/5 1478,8.05,,S 292 | 1182,1,"Rheims, Mr. George Alexander Lucien",male,,0,0,PC 17607,39.6,,S 293 | 1183,3,"Daly, Miss. Margaret Marcella Maggie""""",female,30,0,0,382650,6.95,,Q 294 | 1184,3,"Nasr, Mr. Mustafa",male,,0,0,2652,7.2292,,C 295 | 1185,1,"Dodge, Dr. Washington",male,53,1,1,33638,81.8583,A34,S 296 | 1186,3,"Wittevrongel, Mr. Camille",male,36,0,0,345771,9.5,,S 297 | 1187,3,"Angheloff, Mr. Minko",male,26,0,0,349202,7.8958,,S 298 | 1188,2,"Laroche, Miss. Louise",female,1,1,2,SC/Paris 2123,41.5792,,C 299 | 1189,3,"Samaan, Mr. Hanna",male,,2,0,2662,21.6792,,C 300 | 1190,1,"Loring, Mr. Joseph Holland",male,30,0,0,113801,45.5,,S 301 | 1191,3,"Johansson, Mr. Nils",male,29,0,0,347467,7.8542,,S 302 | 1192,3,"Olsson, Mr. Oscar Wilhelm",male,32,0,0,347079,7.775,,S 303 | 1193,2,"Malachard, Mr. Noel",male,,0,0,237735,15.0458,D,C 304 | 1194,2,"Phillips, Mr. Escott Robert",male,43,0,1,S.O./P.P. 2,21,,S 305 | 1195,3,"Pokrnic, Mr. Tome",male,24,0,0,315092,8.6625,,S 306 | 1196,3,"McCarthy, Miss. Catherine Katie""""",female,,0,0,383123,7.75,,Q 307 | 1197,1,"Crosby, Mrs. Edward Gifford (Catherine Elizabeth Halstead)",female,64,1,1,112901,26.55,B26,S 308 | 1198,1,"Allison, Mr. Hudson Joshua Creighton",male,30,1,2,113781,151.55,C22 C26,S 309 | 1199,3,"Aks, Master. Philip Frank",male,0.83,0,1,392091,9.35,,S 310 | 1200,1,"Hays, Mr. Charles Melville",male,55,1,1,12749,93.5,B69,S 311 | 1201,3,"Hansen, Mrs. Claus Peter (Jennie L Howard)",female,45,1,0,350026,14.1083,,S 312 | 1202,3,"Cacic, Mr. Jego Grga",male,18,0,0,315091,8.6625,,S 313 | 1203,3,"Vartanian, Mr. David",male,22,0,0,2658,7.225,,C 314 | 1204,3,"Sadowitz, Mr. Harry",male,,0,0,LP 1588,7.575,,S 315 | 1205,3,"Carr, Miss. Jeannie",female,37,0,0,368364,7.75,,Q 316 | 1206,1,"White, Mrs. John Stuart (Ella Holmes)",female,55,0,0,PC 17760,135.6333,C32,C 317 | 1207,3,"Hagardon, Miss. Kate",female,17,0,0,AQ/3. 30631,7.7333,,Q 318 | 1208,1,"Spencer, Mr. William Augustus",male,57,1,0,PC 17569,146.5208,B78,C 319 | 1209,2,"Rogers, Mr. Reginald Harry",male,19,0,0,28004,10.5,,S 320 | 1210,3,"Jonsson, Mr. Nils Hilding",male,27,0,0,350408,7.8542,,S 321 | 1211,2,"Jefferys, Mr. Ernest Wilfred",male,22,2,0,C.A. 31029,31.5,,S 322 | 1212,3,"Andersson, Mr. Johan Samuel",male,26,0,0,347075,7.775,,S 323 | 1213,3,"Krekorian, Mr. Neshan",male,25,0,0,2654,7.2292,F E57,C 324 | 1214,2,"Nesson, Mr. Israel",male,26,0,0,244368,13,F2,S 325 | 1215,1,"Rowe, Mr. Alfred G",male,33,0,0,113790,26.55,,S 326 | 1216,1,"Kreuchen, Miss. Emilie",female,39,0,0,24160,211.3375,,S 327 | 1217,3,"Assam, Mr. Ali",male,23,0,0,SOTON/O.Q. 3101309,7.05,,S 328 | 1218,2,"Becker, Miss. Ruth Elizabeth",female,12,2,1,230136,39,F4,S 329 | 1219,1,"Rosenshine, Mr. George (Mr George Thorne"")""",male,46,0,0,PC 17585,79.2,,C 330 | 1220,2,"Clarke, Mr. Charles Valentine",male,29,1,0,2003,26,,S 331 | 1221,2,"Enander, Mr. Ingvar",male,21,0,0,236854,13,,S 332 | 1222,2,"Davies, Mrs. John Morgan (Elizabeth Agnes Mary White) ",female,48,0,2,C.A. 33112,36.75,,S 333 | 1223,1,"Dulles, Mr. William Crothers",male,39,0,0,PC 17580,29.7,A18,C 334 | 1224,3,"Thomas, Mr. Tannous",male,,0,0,2684,7.225,,C 335 | 1225,3,"Nakid, Mrs. Said (Waika Mary"" Mowad)""",female,19,1,1,2653,15.7417,,C 336 | 1226,3,"Cor, Mr. Ivan",male,27,0,0,349229,7.8958,,S 337 | 1227,1,"Maguire, Mr. John Edward",male,30,0,0,110469,26,C106,S 338 | 1228,2,"de Brito, Mr. Jose Joaquim",male,32,0,0,244360,13,,S 339 | 1229,3,"Elias, Mr. Joseph",male,39,0,2,2675,7.2292,,C 340 | 1230,2,"Denbury, Mr. Herbert",male,25,0,0,C.A. 31029,31.5,,S 341 | 1231,3,"Betros, Master. Seman",male,,0,0,2622,7.2292,,C 342 | 1232,2,"Fillbrook, Mr. Joseph Charles",male,18,0,0,C.A. 15185,10.5,,S 343 | 1233,3,"Lundstrom, Mr. Thure Edvin",male,32,0,0,350403,7.5792,,S 344 | 1234,3,"Sage, Mr. John George",male,,1,9,CA. 2343,69.55,,S 345 | 1235,1,"Cardeza, Mrs. James Warburton Martinez (Charlotte Wardle Drake)",female,58,0,1,PC 17755,512.3292,B51 B53 B55,C 346 | 1236,3,"van Billiard, Master. James William",male,,1,1,A/5. 851,14.5,,S 347 | 1237,3,"Abelseth, Miss. Karen Marie",female,16,0,0,348125,7.65,,S 348 | 1238,2,"Botsford, Mr. William Hull",male,26,0,0,237670,13,,S 349 | 1239,3,"Whabee, Mrs. George Joseph (Shawneene Abi-Saab)",female,38,0,0,2688,7.2292,,C 350 | 1240,2,"Giles, Mr. Ralph",male,24,0,0,248726,13.5,,S 351 | 1241,2,"Walcroft, Miss. Nellie",female,31,0,0,F.C.C. 13528,21,,S 352 | 1242,1,"Greenfield, Mrs. Leo David (Blanche Strouse)",female,45,0,1,PC 17759,63.3583,D10 D12,C 353 | 1243,2,"Stokes, Mr. Philip Joseph",male,25,0,0,F.C.C. 13540,10.5,,S 354 | 1244,2,"Dibden, Mr. William",male,18,0,0,S.O.C. 14879,73.5,,S 355 | 1245,2,"Herman, Mr. Samuel",male,49,1,2,220845,65,,S 356 | 1246,3,"Dean, Miss. Elizabeth Gladys Millvina""""",female,0.17,1,2,C.A. 2315,20.575,,S 357 | 1247,1,"Julian, Mr. Henry Forbes",male,50,0,0,113044,26,E60,S 358 | 1248,1,"Brown, Mrs. John Murray (Caroline Lane Lamson)",female,59,2,0,11769,51.4792,C101,S 359 | 1249,3,"Lockyer, Mr. Edward",male,,0,0,1222,7.8792,,S 360 | 1250,3,"O'Keefe, Mr. Patrick",male,,0,0,368402,7.75,,Q 361 | 1251,3,"Lindell, Mrs. Edvard Bengtsson (Elin Gerda Persson)",female,30,1,0,349910,15.55,,S 362 | 1252,3,"Sage, Master. William Henry",male,14.5,8,2,CA. 2343,69.55,,S 363 | 1253,2,"Mallet, Mrs. Albert (Antoinette Magnin)",female,24,1,1,S.C./PARIS 2079,37.0042,,C 364 | 1254,2,"Ware, Mrs. John James (Florence Louise Long)",female,31,0,0,CA 31352,21,,S 365 | 1255,3,"Strilic, Mr. Ivan",male,27,0,0,315083,8.6625,,S 366 | 1256,1,"Harder, Mrs. George Achilles (Dorothy Annan)",female,25,1,0,11765,55.4417,E50,C 367 | 1257,3,"Sage, Mrs. John (Annie Bullen)",female,,1,9,CA. 2343,69.55,,S 368 | 1258,3,"Caram, Mr. Joseph",male,,1,0,2689,14.4583,,C 369 | 1259,3,"Riihivouri, Miss. Susanna Juhantytar Sanni""""",female,22,0,0,3101295,39.6875,,S 370 | 1260,1,"Gibson, Mrs. Leonard (Pauline C Boeson)",female,45,0,1,112378,59.4,,C 371 | 1261,2,"Pallas y Castello, Mr. Emilio",male,29,0,0,SC/PARIS 2147,13.8583,,C 372 | 1262,2,"Giles, Mr. Edgar",male,21,1,0,28133,11.5,,S 373 | 1263,1,"Wilson, Miss. Helen Alice",female,31,0,0,16966,134.5,E39 E41,C 374 | 1264,1,"Ismay, Mr. Joseph Bruce",male,49,0,0,112058,0,B52 B54 B56,S 375 | 1265,2,"Harbeck, Mr. William H",male,44,0,0,248746,13,,S 376 | 1266,1,"Dodge, Mrs. Washington (Ruth Vidaver)",female,54,1,1,33638,81.8583,A34,S 377 | 1267,1,"Bowen, Miss. Grace Scott",female,45,0,0,PC 17608,262.375,,C 378 | 1268,3,"Kink, Miss. Maria",female,22,2,0,315152,8.6625,,S 379 | 1269,2,"Cotterill, Mr. Henry Harry""""",male,21,0,0,29107,11.5,,S 380 | 1270,1,"Hipkins, Mr. William Edward",male,55,0,0,680,50,C39,S 381 | 1271,3,"Asplund, Master. Carl Edgar",male,5,4,2,347077,31.3875,,S 382 | 1272,3,"O'Connor, Mr. Patrick",male,,0,0,366713,7.75,,Q 383 | 1273,3,"Foley, Mr. Joseph",male,26,0,0,330910,7.8792,,Q 384 | 1274,3,"Risien, Mrs. Samuel (Emma)",female,,0,0,364498,14.5,,S 385 | 1275,3,"McNamee, Mrs. Neal (Eileen O'Leary)",female,19,1,0,376566,16.1,,S 386 | 1276,2,"Wheeler, Mr. Edwin Frederick""""",male,,0,0,SC/PARIS 2159,12.875,,S 387 | 1277,2,"Herman, Miss. Kate",female,24,1,2,220845,65,,S 388 | 1278,3,"Aronsson, Mr. Ernst Axel Algot",male,24,0,0,349911,7.775,,S 389 | 1279,2,"Ashby, Mr. John",male,57,0,0,244346,13,,S 390 | 1280,3,"Canavan, Mr. Patrick",male,21,0,0,364858,7.75,,Q 391 | 1281,3,"Palsson, Master. Paul Folke",male,6,3,1,349909,21.075,,S 392 | 1282,1,"Payne, Mr. Vivian Ponsonby",male,23,0,0,12749,93.5,B24,S 393 | 1283,1,"Lines, Mrs. Ernest H (Elizabeth Lindsey James)",female,51,0,1,PC 17592,39.4,D28,S 394 | 1284,3,"Abbott, Master. Eugene Joseph",male,13,0,2,C.A. 2673,20.25,,S 395 | 1285,2,"Gilbert, Mr. William",male,47,0,0,C.A. 30769,10.5,,S 396 | 1286,3,"Kink-Heilmann, Mr. Anton",male,29,3,1,315153,22.025,,S 397 | 1287,1,"Smith, Mrs. Lucien Philip (Mary Eloise Hughes)",female,18,1,0,13695,60,C31,S 398 | 1288,3,"Colbert, Mr. Patrick",male,24,0,0,371109,7.25,,Q 399 | 1289,1,"Frolicher-Stehli, Mrs. Maxmillian (Margaretha Emerentia Stehli)",female,48,1,1,13567,79.2,B41,C 400 | 1290,3,"Larsson-Rondberg, Mr. Edvard A",male,22,0,0,347065,7.775,,S 401 | 1291,3,"Conlon, Mr. Thomas Henry",male,31,0,0,21332,7.7333,,Q 402 | 1292,1,"Bonnell, Miss. Caroline",female,30,0,0,36928,164.8667,C7,S 403 | 1293,2,"Gale, Mr. Harry",male,38,1,0,28664,21,,S 404 | 1294,1,"Gibson, Miss. Dorothy Winifred",female,22,0,1,112378,59.4,,C 405 | 1295,1,"Carrau, Mr. Jose Pedro",male,17,0,0,113059,47.1,,S 406 | 1296,1,"Frauenthal, Mr. Isaac Gerald",male,43,1,0,17765,27.7208,D40,C 407 | 1297,2,"Nourney, Mr. Alfred (Baron von Drachstedt"")""",male,20,0,0,SC/PARIS 2166,13.8625,D38,C 408 | 1298,2,"Ware, Mr. William Jeffery",male,23,1,0,28666,10.5,,S 409 | 1299,1,"Widener, Mr. George Dunton",male,50,1,1,113503,211.5,C80,C 410 | 1300,3,"Riordan, Miss. Johanna Hannah""""",female,,0,0,334915,7.7208,,Q 411 | 1301,3,"Peacock, Miss. Treasteall",female,3,1,1,SOTON/O.Q. 3101315,13.775,,S 412 | 1302,3,"Naughton, Miss. Hannah",female,,0,0,365237,7.75,,Q 413 | 1303,1,"Minahan, Mrs. William Edward (Lillian E Thorpe)",female,37,1,0,19928,90,C78,Q 414 | 1304,3,"Henriksson, Miss. Jenny Lovisa",female,28,0,0,347086,7.775,,S 415 | 1305,3,"Spector, Mr. Woolf",male,,0,0,A.5. 3236,8.05,,S 416 | 1306,1,"Oliva y Ocana, Dona. Fermina",female,39,0,0,PC 17758,108.9,C105,C 417 | 1307,3,"Saether, Mr. Simon Sivertsen",male,38.5,0,0,SOTON/O.Q. 3101262,7.25,,S 418 | 1308,3,"Ware, Mr. Frederick",male,,0,0,359309,8.05,,S 419 | 1309,3,"Peter, Master. Michael J",male,,1,1,2668,22.3583,,C 420 | -------------------------------------------------------------------------------- /Machine Learning Sections/Logistic-Regression/titanic_train.csv: -------------------------------------------------------------------------------- 1 | PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked 2 | 1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S 3 | 2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C 4 | 3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S 5 | 4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S 6 | 5,0,3,"Allen, Mr. William Henry",male,35,0,0,373450,8.05,,S 7 | 6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q 8 | 7,0,1,"McCarthy, Mr. Timothy J",male,54,0,0,17463,51.8625,E46,S 9 | 8,0,3,"Palsson, Master. Gosta Leonard",male,2,3,1,349909,21.075,,S 10 | 9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27,0,2,347742,11.1333,,S 11 | 10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14,1,0,237736,30.0708,,C 12 | 11,1,3,"Sandstrom, Miss. Marguerite Rut",female,4,1,1,PP 9549,16.7,G6,S 13 | 12,1,1,"Bonnell, Miss. Elizabeth",female,58,0,0,113783,26.55,C103,S 14 | 13,0,3,"Saundercock, Mr. William Henry",male,20,0,0,A/5. 2151,8.05,,S 15 | 14,0,3,"Andersson, Mr. Anders Johan",male,39,1,5,347082,31.275,,S 16 | 15,0,3,"Vestrom, Miss. Hulda Amanda Adolfina",female,14,0,0,350406,7.8542,,S 17 | 16,1,2,"Hewlett, Mrs. (Mary D Kingcome) ",female,55,0,0,248706,16,,S 18 | 17,0,3,"Rice, Master. Eugene",male,2,4,1,382652,29.125,,Q 19 | 18,1,2,"Williams, Mr. Charles Eugene",male,,0,0,244373,13,,S 20 | 19,0,3,"Vander Planke, Mrs. Julius (Emelia Maria Vandemoortele)",female,31,1,0,345763,18,,S 21 | 20,1,3,"Masselmani, Mrs. Fatima",female,,0,0,2649,7.225,,C 22 | 21,0,2,"Fynney, Mr. Joseph J",male,35,0,0,239865,26,,S 23 | 22,1,2,"Beesley, Mr. Lawrence",male,34,0,0,248698,13,D56,S 24 | 23,1,3,"McGowan, Miss. Anna ""Annie""",female,15,0,0,330923,8.0292,,Q 25 | 24,1,1,"Sloper, Mr. William Thompson",male,28,0,0,113788,35.5,A6,S 26 | 25,0,3,"Palsson, Miss. Torborg Danira",female,8,3,1,349909,21.075,,S 27 | 26,1,3,"Asplund, Mrs. Carl Oscar (Selma Augusta Emilia Johansson)",female,38,1,5,347077,31.3875,,S 28 | 27,0,3,"Emir, Mr. Farred Chehab",male,,0,0,2631,7.225,,C 29 | 28,0,1,"Fortune, Mr. Charles Alexander",male,19,3,2,19950,263,C23 C25 C27,S 30 | 29,1,3,"O'Dwyer, Miss. Ellen ""Nellie""",female,,0,0,330959,7.8792,,Q 31 | 30,0,3,"Todoroff, Mr. Lalio",male,,0,0,349216,7.8958,,S 32 | 31,0,1,"Uruchurtu, Don. Manuel E",male,40,0,0,PC 17601,27.7208,,C 33 | 32,1,1,"Spencer, Mrs. William Augustus (Marie Eugenie)",female,,1,0,PC 17569,146.5208,B78,C 34 | 33,1,3,"Glynn, Miss. Mary Agatha",female,,0,0,335677,7.75,,Q 35 | 34,0,2,"Wheadon, Mr. Edward H",male,66,0,0,C.A. 24579,10.5,,S 36 | 35,0,1,"Meyer, Mr. Edgar Joseph",male,28,1,0,PC 17604,82.1708,,C 37 | 36,0,1,"Holverson, Mr. Alexander Oskar",male,42,1,0,113789,52,,S 38 | 37,1,3,"Mamee, Mr. Hanna",male,,0,0,2677,7.2292,,C 39 | 38,0,3,"Cann, Mr. Ernest Charles",male,21,0,0,A./5. 2152,8.05,,S 40 | 39,0,3,"Vander Planke, Miss. Augusta Maria",female,18,2,0,345764,18,,S 41 | 40,1,3,"Nicola-Yarred, Miss. Jamila",female,14,1,0,2651,11.2417,,C 42 | 41,0,3,"Ahlin, Mrs. Johan (Johanna Persdotter Larsson)",female,40,1,0,7546,9.475,,S 43 | 42,0,2,"Turpin, Mrs. William John Robert (Dorothy Ann Wonnacott)",female,27,1,0,11668,21,,S 44 | 43,0,3,"Kraeff, Mr. Theodor",male,,0,0,349253,7.8958,,C 45 | 44,1,2,"Laroche, Miss. Simonne Marie Anne Andree",female,3,1,2,SC/Paris 2123,41.5792,,C 46 | 45,1,3,"Devaney, Miss. Margaret Delia",female,19,0,0,330958,7.8792,,Q 47 | 46,0,3,"Rogers, Mr. William John",male,,0,0,S.C./A.4. 23567,8.05,,S 48 | 47,0,3,"Lennon, Mr. Denis",male,,1,0,370371,15.5,,Q 49 | 48,1,3,"O'Driscoll, Miss. Bridget",female,,0,0,14311,7.75,,Q 50 | 49,0,3,"Samaan, Mr. Youssef",male,,2,0,2662,21.6792,,C 51 | 50,0,3,"Arnold-Franchi, Mrs. Josef (Josefine Franchi)",female,18,1,0,349237,17.8,,S 52 | 51,0,3,"Panula, Master. Juha Niilo",male,7,4,1,3101295,39.6875,,S 53 | 52,0,3,"Nosworthy, Mr. Richard Cater",male,21,0,0,A/4. 39886,7.8,,S 54 | 53,1,1,"Harper, Mrs. Henry Sleeper (Myna Haxtun)",female,49,1,0,PC 17572,76.7292,D33,C 55 | 54,1,2,"Faunthorpe, Mrs. Lizzie (Elizabeth Anne Wilkinson)",female,29,1,0,2926,26,,S 56 | 55,0,1,"Ostby, Mr. Engelhart Cornelius",male,65,0,1,113509,61.9792,B30,C 57 | 56,1,1,"Woolner, Mr. Hugh",male,,0,0,19947,35.5,C52,S 58 | 57,1,2,"Rugg, Miss. Emily",female,21,0,0,C.A. 31026,10.5,,S 59 | 58,0,3,"Novel, Mr. Mansouer",male,28.5,0,0,2697,7.2292,,C 60 | 59,1,2,"West, Miss. Constance Mirium",female,5,1,2,C.A. 34651,27.75,,S 61 | 60,0,3,"Goodwin, Master. William Frederick",male,11,5,2,CA 2144,46.9,,S 62 | 61,0,3,"Sirayanian, Mr. Orsen",male,22,0,0,2669,7.2292,,C 63 | 62,1,1,"Icard, Miss. Amelie",female,38,0,0,113572,80,B28, 64 | 63,0,1,"Harris, Mr. Henry Birkhardt",male,45,1,0,36973,83.475,C83,S 65 | 64,0,3,"Skoog, Master. Harald",male,4,3,2,347088,27.9,,S 66 | 65,0,1,"Stewart, Mr. Albert A",male,,0,0,PC 17605,27.7208,,C 67 | 66,1,3,"Moubarek, Master. Gerios",male,,1,1,2661,15.2458,,C 68 | 67,1,2,"Nye, Mrs. (Elizabeth Ramell)",female,29,0,0,C.A. 29395,10.5,F33,S 69 | 68,0,3,"Crease, Mr. Ernest James",male,19,0,0,S.P. 3464,8.1583,,S 70 | 69,1,3,"Andersson, Miss. Erna Alexandra",female,17,4,2,3101281,7.925,,S 71 | 70,0,3,"Kink, Mr. Vincenz",male,26,2,0,315151,8.6625,,S 72 | 71,0,2,"Jenkin, Mr. Stephen Curnow",male,32,0,0,C.A. 33111,10.5,,S 73 | 72,0,3,"Goodwin, Miss. Lillian Amy",female,16,5,2,CA 2144,46.9,,S 74 | 73,0,2,"Hood, Mr. Ambrose Jr",male,21,0,0,S.O.C. 14879,73.5,,S 75 | 74,0,3,"Chronopoulos, Mr. Apostolos",male,26,1,0,2680,14.4542,,C 76 | 75,1,3,"Bing, Mr. Lee",male,32,0,0,1601,56.4958,,S 77 | 76,0,3,"Moen, Mr. Sigurd Hansen",male,25,0,0,348123,7.65,F G73,S 78 | 77,0,3,"Staneff, Mr. Ivan",male,,0,0,349208,7.8958,,S 79 | 78,0,3,"Moutal, Mr. Rahamin Haim",male,,0,0,374746,8.05,,S 80 | 79,1,2,"Caldwell, Master. Alden Gates",male,0.83,0,2,248738,29,,S 81 | 80,1,3,"Dowdell, Miss. Elizabeth",female,30,0,0,364516,12.475,,S 82 | 81,0,3,"Waelens, Mr. Achille",male,22,0,0,345767,9,,S 83 | 82,1,3,"Sheerlinck, Mr. Jan Baptist",male,29,0,0,345779,9.5,,S 84 | 83,1,3,"McDermott, Miss. Brigdet Delia",female,,0,0,330932,7.7875,,Q 85 | 84,0,1,"Carrau, Mr. Francisco M",male,28,0,0,113059,47.1,,S 86 | 85,1,2,"Ilett, Miss. Bertha",female,17,0,0,SO/C 14885,10.5,,S 87 | 86,1,3,"Backstrom, Mrs. Karl Alfred (Maria Mathilda Gustafsson)",female,33,3,0,3101278,15.85,,S 88 | 87,0,3,"Ford, Mr. William Neal",male,16,1,3,W./C. 6608,34.375,,S 89 | 88,0,3,"Slocovski, Mr. Selman Francis",male,,0,0,SOTON/OQ 392086,8.05,,S 90 | 89,1,1,"Fortune, Miss. Mabel Helen",female,23,3,2,19950,263,C23 C25 C27,S 91 | 90,0,3,"Celotti, Mr. Francesco",male,24,0,0,343275,8.05,,S 92 | 91,0,3,"Christmann, Mr. Emil",male,29,0,0,343276,8.05,,S 93 | 92,0,3,"Andreasson, Mr. Paul Edvin",male,20,0,0,347466,7.8542,,S 94 | 93,0,1,"Chaffee, Mr. Herbert Fuller",male,46,1,0,W.E.P. 5734,61.175,E31,S 95 | 94,0,3,"Dean, Mr. Bertram Frank",male,26,1,2,C.A. 2315,20.575,,S 96 | 95,0,3,"Coxon, Mr. Daniel",male,59,0,0,364500,7.25,,S 97 | 96,0,3,"Shorney, Mr. Charles Joseph",male,,0,0,374910,8.05,,S 98 | 97,0,1,"Goldschmidt, Mr. George B",male,71,0,0,PC 17754,34.6542,A5,C 99 | 98,1,1,"Greenfield, Mr. William Bertram",male,23,0,1,PC 17759,63.3583,D10 D12,C 100 | 99,1,2,"Doling, Mrs. John T (Ada Julia Bone)",female,34,0,1,231919,23,,S 101 | 100,0,2,"Kantor, Mr. Sinai",male,34,1,0,244367,26,,S 102 | 101,0,3,"Petranec, Miss. Matilda",female,28,0,0,349245,7.8958,,S 103 | 102,0,3,"Petroff, Mr. Pastcho (""Pentcho"")",male,,0,0,349215,7.8958,,S 104 | 103,0,1,"White, Mr. Richard Frasar",male,21,0,1,35281,77.2875,D26,S 105 | 104,0,3,"Johansson, Mr. Gustaf Joel",male,33,0,0,7540,8.6542,,S 106 | 105,0,3,"Gustafsson, Mr. Anders Vilhelm",male,37,2,0,3101276,7.925,,S 107 | 106,0,3,"Mionoff, Mr. Stoytcho",male,28,0,0,349207,7.8958,,S 108 | 107,1,3,"Salkjelsvik, Miss. Anna Kristine",female,21,0,0,343120,7.65,,S 109 | 108,1,3,"Moss, Mr. Albert Johan",male,,0,0,312991,7.775,,S 110 | 109,0,3,"Rekic, Mr. Tido",male,38,0,0,349249,7.8958,,S 111 | 110,1,3,"Moran, Miss. Bertha",female,,1,0,371110,24.15,,Q 112 | 111,0,1,"Porter, Mr. Walter Chamberlain",male,47,0,0,110465,52,C110,S 113 | 112,0,3,"Zabour, Miss. Hileni",female,14.5,1,0,2665,14.4542,,C 114 | 113,0,3,"Barton, Mr. David John",male,22,0,0,324669,8.05,,S 115 | 114,0,3,"Jussila, Miss. Katriina",female,20,1,0,4136,9.825,,S 116 | 115,0,3,"Attalah, Miss. Malake",female,17,0,0,2627,14.4583,,C 117 | 116,0,3,"Pekoniemi, Mr. Edvard",male,21,0,0,STON/O 2. 3101294,7.925,,S 118 | 117,0,3,"Connors, Mr. Patrick",male,70.5,0,0,370369,7.75,,Q 119 | 118,0,2,"Turpin, Mr. William John Robert",male,29,1,0,11668,21,,S 120 | 119,0,1,"Baxter, Mr. Quigg Edmond",male,24,0,1,PC 17558,247.5208,B58 B60,C 121 | 120,0,3,"Andersson, Miss. Ellis Anna Maria",female,2,4,2,347082,31.275,,S 122 | 121,0,2,"Hickman, Mr. Stanley George",male,21,2,0,S.O.C. 14879,73.5,,S 123 | 122,0,3,"Moore, Mr. Leonard Charles",male,,0,0,A4. 54510,8.05,,S 124 | 123,0,2,"Nasser, Mr. Nicholas",male,32.5,1,0,237736,30.0708,,C 125 | 124,1,2,"Webber, Miss. Susan",female,32.5,0,0,27267,13,E101,S 126 | 125,0,1,"White, Mr. Percival Wayland",male,54,0,1,35281,77.2875,D26,S 127 | 126,1,3,"Nicola-Yarred, Master. Elias",male,12,1,0,2651,11.2417,,C 128 | 127,0,3,"McMahon, Mr. Martin",male,,0,0,370372,7.75,,Q 129 | 128,1,3,"Madsen, Mr. Fridtjof Arne",male,24,0,0,C 17369,7.1417,,S 130 | 129,1,3,"Peter, Miss. Anna",female,,1,1,2668,22.3583,F E69,C 131 | 130,0,3,"Ekstrom, Mr. Johan",male,45,0,0,347061,6.975,,S 132 | 131,0,3,"Drazenoic, Mr. Jozef",male,33,0,0,349241,7.8958,,C 133 | 132,0,3,"Coelho, Mr. Domingos Fernandeo",male,20,0,0,SOTON/O.Q. 3101307,7.05,,S 134 | 133,0,3,"Robins, Mrs. Alexander A (Grace Charity Laury)",female,47,1,0,A/5. 3337,14.5,,S 135 | 134,1,2,"Weisz, Mrs. Leopold (Mathilde Francoise Pede)",female,29,1,0,228414,26,,S 136 | 135,0,2,"Sobey, Mr. Samuel James Hayden",male,25,0,0,C.A. 29178,13,,S 137 | 136,0,2,"Richard, Mr. Emile",male,23,0,0,SC/PARIS 2133,15.0458,,C 138 | 137,1,1,"Newsom, Miss. Helen Monypeny",female,19,0,2,11752,26.2833,D47,S 139 | 138,0,1,"Futrelle, Mr. Jacques Heath",male,37,1,0,113803,53.1,C123,S 140 | 139,0,3,"Osen, Mr. Olaf Elon",male,16,0,0,7534,9.2167,,S 141 | 140,0,1,"Giglio, Mr. Victor",male,24,0,0,PC 17593,79.2,B86,C 142 | 141,0,3,"Boulos, Mrs. Joseph (Sultana)",female,,0,2,2678,15.2458,,C 143 | 142,1,3,"Nysten, Miss. Anna Sofia",female,22,0,0,347081,7.75,,S 144 | 143,1,3,"Hakkarainen, Mrs. Pekka Pietari (Elin Matilda Dolck)",female,24,1,0,STON/O2. 3101279,15.85,,S 145 | 144,0,3,"Burke, Mr. Jeremiah",male,19,0,0,365222,6.75,,Q 146 | 145,0,2,"Andrew, Mr. Edgardo Samuel",male,18,0,0,231945,11.5,,S 147 | 146,0,2,"Nicholls, Mr. Joseph Charles",male,19,1,1,C.A. 33112,36.75,,S 148 | 147,1,3,"Andersson, Mr. August Edvard (""Wennerstrom"")",male,27,0,0,350043,7.7958,,S 149 | 148,0,3,"Ford, Miss. Robina Maggie ""Ruby""",female,9,2,2,W./C. 6608,34.375,,S 150 | 149,0,2,"Navratil, Mr. Michel (""Louis M Hoffman"")",male,36.5,0,2,230080,26,F2,S 151 | 150,0,2,"Byles, Rev. Thomas Roussel Davids",male,42,0,0,244310,13,,S 152 | 151,0,2,"Bateman, Rev. Robert James",male,51,0,0,S.O.P. 1166,12.525,,S 153 | 152,1,1,"Pears, Mrs. Thomas (Edith Wearne)",female,22,1,0,113776,66.6,C2,S 154 | 153,0,3,"Meo, Mr. Alfonzo",male,55.5,0,0,A.5. 11206,8.05,,S 155 | 154,0,3,"van Billiard, Mr. Austin Blyler",male,40.5,0,2,A/5. 851,14.5,,S 156 | 155,0,3,"Olsen, Mr. Ole Martin",male,,0,0,Fa 265302,7.3125,,S 157 | 156,0,1,"Williams, Mr. Charles Duane",male,51,0,1,PC 17597,61.3792,,C 158 | 157,1,3,"Gilnagh, Miss. Katherine ""Katie""",female,16,0,0,35851,7.7333,,Q 159 | 158,0,3,"Corn, Mr. Harry",male,30,0,0,SOTON/OQ 392090,8.05,,S 160 | 159,0,3,"Smiljanic, Mr. Mile",male,,0,0,315037,8.6625,,S 161 | 160,0,3,"Sage, Master. Thomas Henry",male,,8,2,CA. 2343,69.55,,S 162 | 161,0,3,"Cribb, Mr. John Hatfield",male,44,0,1,371362,16.1,,S 163 | 162,1,2,"Watt, Mrs. James (Elizabeth ""Bessie"" Inglis Milne)",female,40,0,0,C.A. 33595,15.75,,S 164 | 163,0,3,"Bengtsson, Mr. John Viktor",male,26,0,0,347068,7.775,,S 165 | 164,0,3,"Calic, Mr. Jovo",male,17,0,0,315093,8.6625,,S 166 | 165,0,3,"Panula, Master. Eino Viljami",male,1,4,1,3101295,39.6875,,S 167 | 166,1,3,"Goldsmith, Master. Frank John William ""Frankie""",male,9,0,2,363291,20.525,,S 168 | 167,1,1,"Chibnall, Mrs. (Edith Martha Bowerman)",female,,0,1,113505,55,E33,S 169 | 168,0,3,"Skoog, Mrs. William (Anna Bernhardina Karlsson)",female,45,1,4,347088,27.9,,S 170 | 169,0,1,"Baumann, Mr. John D",male,,0,0,PC 17318,25.925,,S 171 | 170,0,3,"Ling, Mr. Lee",male,28,0,0,1601,56.4958,,S 172 | 171,0,1,"Van der hoef, Mr. Wyckoff",male,61,0,0,111240,33.5,B19,S 173 | 172,0,3,"Rice, Master. Arthur",male,4,4,1,382652,29.125,,Q 174 | 173,1,3,"Johnson, Miss. Eleanor Ileen",female,1,1,1,347742,11.1333,,S 175 | 174,0,3,"Sivola, Mr. Antti Wilhelm",male,21,0,0,STON/O 2. 3101280,7.925,,S 176 | 175,0,1,"Smith, Mr. James Clinch",male,56,0,0,17764,30.6958,A7,C 177 | 176,0,3,"Klasen, Mr. Klas Albin",male,18,1,1,350404,7.8542,,S 178 | 177,0,3,"Lefebre, Master. Henry Forbes",male,,3,1,4133,25.4667,,S 179 | 178,0,1,"Isham, Miss. Ann Elizabeth",female,50,0,0,PC 17595,28.7125,C49,C 180 | 179,0,2,"Hale, Mr. Reginald",male,30,0,0,250653,13,,S 181 | 180,0,3,"Leonard, Mr. Lionel",male,36,0,0,LINE,0,,S 182 | 181,0,3,"Sage, Miss. Constance Gladys",female,,8,2,CA. 2343,69.55,,S 183 | 182,0,2,"Pernot, Mr. Rene",male,,0,0,SC/PARIS 2131,15.05,,C 184 | 183,0,3,"Asplund, Master. Clarence Gustaf Hugo",male,9,4,2,347077,31.3875,,S 185 | 184,1,2,"Becker, Master. Richard F",male,1,2,1,230136,39,F4,S 186 | 185,1,3,"Kink-Heilmann, Miss. Luise Gretchen",female,4,0,2,315153,22.025,,S 187 | 186,0,1,"Rood, Mr. Hugh Roscoe",male,,0,0,113767,50,A32,S 188 | 187,1,3,"O'Brien, Mrs. Thomas (Johanna ""Hannah"" Godfrey)",female,,1,0,370365,15.5,,Q 189 | 188,1,1,"Romaine, Mr. Charles Hallace (""Mr C Rolmane"")",male,45,0,0,111428,26.55,,S 190 | 189,0,3,"Bourke, Mr. John",male,40,1,1,364849,15.5,,Q 191 | 190,0,3,"Turcin, Mr. Stjepan",male,36,0,0,349247,7.8958,,S 192 | 191,1,2,"Pinsky, Mrs. (Rosa)",female,32,0,0,234604,13,,S 193 | 192,0,2,"Carbines, Mr. William",male,19,0,0,28424,13,,S 194 | 193,1,3,"Andersen-Jensen, Miss. Carla Christine Nielsine",female,19,1,0,350046,7.8542,,S 195 | 194,1,2,"Navratil, Master. Michel M",male,3,1,1,230080,26,F2,S 196 | 195,1,1,"Brown, Mrs. James Joseph (Margaret Tobin)",female,44,0,0,PC 17610,27.7208,B4,C 197 | 196,1,1,"Lurette, Miss. Elise",female,58,0,0,PC 17569,146.5208,B80,C 198 | 197,0,3,"Mernagh, Mr. Robert",male,,0,0,368703,7.75,,Q 199 | 198,0,3,"Olsen, Mr. Karl Siegwart Andreas",male,42,0,1,4579,8.4042,,S 200 | 199,1,3,"Madigan, Miss. Margaret ""Maggie""",female,,0,0,370370,7.75,,Q 201 | 200,0,2,"Yrois, Miss. Henriette (""Mrs Harbeck"")",female,24,0,0,248747,13,,S 202 | 201,0,3,"Vande Walle, Mr. Nestor Cyriel",male,28,0,0,345770,9.5,,S 203 | 202,0,3,"Sage, Mr. Frederick",male,,8,2,CA. 2343,69.55,,S 204 | 203,0,3,"Johanson, Mr. Jakob Alfred",male,34,0,0,3101264,6.4958,,S 205 | 204,0,3,"Youseff, Mr. Gerious",male,45.5,0,0,2628,7.225,,C 206 | 205,1,3,"Cohen, Mr. Gurshon ""Gus""",male,18,0,0,A/5 3540,8.05,,S 207 | 206,0,3,"Strom, Miss. Telma Matilda",female,2,0,1,347054,10.4625,G6,S 208 | 207,0,3,"Backstrom, Mr. Karl Alfred",male,32,1,0,3101278,15.85,,S 209 | 208,1,3,"Albimona, Mr. Nassef Cassem",male,26,0,0,2699,18.7875,,C 210 | 209,1,3,"Carr, Miss. Helen ""Ellen""",female,16,0,0,367231,7.75,,Q 211 | 210,1,1,"Blank, Mr. Henry",male,40,0,0,112277,31,A31,C 212 | 211,0,3,"Ali, Mr. Ahmed",male,24,0,0,SOTON/O.Q. 3101311,7.05,,S 213 | 212,1,2,"Cameron, Miss. Clear Annie",female,35,0,0,F.C.C. 13528,21,,S 214 | 213,0,3,"Perkin, Mr. John Henry",male,22,0,0,A/5 21174,7.25,,S 215 | 214,0,2,"Givard, Mr. Hans Kristensen",male,30,0,0,250646,13,,S 216 | 215,0,3,"Kiernan, Mr. Philip",male,,1,0,367229,7.75,,Q 217 | 216,1,1,"Newell, Miss. Madeleine",female,31,1,0,35273,113.275,D36,C 218 | 217,1,3,"Honkanen, Miss. Eliina",female,27,0,0,STON/O2. 3101283,7.925,,S 219 | 218,0,2,"Jacobsohn, Mr. Sidney Samuel",male,42,1,0,243847,27,,S 220 | 219,1,1,"Bazzani, Miss. Albina",female,32,0,0,11813,76.2917,D15,C 221 | 220,0,2,"Harris, Mr. Walter",male,30,0,0,W/C 14208,10.5,,S 222 | 221,1,3,"Sunderland, Mr. Victor Francis",male,16,0,0,SOTON/OQ 392089,8.05,,S 223 | 222,0,2,"Bracken, Mr. James H",male,27,0,0,220367,13,,S 224 | 223,0,3,"Green, Mr. George Henry",male,51,0,0,21440,8.05,,S 225 | 224,0,3,"Nenkoff, Mr. Christo",male,,0,0,349234,7.8958,,S 226 | 225,1,1,"Hoyt, Mr. Frederick Maxfield",male,38,1,0,19943,90,C93,S 227 | 226,0,3,"Berglund, Mr. Karl Ivar Sven",male,22,0,0,PP 4348,9.35,,S 228 | 227,1,2,"Mellors, Mr. William John",male,19,0,0,SW/PP 751,10.5,,S 229 | 228,0,3,"Lovell, Mr. John Hall (""Henry"")",male,20.5,0,0,A/5 21173,7.25,,S 230 | 229,0,2,"Fahlstrom, Mr. Arne Jonas",male,18,0,0,236171,13,,S 231 | 230,0,3,"Lefebre, Miss. Mathilde",female,,3,1,4133,25.4667,,S 232 | 231,1,1,"Harris, Mrs. Henry Birkhardt (Irene Wallach)",female,35,1,0,36973,83.475,C83,S 233 | 232,0,3,"Larsson, Mr. Bengt Edvin",male,29,0,0,347067,7.775,,S 234 | 233,0,2,"Sjostedt, Mr. Ernst Adolf",male,59,0,0,237442,13.5,,S 235 | 234,1,3,"Asplund, Miss. Lillian Gertrud",female,5,4,2,347077,31.3875,,S 236 | 235,0,2,"Leyson, Mr. Robert William Norman",male,24,0,0,C.A. 29566,10.5,,S 237 | 236,0,3,"Harknett, Miss. Alice Phoebe",female,,0,0,W./C. 6609,7.55,,S 238 | 237,0,2,"Hold, Mr. Stephen",male,44,1,0,26707,26,,S 239 | 238,1,2,"Collyer, Miss. Marjorie ""Lottie""",female,8,0,2,C.A. 31921,26.25,,S 240 | 239,0,2,"Pengelly, Mr. Frederick William",male,19,0,0,28665,10.5,,S 241 | 240,0,2,"Hunt, Mr. George Henry",male,33,0,0,SCO/W 1585,12.275,,S 242 | 241,0,3,"Zabour, Miss. Thamine",female,,1,0,2665,14.4542,,C 243 | 242,1,3,"Murphy, Miss. Katherine ""Kate""",female,,1,0,367230,15.5,,Q 244 | 243,0,2,"Coleridge, Mr. Reginald Charles",male,29,0,0,W./C. 14263,10.5,,S 245 | 244,0,3,"Maenpaa, Mr. Matti Alexanteri",male,22,0,0,STON/O 2. 3101275,7.125,,S 246 | 245,0,3,"Attalah, Mr. Sleiman",male,30,0,0,2694,7.225,,C 247 | 246,0,1,"Minahan, Dr. William Edward",male,44,2,0,19928,90,C78,Q 248 | 247,0,3,"Lindahl, Miss. Agda Thorilda Viktoria",female,25,0,0,347071,7.775,,S 249 | 248,1,2,"Hamalainen, Mrs. William (Anna)",female,24,0,2,250649,14.5,,S 250 | 249,1,1,"Beckwith, Mr. Richard Leonard",male,37,1,1,11751,52.5542,D35,S 251 | 250,0,2,"Carter, Rev. Ernest Courtenay",male,54,1,0,244252,26,,S 252 | 251,0,3,"Reed, Mr. James George",male,,0,0,362316,7.25,,S 253 | 252,0,3,"Strom, Mrs. Wilhelm (Elna Matilda Persson)",female,29,1,1,347054,10.4625,G6,S 254 | 253,0,1,"Stead, Mr. William Thomas",male,62,0,0,113514,26.55,C87,S 255 | 254,0,3,"Lobb, Mr. William Arthur",male,30,1,0,A/5. 3336,16.1,,S 256 | 255,0,3,"Rosblom, Mrs. Viktor (Helena Wilhelmina)",female,41,0,2,370129,20.2125,,S 257 | 256,1,3,"Touma, Mrs. Darwis (Hanne Youssef Razi)",female,29,0,2,2650,15.2458,,C 258 | 257,1,1,"Thorne, Mrs. Gertrude Maybelle",female,,0,0,PC 17585,79.2,,C 259 | 258,1,1,"Cherry, Miss. Gladys",female,30,0,0,110152,86.5,B77,S 260 | 259,1,1,"Ward, Miss. Anna",female,35,0,0,PC 17755,512.3292,,C 261 | 260,1,2,"Parrish, Mrs. (Lutie Davis)",female,50,0,1,230433,26,,S 262 | 261,0,3,"Smith, Mr. Thomas",male,,0,0,384461,7.75,,Q 263 | 262,1,3,"Asplund, Master. Edvin Rojj Felix",male,3,4,2,347077,31.3875,,S 264 | 263,0,1,"Taussig, Mr. Emil",male,52,1,1,110413,79.65,E67,S 265 | 264,0,1,"Harrison, Mr. William",male,40,0,0,112059,0,B94,S 266 | 265,0,3,"Henry, Miss. Delia",female,,0,0,382649,7.75,,Q 267 | 266,0,2,"Reeves, Mr. David",male,36,0,0,C.A. 17248,10.5,,S 268 | 267,0,3,"Panula, Mr. Ernesti Arvid",male,16,4,1,3101295,39.6875,,S 269 | 268,1,3,"Persson, Mr. Ernst Ulrik",male,25,1,0,347083,7.775,,S 270 | 269,1,1,"Graham, Mrs. William Thompson (Edith Junkins)",female,58,0,1,PC 17582,153.4625,C125,S 271 | 270,1,1,"Bissette, Miss. Amelia",female,35,0,0,PC 17760,135.6333,C99,S 272 | 271,0,1,"Cairns, Mr. Alexander",male,,0,0,113798,31,,S 273 | 272,1,3,"Tornquist, Mr. William Henry",male,25,0,0,LINE,0,,S 274 | 273,1,2,"Mellinger, Mrs. (Elizabeth Anne Maidment)",female,41,0,1,250644,19.5,,S 275 | 274,0,1,"Natsch, Mr. Charles H",male,37,0,1,PC 17596,29.7,C118,C 276 | 275,1,3,"Healy, Miss. Hanora ""Nora""",female,,0,0,370375,7.75,,Q 277 | 276,1,1,"Andrews, Miss. Kornelia Theodosia",female,63,1,0,13502,77.9583,D7,S 278 | 277,0,3,"Lindblom, Miss. Augusta Charlotta",female,45,0,0,347073,7.75,,S 279 | 278,0,2,"Parkes, Mr. Francis ""Frank""",male,,0,0,239853,0,,S 280 | 279,0,3,"Rice, Master. Eric",male,7,4,1,382652,29.125,,Q 281 | 280,1,3,"Abbott, Mrs. Stanton (Rosa Hunt)",female,35,1,1,C.A. 2673,20.25,,S 282 | 281,0,3,"Duane, Mr. Frank",male,65,0,0,336439,7.75,,Q 283 | 282,0,3,"Olsson, Mr. Nils Johan Goransson",male,28,0,0,347464,7.8542,,S 284 | 283,0,3,"de Pelsmaeker, Mr. Alfons",male,16,0,0,345778,9.5,,S 285 | 284,1,3,"Dorking, Mr. Edward Arthur",male,19,0,0,A/5. 10482,8.05,,S 286 | 285,0,1,"Smith, Mr. Richard William",male,,0,0,113056,26,A19,S 287 | 286,0,3,"Stankovic, Mr. Ivan",male,33,0,0,349239,8.6625,,C 288 | 287,1,3,"de Mulder, Mr. Theodore",male,30,0,0,345774,9.5,,S 289 | 288,0,3,"Naidenoff, Mr. Penko",male,22,0,0,349206,7.8958,,S 290 | 289,1,2,"Hosono, Mr. Masabumi",male,42,0,0,237798,13,,S 291 | 290,1,3,"Connolly, Miss. Kate",female,22,0,0,370373,7.75,,Q 292 | 291,1,1,"Barber, Miss. Ellen ""Nellie""",female,26,0,0,19877,78.85,,S 293 | 292,1,1,"Bishop, Mrs. Dickinson H (Helen Walton)",female,19,1,0,11967,91.0792,B49,C 294 | 293,0,2,"Levy, Mr. Rene Jacques",male,36,0,0,SC/Paris 2163,12.875,D,C 295 | 294,0,3,"Haas, Miss. Aloisia",female,24,0,0,349236,8.85,,S 296 | 295,0,3,"Mineff, Mr. Ivan",male,24,0,0,349233,7.8958,,S 297 | 296,0,1,"Lewy, Mr. Ervin G",male,,0,0,PC 17612,27.7208,,C 298 | 297,0,3,"Hanna, Mr. Mansour",male,23.5,0,0,2693,7.2292,,C 299 | 298,0,1,"Allison, Miss. Helen Loraine",female,2,1,2,113781,151.55,C22 C26,S 300 | 299,1,1,"Saalfeld, Mr. Adolphe",male,,0,0,19988,30.5,C106,S 301 | 300,1,1,"Baxter, Mrs. James (Helene DeLaudeniere Chaput)",female,50,0,1,PC 17558,247.5208,B58 B60,C 302 | 301,1,3,"Kelly, Miss. Anna Katherine ""Annie Kate""",female,,0,0,9234,7.75,,Q 303 | 302,1,3,"McCoy, Mr. Bernard",male,,2,0,367226,23.25,,Q 304 | 303,0,3,"Johnson, Mr. William Cahoone Jr",male,19,0,0,LINE,0,,S 305 | 304,1,2,"Keane, Miss. Nora A",female,,0,0,226593,12.35,E101,Q 306 | 305,0,3,"Williams, Mr. Howard Hugh ""Harry""",male,,0,0,A/5 2466,8.05,,S 307 | 306,1,1,"Allison, Master. Hudson Trevor",male,0.92,1,2,113781,151.55,C22 C26,S 308 | 307,1,1,"Fleming, Miss. Margaret",female,,0,0,17421,110.8833,,C 309 | 308,1,1,"Penasco y Castellana, Mrs. Victor de Satode (Maria Josefa Perez de Soto y Vallejo)",female,17,1,0,PC 17758,108.9,C65,C 310 | 309,0,2,"Abelson, Mr. Samuel",male,30,1,0,P/PP 3381,24,,C 311 | 310,1,1,"Francatelli, Miss. Laura Mabel",female,30,0,0,PC 17485,56.9292,E36,C 312 | 311,1,1,"Hays, Miss. Margaret Bechstein",female,24,0,0,11767,83.1583,C54,C 313 | 312,1,1,"Ryerson, Miss. Emily Borie",female,18,2,2,PC 17608,262.375,B57 B59 B63 B66,C 314 | 313,0,2,"Lahtinen, Mrs. William (Anna Sylfven)",female,26,1,1,250651,26,,S 315 | 314,0,3,"Hendekovic, Mr. Ignjac",male,28,0,0,349243,7.8958,,S 316 | 315,0,2,"Hart, Mr. Benjamin",male,43,1,1,F.C.C. 13529,26.25,,S 317 | 316,1,3,"Nilsson, Miss. Helmina Josefina",female,26,0,0,347470,7.8542,,S 318 | 317,1,2,"Kantor, Mrs. Sinai (Miriam Sternin)",female,24,1,0,244367,26,,S 319 | 318,0,2,"Moraweck, Dr. Ernest",male,54,0,0,29011,14,,S 320 | 319,1,1,"Wick, Miss. Mary Natalie",female,31,0,2,36928,164.8667,C7,S 321 | 320,1,1,"Spedden, Mrs. Frederic Oakley (Margaretta Corning Stone)",female,40,1,1,16966,134.5,E34,C 322 | 321,0,3,"Dennis, Mr. Samuel",male,22,0,0,A/5 21172,7.25,,S 323 | 322,0,3,"Danoff, Mr. Yoto",male,27,0,0,349219,7.8958,,S 324 | 323,1,2,"Slayter, Miss. Hilda Mary",female,30,0,0,234818,12.35,,Q 325 | 324,1,2,"Caldwell, Mrs. Albert Francis (Sylvia Mae Harbaugh)",female,22,1,1,248738,29,,S 326 | 325,0,3,"Sage, Mr. George John Jr",male,,8,2,CA. 2343,69.55,,S 327 | 326,1,1,"Young, Miss. Marie Grice",female,36,0,0,PC 17760,135.6333,C32,C 328 | 327,0,3,"Nysveen, Mr. Johan Hansen",male,61,0,0,345364,6.2375,,S 329 | 328,1,2,"Ball, Mrs. (Ada E Hall)",female,36,0,0,28551,13,D,S 330 | 329,1,3,"Goldsmith, Mrs. Frank John (Emily Alice Brown)",female,31,1,1,363291,20.525,,S 331 | 330,1,1,"Hippach, Miss. Jean Gertrude",female,16,0,1,111361,57.9792,B18,C 332 | 331,1,3,"McCoy, Miss. Agnes",female,,2,0,367226,23.25,,Q 333 | 332,0,1,"Partner, Mr. Austen",male,45.5,0,0,113043,28.5,C124,S 334 | 333,0,1,"Graham, Mr. George Edward",male,38,0,1,PC 17582,153.4625,C91,S 335 | 334,0,3,"Vander Planke, Mr. Leo Edmondus",male,16,2,0,345764,18,,S 336 | 335,1,1,"Frauenthal, Mrs. Henry William (Clara Heinsheimer)",female,,1,0,PC 17611,133.65,,S 337 | 336,0,3,"Denkoff, Mr. Mitto",male,,0,0,349225,7.8958,,S 338 | 337,0,1,"Pears, Mr. Thomas Clinton",male,29,1,0,113776,66.6,C2,S 339 | 338,1,1,"Burns, Miss. Elizabeth Margaret",female,41,0,0,16966,134.5,E40,C 340 | 339,1,3,"Dahl, Mr. Karl Edwart",male,45,0,0,7598,8.05,,S 341 | 340,0,1,"Blackwell, Mr. Stephen Weart",male,45,0,0,113784,35.5,T,S 342 | 341,1,2,"Navratil, Master. Edmond Roger",male,2,1,1,230080,26,F2,S 343 | 342,1,1,"Fortune, Miss. Alice Elizabeth",female,24,3,2,19950,263,C23 C25 C27,S 344 | 343,0,2,"Collander, Mr. Erik Gustaf",male,28,0,0,248740,13,,S 345 | 344,0,2,"Sedgwick, Mr. Charles Frederick Waddington",male,25,0,0,244361,13,,S 346 | 345,0,2,"Fox, Mr. Stanley Hubert",male,36,0,0,229236,13,,S 347 | 346,1,2,"Brown, Miss. Amelia ""Mildred""",female,24,0,0,248733,13,F33,S 348 | 347,1,2,"Smith, Miss. Marion Elsie",female,40,0,0,31418,13,,S 349 | 348,1,3,"Davison, Mrs. Thomas Henry (Mary E Finck)",female,,1,0,386525,16.1,,S 350 | 349,1,3,"Coutts, Master. William Loch ""William""",male,3,1,1,C.A. 37671,15.9,,S 351 | 350,0,3,"Dimic, Mr. Jovan",male,42,0,0,315088,8.6625,,S 352 | 351,0,3,"Odahl, Mr. Nils Martin",male,23,0,0,7267,9.225,,S 353 | 352,0,1,"Williams-Lambert, Mr. Fletcher Fellows",male,,0,0,113510,35,C128,S 354 | 353,0,3,"Elias, Mr. Tannous",male,15,1,1,2695,7.2292,,C 355 | 354,0,3,"Arnold-Franchi, Mr. Josef",male,25,1,0,349237,17.8,,S 356 | 355,0,3,"Yousif, Mr. Wazli",male,,0,0,2647,7.225,,C 357 | 356,0,3,"Vanden Steen, Mr. Leo Peter",male,28,0,0,345783,9.5,,S 358 | 357,1,1,"Bowerman, Miss. Elsie Edith",female,22,0,1,113505,55,E33,S 359 | 358,0,2,"Funk, Miss. Annie Clemmer",female,38,0,0,237671,13,,S 360 | 359,1,3,"McGovern, Miss. Mary",female,,0,0,330931,7.8792,,Q 361 | 360,1,3,"Mockler, Miss. Helen Mary ""Ellie""",female,,0,0,330980,7.8792,,Q 362 | 361,0,3,"Skoog, Mr. Wilhelm",male,40,1,4,347088,27.9,,S 363 | 362,0,2,"del Carlo, Mr. Sebastiano",male,29,1,0,SC/PARIS 2167,27.7208,,C 364 | 363,0,3,"Barbara, Mrs. (Catherine David)",female,45,0,1,2691,14.4542,,C 365 | 364,0,3,"Asim, Mr. Adola",male,35,0,0,SOTON/O.Q. 3101310,7.05,,S 366 | 365,0,3,"O'Brien, Mr. Thomas",male,,1,0,370365,15.5,,Q 367 | 366,0,3,"Adahl, Mr. Mauritz Nils Martin",male,30,0,0,C 7076,7.25,,S 368 | 367,1,1,"Warren, Mrs. Frank Manley (Anna Sophia Atkinson)",female,60,1,0,110813,75.25,D37,C 369 | 368,1,3,"Moussa, Mrs. (Mantoura Boulos)",female,,0,0,2626,7.2292,,C 370 | 369,1,3,"Jermyn, Miss. Annie",female,,0,0,14313,7.75,,Q 371 | 370,1,1,"Aubart, Mme. Leontine Pauline",female,24,0,0,PC 17477,69.3,B35,C 372 | 371,1,1,"Harder, Mr. George Achilles",male,25,1,0,11765,55.4417,E50,C 373 | 372,0,3,"Wiklund, Mr. Jakob Alfred",male,18,1,0,3101267,6.4958,,S 374 | 373,0,3,"Beavan, Mr. William Thomas",male,19,0,0,323951,8.05,,S 375 | 374,0,1,"Ringhini, Mr. Sante",male,22,0,0,PC 17760,135.6333,,C 376 | 375,0,3,"Palsson, Miss. Stina Viola",female,3,3,1,349909,21.075,,S 377 | 376,1,1,"Meyer, Mrs. Edgar Joseph (Leila Saks)",female,,1,0,PC 17604,82.1708,,C 378 | 377,1,3,"Landergren, Miss. Aurora Adelia",female,22,0,0,C 7077,7.25,,S 379 | 378,0,1,"Widener, Mr. Harry Elkins",male,27,0,2,113503,211.5,C82,C 380 | 379,0,3,"Betros, Mr. Tannous",male,20,0,0,2648,4.0125,,C 381 | 380,0,3,"Gustafsson, Mr. Karl Gideon",male,19,0,0,347069,7.775,,S 382 | 381,1,1,"Bidois, Miss. Rosalie",female,42,0,0,PC 17757,227.525,,C 383 | 382,1,3,"Nakid, Miss. Maria (""Mary"")",female,1,0,2,2653,15.7417,,C 384 | 383,0,3,"Tikkanen, Mr. Juho",male,32,0,0,STON/O 2. 3101293,7.925,,S 385 | 384,1,1,"Holverson, Mrs. Alexander Oskar (Mary Aline Towner)",female,35,1,0,113789,52,,S 386 | 385,0,3,"Plotcharsky, Mr. Vasil",male,,0,0,349227,7.8958,,S 387 | 386,0,2,"Davies, Mr. Charles Henry",male,18,0,0,S.O.C. 14879,73.5,,S 388 | 387,0,3,"Goodwin, Master. Sidney Leonard",male,1,5,2,CA 2144,46.9,,S 389 | 388,1,2,"Buss, Miss. Kate",female,36,0,0,27849,13,,S 390 | 389,0,3,"Sadlier, Mr. Matthew",male,,0,0,367655,7.7292,,Q 391 | 390,1,2,"Lehmann, Miss. Bertha",female,17,0,0,SC 1748,12,,C 392 | 391,1,1,"Carter, Mr. William Ernest",male,36,1,2,113760,120,B96 B98,S 393 | 392,1,3,"Jansson, Mr. Carl Olof",male,21,0,0,350034,7.7958,,S 394 | 393,0,3,"Gustafsson, Mr. Johan Birger",male,28,2,0,3101277,7.925,,S 395 | 394,1,1,"Newell, Miss. Marjorie",female,23,1,0,35273,113.275,D36,C 396 | 395,1,3,"Sandstrom, Mrs. Hjalmar (Agnes Charlotta Bengtsson)",female,24,0,2,PP 9549,16.7,G6,S 397 | 396,0,3,"Johansson, Mr. Erik",male,22,0,0,350052,7.7958,,S 398 | 397,0,3,"Olsson, Miss. Elina",female,31,0,0,350407,7.8542,,S 399 | 398,0,2,"McKane, Mr. Peter David",male,46,0,0,28403,26,,S 400 | 399,0,2,"Pain, Dr. Alfred",male,23,0,0,244278,10.5,,S 401 | 400,1,2,"Trout, Mrs. William H (Jessie L)",female,28,0,0,240929,12.65,,S 402 | 401,1,3,"Niskanen, Mr. Juha",male,39,0,0,STON/O 2. 3101289,7.925,,S 403 | 402,0,3,"Adams, Mr. John",male,26,0,0,341826,8.05,,S 404 | 403,0,3,"Jussila, Miss. Mari Aina",female,21,1,0,4137,9.825,,S 405 | 404,0,3,"Hakkarainen, Mr. Pekka Pietari",male,28,1,0,STON/O2. 3101279,15.85,,S 406 | 405,0,3,"Oreskovic, Miss. Marija",female,20,0,0,315096,8.6625,,S 407 | 406,0,2,"Gale, Mr. Shadrach",male,34,1,0,28664,21,,S 408 | 407,0,3,"Widegren, Mr. Carl/Charles Peter",male,51,0,0,347064,7.75,,S 409 | 408,1,2,"Richards, Master. William Rowe",male,3,1,1,29106,18.75,,S 410 | 409,0,3,"Birkeland, Mr. Hans Martin Monsen",male,21,0,0,312992,7.775,,S 411 | 410,0,3,"Lefebre, Miss. Ida",female,,3,1,4133,25.4667,,S 412 | 411,0,3,"Sdycoff, Mr. Todor",male,,0,0,349222,7.8958,,S 413 | 412,0,3,"Hart, Mr. Henry",male,,0,0,394140,6.8583,,Q 414 | 413,1,1,"Minahan, Miss. Daisy E",female,33,1,0,19928,90,C78,Q 415 | 414,0,2,"Cunningham, Mr. Alfred Fleming",male,,0,0,239853,0,,S 416 | 415,1,3,"Sundman, Mr. Johan Julian",male,44,0,0,STON/O 2. 3101269,7.925,,S 417 | 416,0,3,"Meek, Mrs. Thomas (Annie Louise Rowley)",female,,0,0,343095,8.05,,S 418 | 417,1,2,"Drew, Mrs. James Vivian (Lulu Thorne Christian)",female,34,1,1,28220,32.5,,S 419 | 418,1,2,"Silven, Miss. Lyyli Karoliina",female,18,0,2,250652,13,,S 420 | 419,0,2,"Matthews, Mr. William John",male,30,0,0,28228,13,,S 421 | 420,0,3,"Van Impe, Miss. Catharina",female,10,0,2,345773,24.15,,S 422 | 421,0,3,"Gheorgheff, Mr. Stanio",male,,0,0,349254,7.8958,,C 423 | 422,0,3,"Charters, Mr. David",male,21,0,0,A/5. 13032,7.7333,,Q 424 | 423,0,3,"Zimmerman, Mr. Leo",male,29,0,0,315082,7.875,,S 425 | 424,0,3,"Danbom, Mrs. Ernst Gilbert (Anna Sigrid Maria Brogren)",female,28,1,1,347080,14.4,,S 426 | 425,0,3,"Rosblom, Mr. Viktor Richard",male,18,1,1,370129,20.2125,,S 427 | 426,0,3,"Wiseman, Mr. Phillippe",male,,0,0,A/4. 34244,7.25,,S 428 | 427,1,2,"Clarke, Mrs. Charles V (Ada Maria Winfield)",female,28,1,0,2003,26,,S 429 | 428,1,2,"Phillips, Miss. Kate Florence (""Mrs Kate Louise Phillips Marshall"")",female,19,0,0,250655,26,,S 430 | 429,0,3,"Flynn, Mr. James",male,,0,0,364851,7.75,,Q 431 | 430,1,3,"Pickard, Mr. Berk (Berk Trembisky)",male,32,0,0,SOTON/O.Q. 392078,8.05,E10,S 432 | 431,1,1,"Bjornstrom-Steffansson, Mr. Mauritz Hakan",male,28,0,0,110564,26.55,C52,S 433 | 432,1,3,"Thorneycroft, Mrs. Percival (Florence Kate White)",female,,1,0,376564,16.1,,S 434 | 433,1,2,"Louch, Mrs. Charles Alexander (Alice Adelaide Slow)",female,42,1,0,SC/AH 3085,26,,S 435 | 434,0,3,"Kallio, Mr. Nikolai Erland",male,17,0,0,STON/O 2. 3101274,7.125,,S 436 | 435,0,1,"Silvey, Mr. William Baird",male,50,1,0,13507,55.9,E44,S 437 | 436,1,1,"Carter, Miss. Lucile Polk",female,14,1,2,113760,120,B96 B98,S 438 | 437,0,3,"Ford, Miss. Doolina Margaret ""Daisy""",female,21,2,2,W./C. 6608,34.375,,S 439 | 438,1,2,"Richards, Mrs. Sidney (Emily Hocking)",female,24,2,3,29106,18.75,,S 440 | 439,0,1,"Fortune, Mr. Mark",male,64,1,4,19950,263,C23 C25 C27,S 441 | 440,0,2,"Kvillner, Mr. Johan Henrik Johannesson",male,31,0,0,C.A. 18723,10.5,,S 442 | 441,1,2,"Hart, Mrs. Benjamin (Esther Ada Bloomfield)",female,45,1,1,F.C.C. 13529,26.25,,S 443 | 442,0,3,"Hampe, Mr. Leon",male,20,0,0,345769,9.5,,S 444 | 443,0,3,"Petterson, Mr. Johan Emil",male,25,1,0,347076,7.775,,S 445 | 444,1,2,"Reynaldo, Ms. Encarnacion",female,28,0,0,230434,13,,S 446 | 445,1,3,"Johannesen-Bratthammer, Mr. Bernt",male,,0,0,65306,8.1125,,S 447 | 446,1,1,"Dodge, Master. Washington",male,4,0,2,33638,81.8583,A34,S 448 | 447,1,2,"Mellinger, Miss. Madeleine Violet",female,13,0,1,250644,19.5,,S 449 | 448,1,1,"Seward, Mr. Frederic Kimber",male,34,0,0,113794,26.55,,S 450 | 449,1,3,"Baclini, Miss. Marie Catherine",female,5,2,1,2666,19.2583,,C 451 | 450,1,1,"Peuchen, Major. Arthur Godfrey",male,52,0,0,113786,30.5,C104,S 452 | 451,0,2,"West, Mr. Edwy Arthur",male,36,1,2,C.A. 34651,27.75,,S 453 | 452,0,3,"Hagland, Mr. Ingvald Olai Olsen",male,,1,0,65303,19.9667,,S 454 | 453,0,1,"Foreman, Mr. Benjamin Laventall",male,30,0,0,113051,27.75,C111,C 455 | 454,1,1,"Goldenberg, Mr. Samuel L",male,49,1,0,17453,89.1042,C92,C 456 | 455,0,3,"Peduzzi, Mr. Joseph",male,,0,0,A/5 2817,8.05,,S 457 | 456,1,3,"Jalsevac, Mr. Ivan",male,29,0,0,349240,7.8958,,C 458 | 457,0,1,"Millet, Mr. Francis Davis",male,65,0,0,13509,26.55,E38,S 459 | 458,1,1,"Kenyon, Mrs. Frederick R (Marion)",female,,1,0,17464,51.8625,D21,S 460 | 459,1,2,"Toomey, Miss. Ellen",female,50,0,0,F.C.C. 13531,10.5,,S 461 | 460,0,3,"O'Connor, Mr. Maurice",male,,0,0,371060,7.75,,Q 462 | 461,1,1,"Anderson, Mr. Harry",male,48,0,0,19952,26.55,E12,S 463 | 462,0,3,"Morley, Mr. William",male,34,0,0,364506,8.05,,S 464 | 463,0,1,"Gee, Mr. Arthur H",male,47,0,0,111320,38.5,E63,S 465 | 464,0,2,"Milling, Mr. Jacob Christian",male,48,0,0,234360,13,,S 466 | 465,0,3,"Maisner, Mr. Simon",male,,0,0,A/S 2816,8.05,,S 467 | 466,0,3,"Goncalves, Mr. Manuel Estanslas",male,38,0,0,SOTON/O.Q. 3101306,7.05,,S 468 | 467,0,2,"Campbell, Mr. William",male,,0,0,239853,0,,S 469 | 468,0,1,"Smart, Mr. John Montgomery",male,56,0,0,113792,26.55,,S 470 | 469,0,3,"Scanlan, Mr. James",male,,0,0,36209,7.725,,Q 471 | 470,1,3,"Baclini, Miss. Helene Barbara",female,0.75,2,1,2666,19.2583,,C 472 | 471,0,3,"Keefe, Mr. Arthur",male,,0,0,323592,7.25,,S 473 | 472,0,3,"Cacic, Mr. Luka",male,38,0,0,315089,8.6625,,S 474 | 473,1,2,"West, Mrs. Edwy Arthur (Ada Mary Worth)",female,33,1,2,C.A. 34651,27.75,,S 475 | 474,1,2,"Jerwan, Mrs. Amin S (Marie Marthe Thuillard)",female,23,0,0,SC/AH Basle 541,13.7917,D,C 476 | 475,0,3,"Strandberg, Miss. Ida Sofia",female,22,0,0,7553,9.8375,,S 477 | 476,0,1,"Clifford, Mr. George Quincy",male,,0,0,110465,52,A14,S 478 | 477,0,2,"Renouf, Mr. Peter Henry",male,34,1,0,31027,21,,S 479 | 478,0,3,"Braund, Mr. Lewis Richard",male,29,1,0,3460,7.0458,,S 480 | 479,0,3,"Karlsson, Mr. Nils August",male,22,0,0,350060,7.5208,,S 481 | 480,1,3,"Hirvonen, Miss. Hildur E",female,2,0,1,3101298,12.2875,,S 482 | 481,0,3,"Goodwin, Master. Harold Victor",male,9,5,2,CA 2144,46.9,,S 483 | 482,0,2,"Frost, Mr. Anthony Wood ""Archie""",male,,0,0,239854,0,,S 484 | 483,0,3,"Rouse, Mr. Richard Henry",male,50,0,0,A/5 3594,8.05,,S 485 | 484,1,3,"Turkula, Mrs. (Hedwig)",female,63,0,0,4134,9.5875,,S 486 | 485,1,1,"Bishop, Mr. Dickinson H",male,25,1,0,11967,91.0792,B49,C 487 | 486,0,3,"Lefebre, Miss. Jeannie",female,,3,1,4133,25.4667,,S 488 | 487,1,1,"Hoyt, Mrs. Frederick Maxfield (Jane Anne Forby)",female,35,1,0,19943,90,C93,S 489 | 488,0,1,"Kent, Mr. Edward Austin",male,58,0,0,11771,29.7,B37,C 490 | 489,0,3,"Somerton, Mr. Francis William",male,30,0,0,A.5. 18509,8.05,,S 491 | 490,1,3,"Coutts, Master. Eden Leslie ""Neville""",male,9,1,1,C.A. 37671,15.9,,S 492 | 491,0,3,"Hagland, Mr. Konrad Mathias Reiersen",male,,1,0,65304,19.9667,,S 493 | 492,0,3,"Windelov, Mr. Einar",male,21,0,0,SOTON/OQ 3101317,7.25,,S 494 | 493,0,1,"Molson, Mr. Harry Markland",male,55,0,0,113787,30.5,C30,S 495 | 494,0,1,"Artagaveytia, Mr. Ramon",male,71,0,0,PC 17609,49.5042,,C 496 | 495,0,3,"Stanley, Mr. Edward Roland",male,21,0,0,A/4 45380,8.05,,S 497 | 496,0,3,"Yousseff, Mr. Gerious",male,,0,0,2627,14.4583,,C 498 | 497,1,1,"Eustis, Miss. Elizabeth Mussey",female,54,1,0,36947,78.2667,D20,C 499 | 498,0,3,"Shellard, Mr. Frederick William",male,,0,0,C.A. 6212,15.1,,S 500 | 499,0,1,"Allison, Mrs. Hudson J C (Bessie Waldo Daniels)",female,25,1,2,113781,151.55,C22 C26,S 501 | 500,0,3,"Svensson, Mr. Olof",male,24,0,0,350035,7.7958,,S 502 | 501,0,3,"Calic, Mr. Petar",male,17,0,0,315086,8.6625,,S 503 | 502,0,3,"Canavan, Miss. Mary",female,21,0,0,364846,7.75,,Q 504 | 503,0,3,"O'Sullivan, Miss. Bridget Mary",female,,0,0,330909,7.6292,,Q 505 | 504,0,3,"Laitinen, Miss. Kristina Sofia",female,37,0,0,4135,9.5875,,S 506 | 505,1,1,"Maioni, Miss. Roberta",female,16,0,0,110152,86.5,B79,S 507 | 506,0,1,"Penasco y Castellana, Mr. Victor de Satode",male,18,1,0,PC 17758,108.9,C65,C 508 | 507,1,2,"Quick, Mrs. Frederick Charles (Jane Richards)",female,33,0,2,26360,26,,S 509 | 508,1,1,"Bradley, Mr. George (""George Arthur Brayton"")",male,,0,0,111427,26.55,,S 510 | 509,0,3,"Olsen, Mr. Henry Margido",male,28,0,0,C 4001,22.525,,S 511 | 510,1,3,"Lang, Mr. Fang",male,26,0,0,1601,56.4958,,S 512 | 511,1,3,"Daly, Mr. Eugene Patrick",male,29,0,0,382651,7.75,,Q 513 | 512,0,3,"Webber, Mr. James",male,,0,0,SOTON/OQ 3101316,8.05,,S 514 | 513,1,1,"McGough, Mr. James Robert",male,36,0,0,PC 17473,26.2875,E25,S 515 | 514,1,1,"Rothschild, Mrs. Martin (Elizabeth L. Barrett)",female,54,1,0,PC 17603,59.4,,C 516 | 515,0,3,"Coleff, Mr. Satio",male,24,0,0,349209,7.4958,,S 517 | 516,0,1,"Walker, Mr. William Anderson",male,47,0,0,36967,34.0208,D46,S 518 | 517,1,2,"Lemore, Mrs. (Amelia Milley)",female,34,0,0,C.A. 34260,10.5,F33,S 519 | 518,0,3,"Ryan, Mr. Patrick",male,,0,0,371110,24.15,,Q 520 | 519,1,2,"Angle, Mrs. William A (Florence ""Mary"" Agnes Hughes)",female,36,1,0,226875,26,,S 521 | 520,0,3,"Pavlovic, Mr. Stefo",male,32,0,0,349242,7.8958,,S 522 | 521,1,1,"Perreault, Miss. Anne",female,30,0,0,12749,93.5,B73,S 523 | 522,0,3,"Vovk, Mr. Janko",male,22,0,0,349252,7.8958,,S 524 | 523,0,3,"Lahoud, Mr. Sarkis",male,,0,0,2624,7.225,,C 525 | 524,1,1,"Hippach, Mrs. Louis Albert (Ida Sophia Fischer)",female,44,0,1,111361,57.9792,B18,C 526 | 525,0,3,"Kassem, Mr. Fared",male,,0,0,2700,7.2292,,C 527 | 526,0,3,"Farrell, Mr. James",male,40.5,0,0,367232,7.75,,Q 528 | 527,1,2,"Ridsdale, Miss. Lucy",female,50,0,0,W./C. 14258,10.5,,S 529 | 528,0,1,"Farthing, Mr. John",male,,0,0,PC 17483,221.7792,C95,S 530 | 529,0,3,"Salonen, Mr. Johan Werner",male,39,0,0,3101296,7.925,,S 531 | 530,0,2,"Hocking, Mr. Richard George",male,23,2,1,29104,11.5,,S 532 | 531,1,2,"Quick, Miss. Phyllis May",female,2,1,1,26360,26,,S 533 | 532,0,3,"Toufik, Mr. Nakli",male,,0,0,2641,7.2292,,C 534 | 533,0,3,"Elias, Mr. Joseph Jr",male,17,1,1,2690,7.2292,,C 535 | 534,1,3,"Peter, Mrs. Catherine (Catherine Rizk)",female,,0,2,2668,22.3583,,C 536 | 535,0,3,"Cacic, Miss. Marija",female,30,0,0,315084,8.6625,,S 537 | 536,1,2,"Hart, Miss. Eva Miriam",female,7,0,2,F.C.C. 13529,26.25,,S 538 | 537,0,1,"Butt, Major. Archibald Willingham",male,45,0,0,113050,26.55,B38,S 539 | 538,1,1,"LeRoy, Miss. Bertha",female,30,0,0,PC 17761,106.425,,C 540 | 539,0,3,"Risien, Mr. Samuel Beard",male,,0,0,364498,14.5,,S 541 | 540,1,1,"Frolicher, Miss. Hedwig Margaritha",female,22,0,2,13568,49.5,B39,C 542 | 541,1,1,"Crosby, Miss. Harriet R",female,36,0,2,WE/P 5735,71,B22,S 543 | 542,0,3,"Andersson, Miss. Ingeborg Constanzia",female,9,4,2,347082,31.275,,S 544 | 543,0,3,"Andersson, Miss. Sigrid Elisabeth",female,11,4,2,347082,31.275,,S 545 | 544,1,2,"Beane, Mr. Edward",male,32,1,0,2908,26,,S 546 | 545,0,1,"Douglas, Mr. Walter Donald",male,50,1,0,PC 17761,106.425,C86,C 547 | 546,0,1,"Nicholson, Mr. Arthur Ernest",male,64,0,0,693,26,,S 548 | 547,1,2,"Beane, Mrs. Edward (Ethel Clarke)",female,19,1,0,2908,26,,S 549 | 548,1,2,"Padro y Manent, Mr. Julian",male,,0,0,SC/PARIS 2146,13.8625,,C 550 | 549,0,3,"Goldsmith, Mr. Frank John",male,33,1,1,363291,20.525,,S 551 | 550,1,2,"Davies, Master. John Morgan Jr",male,8,1,1,C.A. 33112,36.75,,S 552 | 551,1,1,"Thayer, Mr. John Borland Jr",male,17,0,2,17421,110.8833,C70,C 553 | 552,0,2,"Sharp, Mr. Percival James R",male,27,0,0,244358,26,,S 554 | 553,0,3,"O'Brien, Mr. Timothy",male,,0,0,330979,7.8292,,Q 555 | 554,1,3,"Leeni, Mr. Fahim (""Philip Zenni"")",male,22,0,0,2620,7.225,,C 556 | 555,1,3,"Ohman, Miss. Velin",female,22,0,0,347085,7.775,,S 557 | 556,0,1,"Wright, Mr. George",male,62,0,0,113807,26.55,,S 558 | 557,1,1,"Duff Gordon, Lady. (Lucille Christiana Sutherland) (""Mrs Morgan"")",female,48,1,0,11755,39.6,A16,C 559 | 558,0,1,"Robbins, Mr. Victor",male,,0,0,PC 17757,227.525,,C 560 | 559,1,1,"Taussig, Mrs. Emil (Tillie Mandelbaum)",female,39,1,1,110413,79.65,E67,S 561 | 560,1,3,"de Messemaeker, Mrs. Guillaume Joseph (Emma)",female,36,1,0,345572,17.4,,S 562 | 561,0,3,"Morrow, Mr. Thomas Rowan",male,,0,0,372622,7.75,,Q 563 | 562,0,3,"Sivic, Mr. Husein",male,40,0,0,349251,7.8958,,S 564 | 563,0,2,"Norman, Mr. Robert Douglas",male,28,0,0,218629,13.5,,S 565 | 564,0,3,"Simmons, Mr. John",male,,0,0,SOTON/OQ 392082,8.05,,S 566 | 565,0,3,"Meanwell, Miss. (Marion Ogden)",female,,0,0,SOTON/O.Q. 392087,8.05,,S 567 | 566,0,3,"Davies, Mr. Alfred J",male,24,2,0,A/4 48871,24.15,,S 568 | 567,0,3,"Stoytcheff, Mr. Ilia",male,19,0,0,349205,7.8958,,S 569 | 568,0,3,"Palsson, Mrs. Nils (Alma Cornelia Berglund)",female,29,0,4,349909,21.075,,S 570 | 569,0,3,"Doharr, Mr. Tannous",male,,0,0,2686,7.2292,,C 571 | 570,1,3,"Jonsson, Mr. Carl",male,32,0,0,350417,7.8542,,S 572 | 571,1,2,"Harris, Mr. George",male,62,0,0,S.W./PP 752,10.5,,S 573 | 572,1,1,"Appleton, Mrs. Edward Dale (Charlotte Lamson)",female,53,2,0,11769,51.4792,C101,S 574 | 573,1,1,"Flynn, Mr. John Irwin (""Irving"")",male,36,0,0,PC 17474,26.3875,E25,S 575 | 574,1,3,"Kelly, Miss. Mary",female,,0,0,14312,7.75,,Q 576 | 575,0,3,"Rush, Mr. Alfred George John",male,16,0,0,A/4. 20589,8.05,,S 577 | 576,0,3,"Patchett, Mr. George",male,19,0,0,358585,14.5,,S 578 | 577,1,2,"Garside, Miss. Ethel",female,34,0,0,243880,13,,S 579 | 578,1,1,"Silvey, Mrs. William Baird (Alice Munger)",female,39,1,0,13507,55.9,E44,S 580 | 579,0,3,"Caram, Mrs. Joseph (Maria Elias)",female,,1,0,2689,14.4583,,C 581 | 580,1,3,"Jussila, Mr. Eiriik",male,32,0,0,STON/O 2. 3101286,7.925,,S 582 | 581,1,2,"Christy, Miss. Julie Rachel",female,25,1,1,237789,30,,S 583 | 582,1,1,"Thayer, Mrs. John Borland (Marian Longstreth Morris)",female,39,1,1,17421,110.8833,C68,C 584 | 583,0,2,"Downton, Mr. William James",male,54,0,0,28403,26,,S 585 | 584,0,1,"Ross, Mr. John Hugo",male,36,0,0,13049,40.125,A10,C 586 | 585,0,3,"Paulner, Mr. Uscher",male,,0,0,3411,8.7125,,C 587 | 586,1,1,"Taussig, Miss. Ruth",female,18,0,2,110413,79.65,E68,S 588 | 587,0,2,"Jarvis, Mr. John Denzil",male,47,0,0,237565,15,,S 589 | 588,1,1,"Frolicher-Stehli, Mr. Maxmillian",male,60,1,1,13567,79.2,B41,C 590 | 589,0,3,"Gilinski, Mr. Eliezer",male,22,0,0,14973,8.05,,S 591 | 590,0,3,"Murdlin, Mr. Joseph",male,,0,0,A./5. 3235,8.05,,S 592 | 591,0,3,"Rintamaki, Mr. Matti",male,35,0,0,STON/O 2. 3101273,7.125,,S 593 | 592,1,1,"Stephenson, Mrs. Walter Bertram (Martha Eustis)",female,52,1,0,36947,78.2667,D20,C 594 | 593,0,3,"Elsbury, Mr. William James",male,47,0,0,A/5 3902,7.25,,S 595 | 594,0,3,"Bourke, Miss. Mary",female,,0,2,364848,7.75,,Q 596 | 595,0,2,"Chapman, Mr. John Henry",male,37,1,0,SC/AH 29037,26,,S 597 | 596,0,3,"Van Impe, Mr. Jean Baptiste",male,36,1,1,345773,24.15,,S 598 | 597,1,2,"Leitch, Miss. Jessie Wills",female,,0,0,248727,33,,S 599 | 598,0,3,"Johnson, Mr. Alfred",male,49,0,0,LINE,0,,S 600 | 599,0,3,"Boulos, Mr. Hanna",male,,0,0,2664,7.225,,C 601 | 600,1,1,"Duff Gordon, Sir. Cosmo Edmund (""Mr Morgan"")",male,49,1,0,PC 17485,56.9292,A20,C 602 | 601,1,2,"Jacobsohn, Mrs. Sidney Samuel (Amy Frances Christy)",female,24,2,1,243847,27,,S 603 | 602,0,3,"Slabenoff, Mr. Petco",male,,0,0,349214,7.8958,,S 604 | 603,0,1,"Harrington, Mr. Charles H",male,,0,0,113796,42.4,,S 605 | 604,0,3,"Torber, Mr. Ernst William",male,44,0,0,364511,8.05,,S 606 | 605,1,1,"Homer, Mr. Harry (""Mr E Haven"")",male,35,0,0,111426,26.55,,C 607 | 606,0,3,"Lindell, Mr. Edvard Bengtsson",male,36,1,0,349910,15.55,,S 608 | 607,0,3,"Karaic, Mr. Milan",male,30,0,0,349246,7.8958,,S 609 | 608,1,1,"Daniel, Mr. Robert Williams",male,27,0,0,113804,30.5,,S 610 | 609,1,2,"Laroche, Mrs. Joseph (Juliette Marie Louise Lafargue)",female,22,1,2,SC/Paris 2123,41.5792,,C 611 | 610,1,1,"Shutes, Miss. Elizabeth W",female,40,0,0,PC 17582,153.4625,C125,S 612 | 611,0,3,"Andersson, Mrs. Anders Johan (Alfrida Konstantia Brogren)",female,39,1,5,347082,31.275,,S 613 | 612,0,3,"Jardin, Mr. Jose Neto",male,,0,0,SOTON/O.Q. 3101305,7.05,,S 614 | 613,1,3,"Murphy, Miss. Margaret Jane",female,,1,0,367230,15.5,,Q 615 | 614,0,3,"Horgan, Mr. John",male,,0,0,370377,7.75,,Q 616 | 615,0,3,"Brocklebank, Mr. William Alfred",male,35,0,0,364512,8.05,,S 617 | 616,1,2,"Herman, Miss. Alice",female,24,1,2,220845,65,,S 618 | 617,0,3,"Danbom, Mr. Ernst Gilbert",male,34,1,1,347080,14.4,,S 619 | 618,0,3,"Lobb, Mrs. William Arthur (Cordelia K Stanlick)",female,26,1,0,A/5. 3336,16.1,,S 620 | 619,1,2,"Becker, Miss. Marion Louise",female,4,2,1,230136,39,F4,S 621 | 620,0,2,"Gavey, Mr. Lawrence",male,26,0,0,31028,10.5,,S 622 | 621,0,3,"Yasbeck, Mr. Antoni",male,27,1,0,2659,14.4542,,C 623 | 622,1,1,"Kimball, Mr. Edwin Nelson Jr",male,42,1,0,11753,52.5542,D19,S 624 | 623,1,3,"Nakid, Mr. Sahid",male,20,1,1,2653,15.7417,,C 625 | 624,0,3,"Hansen, Mr. Henry Damsgaard",male,21,0,0,350029,7.8542,,S 626 | 625,0,3,"Bowen, Mr. David John ""Dai""",male,21,0,0,54636,16.1,,S 627 | 626,0,1,"Sutton, Mr. Frederick",male,61,0,0,36963,32.3208,D50,S 628 | 627,0,2,"Kirkland, Rev. Charles Leonard",male,57,0,0,219533,12.35,,Q 629 | 628,1,1,"Longley, Miss. Gretchen Fiske",female,21,0,0,13502,77.9583,D9,S 630 | 629,0,3,"Bostandyeff, Mr. Guentcho",male,26,0,0,349224,7.8958,,S 631 | 630,0,3,"O'Connell, Mr. Patrick D",male,,0,0,334912,7.7333,,Q 632 | 631,1,1,"Barkworth, Mr. Algernon Henry Wilson",male,80,0,0,27042,30,A23,S 633 | 632,0,3,"Lundahl, Mr. Johan Svensson",male,51,0,0,347743,7.0542,,S 634 | 633,1,1,"Stahelin-Maeglin, Dr. Max",male,32,0,0,13214,30.5,B50,C 635 | 634,0,1,"Parr, Mr. William Henry Marsh",male,,0,0,112052,0,,S 636 | 635,0,3,"Skoog, Miss. Mabel",female,9,3,2,347088,27.9,,S 637 | 636,1,2,"Davis, Miss. Mary",female,28,0,0,237668,13,,S 638 | 637,0,3,"Leinonen, Mr. Antti Gustaf",male,32,0,0,STON/O 2. 3101292,7.925,,S 639 | 638,0,2,"Collyer, Mr. Harvey",male,31,1,1,C.A. 31921,26.25,,S 640 | 639,0,3,"Panula, Mrs. Juha (Maria Emilia Ojala)",female,41,0,5,3101295,39.6875,,S 641 | 640,0,3,"Thorneycroft, Mr. Percival",male,,1,0,376564,16.1,,S 642 | 641,0,3,"Jensen, Mr. Hans Peder",male,20,0,0,350050,7.8542,,S 643 | 642,1,1,"Sagesser, Mlle. Emma",female,24,0,0,PC 17477,69.3,B35,C 644 | 643,0,3,"Skoog, Miss. Margit Elizabeth",female,2,3,2,347088,27.9,,S 645 | 644,1,3,"Foo, Mr. Choong",male,,0,0,1601,56.4958,,S 646 | 645,1,3,"Baclini, Miss. Eugenie",female,0.75,2,1,2666,19.2583,,C 647 | 646,1,1,"Harper, Mr. Henry Sleeper",male,48,1,0,PC 17572,76.7292,D33,C 648 | 647,0,3,"Cor, Mr. Liudevit",male,19,0,0,349231,7.8958,,S 649 | 648,1,1,"Simonius-Blumer, Col. Oberst Alfons",male,56,0,0,13213,35.5,A26,C 650 | 649,0,3,"Willey, Mr. Edward",male,,0,0,S.O./P.P. 751,7.55,,S 651 | 650,1,3,"Stanley, Miss. Amy Zillah Elsie",female,23,0,0,CA. 2314,7.55,,S 652 | 651,0,3,"Mitkoff, Mr. Mito",male,,0,0,349221,7.8958,,S 653 | 652,1,2,"Doling, Miss. Elsie",female,18,0,1,231919,23,,S 654 | 653,0,3,"Kalvik, Mr. Johannes Halvorsen",male,21,0,0,8475,8.4333,,S 655 | 654,1,3,"O'Leary, Miss. Hanora ""Norah""",female,,0,0,330919,7.8292,,Q 656 | 655,0,3,"Hegarty, Miss. Hanora ""Nora""",female,18,0,0,365226,6.75,,Q 657 | 656,0,2,"Hickman, Mr. Leonard Mark",male,24,2,0,S.O.C. 14879,73.5,,S 658 | 657,0,3,"Radeff, Mr. Alexander",male,,0,0,349223,7.8958,,S 659 | 658,0,3,"Bourke, Mrs. John (Catherine)",female,32,1,1,364849,15.5,,Q 660 | 659,0,2,"Eitemiller, Mr. George Floyd",male,23,0,0,29751,13,,S 661 | 660,0,1,"Newell, Mr. Arthur Webster",male,58,0,2,35273,113.275,D48,C 662 | 661,1,1,"Frauenthal, Dr. Henry William",male,50,2,0,PC 17611,133.65,,S 663 | 662,0,3,"Badt, Mr. Mohamed",male,40,0,0,2623,7.225,,C 664 | 663,0,1,"Colley, Mr. Edward Pomeroy",male,47,0,0,5727,25.5875,E58,S 665 | 664,0,3,"Coleff, Mr. Peju",male,36,0,0,349210,7.4958,,S 666 | 665,1,3,"Lindqvist, Mr. Eino William",male,20,1,0,STON/O 2. 3101285,7.925,,S 667 | 666,0,2,"Hickman, Mr. Lewis",male,32,2,0,S.O.C. 14879,73.5,,S 668 | 667,0,2,"Butler, Mr. Reginald Fenton",male,25,0,0,234686,13,,S 669 | 668,0,3,"Rommetvedt, Mr. Knud Paust",male,,0,0,312993,7.775,,S 670 | 669,0,3,"Cook, Mr. Jacob",male,43,0,0,A/5 3536,8.05,,S 671 | 670,1,1,"Taylor, Mrs. Elmer Zebley (Juliet Cummins Wright)",female,,1,0,19996,52,C126,S 672 | 671,1,2,"Brown, Mrs. Thomas William Solomon (Elizabeth Catherine Ford)",female,40,1,1,29750,39,,S 673 | 672,0,1,"Davidson, Mr. Thornton",male,31,1,0,F.C. 12750,52,B71,S 674 | 673,0,2,"Mitchell, Mr. Henry Michael",male,70,0,0,C.A. 24580,10.5,,S 675 | 674,1,2,"Wilhelms, Mr. Charles",male,31,0,0,244270,13,,S 676 | 675,0,2,"Watson, Mr. Ennis Hastings",male,,0,0,239856,0,,S 677 | 676,0,3,"Edvardsson, Mr. Gustaf Hjalmar",male,18,0,0,349912,7.775,,S 678 | 677,0,3,"Sawyer, Mr. Frederick Charles",male,24.5,0,0,342826,8.05,,S 679 | 678,1,3,"Turja, Miss. Anna Sofia",female,18,0,0,4138,9.8417,,S 680 | 679,0,3,"Goodwin, Mrs. Frederick (Augusta Tyler)",female,43,1,6,CA 2144,46.9,,S 681 | 680,1,1,"Cardeza, Mr. Thomas Drake Martinez",male,36,0,1,PC 17755,512.3292,B51 B53 B55,C 682 | 681,0,3,"Peters, Miss. Katie",female,,0,0,330935,8.1375,,Q 683 | 682,1,1,"Hassab, Mr. Hammad",male,27,0,0,PC 17572,76.7292,D49,C 684 | 683,0,3,"Olsvigen, Mr. Thor Anderson",male,20,0,0,6563,9.225,,S 685 | 684,0,3,"Goodwin, Mr. Charles Edward",male,14,5,2,CA 2144,46.9,,S 686 | 685,0,2,"Brown, Mr. Thomas William Solomon",male,60,1,1,29750,39,,S 687 | 686,0,2,"Laroche, Mr. Joseph Philippe Lemercier",male,25,1,2,SC/Paris 2123,41.5792,,C 688 | 687,0,3,"Panula, Mr. Jaako Arnold",male,14,4,1,3101295,39.6875,,S 689 | 688,0,3,"Dakic, Mr. Branko",male,19,0,0,349228,10.1708,,S 690 | 689,0,3,"Fischer, Mr. Eberhard Thelander",male,18,0,0,350036,7.7958,,S 691 | 690,1,1,"Madill, Miss. Georgette Alexandra",female,15,0,1,24160,211.3375,B5,S 692 | 691,1,1,"Dick, Mr. Albert Adrian",male,31,1,0,17474,57,B20,S 693 | 692,1,3,"Karun, Miss. Manca",female,4,0,1,349256,13.4167,,C 694 | 693,1,3,"Lam, Mr. Ali",male,,0,0,1601,56.4958,,S 695 | 694,0,3,"Saad, Mr. Khalil",male,25,0,0,2672,7.225,,C 696 | 695,0,1,"Weir, Col. John",male,60,0,0,113800,26.55,,S 697 | 696,0,2,"Chapman, Mr. Charles Henry",male,52,0,0,248731,13.5,,S 698 | 697,0,3,"Kelly, Mr. James",male,44,0,0,363592,8.05,,S 699 | 698,1,3,"Mullens, Miss. Katherine ""Katie""",female,,0,0,35852,7.7333,,Q 700 | 699,0,1,"Thayer, Mr. John Borland",male,49,1,1,17421,110.8833,C68,C 701 | 700,0,3,"Humblen, Mr. Adolf Mathias Nicolai Olsen",male,42,0,0,348121,7.65,F G63,S 702 | 701,1,1,"Astor, Mrs. John Jacob (Madeleine Talmadge Force)",female,18,1,0,PC 17757,227.525,C62 C64,C 703 | 702,1,1,"Silverthorne, Mr. Spencer Victor",male,35,0,0,PC 17475,26.2875,E24,S 704 | 703,0,3,"Barbara, Miss. Saiide",female,18,0,1,2691,14.4542,,C 705 | 704,0,3,"Gallagher, Mr. Martin",male,25,0,0,36864,7.7417,,Q 706 | 705,0,3,"Hansen, Mr. Henrik Juul",male,26,1,0,350025,7.8542,,S 707 | 706,0,2,"Morley, Mr. Henry Samuel (""Mr Henry Marshall"")",male,39,0,0,250655,26,,S 708 | 707,1,2,"Kelly, Mrs. Florence ""Fannie""",female,45,0,0,223596,13.5,,S 709 | 708,1,1,"Calderhead, Mr. Edward Pennington",male,42,0,0,PC 17476,26.2875,E24,S 710 | 709,1,1,"Cleaver, Miss. Alice",female,22,0,0,113781,151.55,,S 711 | 710,1,3,"Moubarek, Master. Halim Gonios (""William George"")",male,,1,1,2661,15.2458,,C 712 | 711,1,1,"Mayne, Mlle. Berthe Antonine (""Mrs de Villiers"")",female,24,0,0,PC 17482,49.5042,C90,C 713 | 712,0,1,"Klaber, Mr. Herman",male,,0,0,113028,26.55,C124,S 714 | 713,1,1,"Taylor, Mr. Elmer Zebley",male,48,1,0,19996,52,C126,S 715 | 714,0,3,"Larsson, Mr. August Viktor",male,29,0,0,7545,9.4833,,S 716 | 715,0,2,"Greenberg, Mr. Samuel",male,52,0,0,250647,13,,S 717 | 716,0,3,"Soholt, Mr. Peter Andreas Lauritz Andersen",male,19,0,0,348124,7.65,F G73,S 718 | 717,1,1,"Endres, Miss. Caroline Louise",female,38,0,0,PC 17757,227.525,C45,C 719 | 718,1,2,"Troutt, Miss. Edwina Celia ""Winnie""",female,27,0,0,34218,10.5,E101,S 720 | 719,0,3,"McEvoy, Mr. Michael",male,,0,0,36568,15.5,,Q 721 | 720,0,3,"Johnson, Mr. Malkolm Joackim",male,33,0,0,347062,7.775,,S 722 | 721,1,2,"Harper, Miss. Annie Jessie ""Nina""",female,6,0,1,248727,33,,S 723 | 722,0,3,"Jensen, Mr. Svend Lauritz",male,17,1,0,350048,7.0542,,S 724 | 723,0,2,"Gillespie, Mr. William Henry",male,34,0,0,12233,13,,S 725 | 724,0,2,"Hodges, Mr. Henry Price",male,50,0,0,250643,13,,S 726 | 725,1,1,"Chambers, Mr. Norman Campbell",male,27,1,0,113806,53.1,E8,S 727 | 726,0,3,"Oreskovic, Mr. Luka",male,20,0,0,315094,8.6625,,S 728 | 727,1,2,"Renouf, Mrs. Peter Henry (Lillian Jefferys)",female,30,3,0,31027,21,,S 729 | 728,1,3,"Mannion, Miss. Margareth",female,,0,0,36866,7.7375,,Q 730 | 729,0,2,"Bryhl, Mr. Kurt Arnold Gottfrid",male,25,1,0,236853,26,,S 731 | 730,0,3,"Ilmakangas, Miss. Pieta Sofia",female,25,1,0,STON/O2. 3101271,7.925,,S 732 | 731,1,1,"Allen, Miss. Elisabeth Walton",female,29,0,0,24160,211.3375,B5,S 733 | 732,0,3,"Hassan, Mr. Houssein G N",male,11,0,0,2699,18.7875,,C 734 | 733,0,2,"Knight, Mr. Robert J",male,,0,0,239855,0,,S 735 | 734,0,2,"Berriman, Mr. William John",male,23,0,0,28425,13,,S 736 | 735,0,2,"Troupiansky, Mr. Moses Aaron",male,23,0,0,233639,13,,S 737 | 736,0,3,"Williams, Mr. Leslie",male,28.5,0,0,54636,16.1,,S 738 | 737,0,3,"Ford, Mrs. Edward (Margaret Ann Watson)",female,48,1,3,W./C. 6608,34.375,,S 739 | 738,1,1,"Lesurer, Mr. Gustave J",male,35,0,0,PC 17755,512.3292,B101,C 740 | 739,0,3,"Ivanoff, Mr. Kanio",male,,0,0,349201,7.8958,,S 741 | 740,0,3,"Nankoff, Mr. Minko",male,,0,0,349218,7.8958,,S 742 | 741,1,1,"Hawksford, Mr. Walter James",male,,0,0,16988,30,D45,S 743 | 742,0,1,"Cavendish, Mr. Tyrell William",male,36,1,0,19877,78.85,C46,S 744 | 743,1,1,"Ryerson, Miss. Susan Parker ""Suzette""",female,21,2,2,PC 17608,262.375,B57 B59 B63 B66,C 745 | 744,0,3,"McNamee, Mr. Neal",male,24,1,0,376566,16.1,,S 746 | 745,1,3,"Stranden, Mr. Juho",male,31,0,0,STON/O 2. 3101288,7.925,,S 747 | 746,0,1,"Crosby, Capt. Edward Gifford",male,70,1,1,WE/P 5735,71,B22,S 748 | 747,0,3,"Abbott, Mr. Rossmore Edward",male,16,1,1,C.A. 2673,20.25,,S 749 | 748,1,2,"Sinkkonen, Miss. Anna",female,30,0,0,250648,13,,S 750 | 749,0,1,"Marvin, Mr. Daniel Warner",male,19,1,0,113773,53.1,D30,S 751 | 750,0,3,"Connaghton, Mr. Michael",male,31,0,0,335097,7.75,,Q 752 | 751,1,2,"Wells, Miss. Joan",female,4,1,1,29103,23,,S 753 | 752,1,3,"Moor, Master. Meier",male,6,0,1,392096,12.475,E121,S 754 | 753,0,3,"Vande Velde, Mr. Johannes Joseph",male,33,0,0,345780,9.5,,S 755 | 754,0,3,"Jonkoff, Mr. Lalio",male,23,0,0,349204,7.8958,,S 756 | 755,1,2,"Herman, Mrs. Samuel (Jane Laver)",female,48,1,2,220845,65,,S 757 | 756,1,2,"Hamalainen, Master. Viljo",male,0.67,1,1,250649,14.5,,S 758 | 757,0,3,"Carlsson, Mr. August Sigfrid",male,28,0,0,350042,7.7958,,S 759 | 758,0,2,"Bailey, Mr. Percy Andrew",male,18,0,0,29108,11.5,,S 760 | 759,0,3,"Theobald, Mr. Thomas Leonard",male,34,0,0,363294,8.05,,S 761 | 760,1,1,"Rothes, the Countess. of (Lucy Noel Martha Dyer-Edwards)",female,33,0,0,110152,86.5,B77,S 762 | 761,0,3,"Garfirth, Mr. John",male,,0,0,358585,14.5,,S 763 | 762,0,3,"Nirva, Mr. Iisakki Antino Aijo",male,41,0,0,SOTON/O2 3101272,7.125,,S 764 | 763,1,3,"Barah, Mr. Hanna Assi",male,20,0,0,2663,7.2292,,C 765 | 764,1,1,"Carter, Mrs. William Ernest (Lucile Polk)",female,36,1,2,113760,120,B96 B98,S 766 | 765,0,3,"Eklund, Mr. Hans Linus",male,16,0,0,347074,7.775,,S 767 | 766,1,1,"Hogeboom, Mrs. John C (Anna Andrews)",female,51,1,0,13502,77.9583,D11,S 768 | 767,0,1,"Brewe, Dr. Arthur Jackson",male,,0,0,112379,39.6,,C 769 | 768,0,3,"Mangan, Miss. Mary",female,30.5,0,0,364850,7.75,,Q 770 | 769,0,3,"Moran, Mr. Daniel J",male,,1,0,371110,24.15,,Q 771 | 770,0,3,"Gronnestad, Mr. Daniel Danielsen",male,32,0,0,8471,8.3625,,S 772 | 771,0,3,"Lievens, Mr. Rene Aime",male,24,0,0,345781,9.5,,S 773 | 772,0,3,"Jensen, Mr. Niels Peder",male,48,0,0,350047,7.8542,,S 774 | 773,0,2,"Mack, Mrs. (Mary)",female,57,0,0,S.O./P.P. 3,10.5,E77,S 775 | 774,0,3,"Elias, Mr. Dibo",male,,0,0,2674,7.225,,C 776 | 775,1,2,"Hocking, Mrs. Elizabeth (Eliza Needs)",female,54,1,3,29105,23,,S 777 | 776,0,3,"Myhrman, Mr. Pehr Fabian Oliver Malkolm",male,18,0,0,347078,7.75,,S 778 | 777,0,3,"Tobin, Mr. Roger",male,,0,0,383121,7.75,F38,Q 779 | 778,1,3,"Emanuel, Miss. Virginia Ethel",female,5,0,0,364516,12.475,,S 780 | 779,0,3,"Kilgannon, Mr. Thomas J",male,,0,0,36865,7.7375,,Q 781 | 780,1,1,"Robert, Mrs. Edward Scott (Elisabeth Walton McMillan)",female,43,0,1,24160,211.3375,B3,S 782 | 781,1,3,"Ayoub, Miss. Banoura",female,13,0,0,2687,7.2292,,C 783 | 782,1,1,"Dick, Mrs. Albert Adrian (Vera Gillespie)",female,17,1,0,17474,57,B20,S 784 | 783,0,1,"Long, Mr. Milton Clyde",male,29,0,0,113501,30,D6,S 785 | 784,0,3,"Johnston, Mr. Andrew G",male,,1,2,W./C. 6607,23.45,,S 786 | 785,0,3,"Ali, Mr. William",male,25,0,0,SOTON/O.Q. 3101312,7.05,,S 787 | 786,0,3,"Harmer, Mr. Abraham (David Lishin)",male,25,0,0,374887,7.25,,S 788 | 787,1,3,"Sjoblom, Miss. Anna Sofia",female,18,0,0,3101265,7.4958,,S 789 | 788,0,3,"Rice, Master. George Hugh",male,8,4,1,382652,29.125,,Q 790 | 789,1,3,"Dean, Master. Bertram Vere",male,1,1,2,C.A. 2315,20.575,,S 791 | 790,0,1,"Guggenheim, Mr. Benjamin",male,46,0,0,PC 17593,79.2,B82 B84,C 792 | 791,0,3,"Keane, Mr. Andrew ""Andy""",male,,0,0,12460,7.75,,Q 793 | 792,0,2,"Gaskell, Mr. Alfred",male,16,0,0,239865,26,,S 794 | 793,0,3,"Sage, Miss. Stella Anna",female,,8,2,CA. 2343,69.55,,S 795 | 794,0,1,"Hoyt, Mr. William Fisher",male,,0,0,PC 17600,30.6958,,C 796 | 795,0,3,"Dantcheff, Mr. Ristiu",male,25,0,0,349203,7.8958,,S 797 | 796,0,2,"Otter, Mr. Richard",male,39,0,0,28213,13,,S 798 | 797,1,1,"Leader, Dr. Alice (Farnham)",female,49,0,0,17465,25.9292,D17,S 799 | 798,1,3,"Osman, Mrs. Mara",female,31,0,0,349244,8.6833,,S 800 | 799,0,3,"Ibrahim Shawah, Mr. Yousseff",male,30,0,0,2685,7.2292,,C 801 | 800,0,3,"Van Impe, Mrs. Jean Baptiste (Rosalie Paula Govaert)",female,30,1,1,345773,24.15,,S 802 | 801,0,2,"Ponesell, Mr. Martin",male,34,0,0,250647,13,,S 803 | 802,1,2,"Collyer, Mrs. Harvey (Charlotte Annie Tate)",female,31,1,1,C.A. 31921,26.25,,S 804 | 803,1,1,"Carter, Master. William Thornton II",male,11,1,2,113760,120,B96 B98,S 805 | 804,1,3,"Thomas, Master. Assad Alexander",male,0.42,0,1,2625,8.5167,,C 806 | 805,1,3,"Hedman, Mr. Oskar Arvid",male,27,0,0,347089,6.975,,S 807 | 806,0,3,"Johansson, Mr. Karl Johan",male,31,0,0,347063,7.775,,S 808 | 807,0,1,"Andrews, Mr. Thomas Jr",male,39,0,0,112050,0,A36,S 809 | 808,0,3,"Pettersson, Miss. Ellen Natalia",female,18,0,0,347087,7.775,,S 810 | 809,0,2,"Meyer, Mr. August",male,39,0,0,248723,13,,S 811 | 810,1,1,"Chambers, Mrs. Norman Campbell (Bertha Griggs)",female,33,1,0,113806,53.1,E8,S 812 | 811,0,3,"Alexander, Mr. William",male,26,0,0,3474,7.8875,,S 813 | 812,0,3,"Lester, Mr. James",male,39,0,0,A/4 48871,24.15,,S 814 | 813,0,2,"Slemen, Mr. Richard James",male,35,0,0,28206,10.5,,S 815 | 814,0,3,"Andersson, Miss. Ebba Iris Alfrida",female,6,4,2,347082,31.275,,S 816 | 815,0,3,"Tomlin, Mr. Ernest Portage",male,30.5,0,0,364499,8.05,,S 817 | 816,0,1,"Fry, Mr. Richard",male,,0,0,112058,0,B102,S 818 | 817,0,3,"Heininen, Miss. Wendla Maria",female,23,0,0,STON/O2. 3101290,7.925,,S 819 | 818,0,2,"Mallet, Mr. Albert",male,31,1,1,S.C./PARIS 2079,37.0042,,C 820 | 819,0,3,"Holm, Mr. John Fredrik Alexander",male,43,0,0,C 7075,6.45,,S 821 | 820,0,3,"Skoog, Master. Karl Thorsten",male,10,3,2,347088,27.9,,S 822 | 821,1,1,"Hays, Mrs. Charles Melville (Clara Jennings Gregg)",female,52,1,1,12749,93.5,B69,S 823 | 822,1,3,"Lulic, Mr. Nikola",male,27,0,0,315098,8.6625,,S 824 | 823,0,1,"Reuchlin, Jonkheer. John George",male,38,0,0,19972,0,,S 825 | 824,1,3,"Moor, Mrs. (Beila)",female,27,0,1,392096,12.475,E121,S 826 | 825,0,3,"Panula, Master. Urho Abraham",male,2,4,1,3101295,39.6875,,S 827 | 826,0,3,"Flynn, Mr. John",male,,0,0,368323,6.95,,Q 828 | 827,0,3,"Lam, Mr. Len",male,,0,0,1601,56.4958,,S 829 | 828,1,2,"Mallet, Master. Andre",male,1,0,2,S.C./PARIS 2079,37.0042,,C 830 | 829,1,3,"McCormack, Mr. Thomas Joseph",male,,0,0,367228,7.75,,Q 831 | 830,1,1,"Stone, Mrs. George Nelson (Martha Evelyn)",female,62,0,0,113572,80,B28, 832 | 831,1,3,"Yasbeck, Mrs. Antoni (Selini Alexander)",female,15,1,0,2659,14.4542,,C 833 | 832,1,2,"Richards, Master. George Sibley",male,0.83,1,1,29106,18.75,,S 834 | 833,0,3,"Saad, Mr. Amin",male,,0,0,2671,7.2292,,C 835 | 834,0,3,"Augustsson, Mr. Albert",male,23,0,0,347468,7.8542,,S 836 | 835,0,3,"Allum, Mr. Owen George",male,18,0,0,2223,8.3,,S 837 | 836,1,1,"Compton, Miss. Sara Rebecca",female,39,1,1,PC 17756,83.1583,E49,C 838 | 837,0,3,"Pasic, Mr. Jakob",male,21,0,0,315097,8.6625,,S 839 | 838,0,3,"Sirota, Mr. Maurice",male,,0,0,392092,8.05,,S 840 | 839,1,3,"Chip, Mr. Chang",male,32,0,0,1601,56.4958,,S 841 | 840,1,1,"Marechal, Mr. Pierre",male,,0,0,11774,29.7,C47,C 842 | 841,0,3,"Alhomaki, Mr. Ilmari Rudolf",male,20,0,0,SOTON/O2 3101287,7.925,,S 843 | 842,0,2,"Mudd, Mr. Thomas Charles",male,16,0,0,S.O./P.P. 3,10.5,,S 844 | 843,1,1,"Serepeca, Miss. Augusta",female,30,0,0,113798,31,,C 845 | 844,0,3,"Lemberopolous, Mr. Peter L",male,34.5,0,0,2683,6.4375,,C 846 | 845,0,3,"Culumovic, Mr. Jeso",male,17,0,0,315090,8.6625,,S 847 | 846,0,3,"Abbing, Mr. Anthony",male,42,0,0,C.A. 5547,7.55,,S 848 | 847,0,3,"Sage, Mr. Douglas Bullen",male,,8,2,CA. 2343,69.55,,S 849 | 848,0,3,"Markoff, Mr. Marin",male,35,0,0,349213,7.8958,,C 850 | 849,0,2,"Harper, Rev. John",male,28,0,1,248727,33,,S 851 | 850,1,1,"Goldenberg, Mrs. Samuel L (Edwiga Grabowska)",female,,1,0,17453,89.1042,C92,C 852 | 851,0,3,"Andersson, Master. Sigvard Harald Elias",male,4,4,2,347082,31.275,,S 853 | 852,0,3,"Svensson, Mr. Johan",male,74,0,0,347060,7.775,,S 854 | 853,0,3,"Boulos, Miss. Nourelain",female,9,1,1,2678,15.2458,,C 855 | 854,1,1,"Lines, Miss. Mary Conover",female,16,0,1,PC 17592,39.4,D28,S 856 | 855,0,2,"Carter, Mrs. Ernest Courtenay (Lilian Hughes)",female,44,1,0,244252,26,,S 857 | 856,1,3,"Aks, Mrs. Sam (Leah Rosen)",female,18,0,1,392091,9.35,,S 858 | 857,1,1,"Wick, Mrs. George Dennick (Mary Hitchcock)",female,45,1,1,36928,164.8667,,S 859 | 858,1,1,"Daly, Mr. Peter Denis ",male,51,0,0,113055,26.55,E17,S 860 | 859,1,3,"Baclini, Mrs. Solomon (Latifa Qurban)",female,24,0,3,2666,19.2583,,C 861 | 860,0,3,"Razi, Mr. Raihed",male,,0,0,2629,7.2292,,C 862 | 861,0,3,"Hansen, Mr. Claus Peter",male,41,2,0,350026,14.1083,,S 863 | 862,0,2,"Giles, Mr. Frederick Edward",male,21,1,0,28134,11.5,,S 864 | 863,1,1,"Swift, Mrs. Frederick Joel (Margaret Welles Barron)",female,48,0,0,17466,25.9292,D17,S 865 | 864,0,3,"Sage, Miss. Dorothy Edith ""Dolly""",female,,8,2,CA. 2343,69.55,,S 866 | 865,0,2,"Gill, Mr. John William",male,24,0,0,233866,13,,S 867 | 866,1,2,"Bystrom, Mrs. (Karolina)",female,42,0,0,236852,13,,S 868 | 867,1,2,"Duran y More, Miss. Asuncion",female,27,1,0,SC/PARIS 2149,13.8583,,C 869 | 868,0,1,"Roebling, Mr. Washington Augustus II",male,31,0,0,PC 17590,50.4958,A24,S 870 | 869,0,3,"van Melkebeke, Mr. Philemon",male,,0,0,345777,9.5,,S 871 | 870,1,3,"Johnson, Master. Harold Theodor",male,4,1,1,347742,11.1333,,S 872 | 871,0,3,"Balkic, Mr. Cerin",male,26,0,0,349248,7.8958,,S 873 | 872,1,1,"Beckwith, Mrs. Richard Leonard (Sallie Monypeny)",female,47,1,1,11751,52.5542,D35,S 874 | 873,0,1,"Carlsson, Mr. Frans Olof",male,33,0,0,695,5,B51 B53 B55,S 875 | 874,0,3,"Vander Cruyssen, Mr. Victor",male,47,0,0,345765,9,,S 876 | 875,1,2,"Abelson, Mrs. Samuel (Hannah Wizosky)",female,28,1,0,P/PP 3381,24,,C 877 | 876,1,3,"Najib, Miss. Adele Kiamie ""Jane""",female,15,0,0,2667,7.225,,C 878 | 877,0,3,"Gustafsson, Mr. Alfred Ossian",male,20,0,0,7534,9.8458,,S 879 | 878,0,3,"Petroff, Mr. Nedelio",male,19,0,0,349212,7.8958,,S 880 | 879,0,3,"Laleff, Mr. Kristo",male,,0,0,349217,7.8958,,S 881 | 880,1,1,"Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)",female,56,0,1,11767,83.1583,C50,C 882 | 881,1,2,"Shelley, Mrs. William (Imanita Parrish Hall)",female,25,0,1,230433,26,,S 883 | 882,0,3,"Markun, Mr. Johann",male,33,0,0,349257,7.8958,,S 884 | 883,0,3,"Dahlberg, Miss. Gerda Ulrika",female,22,0,0,7552,10.5167,,S 885 | 884,0,2,"Banfield, Mr. Frederick James",male,28,0,0,C.A./SOTON 34068,10.5,,S 886 | 885,0,3,"Sutehall, Mr. Henry Jr",male,25,0,0,SOTON/OQ 392076,7.05,,S 887 | 886,0,3,"Rice, Mrs. William (Margaret Norton)",female,39,0,5,382652,29.125,,Q 888 | 887,0,2,"Montvila, Rev. Juozas",male,27,0,0,211536,13,,S 889 | 888,1,1,"Graham, Miss. Margaret Edith",female,19,0,0,112053,30,B42,S 890 | 889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.45,,S 891 | 890,1,1,"Behr, Mr. Karl Howell",male,26,0,0,111369,30,C148,C 892 | 891,0,3,"Dooley, Mr. Patrick",male,32,0,0,370376,7.75,,Q 893 | -------------------------------------------------------------------------------- /Machine Learning Sections/Recommender-Systems/Advanced Recommender Systems with Python.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\n", 8 | "# Advanced Recommender Systems with Python\n", 9 | "\n", 10 | "Welcome to the code notebook for creating Advanced Recommender Systems with Python. This is an optional lecture notebook for you to check out. Currently there is no video for this lecture because of the level of mathematics used and the heavy use of SciPy here.\n", 11 | "\n", 12 | "Recommendation Systems usually rely on larger data sets and specifically need to be organized in a particular fashion. Because of this, we won't have a project to go along with this topic, instead we will have a more intensive walkthrough process on creating a recommendation system with Python with the same Movie Lens Data Set.\n", 13 | "\n", 14 | "*Note: The actual mathematics behind recommender systems is pretty heavy in Linear Algebra.*\n", 15 | "___" 16 | ] 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": [ 22 | "## Methods Used\n", 23 | "\n", 24 | "Two most common types of recommender systems are **Content-Based** and **Collaborative Filtering (CF)**. \n", 25 | "\n", 26 | "* Collaborative filtering produces recommendations based on the knowledge of users’ attitude to items, that is it uses the \"wisdom of the crowd\" to recommend items. \n", 27 | "* Content-based recommender systems focus on the attributes of the items and give you recommendations based on the similarity between them.\n", 28 | "\n", 29 | "## Collaborative Filtering\n", 30 | "\n", 31 | "In general, Collaborative filtering (CF) is more commonly used than content-based systems because it usually gives better results and is relatively easy to understand (from an overall implementation perspective). The algorithm has the ability to do feature learning on its own, which means that it can start to learn for itself what features to use. \n", 32 | "\n", 33 | "CF can be divided into **Memory-Based Collaborative Filtering** and **Model-Based Collaborative filtering**. \n", 34 | "\n", 35 | "In this tutorial, we will implement Model-Based CF by using singular value decomposition (SVD) and Memory-Based CF by computing cosine similarity. \n", 36 | "\n", 37 | "## The Data\n", 38 | "\n", 39 | "We will use famous MovieLens dataset, which is one of the most common datasets used when implementing and testing recommender engines. It contains 100k movie ratings from 943 users and a selection of 1682 movies.\n", 40 | "\n", 41 | "You can download the dataset [here](http://files.grouplens.org/datasets/movielens/ml-100k.zip) or just use the u.data file that is already included in this folder.\n", 42 | "\n", 43 | "____\n", 44 | "## Getting Started\n", 45 | "\n", 46 | "Let's import some libraries we will need:" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": 1, 52 | "metadata": { 53 | "collapsed": true 54 | }, 55 | "outputs": [], 56 | "source": [ 57 | "import numpy as np\n", 58 | "import pandas as pd" 59 | ] 60 | }, 61 | { 62 | "cell_type": "markdown", 63 | "metadata": {}, 64 | "source": [ 65 | "We can then read in the **u.data** file, which contains the full dataset. You can read a brief description of the dataset [here](http://files.grouplens.org/datasets/movielens/ml-100k-README.txt).\n", 66 | "\n", 67 | "Note how we specify the separator argument for a Tab separated file." 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": 2, 73 | "metadata": { 74 | "collapsed": true 75 | }, 76 | "outputs": [], 77 | "source": [ 78 | "column_names = ['user_id', 'item_id', 'rating', 'timestamp']\n", 79 | "df = pd.read_csv('u.data', sep='\\t', names=column_names)" 80 | ] 81 | }, 82 | { 83 | "cell_type": "markdown", 84 | "metadata": {}, 85 | "source": [ 86 | "Let's take a quick look at the data." 87 | ] 88 | }, 89 | { 90 | "cell_type": "code", 91 | "execution_count": 4, 92 | "metadata": {}, 93 | "outputs": [ 94 | { 95 | "data": { 96 | "text/html": [ 97 | "
\n", 98 | "\n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | "
user_iditem_idratingtimestamp
00505881250949
101725881250949
201331881250949
31962423881250949
41863023891717742
\n", 146 | "
" 147 | ], 148 | "text/plain": [ 149 | " user_id item_id rating timestamp\n", 150 | "0 0 50 5 881250949\n", 151 | "1 0 172 5 881250949\n", 152 | "2 0 133 1 881250949\n", 153 | "3 196 242 3 881250949\n", 154 | "4 186 302 3 891717742" 155 | ] 156 | }, 157 | "execution_count": 4, 158 | "metadata": {}, 159 | "output_type": "execute_result" 160 | } 161 | ], 162 | "source": [ 163 | "df.head()" 164 | ] 165 | }, 166 | { 167 | "cell_type": "markdown", 168 | "metadata": {}, 169 | "source": [ 170 | "Note how we only have the item_id, not the movie name. We can use the Movie_ID_Titles csv file to grab the movie names and merge it with this dataframe:" 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": 15, 176 | "metadata": {}, 177 | "outputs": [ 178 | { 179 | "data": { 180 | "text/html": [ 181 | "
\n", 182 | "\n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | "
item_idtitle
01Toy Story (1995)
12GoldenEye (1995)
23Four Rooms (1995)
34Get Shorty (1995)
45Copycat (1995)
\n", 218 | "
" 219 | ], 220 | "text/plain": [ 221 | " item_id title\n", 222 | "0 1 Toy Story (1995)\n", 223 | "1 2 GoldenEye (1995)\n", 224 | "2 3 Four Rooms (1995)\n", 225 | "3 4 Get Shorty (1995)\n", 226 | "4 5 Copycat (1995)" 227 | ] 228 | }, 229 | "execution_count": 15, 230 | "metadata": {}, 231 | "output_type": "execute_result" 232 | } 233 | ], 234 | "source": [ 235 | "movie_titles = pd.read_csv(\"Movie_Id_Titles\")\n", 236 | "movie_titles.head()" 237 | ] 238 | }, 239 | { 240 | "cell_type": "markdown", 241 | "metadata": {}, 242 | "source": [ 243 | "Then merge the dataframes:" 244 | ] 245 | }, 246 | { 247 | "cell_type": "code", 248 | "execution_count": 16, 249 | "metadata": {}, 250 | "outputs": [ 251 | { 252 | "data": { 253 | "text/html": [ 254 | "
\n", 255 | "\n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | "
user_iditem_idratingtimestamptitle
00505881250949Star Wars (1977)
1290505880473582Star Wars (1977)
279504891271545Star Wars (1977)
32505888552084Star Wars (1977)
48505879362124Star Wars (1977)
\n", 309 | "
" 310 | ], 311 | "text/plain": [ 312 | " user_id item_id rating timestamp title\n", 313 | "0 0 50 5 881250949 Star Wars (1977)\n", 314 | "1 290 50 5 880473582 Star Wars (1977)\n", 315 | "2 79 50 4 891271545 Star Wars (1977)\n", 316 | "3 2 50 5 888552084 Star Wars (1977)\n", 317 | "4 8 50 5 879362124 Star Wars (1977)" 318 | ] 319 | }, 320 | "execution_count": 16, 321 | "metadata": {}, 322 | "output_type": "execute_result" 323 | } 324 | ], 325 | "source": [ 326 | "df = pd.merge(df,movie_titles,on='item_id')\n", 327 | "df.head()" 328 | ] 329 | }, 330 | { 331 | "cell_type": "markdown", 332 | "metadata": {}, 333 | "source": [ 334 | "Now let's take a quick look at the number of unique users and movies." 335 | ] 336 | }, 337 | { 338 | "cell_type": "code", 339 | "execution_count": 26, 340 | "metadata": {}, 341 | "outputs": [ 342 | { 343 | "name": "stdout", 344 | "output_type": "stream", 345 | "text": [ 346 | "Num. of Users: 944\n", 347 | "Num of Movies: 1682\n" 348 | ] 349 | } 350 | ], 351 | "source": [ 352 | "n_users = df.user_id.nunique()\n", 353 | "n_items = df.item_id.nunique()\n", 354 | "\n", 355 | "print('Num. of Users: '+ str(n_users))\n", 356 | "print('Num of Movies: '+str(n_items))" 357 | ] 358 | }, 359 | { 360 | "cell_type": "markdown", 361 | "metadata": {}, 362 | "source": [ 363 | "## Train Test Split\n", 364 | "\n", 365 | "Recommendation Systems by their very nature are very difficult to evaluate, but we will still show you how to evaluate them in this tutorial. In order to do this, we'll split our data into two sets. However, we won't do our classic X_train,X_test,y_train,y_test split. Instead we can actually just segement the data into two sets of data:" 366 | ] 367 | }, 368 | { 369 | "cell_type": "code", 370 | "execution_count": 27, 371 | "metadata": { 372 | "collapsed": true 373 | }, 374 | "outputs": [], 375 | "source": [ 376 | "from sklearn.cross_validation import train_test_split\n", 377 | "train_data, test_data = train_test_split(df, test_size=0.25)" 378 | ] 379 | }, 380 | { 381 | "cell_type": "markdown", 382 | "metadata": {}, 383 | "source": [ 384 | "## Memory-Based Collaborative Filtering\n", 385 | "\n", 386 | "Memory-Based Collaborative Filtering approaches can be divided into two main sections: **user-item filtering** and **item-item filtering**. \n", 387 | "\n", 388 | "A *user-item filtering* will take a particular user, find users that are similar to that user based on similarity of ratings, and recommend items that those similar users liked. \n", 389 | "\n", 390 | "In contrast, *item-item filtering* will take an item, find users who liked that item, and find other items that those users or similar users also liked. It takes items and outputs other items as recommendations. \n", 391 | "\n", 392 | "* *Item-Item Collaborative Filtering*: “Users who liked this item also liked …”\n", 393 | "* *User-Item Collaborative Filtering*: “Users who are similar to you also liked …”" 394 | ] 395 | }, 396 | { 397 | "cell_type": "markdown", 398 | "metadata": {}, 399 | "source": [ 400 | "In both cases, you create a user-item matrix which built from the entire dataset.\n", 401 | "\n", 402 | "Since we have split the data into testing and training we will need to create two ``[943 x 1682]`` matrices (all users by all movies). \n", 403 | "\n", 404 | "The training matrix contains 75% of the ratings and the testing matrix contains 25% of the ratings. " 405 | ] 406 | }, 407 | { 408 | "cell_type": "markdown", 409 | "metadata": {}, 410 | "source": [ 411 | "Example of user-item matrix:\n", 412 | "\"blog8\"/" 413 | ] 414 | }, 415 | { 416 | "cell_type": "markdown", 417 | "metadata": {}, 418 | "source": [ 419 | "After you have built the user-item matrix you calculate the similarity and create a similarity matrix. \n", 420 | "\n", 421 | "The similarity values between items in *Item-Item Collaborative Filtering* are measured by observing all the users who have rated both items. \n", 422 | "\n", 423 | "" 424 | ] 425 | }, 426 | { 427 | "cell_type": "markdown", 428 | "metadata": {}, 429 | "source": [ 430 | "For *User-Item Collaborative Filtering* the similarity values between users are measured by observing all the items that are rated by both users.\n", 431 | "\n", 432 | "" 433 | ] 434 | }, 435 | { 436 | "cell_type": "markdown", 437 | "metadata": {}, 438 | "source": [ 439 | "A distance metric commonly used in recommender systems is *cosine similarity*, where the ratings are seen as vectors in ``n``-dimensional space and the similarity is calculated based on the angle between these vectors. \n", 440 | "Cosine similiarity for users *a* and *m* can be calculated using the formula below, where you take dot product of the user vector *$u_k$* and the user vector *$u_a$* and divide it by multiplication of the Euclidean lengths of the vectors.\n", 441 | "\n", 442 | "\n", 443 | "To calculate similarity between items *m* and *b* you use the formula:\n", 444 | "\n", 445 | "\n", 447 | "\n", 448 | "Your first step will be to create the user-item matrix. Since you have both testing and training data you need to create two matrices. " 449 | ] 450 | }, 451 | { 452 | "cell_type": "code", 453 | "execution_count": 28, 454 | "metadata": { 455 | "collapsed": true 456 | }, 457 | "outputs": [], 458 | "source": [ 459 | "#Create two user-item matrices, one for training and another for testing\n", 460 | "train_data_matrix = np.zeros((n_users, n_items))\n", 461 | "for line in train_data.itertuples():\n", 462 | " train_data_matrix[line[1]-1, line[2]-1] = line[3] \n", 463 | "\n", 464 | "test_data_matrix = np.zeros((n_users, n_items))\n", 465 | "for line in test_data.itertuples():\n", 466 | " test_data_matrix[line[1]-1, line[2]-1] = line[3]" 467 | ] 468 | }, 469 | { 470 | "cell_type": "markdown", 471 | "metadata": {}, 472 | "source": [ 473 | "You can use the [pairwise_distances](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html) function from sklearn to calculate the cosine similarity. Note, the output will range from 0 to 1 since the ratings are all positive." 474 | ] 475 | }, 476 | { 477 | "cell_type": "code", 478 | "execution_count": 29, 479 | "metadata": { 480 | "collapsed": true 481 | }, 482 | "outputs": [], 483 | "source": [ 484 | "from sklearn.metrics.pairwise import pairwise_distances\n", 485 | "user_similarity = pairwise_distances(train_data_matrix, metric='cosine')\n", 486 | "item_similarity = pairwise_distances(train_data_matrix.T, metric='cosine')" 487 | ] 488 | }, 489 | { 490 | "cell_type": "markdown", 491 | "metadata": {}, 492 | "source": [ 493 | "Next step is to make predictions. You have already created similarity matrices: `user_similarity` and `item_similarity` and therefore you can make a prediction by applying following formula for user-based CF:\n", 494 | "\n", 495 | "\n", 496 | "\n", 497 | "You can look at the similarity between users *k* and *a* as weights that are multiplied by the ratings of a similar user *a* (corrected for the average rating of that user). You will need to normalize it so that the ratings stay between 1 and 5 and, as a final step, sum the average ratings for the user that you are trying to predict. \n", 498 | "\n", 499 | "The idea here is that some users may tend always to give high or low ratings to all movies. The relative difference in the ratings that these users give is more important than the absolute values. To give an example: suppose, user *k* gives 4 stars to his favourite movies and 3 stars to all other good movies. Suppose now that another user *t* rates movies that he/she likes with 5 stars, and the movies he/she fell asleep over with 3 stars. These two users could have a very similar taste but treat the rating system differently. \n", 500 | "\n", 501 | "When making a prediction for item-based CF you don't need to correct for users average rating since query user itself is used to do predictions.\n", 502 | "\n", 503 | "" 504 | ] 505 | }, 506 | { 507 | "cell_type": "code", 508 | "execution_count": 30, 509 | "metadata": { 510 | "collapsed": true 511 | }, 512 | "outputs": [], 513 | "source": [ 514 | "def predict(ratings, similarity, type='user'):\n", 515 | " if type == 'user':\n", 516 | " mean_user_rating = ratings.mean(axis=1)\n", 517 | " #You use np.newaxis so that mean_user_rating has same format as ratings\n", 518 | " ratings_diff = (ratings - mean_user_rating[:, np.newaxis]) \n", 519 | " pred = mean_user_rating[:, np.newaxis] + similarity.dot(ratings_diff) / np.array([np.abs(similarity).sum(axis=1)]).T\n", 520 | " elif type == 'item':\n", 521 | " pred = ratings.dot(similarity) / np.array([np.abs(similarity).sum(axis=1)]) \n", 522 | " return pred" 523 | ] 524 | }, 525 | { 526 | "cell_type": "code", 527 | "execution_count": 31, 528 | "metadata": { 529 | "collapsed": true 530 | }, 531 | "outputs": [], 532 | "source": [ 533 | "item_prediction = predict(train_data_matrix, item_similarity, type='item')\n", 534 | "user_prediction = predict(train_data_matrix, user_similarity, type='user')" 535 | ] 536 | }, 537 | { 538 | "cell_type": "markdown", 539 | "metadata": {}, 540 | "source": [ 541 | "### Evaluation\n", 542 | "There are many evaluation metrics but one of the most popular metric used to evaluate accuracy of predicted ratings is *Root Mean Squared Error (RMSE)*. \n", 543 | "\n", 544 | "\n", 545 | "You can use the [mean_square_error](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html) (MSE) function from `sklearn`, where the RMSE is just the square root of MSE. To read more about different evaluation metrics you can take a look at [this article](http://research.microsoft.com/pubs/115396/EvaluationMetrics.TR.pdf). " 546 | ] 547 | }, 548 | { 549 | "cell_type": "markdown", 550 | "metadata": {}, 551 | "source": [ 552 | "Since you only want to consider predicted ratings that are in the test dataset, you filter out all other elements in the prediction matrix with `prediction[ground_truth.nonzero()]`. " 553 | ] 554 | }, 555 | { 556 | "cell_type": "code", 557 | "execution_count": 32, 558 | "metadata": { 559 | "collapsed": true 560 | }, 561 | "outputs": [], 562 | "source": [ 563 | "from sklearn.metrics import mean_squared_error\n", 564 | "from math import sqrt\n", 565 | "def rmse(prediction, ground_truth):\n", 566 | " prediction = prediction[ground_truth.nonzero()].flatten() \n", 567 | " ground_truth = ground_truth[ground_truth.nonzero()].flatten()\n", 568 | " return sqrt(mean_squared_error(prediction, ground_truth))" 569 | ] 570 | }, 571 | { 572 | "cell_type": "code", 573 | "execution_count": 33, 574 | "metadata": { 575 | "scrolled": true 576 | }, 577 | "outputs": [ 578 | { 579 | "name": "stdout", 580 | "output_type": "stream", 581 | "text": [ 582 | "User-based CF RMSE: 3.135451660158989\n", 583 | "Item-based CF RMSE: 3.4593766647252515\n" 584 | ] 585 | } 586 | ], 587 | "source": [ 588 | "print('User-based CF RMSE: ' + str(rmse(user_prediction, test_data_matrix)))\n", 589 | "print('Item-based CF RMSE: ' + str(rmse(item_prediction, test_data_matrix)))" 590 | ] 591 | }, 592 | { 593 | "cell_type": "markdown", 594 | "metadata": { 595 | "collapsed": true 596 | }, 597 | "source": [ 598 | "Memory-based algorithms are easy to implement and produce reasonable prediction quality. \n", 599 | "The drawback of memory-based CF is that it doesn't scale to real-world scenarios and doesn't address the well-known cold-start problem, that is when new user or new item enters the system. Model-based CF methods are scalable and can deal with higher sparsity level than memory-based models, but also suffer when new users or items that don't have any ratings enter the system. I would like to thank Ethan Rosenthal for his [post](http://blog.ethanrosenthal.com/2015/11/02/intro-to-collaborative-filtering/) about Memory-Based Collaborative Filtering. " 600 | ] 601 | }, 602 | { 603 | "cell_type": "markdown", 604 | "metadata": {}, 605 | "source": [ 606 | "# Model-based Collaborative Filtering\n", 607 | "\n", 608 | "Model-based Collaborative Filtering is based on **matrix factorization (MF)** which has received greater exposure, mainly as an unsupervised learning method for latent variable decomposition and dimensionality reduction. Matrix factorization is widely used for recommender systems where it can deal better with scalability and sparsity than Memory-based CF. The goal of MF is to learn the latent preferences of users and the latent attributes of items from known ratings (learn features that describe the characteristics of ratings) to then predict the unknown ratings through the dot product of the latent features of users and items. \n", 609 | "When you have a very sparse matrix, with a lot of dimensions, by doing matrix factorization you can restructure the user-item matrix into low-rank structure, and you can represent the matrix by the multiplication of two low-rank matrices, where the rows contain the latent vector. You fit this matrix to approximate your original matrix, as closely as possible, by multiplying the low-rank matrices together, which fills in the entries missing in the original matrix.\n", 610 | "\n", 611 | "Let's calculate the sparsity level of MovieLens dataset:" 612 | ] 613 | }, 614 | { 615 | "cell_type": "code", 616 | "execution_count": 34, 617 | "metadata": {}, 618 | "outputs": [ 619 | { 620 | "name": "stdout", 621 | "output_type": "stream", 622 | "text": [ 623 | "The sparsity level of MovieLens100K is 93.7%\n" 624 | ] 625 | } 626 | ], 627 | "source": [ 628 | "sparsity=round(1.0-len(df)/float(n_users*n_items),3)\n", 629 | "print('The sparsity level of MovieLens100K is ' + str(sparsity*100) + '%')" 630 | ] 631 | }, 632 | { 633 | "cell_type": "markdown", 634 | "metadata": {}, 635 | "source": [ 636 | "To give an example of the learned latent preferences of the users and items: let's say for the MovieLens dataset you have the following information: _(user id, age, location, gender, movie id, director, actor, language, year, rating)_. By applying matrix factorization the model learns that important user features are _age group (under 10, 10-18, 18-30, 30-90)_, _location_ and _gender_, and for movie features it learns that _decade_, _director_ and _actor_ are most important. Now if you look into the information you have stored, there is no such feature as the _decade_, but the model can learn on its own. The important aspect is that the CF model only uses data (user_id, movie_id, rating) to learn the latent features. If there is little data available model-based CF model will predict poorly, since it will be more difficult to learn the latent features. \n", 637 | "\n", 638 | "Models that use both ratings and content features are called **Hybrid Recommender Systems** where both Collaborative Filtering and Content-based Models are combined. Hybrid recommender systems usually show higher accuracy than Collaborative Filtering or Content-based Models on their own: they are capable to address the cold-start problem better since if you don't have any ratings for a user or an item you could use the metadata from the user or item to make a prediction. " 639 | ] 640 | }, 641 | { 642 | "cell_type": "markdown", 643 | "metadata": {}, 644 | "source": [ 645 | "### SVD\n", 646 | "A well-known matrix factorization method is **Singular value decomposition (SVD)**. Collaborative Filtering can be formulated by approximating a matrix `X` by using singular value decomposition. The winning team at the Netflix Prize competition used SVD matrix factorization models to produce product recommendations, for more information I recommend to read articles: [Netflix Recommendations: Beyond the 5 stars](http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html) and [Netflix Prize and SVD](http://buzzard.ups.edu/courses/2014spring/420projects/math420-UPS-spring-2014-gower-netflix-SVD.pdf).\n", 647 | "The general equation can be expressed as follows:\n", 648 | "\n", 649 | "\n", 650 | "\n", 651 | "Given `m x n` matrix `X`:\n", 652 | "* *`U`* is an *`(m x r)`* orthogonal matrix\n", 653 | "* *`S`* is an *`(r x r)`* diagonal matrix with non-negative real numbers on the diagonal\n", 654 | "* *V^T* is an *`(r x n)`* orthogonal matrix\n", 655 | "\n", 656 | "Elements on the diagnoal in `S` are known as *singular values of `X`*. \n", 657 | "\n", 658 | "\n", 659 | "Matrix *`X`* can be factorized to *`U`*, *`S`* and *`V`*. The *`U`* matrix represents the feature vectors corresponding to the users in the hidden feature space and the *`V`* matrix represents the feature vectors corresponding to the items in the hidden feature space.\n", 660 | "\n", 661 | "\n", 662 | "Now you can make a prediction by taking dot product of *`U`*, *`S`* and *`V^T`*.\n", 663 | "\n", 664 | "" 665 | ] 666 | }, 667 | { 668 | "cell_type": "code", 669 | "execution_count": 35, 670 | "metadata": {}, 671 | "outputs": [ 672 | { 673 | "name": "stdout", 674 | "output_type": "stream", 675 | "text": [ 676 | "User-based CF MSE: 2.727093975231784\n" 677 | ] 678 | } 679 | ], 680 | "source": [ 681 | "import scipy.sparse as sp\n", 682 | "from scipy.sparse.linalg import svds\n", 683 | "\n", 684 | "#get SVD components from train matrix. Choose k.\n", 685 | "u, s, vt = svds(train_data_matrix, k = 20)\n", 686 | "s_diag_matrix=np.diag(s)\n", 687 | "X_pred = np.dot(np.dot(u, s_diag_matrix), vt)\n", 688 | "print('User-based CF MSE: ' + str(rmse(X_pred, test_data_matrix)))" 689 | ] 690 | }, 691 | { 692 | "cell_type": "markdown", 693 | "metadata": {}, 694 | "source": [ 695 | "Carelessly addressing only the relatively few known entries is highly prone to overfitting. SVD can be very slow and computationally expensive. More recent work minimizes the squared error by applying alternating least square or stochastic gradient descent and uses regularization terms to prevent overfitting. Alternating least square and stochastic gradient descent methods for CF will be covered in the next tutorials.\n" 696 | ] 697 | }, 698 | { 699 | "cell_type": "markdown", 700 | "metadata": {}, 701 | "source": [ 702 | "Review:\n", 703 | "\n", 704 | "* We have covered how to implement simple **Collaborative Filtering** methods, both memory-based CF and model-based CF.\n", 705 | "* **Memory-based models** are based on similarity between items or users, where we use cosine-similarity.\n", 706 | "* **Model-based CF** is based on matrix factorization where we use SVD to factorize the matrix.\n", 707 | "* Building recommender systems that perform well in cold-start scenarios (where little data is available on new users and items) remains a challenge. The standard collaborative filtering method performs poorly is such settings. " 708 | ] 709 | }, 710 | { 711 | "cell_type": "markdown", 712 | "metadata": {}, 713 | "source": [ 714 | "## Looking for more?\n", 715 | "\n", 716 | "If you want to tackle your own recommendation system analysis, check out these data sets. Note: The files are quite large in most cases, not all the links may stay up to host the data, but the majority of them still work. Or just Google for your own data set!\n", 717 | "\n", 718 | "**Movies Recommendation:**\n", 719 | "\n", 720 | "MovieLens - Movie Recommendation Data Sets http://www.grouplens.org/node/73\n", 721 | "\n", 722 | "Yahoo! - Movie, Music, and Images Ratings Data Sets http://webscope.sandbox.yahoo.com/catalog.php?datatype=r\n", 723 | "\n", 724 | "Jester - Movie Ratings Data Sets (Collaborative Filtering Dataset) http://www.ieor.berkeley.edu/~goldberg/jester-data/\n", 725 | "\n", 726 | "Cornell University - Movie-review data for use in sentiment-analysis experiments http://www.cs.cornell.edu/people/pabo/movie-review-data/\n", 727 | "\n", 728 | "**Music Recommendation:**\n", 729 | "\n", 730 | "Last.fm - Music Recommendation Data Sets http://www.dtic.upf.edu/~ocelma/MusicRecommendationDataset/index.html\n", 731 | "\n", 732 | "Yahoo! - Movie, Music, and Images Ratings Data Sets http://webscope.sandbox.yahoo.com/catalog.php?datatype=r\n", 733 | "\n", 734 | "Audioscrobbler - Music Recommendation Data Sets http://www-etud.iro.umontreal.ca/~bergstrj/audioscrobbler_data.html\n", 735 | "\n", 736 | "Amazon - Audio CD recommendations http://131.193.40.52/data/\n", 737 | "\n", 738 | "**Books Recommendation:**\n", 739 | "\n", 740 | "Institut für Informatik, Universität Freiburg - Book Ratings Data Sets http://www.informatik.uni-freiburg.de/~cziegler/BX/\n", 741 | "Food Recommendation:\n", 742 | "\n", 743 | "Chicago Entree - Food Ratings Data Sets http://archive.ics.uci.edu/ml/datasets/Entree+Chicago+Recommendation+Data\n", 744 | "Merchandise Recommendation:\n", 745 | "\n", 746 | "**Healthcare Recommendation:**\n", 747 | "\n", 748 | "Nursing Home - Provider Ratings Data Set http://data.medicare.gov/dataset/Nursing-Home-Compare-Provider-Ratings/mufm-vy8d\n", 749 | "\n", 750 | "Hospital Ratings - Survey of Patients Hospital Experiences http://data.medicare.gov/dataset/Survey-of-Patients-Hospital-Experiences-HCAHPS-/rj76-22dk\n", 751 | "\n", 752 | "**Dating Recommendation:**\n", 753 | "\n", 754 | "www.libimseti.cz - Dating website recommendation (collaborative filtering) http://www.occamslab.com/petricek/data/\n", 755 | "Scholarly Paper Recommendation:\n", 756 | "\n", 757 | "National University of Singapore - Scholarly Paper Recommendation http://www.comp.nus.edu.sg/~sugiyama/SchPaperRecData.html\n", 758 | "\n", 759 | "# Great Job!" 760 | ] 761 | }, 762 | { 763 | "cell_type": "code", 764 | "execution_count": null, 765 | "metadata": { 766 | "collapsed": true 767 | }, 768 | "outputs": [], 769 | "source": [] 770 | } 771 | ], 772 | "metadata": { 773 | "kernelspec": { 774 | "display_name": "Python 3", 775 | "language": "python", 776 | "name": "python3" 777 | }, 778 | "language_info": { 779 | "codemirror_mode": { 780 | "name": "ipython", 781 | "version": 3 782 | }, 783 | "file_extension": ".py", 784 | "mimetype": "text/x-python", 785 | "name": "python", 786 | "nbconvert_exporter": "python", 787 | "pygments_lexer": "ipython3", 788 | "version": "3.6.1" 789 | } 790 | }, 791 | "nbformat": 4, 792 | "nbformat_minor": 1 793 | } 794 | -------------------------------------------------------------------------------- /Python-Crash-Course/Python Crash Course Exercises .ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Python Crash Course Exercises \n", 8 | "\n", 9 | "This is an optional exercise to test your understanding of Python Basics. If you find this extremely challenging, then you probably are not ready for the rest of this course yet and don't have enough programming experience to continue. I would suggest you take another course more geared towards complete beginners, such as [Complete Python Bootcamp](https://www.udemy.com/complete-python-bootcamp)" 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": {}, 15 | "source": [ 16 | "## Exercises\n", 17 | "\n", 18 | "Answer the questions or complete the tasks outlined in bold below, use the specific method described if applicable." 19 | ] 20 | }, 21 | { 22 | "cell_type": "markdown", 23 | "metadata": {}, 24 | "source": [ 25 | "** What is 7 to the power of 4?**" 26 | ] 27 | }, 28 | { 29 | "cell_type": "code", 30 | "execution_count": 1, 31 | "metadata": {}, 32 | "outputs": [ 33 | { 34 | "data": { 35 | "text/plain": [ 36 | "2401" 37 | ] 38 | }, 39 | "execution_count": 1, 40 | "metadata": {}, 41 | "output_type": "execute_result" 42 | } 43 | ], 44 | "source": [ 45 | "7**4" 46 | ] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "metadata": {}, 51 | "source": [ 52 | "** Split this string:**\n", 53 | "\n", 54 | " s = \"Hi there Sam!\"\n", 55 | " \n", 56 | "**into a list. **" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": 3, 62 | "metadata": { 63 | "collapsed": true 64 | }, 65 | "outputs": [], 66 | "source": [ 67 | "s = \"Hi there Sam!\"" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": 4, 73 | "metadata": {}, 74 | "outputs": [ 75 | { 76 | "data": { 77 | "text/plain": [ 78 | "['Hi', 'there', 'Sam!']" 79 | ] 80 | }, 81 | "execution_count": 4, 82 | "metadata": {}, 83 | "output_type": "execute_result" 84 | } 85 | ], 86 | "source": [ 87 | "s.split(\" \")" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "** Given the variables:**\n", 95 | "\n", 96 | " planet = \"Earth\"\n", 97 | " diameter = 12742\n", 98 | "\n", 99 | "** Use .format() to print the following string: **\n", 100 | "\n", 101 | " The diameter of Earth is 12742 kilometers." 102 | ] 103 | }, 104 | { 105 | "cell_type": "code", 106 | "execution_count": 6, 107 | "metadata": { 108 | "collapsed": true 109 | }, 110 | "outputs": [], 111 | "source": [ 112 | "planet = \"Earth\"\n", 113 | "diameter = 12742" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": 7, 119 | "metadata": {}, 120 | "outputs": [ 121 | { 122 | "name": "stdout", 123 | "output_type": "stream", 124 | "text": [ 125 | "The diameter of the Earth is 12742 kilometers.\n" 126 | ] 127 | } 128 | ], 129 | "source": [ 130 | "print(\"The diameter of the {} is {} kilometers.\".format(planet, diameter))" 131 | ] 132 | }, 133 | { 134 | "cell_type": "markdown", 135 | "metadata": {}, 136 | "source": [ 137 | "** Given this nested list, use indexing to grab the word \"hello\" **" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": 13, 143 | "metadata": { 144 | "collapsed": true 145 | }, 146 | "outputs": [], 147 | "source": [ 148 | "lst = [1,2,[3,4],[5,[100,200,['hello']],23,11],1,7]" 149 | ] 150 | }, 151 | { 152 | "cell_type": "code", 153 | "execution_count": 14, 154 | "metadata": {}, 155 | "outputs": [ 156 | { 157 | "data": { 158 | "text/plain": [ 159 | "'hello'" 160 | ] 161 | }, 162 | "execution_count": 14, 163 | "metadata": {}, 164 | "output_type": "execute_result" 165 | } 166 | ], 167 | "source": [ 168 | "lst[3][1][2][0]" 169 | ] 170 | }, 171 | { 172 | "cell_type": "markdown", 173 | "metadata": {}, 174 | "source": [ 175 | "** Given this nested dictionary grab the word \"hello\". Be prepared, this will be annoying/tricky **" 176 | ] 177 | }, 178 | { 179 | "cell_type": "code", 180 | "execution_count": 16, 181 | "metadata": { 182 | "collapsed": true 183 | }, 184 | "outputs": [], 185 | "source": [ 186 | "d = {'k1':[1,2,3,{'tricky':['oh','man','inception',{'target':[1,2,3,'hello']}]}]}" 187 | ] 188 | }, 189 | { 190 | "cell_type": "code", 191 | "execution_count": 17, 192 | "metadata": {}, 193 | "outputs": [ 194 | { 195 | "data": { 196 | "text/plain": [ 197 | "'hello'" 198 | ] 199 | }, 200 | "execution_count": 17, 201 | "metadata": {}, 202 | "output_type": "execute_result" 203 | } 204 | ], 205 | "source": [ 206 | "d['k1'][3]['tricky'][3]['target'][3]" 207 | ] 208 | }, 209 | { 210 | "cell_type": "markdown", 211 | "metadata": {}, 212 | "source": [ 213 | "** What is the main difference between a tuple and a list? **" 214 | ] 215 | }, 216 | { 217 | "cell_type": "markdown", 218 | "metadata": { 219 | "collapsed": true 220 | }, 221 | "source": [ 222 | "A tuple uses parentheses, and individual elements cannot be overwritten or changed as easily as with a list." 223 | ] 224 | }, 225 | { 226 | "cell_type": "markdown", 227 | "metadata": {}, 228 | "source": [ 229 | "** Create a function that grabs the email website domain from a string in the form: **\n", 230 | "\n", 231 | " user@domain.com\n", 232 | " \n", 233 | "**So for example, passing \"user@domain.com\" would return: domain.com**" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": 23, 239 | "metadata": { 240 | "collapsed": true 241 | }, 242 | "outputs": [], 243 | "source": [ 244 | "def domainGet(email):\n", 245 | " return email[(email.find('@') + 1):]" 246 | ] 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": 24, 251 | "metadata": {}, 252 | "outputs": [ 253 | { 254 | "data": { 255 | "text/plain": [ 256 | "'domain.com'" 257 | ] 258 | }, 259 | "execution_count": 24, 260 | "metadata": {}, 261 | "output_type": "execute_result" 262 | } 263 | ], 264 | "source": [ 265 | "domainGet('user@domain.com')" 266 | ] 267 | }, 268 | { 269 | "cell_type": "markdown", 270 | "metadata": {}, 271 | "source": [ 272 | "** Create a basic function that returns True if the word 'dog' is contained in the input string. Don't worry about edge cases like a punctuation being attached to the word dog, but do account for capitalization. **" 273 | ] 274 | }, 275 | { 276 | "cell_type": "code", 277 | "execution_count": 27, 278 | "metadata": { 279 | "collapsed": true 280 | }, 281 | "outputs": [], 282 | "source": [ 283 | "def findDog(s):\n", 284 | " return 'dog' in s.lower()" 285 | ] 286 | }, 287 | { 288 | "cell_type": "code", 289 | "execution_count": 28, 290 | "metadata": {}, 291 | "outputs": [ 292 | { 293 | "data": { 294 | "text/plain": [ 295 | "True" 296 | ] 297 | }, 298 | "execution_count": 28, 299 | "metadata": {}, 300 | "output_type": "execute_result" 301 | } 302 | ], 303 | "source": [ 304 | "findDog('Is there a dog here?')" 305 | ] 306 | }, 307 | { 308 | "cell_type": "markdown", 309 | "metadata": {}, 310 | "source": [ 311 | "** Create a function that counts the number of times the word \"dog\" occurs in a string. Again ignore edge cases. **" 312 | ] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": 32, 317 | "metadata": { 318 | "collapsed": true 319 | }, 320 | "outputs": [], 321 | "source": [ 322 | "def countDog(s):\n", 323 | " return len([x for x in s.lower().split(\" \") if x == 'dog'])" 324 | ] 325 | }, 326 | { 327 | "cell_type": "code", 328 | "execution_count": 33, 329 | "metadata": {}, 330 | "outputs": [ 331 | { 332 | "data": { 333 | "text/plain": [ 334 | "2" 335 | ] 336 | }, 337 | "execution_count": 33, 338 | "metadata": {}, 339 | "output_type": "execute_result" 340 | } 341 | ], 342 | "source": [ 343 | "countDog('This dog runs faster than the other dog dude!')" 344 | ] 345 | }, 346 | { 347 | "cell_type": "markdown", 348 | "metadata": {}, 349 | "source": [ 350 | "** Use lambda expressions and the filter() function to filter out words from a list that don't start with the letter 's'. For example:**\n", 351 | "\n", 352 | " seq = ['soup','dog','salad','cat','great']\n", 353 | "\n", 354 | "**should be filtered down to:**\n", 355 | "\n", 356 | " ['soup','salad']" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": 7, 362 | "metadata": { 363 | "collapsed": true 364 | }, 365 | "outputs": [], 366 | "source": [ 367 | "seq = ['soup','dog','salad','cat','great']" 368 | ] 369 | }, 370 | { 371 | "cell_type": "code", 372 | "execution_count": 9, 373 | "metadata": {}, 374 | "outputs": [ 375 | { 376 | "data": { 377 | "text/plain": [ 378 | "['soup', 'salad']" 379 | ] 380 | }, 381 | "execution_count": 9, 382 | "metadata": {}, 383 | "output_type": "execute_result" 384 | } 385 | ], 386 | "source": [ 387 | "list(filter(lambda x: x[0] == 's', seq))" 388 | ] 389 | }, 390 | { 391 | "cell_type": "markdown", 392 | "metadata": {}, 393 | "source": [ 394 | "### Final Problem\n", 395 | "**You are driving a little too fast, and a police officer stops you. Write a function\n", 396 | " to return one of 3 possible results: \"No ticket\", \"Small ticket\", or \"Big Ticket\". \n", 397 | " If your speed is 60 or less, the result is \"No Ticket\". If speed is between 61 \n", 398 | " and 80 inclusive, the result is \"Small Ticket\". If speed is 81 or more, the result is \"Big Ticket\". Unless it is your birthday (encoded as a boolean value in the parameters of the function) -- on your birthday, your speed can be 5 higher in all \n", 399 | " cases. **" 400 | ] 401 | }, 402 | { 403 | "cell_type": "code", 404 | "execution_count": 10, 405 | "metadata": { 406 | "collapsed": true 407 | }, 408 | "outputs": [], 409 | "source": [ 410 | "def caught_speeding(speed, is_birthday):\n", 411 | " s = \"No ticket\"\n", 412 | " if is_birthday:\n", 413 | " speed -= 5\n", 414 | " if speed > 80:\n", 415 | " s = \"Big Ticket\"\n", 416 | " elif speed > 60:\n", 417 | " s = \"Small Ticket\"\n", 418 | " return s" 419 | ] 420 | }, 421 | { 422 | "cell_type": "code", 423 | "execution_count": 11, 424 | "metadata": {}, 425 | "outputs": [ 426 | { 427 | "data": { 428 | "text/plain": [ 429 | "'Small Ticket'" 430 | ] 431 | }, 432 | "execution_count": 11, 433 | "metadata": {}, 434 | "output_type": "execute_result" 435 | } 436 | ], 437 | "source": [ 438 | "caught_speeding(81, True)" 439 | ] 440 | }, 441 | { 442 | "cell_type": "code", 443 | "execution_count": 12, 444 | "metadata": {}, 445 | "outputs": [ 446 | { 447 | "data": { 448 | "text/plain": [ 449 | "'Big Ticket'" 450 | ] 451 | }, 452 | "execution_count": 12, 453 | "metadata": {}, 454 | "output_type": "execute_result" 455 | } 456 | ], 457 | "source": [ 458 | "caught_speeding(81, False)" 459 | ] 460 | }, 461 | { 462 | "cell_type": "markdown", 463 | "metadata": {}, 464 | "source": [ 465 | "# Great job!" 466 | ] 467 | } 468 | ], 469 | "metadata": { 470 | "kernelspec": { 471 | "display_name": "Python 3", 472 | "language": "python", 473 | "name": "python3" 474 | }, 475 | "language_info": { 476 | "codemirror_mode": { 477 | "name": "ipython", 478 | "version": 3 479 | }, 480 | "file_extension": ".py", 481 | "mimetype": "text/x-python", 482 | "name": "python", 483 | "nbconvert_exporter": "python", 484 | "pygments_lexer": "ipython3", 485 | "version": "3.6.1" 486 | } 487 | }, 488 | "nbformat": 4, 489 | "nbformat_minor": 1 490 | } 491 | -------------------------------------------------------------------------------- /Python-for-Data-Analysis/NumPy/Numpy Exercise .ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "___\n", 8 | "\n", 9 | " \n", 10 | "___" 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": {}, 16 | "source": [ 17 | "# NumPy Exercises \n", 18 | "\n", 19 | "Now that we've learned about NumPy let's test your knowledge. We'll start off with a few simple tasks, and then you'll be asked some more complicated questions." 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "#### Import NumPy as np" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": 1, 32 | "metadata": { 33 | "collapsed": true 34 | }, 35 | "outputs": [], 36 | "source": [ 37 | "import numpy as np" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": [ 44 | "#### Create an array of 10 zeros " 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": 2, 50 | "metadata": {}, 51 | "outputs": [ 52 | { 53 | "data": { 54 | "text/plain": [ 55 | "array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])" 56 | ] 57 | }, 58 | "execution_count": 2, 59 | "metadata": {}, 60 | "output_type": "execute_result" 61 | } 62 | ], 63 | "source": [ 64 | "np.zeros(10)" 65 | ] 66 | }, 67 | { 68 | "cell_type": "markdown", 69 | "metadata": {}, 70 | "source": [ 71 | "#### Create an array of 10 ones" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": 3, 77 | "metadata": {}, 78 | "outputs": [ 79 | { 80 | "data": { 81 | "text/plain": [ 82 | "array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])" 83 | ] 84 | }, 85 | "execution_count": 3, 86 | "metadata": {}, 87 | "output_type": "execute_result" 88 | } 89 | ], 90 | "source": [ 91 | "np.ones(10)" 92 | ] 93 | }, 94 | { 95 | "cell_type": "markdown", 96 | "metadata": {}, 97 | "source": [ 98 | "#### Create an array of 10 fives" 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": 4, 104 | "metadata": {}, 105 | "outputs": [ 106 | { 107 | "data": { 108 | "text/plain": [ 109 | "array([ 5., 5., 5., 5., 5., 5., 5., 5., 5., 5.])" 110 | ] 111 | }, 112 | "execution_count": 4, 113 | "metadata": {}, 114 | "output_type": "execute_result" 115 | } 116 | ], 117 | "source": [ 118 | "np.ones(10)*5" 119 | ] 120 | }, 121 | { 122 | "cell_type": "markdown", 123 | "metadata": {}, 124 | "source": [ 125 | "#### Create an array of the integers from 10 to 50" 126 | ] 127 | }, 128 | { 129 | "cell_type": "code", 130 | "execution_count": 5, 131 | "metadata": {}, 132 | "outputs": [ 133 | { 134 | "data": { 135 | "text/plain": [ 136 | "array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,\n", 137 | " 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,\n", 138 | " 44, 45, 46, 47, 48, 49, 50])" 139 | ] 140 | }, 141 | "execution_count": 5, 142 | "metadata": {}, 143 | "output_type": "execute_result" 144 | } 145 | ], 146 | "source": [ 147 | "np.arange(10, 51)" 148 | ] 149 | }, 150 | { 151 | "cell_type": "markdown", 152 | "metadata": {}, 153 | "source": [ 154 | "#### Create an array of all the even integers from 10 to 50" 155 | ] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": 6, 160 | "metadata": {}, 161 | "outputs": [ 162 | { 163 | "data": { 164 | "text/plain": [ 165 | "array([10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42,\n", 166 | " 44, 46, 48, 50])" 167 | ] 168 | }, 169 | "execution_count": 6, 170 | "metadata": {}, 171 | "output_type": "execute_result" 172 | } 173 | ], 174 | "source": [ 175 | "np.arange(10, 51, 2)" 176 | ] 177 | }, 178 | { 179 | "cell_type": "markdown", 180 | "metadata": {}, 181 | "source": [ 182 | "#### Create a 3x3 matrix with values ranging from 0 to 8" 183 | ] 184 | }, 185 | { 186 | "cell_type": "code", 187 | "execution_count": 7, 188 | "metadata": {}, 189 | "outputs": [ 190 | { 191 | "data": { 192 | "text/plain": [ 193 | "array([[0, 1, 2],\n", 194 | " [3, 4, 5],\n", 195 | " [6, 7, 8]])" 196 | ] 197 | }, 198 | "execution_count": 7, 199 | "metadata": {}, 200 | "output_type": "execute_result" 201 | } 202 | ], 203 | "source": [ 204 | "np.arange(0, 9).reshape(3, 3)" 205 | ] 206 | }, 207 | { 208 | "cell_type": "markdown", 209 | "metadata": {}, 210 | "source": [ 211 | "#### Create a 3x3 identity matrix" 212 | ] 213 | }, 214 | { 215 | "cell_type": "code", 216 | "execution_count": 8, 217 | "metadata": {}, 218 | "outputs": [ 219 | { 220 | "data": { 221 | "text/plain": [ 222 | "array([[ 1., 0., 0.],\n", 223 | " [ 0., 1., 0.],\n", 224 | " [ 0., 0., 1.]])" 225 | ] 226 | }, 227 | "execution_count": 8, 228 | "metadata": {}, 229 | "output_type": "execute_result" 230 | } 231 | ], 232 | "source": [ 233 | "np.eye(3)" 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "metadata": {}, 239 | "source": [ 240 | "#### Use NumPy to generate a random number between 0 and 1" 241 | ] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "execution_count": 9, 246 | "metadata": {}, 247 | "outputs": [ 248 | { 249 | "data": { 250 | "text/plain": [ 251 | "0.4759215240570823" 252 | ] 253 | }, 254 | "execution_count": 9, 255 | "metadata": {}, 256 | "output_type": "execute_result" 257 | } 258 | ], 259 | "source": [ 260 | "np.random.rand()" 261 | ] 262 | }, 263 | { 264 | "cell_type": "markdown", 265 | "metadata": {}, 266 | "source": [ 267 | "#### Use NumPy to generate an array of 25 random numbers sampled from a standard normal distribution" 268 | ] 269 | }, 270 | { 271 | "cell_type": "code", 272 | "execution_count": 10, 273 | "metadata": {}, 274 | "outputs": [ 275 | { 276 | "data": { 277 | "text/plain": [ 278 | "array([ 0.40127342, -0.39744424, 0.41517652, 0.51154175, 0.76030236,\n", 279 | " 0.89558154, 2.17951736, -0.55272178, 0.06671282, 1.19590839,\n", 280 | " -0.34447993, -0.75960322, -1.40236996, -0.51880969, 1.04033601,\n", 281 | " 0.38454209, -1.233228 , 1.11419685, -0.97172684, 0.29713533,\n", 282 | " -0.62494569, -1.6248781 , 0.76947529, -0.62297017, 0.18219321])" 283 | ] 284 | }, 285 | "execution_count": 10, 286 | "metadata": {}, 287 | "output_type": "execute_result" 288 | } 289 | ], 290 | "source": [ 291 | "np.random.randn(25)" 292 | ] 293 | }, 294 | { 295 | "cell_type": "markdown", 296 | "metadata": {}, 297 | "source": [ 298 | "#### Create the following matrix:" 299 | ] 300 | }, 301 | { 302 | "cell_type": "code", 303 | "execution_count": 11, 304 | "metadata": {}, 305 | "outputs": [ 306 | { 307 | "data": { 308 | "text/plain": [ 309 | "array([[ 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1 ],\n", 310 | " [ 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.2 ],\n", 311 | " [ 0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.3 ],\n", 312 | " [ 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4 ],\n", 313 | " [ 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5 ],\n", 314 | " [ 0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.6 ],\n", 315 | " [ 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.7 ],\n", 316 | " [ 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.8 ],\n", 317 | " [ 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.9 ],\n", 318 | " [ 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99, 1. ]])" 319 | ] 320 | }, 321 | "execution_count": 11, 322 | "metadata": {}, 323 | "output_type": "execute_result" 324 | } 325 | ], 326 | "source": [ 327 | "np.linspace(0.01, 1, 100).reshape(10, 10)" 328 | ] 329 | }, 330 | { 331 | "cell_type": "markdown", 332 | "metadata": {}, 333 | "source": [ 334 | "#### Create an array of 20 linearly spaced points between 0 and 1:" 335 | ] 336 | }, 337 | { 338 | "cell_type": "code", 339 | "execution_count": 12, 340 | "metadata": {}, 341 | "outputs": [ 342 | { 343 | "data": { 344 | "text/plain": [ 345 | "array([ 0. , 0.05263158, 0.10526316, 0.15789474, 0.21052632,\n", 346 | " 0.26315789, 0.31578947, 0.36842105, 0.42105263, 0.47368421,\n", 347 | " 0.52631579, 0.57894737, 0.63157895, 0.68421053, 0.73684211,\n", 348 | " 0.78947368, 0.84210526, 0.89473684, 0.94736842, 1. ])" 349 | ] 350 | }, 351 | "execution_count": 12, 352 | "metadata": {}, 353 | "output_type": "execute_result" 354 | } 355 | ], 356 | "source": [ 357 | "np.linspace(0, 1, 20)" 358 | ] 359 | }, 360 | { 361 | "cell_type": "markdown", 362 | "metadata": {}, 363 | "source": [ 364 | "## Numpy Indexing and Selection\n", 365 | "\n", 366 | "Now you will be given a few matrices, and be asked to replicate the resulting matrix outputs:" 367 | ] 368 | }, 369 | { 370 | "cell_type": "code", 371 | "execution_count": 13, 372 | "metadata": {}, 373 | "outputs": [ 374 | { 375 | "data": { 376 | "text/plain": [ 377 | "array([[ 1, 2, 3, 4, 5],\n", 378 | " [ 6, 7, 8, 9, 10],\n", 379 | " [11, 12, 13, 14, 15],\n", 380 | " [16, 17, 18, 19, 20],\n", 381 | " [21, 22, 23, 24, 25]])" 382 | ] 383 | }, 384 | "execution_count": 13, 385 | "metadata": {}, 386 | "output_type": "execute_result" 387 | } 388 | ], 389 | "source": [ 390 | "mat = np.arange(1, 26).reshape(5, 5)\n", 391 | "mat" 392 | ] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "execution_count": 25, 397 | "metadata": {}, 398 | "outputs": [ 399 | { 400 | "data": { 401 | "text/plain": [ 402 | "array([[12, 13, 14, 15],\n", 403 | " [17, 18, 19, 20],\n", 404 | " [22, 23, 24, 25]])" 405 | ] 406 | }, 407 | "execution_count": 25, 408 | "metadata": {}, 409 | "output_type": "execute_result" 410 | } 411 | ], 412 | "source": [ 413 | "# WRITE CODE HERE THAT REPRODUCES THE OUTPUT OF THE CELL BELOW\n", 414 | "# BE CAREFUL NOT TO RUN THE CELL BELOW, OTHERWISE YOU WON'T\n", 415 | "# BE ABLE TO SEE THE OUTPUT ANY MORE\n", 416 | "mat[2:, 1:]" 417 | ] 418 | }, 419 | { 420 | "cell_type": "code", 421 | "execution_count": 40, 422 | "metadata": {}, 423 | "outputs": [ 424 | { 425 | "data": { 426 | "text/plain": [ 427 | "array([[12, 13, 14, 15],\n", 428 | " [17, 18, 19, 20],\n", 429 | " [22, 23, 24, 25]])" 430 | ] 431 | }, 432 | "execution_count": 40, 433 | "metadata": {}, 434 | "output_type": "execute_result" 435 | } 436 | ], 437 | "source": [] 438 | }, 439 | { 440 | "cell_type": "code", 441 | "execution_count": 26, 442 | "metadata": {}, 443 | "outputs": [ 444 | { 445 | "data": { 446 | "text/plain": [ 447 | "20" 448 | ] 449 | }, 450 | "execution_count": 26, 451 | "metadata": {}, 452 | "output_type": "execute_result" 453 | } 454 | ], 455 | "source": [ 456 | "# WRITE CODE HERE THAT REPRODUCES THE OUTPUT OF THE CELL BELOW\n", 457 | "# BE CAREFUL NOT TO RUN THE CELL BELOW, OTHERWISE YOU WON'T\n", 458 | "# BE ABLE TO SEE THE OUTPUT ANY MORE\n", 459 | "mat[3][4]" 460 | ] 461 | }, 462 | { 463 | "cell_type": "code", 464 | "execution_count": 41, 465 | "metadata": {}, 466 | "outputs": [ 467 | { 468 | "data": { 469 | "text/plain": [ 470 | "20" 471 | ] 472 | }, 473 | "execution_count": 41, 474 | "metadata": {}, 475 | "output_type": "execute_result" 476 | } 477 | ], 478 | "source": [] 479 | }, 480 | { 481 | "cell_type": "code", 482 | "execution_count": 29, 483 | "metadata": {}, 484 | "outputs": [ 485 | { 486 | "data": { 487 | "text/plain": [ 488 | "array([[ 2],\n", 489 | " [ 7],\n", 490 | " [12]])" 491 | ] 492 | }, 493 | "execution_count": 29, 494 | "metadata": {}, 495 | "output_type": "execute_result" 496 | } 497 | ], 498 | "source": [ 499 | "# WRITE CODE HERE THAT REPRODUCES THE OUTPUT OF THE CELL BELOW\n", 500 | "# BE CAREFUL NOT TO RUN THE CELL BELOW, OTHERWISE YOU WON'T\n", 501 | "# BE ABLE TO SEE THE OUTPUT ANY MORE\n", 502 | "mat[:3,1].reshape(3,1)" 503 | ] 504 | }, 505 | { 506 | "cell_type": "code", 507 | "execution_count": 42, 508 | "metadata": {}, 509 | "outputs": [ 510 | { 511 | "data": { 512 | "text/plain": [ 513 | "array([[ 2],\n", 514 | " [ 7],\n", 515 | " [12]])" 516 | ] 517 | }, 518 | "execution_count": 42, 519 | "metadata": {}, 520 | "output_type": "execute_result" 521 | } 522 | ], 523 | "source": [] 524 | }, 525 | { 526 | "cell_type": "code", 527 | "execution_count": 31, 528 | "metadata": {}, 529 | "outputs": [ 530 | { 531 | "data": { 532 | "text/plain": [ 533 | "array([21, 22, 23, 24, 25])" 534 | ] 535 | }, 536 | "execution_count": 31, 537 | "metadata": {}, 538 | "output_type": "execute_result" 539 | } 540 | ], 541 | "source": [ 542 | "# WRITE CODE HERE THAT REPRODUCES THE OUTPUT OF THE CELL BELOW\n", 543 | "# BE CAREFUL NOT TO RUN THE CELL BELOW, OTHERWISE YOU WON'T\n", 544 | "# BE ABLE TO SEE THE OUTPUT ANY MORE\n", 545 | "mat[-1]" 546 | ] 547 | }, 548 | { 549 | "cell_type": "code", 550 | "execution_count": 46, 551 | "metadata": {}, 552 | "outputs": [ 553 | { 554 | "data": { 555 | "text/plain": [ 556 | "array([21, 22, 23, 24, 25])" 557 | ] 558 | }, 559 | "execution_count": 46, 560 | "metadata": {}, 561 | "output_type": "execute_result" 562 | } 563 | ], 564 | "source": [] 565 | }, 566 | { 567 | "cell_type": "code", 568 | "execution_count": 32, 569 | "metadata": {}, 570 | "outputs": [ 571 | { 572 | "data": { 573 | "text/plain": [ 574 | "array([[16, 17, 18, 19, 20],\n", 575 | " [21, 22, 23, 24, 25]])" 576 | ] 577 | }, 578 | "execution_count": 32, 579 | "metadata": {}, 580 | "output_type": "execute_result" 581 | } 582 | ], 583 | "source": [ 584 | "# WRITE CODE HERE THAT REPRODUCES THE OUTPUT OF THE CELL BELOW\n", 585 | "# BE CAREFUL NOT TO RUN THE CELL BELOW, OTHERWISE YOU WON'T\n", 586 | "# BE ABLE TO SEE THE OUTPUT ANY MORE\n", 587 | "mat[-2:]" 588 | ] 589 | }, 590 | { 591 | "cell_type": "code", 592 | "execution_count": 49, 593 | "metadata": {}, 594 | "outputs": [ 595 | { 596 | "data": { 597 | "text/plain": [ 598 | "array([[16, 17, 18, 19, 20],\n", 599 | " [21, 22, 23, 24, 25]])" 600 | ] 601 | }, 602 | "execution_count": 49, 603 | "metadata": {}, 604 | "output_type": "execute_result" 605 | } 606 | ], 607 | "source": [] 608 | }, 609 | { 610 | "cell_type": "markdown", 611 | "metadata": {}, 612 | "source": [ 613 | "### Now do the following" 614 | ] 615 | }, 616 | { 617 | "cell_type": "markdown", 618 | "metadata": {}, 619 | "source": [ 620 | "#### Get the sum of all the values in mat" 621 | ] 622 | }, 623 | { 624 | "cell_type": "code", 625 | "execution_count": 35, 626 | "metadata": {}, 627 | "outputs": [ 628 | { 629 | "data": { 630 | "text/plain": [ 631 | "325" 632 | ] 633 | }, 634 | "execution_count": 35, 635 | "metadata": {}, 636 | "output_type": "execute_result" 637 | } 638 | ], 639 | "source": [ 640 | "np.sum(mat)" 641 | ] 642 | }, 643 | { 644 | "cell_type": "markdown", 645 | "metadata": {}, 646 | "source": [ 647 | "#### Get the standard deviation of the values in mat" 648 | ] 649 | }, 650 | { 651 | "cell_type": "code", 652 | "execution_count": 34, 653 | "metadata": {}, 654 | "outputs": [ 655 | { 656 | "data": { 657 | "text/plain": [ 658 | "7.2111025509279782" 659 | ] 660 | }, 661 | "execution_count": 34, 662 | "metadata": {}, 663 | "output_type": "execute_result" 664 | } 665 | ], 666 | "source": [ 667 | "np.std(mat)" 668 | ] 669 | }, 670 | { 671 | "cell_type": "markdown", 672 | "metadata": {}, 673 | "source": [ 674 | "#### Get the sum of all the columns in mat" 675 | ] 676 | }, 677 | { 678 | "cell_type": "code", 679 | "execution_count": 36, 680 | "metadata": {}, 681 | "outputs": [ 682 | { 683 | "data": { 684 | "text/plain": [ 685 | "array([55, 60, 65, 70, 75])" 686 | ] 687 | }, 688 | "execution_count": 36, 689 | "metadata": {}, 690 | "output_type": "execute_result" 691 | } 692 | ], 693 | "source": [ 694 | "np.sum(mat, axis=0)" 695 | ] 696 | }, 697 | { 698 | "cell_type": "markdown", 699 | "metadata": { 700 | "collapsed": true 701 | }, 702 | "source": [ 703 | "# Great Job!" 704 | ] 705 | } 706 | ], 707 | "metadata": { 708 | "kernelspec": { 709 | "display_name": "Python 3", 710 | "language": "python", 711 | "name": "python3" 712 | }, 713 | "language_info": { 714 | "codemirror_mode": { 715 | "name": "ipython", 716 | "version": 3 717 | }, 718 | "file_extension": ".py", 719 | "mimetype": "text/x-python", 720 | "name": "python", 721 | "nbconvert_exporter": "python", 722 | "pygments_lexer": "ipython3", 723 | "version": "3.6.1" 724 | } 725 | }, 726 | "nbformat": 4, 727 | "nbformat_minor": 1 728 | } 729 | -------------------------------------------------------------------------------- /Python-for-Data-Analysis/Pandas/Excel_Sample.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/thomasdunlap/Python-DS-And-ML-Bootcamp/172dd5038aaf199d0e2e3170dbd90d0627937a9c/Python-for-Data-Analysis/Pandas/Excel_Sample.xlsx -------------------------------------------------------------------------------- /Python-for-Data-Analysis/Pandas/Pandas Exercises/Ecommerce Purchases Exercise .ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "___\n", 8 | "\n", 9 | " \n", 10 | "___\n", 11 | "# Ecommerce Purchases Exercise\n", 12 | "\n", 13 | "In this Exercise you will be given some Fake Data about some purchases done through Amazon! Just go ahead and follow the directions and try your best to answer the questions and complete the tasks. Feel free to reference the solutions. Most of the tasks can be solved in different ways. For the most part, the questions get progressively harder.\n", 14 | "\n", 15 | "Please excuse anything that doesn't make \"Real-World\" sense in the dataframe, all the data is fake and made-up.\n", 16 | "\n", 17 | "Also note that all of these questions can be answered with one line of code.\n", 18 | "____\n", 19 | "** Import pandas and read in the Ecommerce Purchases csv file and set it to a DataFrame called ecom. **" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": 1, 25 | "metadata": { 26 | "collapsed": true 27 | }, 28 | "outputs": [], 29 | "source": [ 30 | "import pandas as pd" 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 2, 36 | "metadata": { 37 | "collapsed": true 38 | }, 39 | "outputs": [], 40 | "source": [ 41 | "df = pd.read_csv(\"./Ecommerce Purchases\")" 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "metadata": {}, 47 | "source": [ 48 | "**Check the head of the DataFrame.**" 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": 3, 54 | "metadata": {}, 55 | "outputs": [ 56 | { 57 | "data": { 58 | "text/html": [ 59 | "
\n", 60 | "\n", 73 | "\n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | "
AddressLotAM or PMBrowser InfoCompanyCredit CardCC Exp DateCC Security CodeCC ProviderEmailJobIP AddressLanguagePurchase Price
016629 Pace Camp Apt. 448\\nAlexisborough, NE 77...46 inPMOpera/9.56.(X11; Linux x86_64; sl-SI) Presto/2...Martinez-Herman601192906112340602/20900JCB 16 digitpdunlap@yahoo.comScientist, product/process development149.146.147.205el98.14
19374 Jasmine Spurs Suite 508\\nSouth John, TN 8...28 rnPMOpera/8.93.(Windows 98; Win 9x 4.90; en-US) Pr...Fletcher, Richards and Whitaker333775816964535611/18561Mastercardanthony41@reed.comDrilling engineer15.160.41.51fr70.73
2Unit 0065 Box 5052\\nDPO AP 2745094 vEPMMozilla/5.0 (compatible; MSIE 9.0; Windows NT ...Simpson, Williams and Pham67595766612508/19699JCB 16 digitamymiller@morales-harrison.comCustomer service manager132.207.160.22de0.95
37780 Julia Fords\\nNew Stacy, WA 4579836 vmPMMozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0 ...Williams, Marshall and Buchanan601157850443071002/24384Discoverbrent16@olson-robinson.infoDrilling engineer30.250.74.19es78.04
423012 Munoz Drive Suite 337\\nNew Cynthia, TX 5...20 IEAMOpera/9.58.(X11; Linux x86_64; it-IT) Presto/2...Brown, Watson and Andrews601145662320799810/25678Diners Club / Carte Blanchechristopherwright@gmail.comFine artist24.140.33.94es77.82
\n", 181 | "
" 182 | ], 183 | "text/plain": [ 184 | " Address Lot AM or PM \\\n", 185 | "0 16629 Pace Camp Apt. 448\\nAlexisborough, NE 77... 46 in PM \n", 186 | "1 9374 Jasmine Spurs Suite 508\\nSouth John, TN 8... 28 rn PM \n", 187 | "2 Unit 0065 Box 5052\\nDPO AP 27450 94 vE PM \n", 188 | "3 7780 Julia Fords\\nNew Stacy, WA 45798 36 vm PM \n", 189 | "4 23012 Munoz Drive Suite 337\\nNew Cynthia, TX 5... 20 IE AM \n", 190 | "\n", 191 | " Browser Info \\\n", 192 | "0 Opera/9.56.(X11; Linux x86_64; sl-SI) Presto/2... \n", 193 | "1 Opera/8.93.(Windows 98; Win 9x 4.90; en-US) Pr... \n", 194 | "2 Mozilla/5.0 (compatible; MSIE 9.0; Windows NT ... \n", 195 | "3 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0 ... \n", 196 | "4 Opera/9.58.(X11; Linux x86_64; it-IT) Presto/2... \n", 197 | "\n", 198 | " Company Credit Card CC Exp Date \\\n", 199 | "0 Martinez-Herman 6011929061123406 02/20 \n", 200 | "1 Fletcher, Richards and Whitaker 3337758169645356 11/18 \n", 201 | "2 Simpson, Williams and Pham 675957666125 08/19 \n", 202 | "3 Williams, Marshall and Buchanan 6011578504430710 02/24 \n", 203 | "4 Brown, Watson and Andrews 6011456623207998 10/25 \n", 204 | "\n", 205 | " CC Security Code CC Provider \\\n", 206 | "0 900 JCB 16 digit \n", 207 | "1 561 Mastercard \n", 208 | "2 699 JCB 16 digit \n", 209 | "3 384 Discover \n", 210 | "4 678 Diners Club / Carte Blanche \n", 211 | "\n", 212 | " Email Job \\\n", 213 | "0 pdunlap@yahoo.com Scientist, product/process development \n", 214 | "1 anthony41@reed.com Drilling engineer \n", 215 | "2 amymiller@morales-harrison.com Customer service manager \n", 216 | "3 brent16@olson-robinson.info Drilling engineer \n", 217 | "4 christopherwright@gmail.com Fine artist \n", 218 | "\n", 219 | " IP Address Language Purchase Price \n", 220 | "0 149.146.147.205 el 98.14 \n", 221 | "1 15.160.41.51 fr 70.73 \n", 222 | "2 132.207.160.22 de 0.95 \n", 223 | "3 30.250.74.19 es 78.04 \n", 224 | "4 24.140.33.94 es 77.82 " 225 | ] 226 | }, 227 | "execution_count": 3, 228 | "metadata": {}, 229 | "output_type": "execute_result" 230 | } 231 | ], 232 | "source": [ 233 | "df.head()" 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "metadata": {}, 239 | "source": [ 240 | "** How many rows and columns are there? **" 241 | ] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "execution_count": 7, 246 | "metadata": {}, 247 | "outputs": [ 248 | { 249 | "name": "stdout", 250 | "output_type": "stream", 251 | "text": [ 252 | "\n", 253 | "RangeIndex: 10000 entries, 0 to 9999\n", 254 | "Data columns (total 14 columns):\n", 255 | "Address 10000 non-null object\n", 256 | "Lot 10000 non-null object\n", 257 | "AM or PM 10000 non-null object\n", 258 | "Browser Info 10000 non-null object\n", 259 | "Company 10000 non-null object\n", 260 | "Credit Card 10000 non-null int64\n", 261 | "CC Exp Date 10000 non-null object\n", 262 | "CC Security Code 10000 non-null int64\n", 263 | "CC Provider 10000 non-null object\n", 264 | "Email 10000 non-null object\n", 265 | "Job 10000 non-null object\n", 266 | "IP Address 10000 non-null object\n", 267 | "Language 10000 non-null object\n", 268 | "Purchase Price 10000 non-null float64\n", 269 | "dtypes: float64(1), int64(2), object(11)\n", 270 | "memory usage: 1.1+ MB\n" 271 | ] 272 | } 273 | ], 274 | "source": [ 275 | "df.info()" 276 | ] 277 | }, 278 | { 279 | "cell_type": "markdown", 280 | "metadata": {}, 281 | "source": [ 282 | "** What is the average Purchase Price? **" 283 | ] 284 | }, 285 | { 286 | "cell_type": "code", 287 | "execution_count": 8, 288 | "metadata": {}, 289 | "outputs": [ 290 | { 291 | "data": { 292 | "text/plain": [ 293 | "50.34730200000025" 294 | ] 295 | }, 296 | "execution_count": 8, 297 | "metadata": {}, 298 | "output_type": "execute_result" 299 | } 300 | ], 301 | "source": [ 302 | "df['Purchase Price'].mean()" 303 | ] 304 | }, 305 | { 306 | "cell_type": "markdown", 307 | "metadata": {}, 308 | "source": [ 309 | "** What were the highest and lowest purchase prices? **" 310 | ] 311 | }, 312 | { 313 | "cell_type": "code", 314 | "execution_count": 9, 315 | "metadata": {}, 316 | "outputs": [ 317 | { 318 | "data": { 319 | "text/plain": [ 320 | "99.989999999999995" 321 | ] 322 | }, 323 | "execution_count": 9, 324 | "metadata": {}, 325 | "output_type": "execute_result" 326 | } 327 | ], 328 | "source": [ 329 | "df['Purchase Price'].max()" 330 | ] 331 | }, 332 | { 333 | "cell_type": "code", 334 | "execution_count": 10, 335 | "metadata": {}, 336 | "outputs": [ 337 | { 338 | "data": { 339 | "text/plain": [ 340 | "0.0" 341 | ] 342 | }, 343 | "execution_count": 10, 344 | "metadata": {}, 345 | "output_type": "execute_result" 346 | } 347 | ], 348 | "source": [ 349 | "df['Purchase Price'].min()" 350 | ] 351 | }, 352 | { 353 | "cell_type": "markdown", 354 | "metadata": {}, 355 | "source": [ 356 | "** How many people have English 'en' as their Language of choice on the website? **" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": 11, 362 | "metadata": {}, 363 | "outputs": [ 364 | { 365 | "data": { 366 | "text/plain": [ 367 | "Address 1098\n", 368 | "Lot 1098\n", 369 | "AM or PM 1098\n", 370 | "Browser Info 1098\n", 371 | "Company 1098\n", 372 | "Credit Card 1098\n", 373 | "CC Exp Date 1098\n", 374 | "CC Security Code 1098\n", 375 | "CC Provider 1098\n", 376 | "Email 1098\n", 377 | "Job 1098\n", 378 | "IP Address 1098\n", 379 | "Language 1098\n", 380 | "Purchase Price 1098\n", 381 | "dtype: int64" 382 | ] 383 | }, 384 | "execution_count": 11, 385 | "metadata": {}, 386 | "output_type": "execute_result" 387 | } 388 | ], 389 | "source": [ 390 | "df[df['Language'] == 'en'].count()" 391 | ] 392 | }, 393 | { 394 | "cell_type": "markdown", 395 | "metadata": {}, 396 | "source": [ 397 | "** How many people have the job title of \"Lawyer\" ? **\n" 398 | ] 399 | }, 400 | { 401 | "cell_type": "code", 402 | "execution_count": 12, 403 | "metadata": {}, 404 | "outputs": [ 405 | { 406 | "name": "stdout", 407 | "output_type": "stream", 408 | "text": [ 409 | "\n", 410 | "Int64Index: 30 entries, 470 to 9979\n", 411 | "Data columns (total 14 columns):\n", 412 | "Address 30 non-null object\n", 413 | "Lot 30 non-null object\n", 414 | "AM or PM 30 non-null object\n", 415 | "Browser Info 30 non-null object\n", 416 | "Company 30 non-null object\n", 417 | "Credit Card 30 non-null int64\n", 418 | "CC Exp Date 30 non-null object\n", 419 | "CC Security Code 30 non-null int64\n", 420 | "CC Provider 30 non-null object\n", 421 | "Email 30 non-null object\n", 422 | "Job 30 non-null object\n", 423 | "IP Address 30 non-null object\n", 424 | "Language 30 non-null object\n", 425 | "Purchase Price 30 non-null float64\n", 426 | "dtypes: float64(1), int64(2), object(11)\n", 427 | "memory usage: 3.5+ KB\n" 428 | ] 429 | } 430 | ], 431 | "source": [ 432 | "df[df['Job'] == 'Lawyer'].info()" 433 | ] 434 | }, 435 | { 436 | "cell_type": "markdown", 437 | "metadata": {}, 438 | "source": [ 439 | "** How many people made the purchase during the AM and how many people made the purchase during PM ? **\n", 440 | "\n", 441 | "**(Hint: Check out [value_counts()](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.value_counts.html) ) **" 442 | ] 443 | }, 444 | { 445 | "cell_type": "code", 446 | "execution_count": 13, 447 | "metadata": {}, 448 | "outputs": [ 449 | { 450 | "data": { 451 | "text/plain": [ 452 | "PM 5068\n", 453 | "AM 4932\n", 454 | "Name: AM or PM, dtype: int64" 455 | ] 456 | }, 457 | "execution_count": 13, 458 | "metadata": {}, 459 | "output_type": "execute_result" 460 | } 461 | ], 462 | "source": [ 463 | "df['AM or PM'].value_counts()" 464 | ] 465 | }, 466 | { 467 | "cell_type": "markdown", 468 | "metadata": {}, 469 | "source": [ 470 | "** What are the 5 most common Job Titles? **" 471 | ] 472 | }, 473 | { 474 | "cell_type": "code", 475 | "execution_count": 15, 476 | "metadata": {}, 477 | "outputs": [ 478 | { 479 | "data": { 480 | "text/plain": [ 481 | "Interior and spatial designer 31\n", 482 | "Lawyer 30\n", 483 | "Social researcher 28\n", 484 | "Designer, jewellery 27\n", 485 | "Purchasing manager 27\n", 486 | "Name: Job, dtype: int64" 487 | ] 488 | }, 489 | "execution_count": 15, 490 | "metadata": {}, 491 | "output_type": "execute_result" 492 | } 493 | ], 494 | "source": [ 495 | "df['Job'].value_counts().head(5)" 496 | ] 497 | }, 498 | { 499 | "cell_type": "markdown", 500 | "metadata": {}, 501 | "source": [ 502 | "** Someone made a purchase that came from Lot: \"90 WT\" , what was the Purchase Price for this transaction? **" 503 | ] 504 | }, 505 | { 506 | "cell_type": "code", 507 | "execution_count": 16, 508 | "metadata": {}, 509 | "outputs": [ 510 | { 511 | "data": { 512 | "text/plain": [ 513 | "513 75.1\n", 514 | "Name: Purchase Price, dtype: float64" 515 | ] 516 | }, 517 | "execution_count": 16, 518 | "metadata": {}, 519 | "output_type": "execute_result" 520 | } 521 | ], 522 | "source": [ 523 | "df[df['Lot'] == \"90 WT\"]['Purchase Price']" 524 | ] 525 | }, 526 | { 527 | "cell_type": "markdown", 528 | "metadata": {}, 529 | "source": [ 530 | "** What is the email of the person with the following Credit Card Number: 4926535242672853 **" 531 | ] 532 | }, 533 | { 534 | "cell_type": "code", 535 | "execution_count": 19, 536 | "metadata": {}, 537 | "outputs": [ 538 | { 539 | "data": { 540 | "text/plain": [ 541 | "1234 bondellen@williams-garza.com\n", 542 | "Name: Email, dtype: object" 543 | ] 544 | }, 545 | "execution_count": 19, 546 | "metadata": {}, 547 | "output_type": "execute_result" 548 | } 549 | ], 550 | "source": [ 551 | "df[df['Credit Card'] == 4926535242672853]['Email']" 552 | ] 553 | }, 554 | { 555 | "cell_type": "markdown", 556 | "metadata": {}, 557 | "source": [ 558 | "** How many people have American Express as their Credit Card Provider *and* made a purchase above $95 ?**" 559 | ] 560 | }, 561 | { 562 | "cell_type": "code", 563 | "execution_count": 24, 564 | "metadata": {}, 565 | "outputs": [ 566 | { 567 | "data": { 568 | "text/plain": [ 569 | "Address 39\n", 570 | "Lot 39\n", 571 | "AM or PM 39\n", 572 | "Browser Info 39\n", 573 | "Company 39\n", 574 | "Credit Card 39\n", 575 | "CC Exp Date 39\n", 576 | "CC Security Code 39\n", 577 | "CC Provider 39\n", 578 | "Email 39\n", 579 | "Job 39\n", 580 | "IP Address 39\n", 581 | "Language 39\n", 582 | "Purchase Price 39\n", 583 | "dtype: int64" 584 | ] 585 | }, 586 | "execution_count": 24, 587 | "metadata": {}, 588 | "output_type": "execute_result" 589 | } 590 | ], 591 | "source": [ 592 | "df[(df['CC Provider'] == \"American Express\") & (df['Purchase Price'] > 95)].count()" 593 | ] 594 | }, 595 | { 596 | "cell_type": "markdown", 597 | "metadata": {}, 598 | "source": [ 599 | "** Hard: How many people have a credit card that expires in 2025? **" 600 | ] 601 | }, 602 | { 603 | "cell_type": "code", 604 | "execution_count": 32, 605 | "metadata": {}, 606 | "outputs": [ 607 | { 608 | "data": { 609 | "text/plain": [ 610 | "1033" 611 | ] 612 | }, 613 | "execution_count": 32, 614 | "metadata": {}, 615 | "output_type": "execute_result" 616 | } 617 | ], 618 | "source": [ 619 | "sum(df['CC Exp Date'].apply(lambda x: x[-2:]) == '25')" 620 | ] 621 | }, 622 | { 623 | "cell_type": "markdown", 624 | "metadata": {}, 625 | "source": [ 626 | "** Hard: What are the top 5 most popular email providers/hosts (e.g. gmail.com, yahoo.com, etc...) **" 627 | ] 628 | }, 629 | { 630 | "cell_type": "code", 631 | "execution_count": 33, 632 | "metadata": {}, 633 | "outputs": [ 634 | { 635 | "data": { 636 | "text/plain": [ 637 | "hotmail.com 1638\n", 638 | "yahoo.com 1616\n", 639 | "gmail.com 1605\n", 640 | "smith.com 42\n", 641 | "williams.com 37\n", 642 | "Name: Email, dtype: int64" 643 | ] 644 | }, 645 | "execution_count": 33, 646 | "metadata": {}, 647 | "output_type": "execute_result" 648 | } 649 | ], 650 | "source": [ 651 | "df['Email'].apply(lambda x: x.split('@')[-1]).value_counts().head(5)" 652 | ] 653 | }, 654 | { 655 | "cell_type": "markdown", 656 | "metadata": {}, 657 | "source": [ 658 | "# Great Job!" 659 | ] 660 | } 661 | ], 662 | "metadata": { 663 | "kernelspec": { 664 | "display_name": "Python 3", 665 | "language": "python", 666 | "name": "python3" 667 | }, 668 | "language_info": { 669 | "codemirror_mode": { 670 | "name": "ipython", 671 | "version": 3 672 | }, 673 | "file_extension": ".py", 674 | "mimetype": "text/x-python", 675 | "name": "python", 676 | "nbconvert_exporter": "python", 677 | "pygments_lexer": "ipython3", 678 | "version": "3.6.1" 679 | } 680 | }, 681 | "nbformat": 4, 682 | "nbformat_minor": 1 683 | } 684 | -------------------------------------------------------------------------------- /Python-for-Data-Analysis/Pandas/Pandas Exercises/SF Salaries Exercise.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "___\n", 8 | "\n", 9 | " \n", 10 | "___" 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": {}, 16 | "source": [ 17 | "# SF Salaries Exercise \n", 18 | "\n", 19 | "Welcome to a quick exercise for you to practice your pandas skills! We will be using the [SF Salaries Dataset](https://www.kaggle.com/kaggle/sf-salaries) from Kaggle! Just follow along and complete the tasks outlined in bold below. The tasks will get harder and harder as you go along." 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "** Import pandas as pd.**" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": 1, 32 | "metadata": { 33 | "collapsed": true 34 | }, 35 | "outputs": [], 36 | "source": [ 37 | "import pandas as pd" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": [ 44 | "** Read Salaries.csv as a dataframe called sal.**" 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": 2, 50 | "metadata": { 51 | "collapsed": true 52 | }, 53 | "outputs": [], 54 | "source": [ 55 | "df = pd.read_csv('Salaries.csv')" 56 | ] 57 | }, 58 | { 59 | "cell_type": "markdown", 60 | "metadata": {}, 61 | "source": [ 62 | "** Check the head of the DataFrame. **" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": 27, 68 | "metadata": {}, 69 | "outputs": [ 70 | { 71 | "data": { 72 | "text/html": [ 73 | "
\n", 74 | "\n", 87 | "\n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | "
IdEmployeeNameJobTitleBasePayOvertimePayOtherPayBenefitsTotalPayTotalPayBenefitsYearNotesAgencyStatus
01NATHANIEL FORDGENERAL MANAGER-METROPOLITAN TRANSIT AUTHORITY167411.180.00400184.25NaN567595.43567595.432011NaNSan FranciscoNaN
12GARY JIMENEZCAPTAIN III (POLICE DEPARTMENT)155966.02245131.88137811.38NaN538909.28538909.282011NaNSan FranciscoNaN
23ALBERT PARDINICAPTAIN III (POLICE DEPARTMENT)212739.13106088.1816452.60NaN335279.91335279.912011NaNSan FranciscoNaN
34CHRISTOPHER CHONGWIRE ROPE CABLE MAINTENANCE MECHANIC77916.0056120.71198306.90NaN332343.61332343.612011NaNSan FranciscoNaN
45PATRICK GARDNERDEPUTY CHIEF OF DEPARTMENT,(FIRE DEPARTMENT)134401.609737.00182234.59NaN326373.19326373.192011NaNSan FranciscoNaN
\n", 189 | "
" 190 | ], 191 | "text/plain": [ 192 | " Id EmployeeName JobTitle \\\n", 193 | "0 1 NATHANIEL FORD GENERAL MANAGER-METROPOLITAN TRANSIT AUTHORITY \n", 194 | "1 2 GARY JIMENEZ CAPTAIN III (POLICE DEPARTMENT) \n", 195 | "2 3 ALBERT PARDINI CAPTAIN III (POLICE DEPARTMENT) \n", 196 | "3 4 CHRISTOPHER CHONG WIRE ROPE CABLE MAINTENANCE MECHANIC \n", 197 | "4 5 PATRICK GARDNER DEPUTY CHIEF OF DEPARTMENT,(FIRE DEPARTMENT) \n", 198 | "\n", 199 | " BasePay OvertimePay OtherPay Benefits TotalPay TotalPayBenefits \\\n", 200 | "0 167411.18 0.00 400184.25 NaN 567595.43 567595.43 \n", 201 | "1 155966.02 245131.88 137811.38 NaN 538909.28 538909.28 \n", 202 | "2 212739.13 106088.18 16452.60 NaN 335279.91 335279.91 \n", 203 | "3 77916.00 56120.71 198306.90 NaN 332343.61 332343.61 \n", 204 | "4 134401.60 9737.00 182234.59 NaN 326373.19 326373.19 \n", 205 | "\n", 206 | " Year Notes Agency Status \n", 207 | "0 2011 NaN San Francisco NaN \n", 208 | "1 2011 NaN San Francisco NaN \n", 209 | "2 2011 NaN San Francisco NaN \n", 210 | "3 2011 NaN San Francisco NaN \n", 211 | "4 2011 NaN San Francisco NaN " 212 | ] 213 | }, 214 | "execution_count": 27, 215 | "metadata": {}, 216 | "output_type": "execute_result" 217 | } 218 | ], 219 | "source": [ 220 | "df.head()" 221 | ] 222 | }, 223 | { 224 | "cell_type": "markdown", 225 | "metadata": {}, 226 | "source": [ 227 | "** Use the .info() method to find out how many entries there are.**" 228 | ] 229 | }, 230 | { 231 | "cell_type": "code", 232 | "execution_count": 4, 233 | "metadata": {}, 234 | "outputs": [ 235 | { 236 | "name": "stdout", 237 | "output_type": "stream", 238 | "text": [ 239 | "\n", 240 | "RangeIndex: 148654 entries, 0 to 148653\n", 241 | "Data columns (total 13 columns):\n", 242 | "Id 148654 non-null int64\n", 243 | "EmployeeName 148654 non-null object\n", 244 | "JobTitle 148654 non-null object\n", 245 | "BasePay 148045 non-null float64\n", 246 | "OvertimePay 148650 non-null float64\n", 247 | "OtherPay 148650 non-null float64\n", 248 | "Benefits 112491 non-null float64\n", 249 | "TotalPay 148654 non-null float64\n", 250 | "TotalPayBenefits 148654 non-null float64\n", 251 | "Year 148654 non-null int64\n", 252 | "Notes 0 non-null float64\n", 253 | "Agency 148654 non-null object\n", 254 | "Status 0 non-null float64\n", 255 | "dtypes: float64(8), int64(2), object(3)\n", 256 | "memory usage: 14.7+ MB\n" 257 | ] 258 | } 259 | ], 260 | "source": [ 261 | "df.info()" 262 | ] 263 | }, 264 | { 265 | "cell_type": "markdown", 266 | "metadata": {}, 267 | "source": [ 268 | "**What is the average BasePay ?**" 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": 5, 274 | "metadata": {}, 275 | "outputs": [ 276 | { 277 | "data": { 278 | "text/plain": [ 279 | "66325.44884050643" 280 | ] 281 | }, 282 | "execution_count": 5, 283 | "metadata": {}, 284 | "output_type": "execute_result" 285 | } 286 | ], 287 | "source": [ 288 | "df['BasePay'].mean()" 289 | ] 290 | }, 291 | { 292 | "cell_type": "markdown", 293 | "metadata": {}, 294 | "source": [ 295 | "** What is the highest amount of OvertimePay in the dataset ? **" 296 | ] 297 | }, 298 | { 299 | "cell_type": "code", 300 | "execution_count": 6, 301 | "metadata": {}, 302 | "outputs": [ 303 | { 304 | "data": { 305 | "text/plain": [ 306 | "245131.88" 307 | ] 308 | }, 309 | "execution_count": 6, 310 | "metadata": {}, 311 | "output_type": "execute_result" 312 | } 313 | ], 314 | "source": [ 315 | "df['OvertimePay'].max()" 316 | ] 317 | }, 318 | { 319 | "cell_type": "markdown", 320 | "metadata": {}, 321 | "source": [ 322 | "** What is the job title of JOSEPH DRISCOLL ? Note: Use all caps, otherwise you may get an answer that doesn't match up (there is also a lowercase Joseph Driscoll). **" 323 | ] 324 | }, 325 | { 326 | "cell_type": "code", 327 | "execution_count": 7, 328 | "metadata": {}, 329 | "outputs": [ 330 | { 331 | "data": { 332 | "text/plain": [ 333 | "24 CAPTAIN, FIRE SUPPRESSION\n", 334 | "Name: JobTitle, dtype: object" 335 | ] 336 | }, 337 | "execution_count": 7, 338 | "metadata": {}, 339 | "output_type": "execute_result" 340 | } 341 | ], 342 | "source": [ 343 | "df[df['EmployeeName'] == 'JOSEPH DRISCOLL']['JobTitle']" 344 | ] 345 | }, 346 | { 347 | "cell_type": "markdown", 348 | "metadata": {}, 349 | "source": [ 350 | "** How much does JOSEPH DRISCOLL make (including benefits)? **" 351 | ] 352 | }, 353 | { 354 | "cell_type": "code", 355 | "execution_count": 8, 356 | "metadata": {}, 357 | "outputs": [ 358 | { 359 | "data": { 360 | "text/plain": [ 361 | "24 270324.91\n", 362 | "Name: TotalPayBenefits, dtype: float64" 363 | ] 364 | }, 365 | "execution_count": 8, 366 | "metadata": {}, 367 | "output_type": "execute_result" 368 | } 369 | ], 370 | "source": [ 371 | "df[df['EmployeeName'] == 'JOSEPH DRISCOLL']['TotalPayBenefits']" 372 | ] 373 | }, 374 | { 375 | "cell_type": "markdown", 376 | "metadata": {}, 377 | "source": [ 378 | "** What is the name of highest paid person (including benefits)?**" 379 | ] 380 | }, 381 | { 382 | "cell_type": "code", 383 | "execution_count": 9, 384 | "metadata": {}, 385 | "outputs": [ 386 | { 387 | "data": { 388 | "text/html": [ 389 | "
\n", 390 | "\n", 403 | "\n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | "
IdEmployeeNameJobTitleBasePayOvertimePayOtherPayBenefitsTotalPayTotalPayBenefitsYearNotesAgencyStatus
01NATHANIEL FORDGENERAL MANAGER-METROPOLITAN TRANSIT AUTHORITY167411.180.0400184.25NaN567595.43567595.432011NaNSan FranciscoNaN
\n", 441 | "
" 442 | ], 443 | "text/plain": [ 444 | " Id EmployeeName JobTitle \\\n", 445 | "0 1 NATHANIEL FORD GENERAL MANAGER-METROPOLITAN TRANSIT AUTHORITY \n", 446 | "\n", 447 | " BasePay OvertimePay OtherPay Benefits TotalPay TotalPayBenefits \\\n", 448 | "0 167411.18 0.0 400184.25 NaN 567595.43 567595.43 \n", 449 | "\n", 450 | " Year Notes Agency Status \n", 451 | "0 2011 NaN San Francisco NaN " 452 | ] 453 | }, 454 | "execution_count": 9, 455 | "metadata": {}, 456 | "output_type": "execute_result" 457 | } 458 | ], 459 | "source": [ 460 | "df[df['TotalPayBenefits'] == df['TotalPayBenefits'].max()]" 461 | ] 462 | }, 463 | { 464 | "cell_type": "markdown", 465 | "metadata": {}, 466 | "source": [ 467 | "** What is the name of lowest paid person (including benefits)? Do you notice something strange about how much he or she is paid?**" 468 | ] 469 | }, 470 | { 471 | "cell_type": "code", 472 | "execution_count": 10, 473 | "metadata": {}, 474 | "outputs": [ 475 | { 476 | "data": { 477 | "text/html": [ 478 | "
\n", 479 | "\n", 492 | "\n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | " \n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 512 | " \n", 513 | " \n", 514 | " \n", 515 | " \n", 516 | " \n", 517 | " \n", 518 | " \n", 519 | " \n", 520 | " \n", 521 | " \n", 522 | " \n", 523 | " \n", 524 | " \n", 525 | " \n", 526 | " \n", 527 | " \n", 528 | " \n", 529 | "
IdEmployeeNameJobTitleBasePayOvertimePayOtherPayBenefitsTotalPayTotalPayBenefitsYearNotesAgencyStatus
148653148654Joe LopezCounselor, Log Cabin Ranch0.00.0-618.130.0-618.13-618.132014NaNSan FranciscoNaN
\n", 530 | "
" 531 | ], 532 | "text/plain": [ 533 | " Id EmployeeName JobTitle BasePay OvertimePay \\\n", 534 | "148653 148654 Joe Lopez Counselor, Log Cabin Ranch 0.0 0.0 \n", 535 | "\n", 536 | " OtherPay Benefits TotalPay TotalPayBenefits Year Notes \\\n", 537 | "148653 -618.13 0.0 -618.13 -618.13 2014 NaN \n", 538 | "\n", 539 | " Agency Status \n", 540 | "148653 San Francisco NaN " 541 | ] 542 | }, 543 | "execution_count": 10, 544 | "metadata": {}, 545 | "output_type": "execute_result" 546 | } 547 | ], 548 | "source": [ 549 | "df[df['TotalPayBenefits'] == df['TotalPayBenefits'].min()]" 550 | ] 551 | }, 552 | { 553 | "cell_type": "markdown", 554 | "metadata": {}, 555 | "source": [ 556 | "** What was the average (mean) BasePay of all employees per year? (2011-2014) ? **" 557 | ] 558 | }, 559 | { 560 | "cell_type": "code", 561 | "execution_count": 12, 562 | "metadata": {}, 563 | "outputs": [ 564 | { 565 | "data": { 566 | "text/plain": [ 567 | "Year\n", 568 | "2011 63595.956517\n", 569 | "2012 65436.406857\n", 570 | "2013 69630.030216\n", 571 | "2014 66564.421924\n", 572 | "Name: BasePay, dtype: float64" 573 | ] 574 | }, 575 | "execution_count": 12, 576 | "metadata": {}, 577 | "output_type": "execute_result" 578 | } 579 | ], 580 | "source": [ 581 | "df.groupby('Year').mean()['BasePay']" 582 | ] 583 | }, 584 | { 585 | "cell_type": "markdown", 586 | "metadata": {}, 587 | "source": [ 588 | "** How many unique job titles are there? **" 589 | ] 590 | }, 591 | { 592 | "cell_type": "code", 593 | "execution_count": 13, 594 | "metadata": {}, 595 | "outputs": [ 596 | { 597 | "data": { 598 | "text/plain": [ 599 | "2159" 600 | ] 601 | }, 602 | "execution_count": 13, 603 | "metadata": {}, 604 | "output_type": "execute_result" 605 | } 606 | ], 607 | "source": [ 608 | "df['JobTitle'].nunique()" 609 | ] 610 | }, 611 | { 612 | "cell_type": "markdown", 613 | "metadata": {}, 614 | "source": [ 615 | "** What are the top 5 most common jobs? **" 616 | ] 617 | }, 618 | { 619 | "cell_type": "code", 620 | "execution_count": 28, 621 | "metadata": {}, 622 | "outputs": [ 623 | { 624 | "data": { 625 | "text/plain": [ 626 | "Transit Operator 7036\n", 627 | "Special Nurse 4389\n", 628 | "Registered Nurse 3736\n", 629 | "Public Svc Aide-Public Works 2518\n", 630 | "Police Officer 3 2421\n", 631 | "Name: JobTitle, dtype: int64" 632 | ] 633 | }, 634 | "execution_count": 28, 635 | "metadata": {}, 636 | "output_type": "execute_result" 637 | } 638 | ], 639 | "source": [ 640 | "df['JobTitle'].value_counts().head()" 641 | ] 642 | }, 643 | { 644 | "cell_type": "markdown", 645 | "metadata": {}, 646 | "source": [ 647 | "** How many Job Titles were represented by only one person in 2013? (e.g. Job Titles with only one occurence in 2013?) **" 648 | ] 649 | }, 650 | { 651 | "cell_type": "code", 652 | "execution_count": 21, 653 | "metadata": {}, 654 | "outputs": [ 655 | { 656 | "data": { 657 | "text/plain": [ 658 | "202" 659 | ] 660 | }, 661 | "execution_count": 21, 662 | "metadata": {}, 663 | "output_type": "execute_result" 664 | } 665 | ], 666 | "source": [ 667 | "sum(df[df['Year']==2013]['JobTitle'].value_counts() == 1)" 668 | ] 669 | }, 670 | { 671 | "cell_type": "markdown", 672 | "metadata": {}, 673 | "source": [ 674 | "** How many people have the word Chief in their job title? (This is pretty tricky) **" 675 | ] 676 | }, 677 | { 678 | "cell_type": "code", 679 | "execution_count": 38, 680 | "metadata": {}, 681 | "outputs": [ 682 | { 683 | "data": { 684 | "text/plain": [ 685 | "627" 686 | ] 687 | }, 688 | "execution_count": 38, 689 | "metadata": {}, 690 | "output_type": "execute_result" 691 | } 692 | ], 693 | "source": [ 694 | "len(df[df['JobTitle'].apply(lambda x: 'chief' in x.lower())])" 695 | ] 696 | }, 697 | { 698 | "cell_type": "markdown", 699 | "metadata": {}, 700 | "source": [ 701 | "** Bonus: Is there a correlation between length of the Job Title string and Salary? **" 702 | ] 703 | }, 704 | { 705 | "cell_type": "code", 706 | "execution_count": 41, 707 | "metadata": { 708 | "collapsed": true 709 | }, 710 | "outputs": [], 711 | "source": [ 712 | "df['LengthOfTitle'] = df['JobTitle'].apply(len)" 713 | ] 714 | }, 715 | { 716 | "cell_type": "code", 717 | "execution_count": 42, 718 | "metadata": {}, 719 | "outputs": [ 720 | { 721 | "data": { 722 | "text/html": [ 723 | "
\n", 724 | "\n", 737 | "\n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | "
LengthOfTitleTotalPayBenefits
LengthOfTitle1.000000-0.036878
TotalPayBenefits-0.0368781.000000
\n", 758 | "
" 759 | ], 760 | "text/plain": [ 761 | " LengthOfTitle TotalPayBenefits\n", 762 | "LengthOfTitle 1.000000 -0.036878\n", 763 | "TotalPayBenefits -0.036878 1.000000" 764 | ] 765 | }, 766 | "execution_count": 42, 767 | "metadata": {}, 768 | "output_type": "execute_result" 769 | } 770 | ], 771 | "source": [ 772 | "df[['LengthOfTitle', 'TotalPayBenefits']].corr()" 773 | ] 774 | }, 775 | { 776 | "cell_type": "markdown", 777 | "metadata": {}, 778 | "source": [ 779 | "# Great Job!" 780 | ] 781 | } 782 | ], 783 | "metadata": { 784 | "kernelspec": { 785 | "display_name": "Python 3", 786 | "language": "python", 787 | "name": "python3" 788 | }, 789 | "language_info": { 790 | "codemirror_mode": { 791 | "name": "ipython", 792 | "version": 3 793 | }, 794 | "file_extension": ".py", 795 | "mimetype": "text/x-python", 796 | "name": "python", 797 | "nbconvert_exporter": "python", 798 | "pygments_lexer": "ipython3", 799 | "version": "3.6.1" 800 | } 801 | }, 802 | "nbformat": 4, 803 | "nbformat_minor": 1 804 | } 805 | -------------------------------------------------------------------------------- /Python-for-Data-Analysis/Pandas/gal.csv: -------------------------------------------------------------------------------- 1 | Bucket,Quotes,Views 2 | Baseline,32,595 3 | Variation 1,30,599 4 | Variation 2,18,622 5 | Variation 3,51,606 6 | Variation 4,38,578 7 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Python for Data Science and Machine Learning Bootcamp 2 | 3 | Here is the link to the [Udemy course](https://www.udemy.com/python-for-data-science-and-machine-learning-bootcamp/). Jose Portilla is a great teacher and it is well worth the cost. 4 | 5 | ### Directories 6 | * [Big Data and Spark](https://github.com/thomasdunlap/Python-DS-And-ML-Bootcamp/tree/master/Big-Data-and-Spark) 7 | * [Machine Learning Sections](https://github.com/thomasdunlap/Python-DS-And-ML-Bootcamp/tree/master/Machine%20Learning%20Sections) 8 | * Recommender Systems 9 | * Natural Language Processing 10 | * Principal Component Analysis 11 | * Cross Validation Bias Variance 12 | * Support Vector Machines 13 | * Decision Trees and Random Forests 14 | * K-Means Clustering 15 | * K-Nearest Neighbors 16 | * Linear Regression 17 | * Logistic Regression 18 | * [Python Data Analysis](https://github.com/thomasdunlap/Python-DS-And-ML-Bootcamp/tree/master/Python-for-Data-Analysis) 19 | * Pandas 20 | * NumPy 21 | * [Python Data Visualization](https://github.com/thomasdunlap/Python-DS-And-ML-Bootcamp/tree/master/Python-for-Data-Visualization) 22 | * Seaborn 23 | * Geographical Plotting 24 | * Matplotlib 25 | * Pandas Built-in Visualization 26 | * [Python Crash Course](https://github.com/thomasdunlap/Python-DS-And-ML-Bootcamp/tree/master/Python-Crash-Course) 27 | --------------------------------------------------------------------------------