├── .gitattributes
├── .gitignore
├── LICENSE.txt
├── README.md
├── aggregate.py
├── association_rules.py
├── average.py
├── basic_filter.py
├── data
├── dna_seq.txt
├── retail.txt
├── titanic.csv
└── words.txt
├── figures
├── 20newsgroups.png
├── als.png
├── lda_topics.png
├── mlp.png
├── random_forrest.png
└── spark.png
├── lda.py
├── log_reg.py
├── mapper.py
├── mlp.py
├── outliers.py
├── pi_est.py
├── random_forest.py
├── recommender.py
├── scala
└── pi_est.scala
├── term_doc.py
└── word_count.py
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 |
4 | # Custom for Visual Studio
5 | *.cs diff=csharp
6 |
7 | # Standard to msysgit
8 | *.doc diff=astextplain
9 | *.DOC diff=astextplain
10 | *.docx diff=astextplain
11 | *.DOCX diff=astextplain
12 | *.dot diff=astextplain
13 | *.DOT diff=astextplain
14 | *.pdf diff=astextplain
15 | *.PDF diff=astextplain
16 | *.rtf diff=astextplain
17 | *.RTF diff=astextplain
18 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # Windows image file caches
2 | Thumbs.db
3 | ehthumbs.db
4 |
5 | # Folder config file
6 | Desktop.ini
7 |
8 | # Recycle Bin used on file shares
9 | $RECYCLE.BIN/
10 |
11 | # Windows Installer files
12 | *.cab
13 | *.msi
14 | *.msm
15 | *.msp
16 |
17 | # Windows shortcuts
18 | *.lnk
19 |
20 | # =========================
21 | # Operating System Files
22 | # =========================
23 |
24 | # OSX
25 | # =========================
26 |
27 | .DS_Store
28 | .AppleDouble
29 | .LSOverride
30 |
31 | # Thumbnails
32 | ._*
33 |
34 | # Files that might appear in the root of a volume
35 | .DocumentRevisions-V100
36 | .fseventsd
37 | .Spotlight-V100
38 | .TemporaryItems
39 | .Trashes
40 | .VolumeIcon.icns
41 |
42 | # Directories potentially created on remote AFP share
43 | .AppleDB
44 | .AppleDesktop
45 | Network Trash Folder
46 | Temporary Items
47 | .apdisk
48 |
--------------------------------------------------------------------------------
/LICENSE.txt:
--------------------------------------------------------------------------------
1 | Copyright (c) 2017, Vadim Smolyakov
2 |
3 | Permission is hereby granted, free of charge, to any person obtaining a copy
4 | of this software and associated documentation files (the "Software"), to deal
5 | in the Software without restriction, including without limitation the rights
6 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
7 | copies of the Software, and to permit persons to whom the Software is
8 | furnished to do so, subject to the following conditions:
9 |
10 | The above copyright notice and this permission notice shall be included in all
11 | copies or substantial portions of the Software.
12 |
13 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
16 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
17 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
18 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
19 | SOFTWARE.
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # spark
2 | pyspark and scala spark
3 |
4 | ### Description
5 |
6 | This repo is a collection of Spark resources.
7 |
8 |
9 |
10 |
11 |
12 | References:
13 | *https://spark.apache.org/docs/latest/programming-guide.html#transformations*
14 | *https://spark.apache.org/docs/latest/ml-guide.html*
15 |
16 | **ALS Recommender System**
17 |
18 | Alternating Least Squares (ALS) matrix factorization recommender system was used to predict top movies for a new user given the ratings in the MovieLens dataset.
19 |
20 |
21 |
22 |
23 |
24 | RMSE was computed for different ALS ranks in order to select the best model, which was then used to predict movie ratings and recommend highest rated movies to a new user.
25 |
26 | References:
27 | *https://spark.apache.org/docs/latest/mllib-collaborative-filtering.html*
28 |
29 | **Distributed Logistic Regression**
30 |
31 | LBFGS Logistic Regression was used to classify the 20newsgroups dataset according to one of 20 topics. Each document in a corpus was converted to a tf-idf vector labelled by the corresponding topic for training.
32 |
33 |
34 |
35 |
36 |
37 | A test accuracy was computed by predicting the topic label based on test tf-idf document vectors. The figure above shows a t-SNE visualization of the 20newsgroups corpus.
38 |
39 | References:
40 | *https://spark.apache.org/docs/latest/mllib-linear-methods.html#logistic-regression*
41 |
42 | **Distributed Random Forest**
43 |
44 | A random forest classifier was used to predict survival on the titanic using features such as age, class, ticket fare and others. The dataset was converted to Spark dataframe and the features were aggregated with vector assembler.
45 |
46 |
47 |
48 |
49 |
50 | A random forest with 100 trees and a max depth of 6 was used to make binary predictions using the Spark ML library.
51 |
52 | References:
53 | *https://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest-classifier*
54 |
55 | **Multi Layer Perceptron**
56 |
57 | A Multi Layer Perceptron (MLP) was used to predict a binary label based on the titanic kaggle dataset. Spark data frames were used to read in and prepare the data for classification
58 |
59 |
60 |
61 |
62 |
63 | The MLP was configured with two hidden layers, 7 input and 2 output neurons. It achieved an occuracy of over 80% on the validation set with 100 training iterations.
64 |
65 | References:
66 | *https://spark.apache.org/docs/latest/ml-classification-regression.html#multilayer-perceptron-classifier*
67 |
68 |
69 | **Distributed LDA**
70 |
71 | A distributed Latent Dirichlet Allocation (LDA) topic model was fit on the 20 newsgroups dataset. The training data was preprocessed using a tokenizer, stop-word remover and a tf-idf transformer.
72 |
73 |
74 |
75 |
76 |
77 | The number of topics was set to K = 20. The figure above shows a word-cloud of topics learned from the 20 newsgroups dataset.
78 |
79 | References:
80 | *https://spark.apache.org/docs/latest/ml-clustering.html#latent-dirichlet-allocation-lda*
81 |
82 |
83 | **Misc**
84 |
85 | [RDD aggregation](https://github.com/vsmolyakov/pyspark/blob/master/aggregate.py), [RDD filter](https://github.com/vsmolyakov/pyspark/blob/master/basic_filter.py), [RDD mapper](https://github.com/vsmolyakov/pyspark/blob/master/mapper.py),
86 | [word count](https://github.com/vsmolyakov/pyspark/blob/master/word_count.py), [term document matrix](https://github.com/vsmolyakov/pyspark/blob/master/term_doc.py), [average](https://github.com/vsmolyakov/pyspark/blob/master/average.py), [outliers](https://github.com/vsmolyakov/pyspark/blob/master/outliers.py), [pi_est](https://github.com/vsmolyakov/pyspark/blob/master/pi_est.py)
87 |
88 | ### Dependencies
89 |
90 | PySpark 2.1.1
91 | Python 2.7
92 |
93 |
--------------------------------------------------------------------------------
/aggregate.py:
--------------------------------------------------------------------------------
1 | '''
2 | >>> sum_cnt
3 | (55, 10)
4 | '''
5 |
6 | from pyspark import SparkContext
7 |
8 | if __name__ == "__main__":
9 |
10 | sc = SparkContext('local', 'aggregate')
11 | nums = sc.parallelize([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
12 |
13 | sum_cnt = nums.aggregate(
14 | (0,0), #initial value
15 | (lambda acc, value: (acc[0] + value, acc[1] + 1)), #combine value with acc
16 | (lambda acc1, acc2: (acc1[0]+acc2[0],acc1[1]+acc2[1])) #combine accumulators
17 | )
18 |
19 | print "mean: ", round(sum_cnt[0]/float(sum_cnt[1]),4)
20 |
21 |
22 |
--------------------------------------------------------------------------------
/association_rules.py:
--------------------------------------------------------------------------------
1 | from pyspark import SparkContext
2 | from pyspark.sql import SQLContext
3 | from pyspark.sql import SparkSession
4 | from pyspark.sql import Row
5 |
6 | import numpy as np
7 | import seaborn as sns
8 | import matplotlib.pyplot as plt
9 |
10 | from pyspark.ml.fpm import FPGrowth
11 |
12 | if __name__ == "__main__":
13 |
14 | sc = SparkContext('local', 'arules')
15 | sqlContext = SQLContext(sc)
16 |
17 | spark = SparkSession\
18 | .builder\
19 | .appName("arules")\
20 | .getOrCreate()
21 |
22 | #dataset = sc.textFile("./data/retail.txt")
23 | df = spark.createDataFrame([
24 | (0, [1, 2, 5]),
25 | (1, [1, 2, 3, 5]),
26 | (2, [1, 2])
27 | ], ["id", "items"])
28 |
29 | fpGrowth = FPGrowth(itemsCol="items", minSupport=0.5, minConfidence=0.6)
30 | model = fpGrowth.fit(df)
31 |
32 | #display frequent itemsets
33 | model.freqItemsets.show()
34 |
35 | #display generated association rules
36 | model.associationRules.show()
37 |
38 | #apply transform
39 | model.transform(df).show()
40 |
41 |
--------------------------------------------------------------------------------
/average.py:
--------------------------------------------------------------------------------
1 | '''
2 | >>> sum_count
3 | (55, 10)
4 | >>> average
5 | 5.5
6 | '''
7 |
8 | from pyspark import SparkContext
9 |
10 | if __name__ == "__main__":
11 |
12 | sc = SparkContext('local', 'word_count')
13 | nums = sc.parallelize([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
14 | sum_count = nums.map(lambda x: (x, 1)).fold((0,0), (lambda x, y: (x[0]+y[0], x[1]+y[1])))
15 | average = sum_count[0] / float(sum_count[1])
16 | print average
17 |
--------------------------------------------------------------------------------
/basic_filter.py:
--------------------------------------------------------------------------------
1 | '''
2 | >>> print res
3 | [4, 16, 36, 64]
4 | '''
5 |
6 | from pyspark import SparkContext
7 |
8 | def even_squares(num):
9 | return num.filter(lambda x: x % 2 == 0).map(lambda x: x * x)
10 |
11 |
12 | if __name__ == "__main__":
13 |
14 | sc = SparkContext('local', 'word_count')
15 | nums = sc.parallelize([1, 2, 3, 4, 5, 6, 7, 8])
16 | res = sorted(even_squares(nums).collect())
17 | print res
18 |
--------------------------------------------------------------------------------
/data/dna_seq.txt:
--------------------------------------------------------------------------------
1 | ATATCCCCGGGAT
2 | ATCGATCGATATG
--------------------------------------------------------------------------------
/data/titanic.csv:
--------------------------------------------------------------------------------
1 | PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
2 | 1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
3 | 2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
4 | 3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S
5 | 4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
6 | 5,0,3,"Allen, Mr. William Henry",male,35,0,0,373450,8.05,,S
7 | 6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
8 | 7,0,1,"McCarthy, Mr. Timothy J",male,54,0,0,17463,51.8625,E46,S
9 | 8,0,3,"Palsson, Master. Gosta Leonard",male,2,3,1,349909,21.075,,S
10 | 9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27,0,2,347742,11.1333,,S
11 | 10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14,1,0,237736,30.0708,,C
12 | 11,1,3,"Sandstrom, Miss. Marguerite Rut",female,4,1,1,PP 9549,16.7,G6,S
13 | 12,1,1,"Bonnell, Miss. Elizabeth",female,58,0,0,113783,26.55,C103,S
14 | 13,0,3,"Saundercock, Mr. William Henry",male,20,0,0,A/5. 2151,8.05,,S
15 | 14,0,3,"Andersson, Mr. Anders Johan",male,39,1,5,347082,31.275,,S
16 | 15,0,3,"Vestrom, Miss. Hulda Amanda Adolfina",female,14,0,0,350406,7.8542,,S
17 | 16,1,2,"Hewlett, Mrs. (Mary D Kingcome) ",female,55,0,0,248706,16,,S
18 | 17,0,3,"Rice, Master. Eugene",male,2,4,1,382652,29.125,,Q
19 | 18,1,2,"Williams, Mr. Charles Eugene",male,,0,0,244373,13,,S
20 | 19,0,3,"Vander Planke, Mrs. Julius (Emelia Maria Vandemoortele)",female,31,1,0,345763,18,,S
21 | 20,1,3,"Masselmani, Mrs. Fatima",female,,0,0,2649,7.225,,C
22 | 21,0,2,"Fynney, Mr. Joseph J",male,35,0,0,239865,26,,S
23 | 22,1,2,"Beesley, Mr. Lawrence",male,34,0,0,248698,13,D56,S
24 | 23,1,3,"McGowan, Miss. Anna ""Annie""",female,15,0,0,330923,8.0292,,Q
25 | 24,1,1,"Sloper, Mr. William Thompson",male,28,0,0,113788,35.5,A6,S
26 | 25,0,3,"Palsson, Miss. Torborg Danira",female,8,3,1,349909,21.075,,S
27 | 26,1,3,"Asplund, Mrs. Carl Oscar (Selma Augusta Emilia Johansson)",female,38,1,5,347077,31.3875,,S
28 | 27,0,3,"Emir, Mr. Farred Chehab",male,,0,0,2631,7.225,,C
29 | 28,0,1,"Fortune, Mr. Charles Alexander",male,19,3,2,19950,263,C23 C25 C27,S
30 | 29,1,3,"O'Dwyer, Miss. Ellen ""Nellie""",female,,0,0,330959,7.8792,,Q
31 | 30,0,3,"Todoroff, Mr. Lalio",male,,0,0,349216,7.8958,,S
32 | 31,0,1,"Uruchurtu, Don. Manuel E",male,40,0,0,PC 17601,27.7208,,C
33 | 32,1,1,"Spencer, Mrs. William Augustus (Marie Eugenie)",female,,1,0,PC 17569,146.5208,B78,C
34 | 33,1,3,"Glynn, Miss. Mary Agatha",female,,0,0,335677,7.75,,Q
35 | 34,0,2,"Wheadon, Mr. Edward H",male,66,0,0,C.A. 24579,10.5,,S
36 | 35,0,1,"Meyer, Mr. Edgar Joseph",male,28,1,0,PC 17604,82.1708,,C
37 | 36,0,1,"Holverson, Mr. Alexander Oskar",male,42,1,0,113789,52,,S
38 | 37,1,3,"Mamee, Mr. Hanna",male,,0,0,2677,7.2292,,C
39 | 38,0,3,"Cann, Mr. Ernest Charles",male,21,0,0,A./5. 2152,8.05,,S
40 | 39,0,3,"Vander Planke, Miss. Augusta Maria",female,18,2,0,345764,18,,S
41 | 40,1,3,"Nicola-Yarred, Miss. Jamila",female,14,1,0,2651,11.2417,,C
42 | 41,0,3,"Ahlin, Mrs. Johan (Johanna Persdotter Larsson)",female,40,1,0,7546,9.475,,S
43 | 42,0,2,"Turpin, Mrs. William John Robert (Dorothy Ann Wonnacott)",female,27,1,0,11668,21,,S
44 | 43,0,3,"Kraeff, Mr. Theodor",male,,0,0,349253,7.8958,,C
45 | 44,1,2,"Laroche, Miss. Simonne Marie Anne Andree",female,3,1,2,SC/Paris 2123,41.5792,,C
46 | 45,1,3,"Devaney, Miss. Margaret Delia",female,19,0,0,330958,7.8792,,Q
47 | 46,0,3,"Rogers, Mr. William John",male,,0,0,S.C./A.4. 23567,8.05,,S
48 | 47,0,3,"Lennon, Mr. Denis",male,,1,0,370371,15.5,,Q
49 | 48,1,3,"O'Driscoll, Miss. Bridget",female,,0,0,14311,7.75,,Q
50 | 49,0,3,"Samaan, Mr. Youssef",male,,2,0,2662,21.6792,,C
51 | 50,0,3,"Arnold-Franchi, Mrs. Josef (Josefine Franchi)",female,18,1,0,349237,17.8,,S
52 | 51,0,3,"Panula, Master. Juha Niilo",male,7,4,1,3101295,39.6875,,S
53 | 52,0,3,"Nosworthy, Mr. Richard Cater",male,21,0,0,A/4. 39886,7.8,,S
54 | 53,1,1,"Harper, Mrs. Henry Sleeper (Myna Haxtun)",female,49,1,0,PC 17572,76.7292,D33,C
55 | 54,1,2,"Faunthorpe, Mrs. Lizzie (Elizabeth Anne Wilkinson)",female,29,1,0,2926,26,,S
56 | 55,0,1,"Ostby, Mr. Engelhart Cornelius",male,65,0,1,113509,61.9792,B30,C
57 | 56,1,1,"Woolner, Mr. Hugh",male,,0,0,19947,35.5,C52,S
58 | 57,1,2,"Rugg, Miss. Emily",female,21,0,0,C.A. 31026,10.5,,S
59 | 58,0,3,"Novel, Mr. Mansouer",male,28.5,0,0,2697,7.2292,,C
60 | 59,1,2,"West, Miss. Constance Mirium",female,5,1,2,C.A. 34651,27.75,,S
61 | 60,0,3,"Goodwin, Master. William Frederick",male,11,5,2,CA 2144,46.9,,S
62 | 61,0,3,"Sirayanian, Mr. Orsen",male,22,0,0,2669,7.2292,,C
63 | 62,1,1,"Icard, Miss. Amelie",female,38,0,0,113572,80,B28,
64 | 63,0,1,"Harris, Mr. Henry Birkhardt",male,45,1,0,36973,83.475,C83,S
65 | 64,0,3,"Skoog, Master. Harald",male,4,3,2,347088,27.9,,S
66 | 65,0,1,"Stewart, Mr. Albert A",male,,0,0,PC 17605,27.7208,,C
67 | 66,1,3,"Moubarek, Master. Gerios",male,,1,1,2661,15.2458,,C
68 | 67,1,2,"Nye, Mrs. (Elizabeth Ramell)",female,29,0,0,C.A. 29395,10.5,F33,S
69 | 68,0,3,"Crease, Mr. Ernest James",male,19,0,0,S.P. 3464,8.1583,,S
70 | 69,1,3,"Andersson, Miss. Erna Alexandra",female,17,4,2,3101281,7.925,,S
71 | 70,0,3,"Kink, Mr. Vincenz",male,26,2,0,315151,8.6625,,S
72 | 71,0,2,"Jenkin, Mr. Stephen Curnow",male,32,0,0,C.A. 33111,10.5,,S
73 | 72,0,3,"Goodwin, Miss. Lillian Amy",female,16,5,2,CA 2144,46.9,,S
74 | 73,0,2,"Hood, Mr. Ambrose Jr",male,21,0,0,S.O.C. 14879,73.5,,S
75 | 74,0,3,"Chronopoulos, Mr. Apostolos",male,26,1,0,2680,14.4542,,C
76 | 75,1,3,"Bing, Mr. Lee",male,32,0,0,1601,56.4958,,S
77 | 76,0,3,"Moen, Mr. Sigurd Hansen",male,25,0,0,348123,7.65,F G73,S
78 | 77,0,3,"Staneff, Mr. Ivan",male,,0,0,349208,7.8958,,S
79 | 78,0,3,"Moutal, Mr. Rahamin Haim",male,,0,0,374746,8.05,,S
80 | 79,1,2,"Caldwell, Master. Alden Gates",male,0.83,0,2,248738,29,,S
81 | 80,1,3,"Dowdell, Miss. Elizabeth",female,30,0,0,364516,12.475,,S
82 | 81,0,3,"Waelens, Mr. Achille",male,22,0,0,345767,9,,S
83 | 82,1,3,"Sheerlinck, Mr. Jan Baptist",male,29,0,0,345779,9.5,,S
84 | 83,1,3,"McDermott, Miss. Brigdet Delia",female,,0,0,330932,7.7875,,Q
85 | 84,0,1,"Carrau, Mr. Francisco M",male,28,0,0,113059,47.1,,S
86 | 85,1,2,"Ilett, Miss. Bertha",female,17,0,0,SO/C 14885,10.5,,S
87 | 86,1,3,"Backstrom, Mrs. Karl Alfred (Maria Mathilda Gustafsson)",female,33,3,0,3101278,15.85,,S
88 | 87,0,3,"Ford, Mr. William Neal",male,16,1,3,W./C. 6608,34.375,,S
89 | 88,0,3,"Slocovski, Mr. Selman Francis",male,,0,0,SOTON/OQ 392086,8.05,,S
90 | 89,1,1,"Fortune, Miss. Mabel Helen",female,23,3,2,19950,263,C23 C25 C27,S
91 | 90,0,3,"Celotti, Mr. Francesco",male,24,0,0,343275,8.05,,S
92 | 91,0,3,"Christmann, Mr. Emil",male,29,0,0,343276,8.05,,S
93 | 92,0,3,"Andreasson, Mr. Paul Edvin",male,20,0,0,347466,7.8542,,S
94 | 93,0,1,"Chaffee, Mr. Herbert Fuller",male,46,1,0,W.E.P. 5734,61.175,E31,S
95 | 94,0,3,"Dean, Mr. Bertram Frank",male,26,1,2,C.A. 2315,20.575,,S
96 | 95,0,3,"Coxon, Mr. Daniel",male,59,0,0,364500,7.25,,S
97 | 96,0,3,"Shorney, Mr. Charles Joseph",male,,0,0,374910,8.05,,S
98 | 97,0,1,"Goldschmidt, Mr. George B",male,71,0,0,PC 17754,34.6542,A5,C
99 | 98,1,1,"Greenfield, Mr. William Bertram",male,23,0,1,PC 17759,63.3583,D10 D12,C
100 | 99,1,2,"Doling, Mrs. John T (Ada Julia Bone)",female,34,0,1,231919,23,,S
101 | 100,0,2,"Kantor, Mr. Sinai",male,34,1,0,244367,26,,S
102 | 101,0,3,"Petranec, Miss. Matilda",female,28,0,0,349245,7.8958,,S
103 | 102,0,3,"Petroff, Mr. Pastcho (""Pentcho"")",male,,0,0,349215,7.8958,,S
104 | 103,0,1,"White, Mr. Richard Frasar",male,21,0,1,35281,77.2875,D26,S
105 | 104,0,3,"Johansson, Mr. Gustaf Joel",male,33,0,0,7540,8.6542,,S
106 | 105,0,3,"Gustafsson, Mr. Anders Vilhelm",male,37,2,0,3101276,7.925,,S
107 | 106,0,3,"Mionoff, Mr. Stoytcho",male,28,0,0,349207,7.8958,,S
108 | 107,1,3,"Salkjelsvik, Miss. Anna Kristine",female,21,0,0,343120,7.65,,S
109 | 108,1,3,"Moss, Mr. Albert Johan",male,,0,0,312991,7.775,,S
110 | 109,0,3,"Rekic, Mr. Tido",male,38,0,0,349249,7.8958,,S
111 | 110,1,3,"Moran, Miss. Bertha",female,,1,0,371110,24.15,,Q
112 | 111,0,1,"Porter, Mr. Walter Chamberlain",male,47,0,0,110465,52,C110,S
113 | 112,0,3,"Zabour, Miss. Hileni",female,14.5,1,0,2665,14.4542,,C
114 | 113,0,3,"Barton, Mr. David John",male,22,0,0,324669,8.05,,S
115 | 114,0,3,"Jussila, Miss. Katriina",female,20,1,0,4136,9.825,,S
116 | 115,0,3,"Attalah, Miss. Malake",female,17,0,0,2627,14.4583,,C
117 | 116,0,3,"Pekoniemi, Mr. Edvard",male,21,0,0,STON/O 2. 3101294,7.925,,S
118 | 117,0,3,"Connors, Mr. Patrick",male,70.5,0,0,370369,7.75,,Q
119 | 118,0,2,"Turpin, Mr. William John Robert",male,29,1,0,11668,21,,S
120 | 119,0,1,"Baxter, Mr. Quigg Edmond",male,24,0,1,PC 17558,247.5208,B58 B60,C
121 | 120,0,3,"Andersson, Miss. Ellis Anna Maria",female,2,4,2,347082,31.275,,S
122 | 121,0,2,"Hickman, Mr. Stanley George",male,21,2,0,S.O.C. 14879,73.5,,S
123 | 122,0,3,"Moore, Mr. Leonard Charles",male,,0,0,A4. 54510,8.05,,S
124 | 123,0,2,"Nasser, Mr. Nicholas",male,32.5,1,0,237736,30.0708,,C
125 | 124,1,2,"Webber, Miss. Susan",female,32.5,0,0,27267,13,E101,S
126 | 125,0,1,"White, Mr. Percival Wayland",male,54,0,1,35281,77.2875,D26,S
127 | 126,1,3,"Nicola-Yarred, Master. Elias",male,12,1,0,2651,11.2417,,C
128 | 127,0,3,"McMahon, Mr. Martin",male,,0,0,370372,7.75,,Q
129 | 128,1,3,"Madsen, Mr. Fridtjof Arne",male,24,0,0,C 17369,7.1417,,S
130 | 129,1,3,"Peter, Miss. Anna",female,,1,1,2668,22.3583,F E69,C
131 | 130,0,3,"Ekstrom, Mr. Johan",male,45,0,0,347061,6.975,,S
132 | 131,0,3,"Drazenoic, Mr. Jozef",male,33,0,0,349241,7.8958,,C
133 | 132,0,3,"Coelho, Mr. Domingos Fernandeo",male,20,0,0,SOTON/O.Q. 3101307,7.05,,S
134 | 133,0,3,"Robins, Mrs. Alexander A (Grace Charity Laury)",female,47,1,0,A/5. 3337,14.5,,S
135 | 134,1,2,"Weisz, Mrs. Leopold (Mathilde Francoise Pede)",female,29,1,0,228414,26,,S
136 | 135,0,2,"Sobey, Mr. Samuel James Hayden",male,25,0,0,C.A. 29178,13,,S
137 | 136,0,2,"Richard, Mr. Emile",male,23,0,0,SC/PARIS 2133,15.0458,,C
138 | 137,1,1,"Newsom, Miss. Helen Monypeny",female,19,0,2,11752,26.2833,D47,S
139 | 138,0,1,"Futrelle, Mr. Jacques Heath",male,37,1,0,113803,53.1,C123,S
140 | 139,0,3,"Osen, Mr. Olaf Elon",male,16,0,0,7534,9.2167,,S
141 | 140,0,1,"Giglio, Mr. Victor",male,24,0,0,PC 17593,79.2,B86,C
142 | 141,0,3,"Boulos, Mrs. Joseph (Sultana)",female,,0,2,2678,15.2458,,C
143 | 142,1,3,"Nysten, Miss. Anna Sofia",female,22,0,0,347081,7.75,,S
144 | 143,1,3,"Hakkarainen, Mrs. Pekka Pietari (Elin Matilda Dolck)",female,24,1,0,STON/O2. 3101279,15.85,,S
145 | 144,0,3,"Burke, Mr. Jeremiah",male,19,0,0,365222,6.75,,Q
146 | 145,0,2,"Andrew, Mr. Edgardo Samuel",male,18,0,0,231945,11.5,,S
147 | 146,0,2,"Nicholls, Mr. Joseph Charles",male,19,1,1,C.A. 33112,36.75,,S
148 | 147,1,3,"Andersson, Mr. August Edvard (""Wennerstrom"")",male,27,0,0,350043,7.7958,,S
149 | 148,0,3,"Ford, Miss. Robina Maggie ""Ruby""",female,9,2,2,W./C. 6608,34.375,,S
150 | 149,0,2,"Navratil, Mr. Michel (""Louis M Hoffman"")",male,36.5,0,2,230080,26,F2,S
151 | 150,0,2,"Byles, Rev. Thomas Roussel Davids",male,42,0,0,244310,13,,S
152 | 151,0,2,"Bateman, Rev. Robert James",male,51,0,0,S.O.P. 1166,12.525,,S
153 | 152,1,1,"Pears, Mrs. Thomas (Edith Wearne)",female,22,1,0,113776,66.6,C2,S
154 | 153,0,3,"Meo, Mr. Alfonzo",male,55.5,0,0,A.5. 11206,8.05,,S
155 | 154,0,3,"van Billiard, Mr. Austin Blyler",male,40.5,0,2,A/5. 851,14.5,,S
156 | 155,0,3,"Olsen, Mr. Ole Martin",male,,0,0,Fa 265302,7.3125,,S
157 | 156,0,1,"Williams, Mr. Charles Duane",male,51,0,1,PC 17597,61.3792,,C
158 | 157,1,3,"Gilnagh, Miss. Katherine ""Katie""",female,16,0,0,35851,7.7333,,Q
159 | 158,0,3,"Corn, Mr. Harry",male,30,0,0,SOTON/OQ 392090,8.05,,S
160 | 159,0,3,"Smiljanic, Mr. Mile",male,,0,0,315037,8.6625,,S
161 | 160,0,3,"Sage, Master. Thomas Henry",male,,8,2,CA. 2343,69.55,,S
162 | 161,0,3,"Cribb, Mr. John Hatfield",male,44,0,1,371362,16.1,,S
163 | 162,1,2,"Watt, Mrs. James (Elizabeth ""Bessie"" Inglis Milne)",female,40,0,0,C.A. 33595,15.75,,S
164 | 163,0,3,"Bengtsson, Mr. John Viktor",male,26,0,0,347068,7.775,,S
165 | 164,0,3,"Calic, Mr. Jovo",male,17,0,0,315093,8.6625,,S
166 | 165,0,3,"Panula, Master. Eino Viljami",male,1,4,1,3101295,39.6875,,S
167 | 166,1,3,"Goldsmith, Master. Frank John William ""Frankie""",male,9,0,2,363291,20.525,,S
168 | 167,1,1,"Chibnall, Mrs. (Edith Martha Bowerman)",female,,0,1,113505,55,E33,S
169 | 168,0,3,"Skoog, Mrs. William (Anna Bernhardina Karlsson)",female,45,1,4,347088,27.9,,S
170 | 169,0,1,"Baumann, Mr. John D",male,,0,0,PC 17318,25.925,,S
171 | 170,0,3,"Ling, Mr. Lee",male,28,0,0,1601,56.4958,,S
172 | 171,0,1,"Van der hoef, Mr. Wyckoff",male,61,0,0,111240,33.5,B19,S
173 | 172,0,3,"Rice, Master. Arthur",male,4,4,1,382652,29.125,,Q
174 | 173,1,3,"Johnson, Miss. Eleanor Ileen",female,1,1,1,347742,11.1333,,S
175 | 174,0,3,"Sivola, Mr. Antti Wilhelm",male,21,0,0,STON/O 2. 3101280,7.925,,S
176 | 175,0,1,"Smith, Mr. James Clinch",male,56,0,0,17764,30.6958,A7,C
177 | 176,0,3,"Klasen, Mr. Klas Albin",male,18,1,1,350404,7.8542,,S
178 | 177,0,3,"Lefebre, Master. Henry Forbes",male,,3,1,4133,25.4667,,S
179 | 178,0,1,"Isham, Miss. Ann Elizabeth",female,50,0,0,PC 17595,28.7125,C49,C
180 | 179,0,2,"Hale, Mr. Reginald",male,30,0,0,250653,13,,S
181 | 180,0,3,"Leonard, Mr. Lionel",male,36,0,0,LINE,0,,S
182 | 181,0,3,"Sage, Miss. Constance Gladys",female,,8,2,CA. 2343,69.55,,S
183 | 182,0,2,"Pernot, Mr. Rene",male,,0,0,SC/PARIS 2131,15.05,,C
184 | 183,0,3,"Asplund, Master. Clarence Gustaf Hugo",male,9,4,2,347077,31.3875,,S
185 | 184,1,2,"Becker, Master. Richard F",male,1,2,1,230136,39,F4,S
186 | 185,1,3,"Kink-Heilmann, Miss. Luise Gretchen",female,4,0,2,315153,22.025,,S
187 | 186,0,1,"Rood, Mr. Hugh Roscoe",male,,0,0,113767,50,A32,S
188 | 187,1,3,"O'Brien, Mrs. Thomas (Johanna ""Hannah"" Godfrey)",female,,1,0,370365,15.5,,Q
189 | 188,1,1,"Romaine, Mr. Charles Hallace (""Mr C Rolmane"")",male,45,0,0,111428,26.55,,S
190 | 189,0,3,"Bourke, Mr. John",male,40,1,1,364849,15.5,,Q
191 | 190,0,3,"Turcin, Mr. Stjepan",male,36,0,0,349247,7.8958,,S
192 | 191,1,2,"Pinsky, Mrs. (Rosa)",female,32,0,0,234604,13,,S
193 | 192,0,2,"Carbines, Mr. William",male,19,0,0,28424,13,,S
194 | 193,1,3,"Andersen-Jensen, Miss. Carla Christine Nielsine",female,19,1,0,350046,7.8542,,S
195 | 194,1,2,"Navratil, Master. Michel M",male,3,1,1,230080,26,F2,S
196 | 195,1,1,"Brown, Mrs. James Joseph (Margaret Tobin)",female,44,0,0,PC 17610,27.7208,B4,C
197 | 196,1,1,"Lurette, Miss. Elise",female,58,0,0,PC 17569,146.5208,B80,C
198 | 197,0,3,"Mernagh, Mr. Robert",male,,0,0,368703,7.75,,Q
199 | 198,0,3,"Olsen, Mr. Karl Siegwart Andreas",male,42,0,1,4579,8.4042,,S
200 | 199,1,3,"Madigan, Miss. Margaret ""Maggie""",female,,0,0,370370,7.75,,Q
201 | 200,0,2,"Yrois, Miss. Henriette (""Mrs Harbeck"")",female,24,0,0,248747,13,,S
202 | 201,0,3,"Vande Walle, Mr. Nestor Cyriel",male,28,0,0,345770,9.5,,S
203 | 202,0,3,"Sage, Mr. Frederick",male,,8,2,CA. 2343,69.55,,S
204 | 203,0,3,"Johanson, Mr. Jakob Alfred",male,34,0,0,3101264,6.4958,,S
205 | 204,0,3,"Youseff, Mr. Gerious",male,45.5,0,0,2628,7.225,,C
206 | 205,1,3,"Cohen, Mr. Gurshon ""Gus""",male,18,0,0,A/5 3540,8.05,,S
207 | 206,0,3,"Strom, Miss. Telma Matilda",female,2,0,1,347054,10.4625,G6,S
208 | 207,0,3,"Backstrom, Mr. Karl Alfred",male,32,1,0,3101278,15.85,,S
209 | 208,1,3,"Albimona, Mr. Nassef Cassem",male,26,0,0,2699,18.7875,,C
210 | 209,1,3,"Carr, Miss. Helen ""Ellen""",female,16,0,0,367231,7.75,,Q
211 | 210,1,1,"Blank, Mr. Henry",male,40,0,0,112277,31,A31,C
212 | 211,0,3,"Ali, Mr. Ahmed",male,24,0,0,SOTON/O.Q. 3101311,7.05,,S
213 | 212,1,2,"Cameron, Miss. Clear Annie",female,35,0,0,F.C.C. 13528,21,,S
214 | 213,0,3,"Perkin, Mr. John Henry",male,22,0,0,A/5 21174,7.25,,S
215 | 214,0,2,"Givard, Mr. Hans Kristensen",male,30,0,0,250646,13,,S
216 | 215,0,3,"Kiernan, Mr. Philip",male,,1,0,367229,7.75,,Q
217 | 216,1,1,"Newell, Miss. Madeleine",female,31,1,0,35273,113.275,D36,C
218 | 217,1,3,"Honkanen, Miss. Eliina",female,27,0,0,STON/O2. 3101283,7.925,,S
219 | 218,0,2,"Jacobsohn, Mr. Sidney Samuel",male,42,1,0,243847,27,,S
220 | 219,1,1,"Bazzani, Miss. Albina",female,32,0,0,11813,76.2917,D15,C
221 | 220,0,2,"Harris, Mr. Walter",male,30,0,0,W/C 14208,10.5,,S
222 | 221,1,3,"Sunderland, Mr. Victor Francis",male,16,0,0,SOTON/OQ 392089,8.05,,S
223 | 222,0,2,"Bracken, Mr. James H",male,27,0,0,220367,13,,S
224 | 223,0,3,"Green, Mr. George Henry",male,51,0,0,21440,8.05,,S
225 | 224,0,3,"Nenkoff, Mr. Christo",male,,0,0,349234,7.8958,,S
226 | 225,1,1,"Hoyt, Mr. Frederick Maxfield",male,38,1,0,19943,90,C93,S
227 | 226,0,3,"Berglund, Mr. Karl Ivar Sven",male,22,0,0,PP 4348,9.35,,S
228 | 227,1,2,"Mellors, Mr. William John",male,19,0,0,SW/PP 751,10.5,,S
229 | 228,0,3,"Lovell, Mr. John Hall (""Henry"")",male,20.5,0,0,A/5 21173,7.25,,S
230 | 229,0,2,"Fahlstrom, Mr. Arne Jonas",male,18,0,0,236171,13,,S
231 | 230,0,3,"Lefebre, Miss. Mathilde",female,,3,1,4133,25.4667,,S
232 | 231,1,1,"Harris, Mrs. Henry Birkhardt (Irene Wallach)",female,35,1,0,36973,83.475,C83,S
233 | 232,0,3,"Larsson, Mr. Bengt Edvin",male,29,0,0,347067,7.775,,S
234 | 233,0,2,"Sjostedt, Mr. Ernst Adolf",male,59,0,0,237442,13.5,,S
235 | 234,1,3,"Asplund, Miss. Lillian Gertrud",female,5,4,2,347077,31.3875,,S
236 | 235,0,2,"Leyson, Mr. Robert William Norman",male,24,0,0,C.A. 29566,10.5,,S
237 | 236,0,3,"Harknett, Miss. Alice Phoebe",female,,0,0,W./C. 6609,7.55,,S
238 | 237,0,2,"Hold, Mr. Stephen",male,44,1,0,26707,26,,S
239 | 238,1,2,"Collyer, Miss. Marjorie ""Lottie""",female,8,0,2,C.A. 31921,26.25,,S
240 | 239,0,2,"Pengelly, Mr. Frederick William",male,19,0,0,28665,10.5,,S
241 | 240,0,2,"Hunt, Mr. George Henry",male,33,0,0,SCO/W 1585,12.275,,S
242 | 241,0,3,"Zabour, Miss. Thamine",female,,1,0,2665,14.4542,,C
243 | 242,1,3,"Murphy, Miss. Katherine ""Kate""",female,,1,0,367230,15.5,,Q
244 | 243,0,2,"Coleridge, Mr. Reginald Charles",male,29,0,0,W./C. 14263,10.5,,S
245 | 244,0,3,"Maenpaa, Mr. Matti Alexanteri",male,22,0,0,STON/O 2. 3101275,7.125,,S
246 | 245,0,3,"Attalah, Mr. Sleiman",male,30,0,0,2694,7.225,,C
247 | 246,0,1,"Minahan, Dr. William Edward",male,44,2,0,19928,90,C78,Q
248 | 247,0,3,"Lindahl, Miss. Agda Thorilda Viktoria",female,25,0,0,347071,7.775,,S
249 | 248,1,2,"Hamalainen, Mrs. William (Anna)",female,24,0,2,250649,14.5,,S
250 | 249,1,1,"Beckwith, Mr. Richard Leonard",male,37,1,1,11751,52.5542,D35,S
251 | 250,0,2,"Carter, Rev. Ernest Courtenay",male,54,1,0,244252,26,,S
252 | 251,0,3,"Reed, Mr. James George",male,,0,0,362316,7.25,,S
253 | 252,0,3,"Strom, Mrs. Wilhelm (Elna Matilda Persson)",female,29,1,1,347054,10.4625,G6,S
254 | 253,0,1,"Stead, Mr. William Thomas",male,62,0,0,113514,26.55,C87,S
255 | 254,0,3,"Lobb, Mr. William Arthur",male,30,1,0,A/5. 3336,16.1,,S
256 | 255,0,3,"Rosblom, Mrs. Viktor (Helena Wilhelmina)",female,41,0,2,370129,20.2125,,S
257 | 256,1,3,"Touma, Mrs. Darwis (Hanne Youssef Razi)",female,29,0,2,2650,15.2458,,C
258 | 257,1,1,"Thorne, Mrs. Gertrude Maybelle",female,,0,0,PC 17585,79.2,,C
259 | 258,1,1,"Cherry, Miss. Gladys",female,30,0,0,110152,86.5,B77,S
260 | 259,1,1,"Ward, Miss. Anna",female,35,0,0,PC 17755,512.3292,,C
261 | 260,1,2,"Parrish, Mrs. (Lutie Davis)",female,50,0,1,230433,26,,S
262 | 261,0,3,"Smith, Mr. Thomas",male,,0,0,384461,7.75,,Q
263 | 262,1,3,"Asplund, Master. Edvin Rojj Felix",male,3,4,2,347077,31.3875,,S
264 | 263,0,1,"Taussig, Mr. Emil",male,52,1,1,110413,79.65,E67,S
265 | 264,0,1,"Harrison, Mr. William",male,40,0,0,112059,0,B94,S
266 | 265,0,3,"Henry, Miss. Delia",female,,0,0,382649,7.75,,Q
267 | 266,0,2,"Reeves, Mr. David",male,36,0,0,C.A. 17248,10.5,,S
268 | 267,0,3,"Panula, Mr. Ernesti Arvid",male,16,4,1,3101295,39.6875,,S
269 | 268,1,3,"Persson, Mr. Ernst Ulrik",male,25,1,0,347083,7.775,,S
270 | 269,1,1,"Graham, Mrs. William Thompson (Edith Junkins)",female,58,0,1,PC 17582,153.4625,C125,S
271 | 270,1,1,"Bissette, Miss. Amelia",female,35,0,0,PC 17760,135.6333,C99,S
272 | 271,0,1,"Cairns, Mr. Alexander",male,,0,0,113798,31,,S
273 | 272,1,3,"Tornquist, Mr. William Henry",male,25,0,0,LINE,0,,S
274 | 273,1,2,"Mellinger, Mrs. (Elizabeth Anne Maidment)",female,41,0,1,250644,19.5,,S
275 | 274,0,1,"Natsch, Mr. Charles H",male,37,0,1,PC 17596,29.7,C118,C
276 | 275,1,3,"Healy, Miss. Hanora ""Nora""",female,,0,0,370375,7.75,,Q
277 | 276,1,1,"Andrews, Miss. Kornelia Theodosia",female,63,1,0,13502,77.9583,D7,S
278 | 277,0,3,"Lindblom, Miss. Augusta Charlotta",female,45,0,0,347073,7.75,,S
279 | 278,0,2,"Parkes, Mr. Francis ""Frank""",male,,0,0,239853,0,,S
280 | 279,0,3,"Rice, Master. Eric",male,7,4,1,382652,29.125,,Q
281 | 280,1,3,"Abbott, Mrs. Stanton (Rosa Hunt)",female,35,1,1,C.A. 2673,20.25,,S
282 | 281,0,3,"Duane, Mr. Frank",male,65,0,0,336439,7.75,,Q
283 | 282,0,3,"Olsson, Mr. Nils Johan Goransson",male,28,0,0,347464,7.8542,,S
284 | 283,0,3,"de Pelsmaeker, Mr. Alfons",male,16,0,0,345778,9.5,,S
285 | 284,1,3,"Dorking, Mr. Edward Arthur",male,19,0,0,A/5. 10482,8.05,,S
286 | 285,0,1,"Smith, Mr. Richard William",male,,0,0,113056,26,A19,S
287 | 286,0,3,"Stankovic, Mr. Ivan",male,33,0,0,349239,8.6625,,C
288 | 287,1,3,"de Mulder, Mr. Theodore",male,30,0,0,345774,9.5,,S
289 | 288,0,3,"Naidenoff, Mr. Penko",male,22,0,0,349206,7.8958,,S
290 | 289,1,2,"Hosono, Mr. Masabumi",male,42,0,0,237798,13,,S
291 | 290,1,3,"Connolly, Miss. Kate",female,22,0,0,370373,7.75,,Q
292 | 291,1,1,"Barber, Miss. Ellen ""Nellie""",female,26,0,0,19877,78.85,,S
293 | 292,1,1,"Bishop, Mrs. Dickinson H (Helen Walton)",female,19,1,0,11967,91.0792,B49,C
294 | 293,0,2,"Levy, Mr. Rene Jacques",male,36,0,0,SC/Paris 2163,12.875,D,C
295 | 294,0,3,"Haas, Miss. Aloisia",female,24,0,0,349236,8.85,,S
296 | 295,0,3,"Mineff, Mr. Ivan",male,24,0,0,349233,7.8958,,S
297 | 296,0,1,"Lewy, Mr. Ervin G",male,,0,0,PC 17612,27.7208,,C
298 | 297,0,3,"Hanna, Mr. Mansour",male,23.5,0,0,2693,7.2292,,C
299 | 298,0,1,"Allison, Miss. Helen Loraine",female,2,1,2,113781,151.55,C22 C26,S
300 | 299,1,1,"Saalfeld, Mr. Adolphe",male,,0,0,19988,30.5,C106,S
301 | 300,1,1,"Baxter, Mrs. James (Helene DeLaudeniere Chaput)",female,50,0,1,PC 17558,247.5208,B58 B60,C
302 | 301,1,3,"Kelly, Miss. Anna Katherine ""Annie Kate""",female,,0,0,9234,7.75,,Q
303 | 302,1,3,"McCoy, Mr. Bernard",male,,2,0,367226,23.25,,Q
304 | 303,0,3,"Johnson, Mr. William Cahoone Jr",male,19,0,0,LINE,0,,S
305 | 304,1,2,"Keane, Miss. Nora A",female,,0,0,226593,12.35,E101,Q
306 | 305,0,3,"Williams, Mr. Howard Hugh ""Harry""",male,,0,0,A/5 2466,8.05,,S
307 | 306,1,1,"Allison, Master. Hudson Trevor",male,0.92,1,2,113781,151.55,C22 C26,S
308 | 307,1,1,"Fleming, Miss. Margaret",female,,0,0,17421,110.8833,,C
309 | 308,1,1,"Penasco y Castellana, Mrs. Victor de Satode (Maria Josefa Perez de Soto y Vallejo)",female,17,1,0,PC 17758,108.9,C65,C
310 | 309,0,2,"Abelson, Mr. Samuel",male,30,1,0,P/PP 3381,24,,C
311 | 310,1,1,"Francatelli, Miss. Laura Mabel",female,30,0,0,PC 17485,56.9292,E36,C
312 | 311,1,1,"Hays, Miss. Margaret Bechstein",female,24,0,0,11767,83.1583,C54,C
313 | 312,1,1,"Ryerson, Miss. Emily Borie",female,18,2,2,PC 17608,262.375,B57 B59 B63 B66,C
314 | 313,0,2,"Lahtinen, Mrs. William (Anna Sylfven)",female,26,1,1,250651,26,,S
315 | 314,0,3,"Hendekovic, Mr. Ignjac",male,28,0,0,349243,7.8958,,S
316 | 315,0,2,"Hart, Mr. Benjamin",male,43,1,1,F.C.C. 13529,26.25,,S
317 | 316,1,3,"Nilsson, Miss. Helmina Josefina",female,26,0,0,347470,7.8542,,S
318 | 317,1,2,"Kantor, Mrs. Sinai (Miriam Sternin)",female,24,1,0,244367,26,,S
319 | 318,0,2,"Moraweck, Dr. Ernest",male,54,0,0,29011,14,,S
320 | 319,1,1,"Wick, Miss. Mary Natalie",female,31,0,2,36928,164.8667,C7,S
321 | 320,1,1,"Spedden, Mrs. Frederic Oakley (Margaretta Corning Stone)",female,40,1,1,16966,134.5,E34,C
322 | 321,0,3,"Dennis, Mr. Samuel",male,22,0,0,A/5 21172,7.25,,S
323 | 322,0,3,"Danoff, Mr. Yoto",male,27,0,0,349219,7.8958,,S
324 | 323,1,2,"Slayter, Miss. Hilda Mary",female,30,0,0,234818,12.35,,Q
325 | 324,1,2,"Caldwell, Mrs. Albert Francis (Sylvia Mae Harbaugh)",female,22,1,1,248738,29,,S
326 | 325,0,3,"Sage, Mr. George John Jr",male,,8,2,CA. 2343,69.55,,S
327 | 326,1,1,"Young, Miss. Marie Grice",female,36,0,0,PC 17760,135.6333,C32,C
328 | 327,0,3,"Nysveen, Mr. Johan Hansen",male,61,0,0,345364,6.2375,,S
329 | 328,1,2,"Ball, Mrs. (Ada E Hall)",female,36,0,0,28551,13,D,S
330 | 329,1,3,"Goldsmith, Mrs. Frank John (Emily Alice Brown)",female,31,1,1,363291,20.525,,S
331 | 330,1,1,"Hippach, Miss. Jean Gertrude",female,16,0,1,111361,57.9792,B18,C
332 | 331,1,3,"McCoy, Miss. Agnes",female,,2,0,367226,23.25,,Q
333 | 332,0,1,"Partner, Mr. Austen",male,45.5,0,0,113043,28.5,C124,S
334 | 333,0,1,"Graham, Mr. George Edward",male,38,0,1,PC 17582,153.4625,C91,S
335 | 334,0,3,"Vander Planke, Mr. Leo Edmondus",male,16,2,0,345764,18,,S
336 | 335,1,1,"Frauenthal, Mrs. Henry William (Clara Heinsheimer)",female,,1,0,PC 17611,133.65,,S
337 | 336,0,3,"Denkoff, Mr. Mitto",male,,0,0,349225,7.8958,,S
338 | 337,0,1,"Pears, Mr. Thomas Clinton",male,29,1,0,113776,66.6,C2,S
339 | 338,1,1,"Burns, Miss. Elizabeth Margaret",female,41,0,0,16966,134.5,E40,C
340 | 339,1,3,"Dahl, Mr. Karl Edwart",male,45,0,0,7598,8.05,,S
341 | 340,0,1,"Blackwell, Mr. Stephen Weart",male,45,0,0,113784,35.5,T,S
342 | 341,1,2,"Navratil, Master. Edmond Roger",male,2,1,1,230080,26,F2,S
343 | 342,1,1,"Fortune, Miss. Alice Elizabeth",female,24,3,2,19950,263,C23 C25 C27,S
344 | 343,0,2,"Collander, Mr. Erik Gustaf",male,28,0,0,248740,13,,S
345 | 344,0,2,"Sedgwick, Mr. Charles Frederick Waddington",male,25,0,0,244361,13,,S
346 | 345,0,2,"Fox, Mr. Stanley Hubert",male,36,0,0,229236,13,,S
347 | 346,1,2,"Brown, Miss. Amelia ""Mildred""",female,24,0,0,248733,13,F33,S
348 | 347,1,2,"Smith, Miss. Marion Elsie",female,40,0,0,31418,13,,S
349 | 348,1,3,"Davison, Mrs. Thomas Henry (Mary E Finck)",female,,1,0,386525,16.1,,S
350 | 349,1,3,"Coutts, Master. William Loch ""William""",male,3,1,1,C.A. 37671,15.9,,S
351 | 350,0,3,"Dimic, Mr. Jovan",male,42,0,0,315088,8.6625,,S
352 | 351,0,3,"Odahl, Mr. Nils Martin",male,23,0,0,7267,9.225,,S
353 | 352,0,1,"Williams-Lambert, Mr. Fletcher Fellows",male,,0,0,113510,35,C128,S
354 | 353,0,3,"Elias, Mr. Tannous",male,15,1,1,2695,7.2292,,C
355 | 354,0,3,"Arnold-Franchi, Mr. Josef",male,25,1,0,349237,17.8,,S
356 | 355,0,3,"Yousif, Mr. Wazli",male,,0,0,2647,7.225,,C
357 | 356,0,3,"Vanden Steen, Mr. Leo Peter",male,28,0,0,345783,9.5,,S
358 | 357,1,1,"Bowerman, Miss. Elsie Edith",female,22,0,1,113505,55,E33,S
359 | 358,0,2,"Funk, Miss. Annie Clemmer",female,38,0,0,237671,13,,S
360 | 359,1,3,"McGovern, Miss. Mary",female,,0,0,330931,7.8792,,Q
361 | 360,1,3,"Mockler, Miss. Helen Mary ""Ellie""",female,,0,0,330980,7.8792,,Q
362 | 361,0,3,"Skoog, Mr. Wilhelm",male,40,1,4,347088,27.9,,S
363 | 362,0,2,"del Carlo, Mr. Sebastiano",male,29,1,0,SC/PARIS 2167,27.7208,,C
364 | 363,0,3,"Barbara, Mrs. (Catherine David)",female,45,0,1,2691,14.4542,,C
365 | 364,0,3,"Asim, Mr. Adola",male,35,0,0,SOTON/O.Q. 3101310,7.05,,S
366 | 365,0,3,"O'Brien, Mr. Thomas",male,,1,0,370365,15.5,,Q
367 | 366,0,3,"Adahl, Mr. Mauritz Nils Martin",male,30,0,0,C 7076,7.25,,S
368 | 367,1,1,"Warren, Mrs. Frank Manley (Anna Sophia Atkinson)",female,60,1,0,110813,75.25,D37,C
369 | 368,1,3,"Moussa, Mrs. (Mantoura Boulos)",female,,0,0,2626,7.2292,,C
370 | 369,1,3,"Jermyn, Miss. Annie",female,,0,0,14313,7.75,,Q
371 | 370,1,1,"Aubart, Mme. Leontine Pauline",female,24,0,0,PC 17477,69.3,B35,C
372 | 371,1,1,"Harder, Mr. George Achilles",male,25,1,0,11765,55.4417,E50,C
373 | 372,0,3,"Wiklund, Mr. Jakob Alfred",male,18,1,0,3101267,6.4958,,S
374 | 373,0,3,"Beavan, Mr. William Thomas",male,19,0,0,323951,8.05,,S
375 | 374,0,1,"Ringhini, Mr. Sante",male,22,0,0,PC 17760,135.6333,,C
376 | 375,0,3,"Palsson, Miss. Stina Viola",female,3,3,1,349909,21.075,,S
377 | 376,1,1,"Meyer, Mrs. Edgar Joseph (Leila Saks)",female,,1,0,PC 17604,82.1708,,C
378 | 377,1,3,"Landergren, Miss. Aurora Adelia",female,22,0,0,C 7077,7.25,,S
379 | 378,0,1,"Widener, Mr. Harry Elkins",male,27,0,2,113503,211.5,C82,C
380 | 379,0,3,"Betros, Mr. Tannous",male,20,0,0,2648,4.0125,,C
381 | 380,0,3,"Gustafsson, Mr. Karl Gideon",male,19,0,0,347069,7.775,,S
382 | 381,1,1,"Bidois, Miss. Rosalie",female,42,0,0,PC 17757,227.525,,C
383 | 382,1,3,"Nakid, Miss. Maria (""Mary"")",female,1,0,2,2653,15.7417,,C
384 | 383,0,3,"Tikkanen, Mr. Juho",male,32,0,0,STON/O 2. 3101293,7.925,,S
385 | 384,1,1,"Holverson, Mrs. Alexander Oskar (Mary Aline Towner)",female,35,1,0,113789,52,,S
386 | 385,0,3,"Plotcharsky, Mr. Vasil",male,,0,0,349227,7.8958,,S
387 | 386,0,2,"Davies, Mr. Charles Henry",male,18,0,0,S.O.C. 14879,73.5,,S
388 | 387,0,3,"Goodwin, Master. Sidney Leonard",male,1,5,2,CA 2144,46.9,,S
389 | 388,1,2,"Buss, Miss. Kate",female,36,0,0,27849,13,,S
390 | 389,0,3,"Sadlier, Mr. Matthew",male,,0,0,367655,7.7292,,Q
391 | 390,1,2,"Lehmann, Miss. Bertha",female,17,0,0,SC 1748,12,,C
392 | 391,1,1,"Carter, Mr. William Ernest",male,36,1,2,113760,120,B96 B98,S
393 | 392,1,3,"Jansson, Mr. Carl Olof",male,21,0,0,350034,7.7958,,S
394 | 393,0,3,"Gustafsson, Mr. Johan Birger",male,28,2,0,3101277,7.925,,S
395 | 394,1,1,"Newell, Miss. Marjorie",female,23,1,0,35273,113.275,D36,C
396 | 395,1,3,"Sandstrom, Mrs. Hjalmar (Agnes Charlotta Bengtsson)",female,24,0,2,PP 9549,16.7,G6,S
397 | 396,0,3,"Johansson, Mr. Erik",male,22,0,0,350052,7.7958,,S
398 | 397,0,3,"Olsson, Miss. Elina",female,31,0,0,350407,7.8542,,S
399 | 398,0,2,"McKane, Mr. Peter David",male,46,0,0,28403,26,,S
400 | 399,0,2,"Pain, Dr. Alfred",male,23,0,0,244278,10.5,,S
401 | 400,1,2,"Trout, Mrs. William H (Jessie L)",female,28,0,0,240929,12.65,,S
402 | 401,1,3,"Niskanen, Mr. Juha",male,39,0,0,STON/O 2. 3101289,7.925,,S
403 | 402,0,3,"Adams, Mr. John",male,26,0,0,341826,8.05,,S
404 | 403,0,3,"Jussila, Miss. Mari Aina",female,21,1,0,4137,9.825,,S
405 | 404,0,3,"Hakkarainen, Mr. Pekka Pietari",male,28,1,0,STON/O2. 3101279,15.85,,S
406 | 405,0,3,"Oreskovic, Miss. Marija",female,20,0,0,315096,8.6625,,S
407 | 406,0,2,"Gale, Mr. Shadrach",male,34,1,0,28664,21,,S
408 | 407,0,3,"Widegren, Mr. Carl/Charles Peter",male,51,0,0,347064,7.75,,S
409 | 408,1,2,"Richards, Master. William Rowe",male,3,1,1,29106,18.75,,S
410 | 409,0,3,"Birkeland, Mr. Hans Martin Monsen",male,21,0,0,312992,7.775,,S
411 | 410,0,3,"Lefebre, Miss. Ida",female,,3,1,4133,25.4667,,S
412 | 411,0,3,"Sdycoff, Mr. Todor",male,,0,0,349222,7.8958,,S
413 | 412,0,3,"Hart, Mr. Henry",male,,0,0,394140,6.8583,,Q
414 | 413,1,1,"Minahan, Miss. Daisy E",female,33,1,0,19928,90,C78,Q
415 | 414,0,2,"Cunningham, Mr. Alfred Fleming",male,,0,0,239853,0,,S
416 | 415,1,3,"Sundman, Mr. Johan Julian",male,44,0,0,STON/O 2. 3101269,7.925,,S
417 | 416,0,3,"Meek, Mrs. Thomas (Annie Louise Rowley)",female,,0,0,343095,8.05,,S
418 | 417,1,2,"Drew, Mrs. James Vivian (Lulu Thorne Christian)",female,34,1,1,28220,32.5,,S
419 | 418,1,2,"Silven, Miss. Lyyli Karoliina",female,18,0,2,250652,13,,S
420 | 419,0,2,"Matthews, Mr. William John",male,30,0,0,28228,13,,S
421 | 420,0,3,"Van Impe, Miss. Catharina",female,10,0,2,345773,24.15,,S
422 | 421,0,3,"Gheorgheff, Mr. Stanio",male,,0,0,349254,7.8958,,C
423 | 422,0,3,"Charters, Mr. David",male,21,0,0,A/5. 13032,7.7333,,Q
424 | 423,0,3,"Zimmerman, Mr. Leo",male,29,0,0,315082,7.875,,S
425 | 424,0,3,"Danbom, Mrs. Ernst Gilbert (Anna Sigrid Maria Brogren)",female,28,1,1,347080,14.4,,S
426 | 425,0,3,"Rosblom, Mr. Viktor Richard",male,18,1,1,370129,20.2125,,S
427 | 426,0,3,"Wiseman, Mr. Phillippe",male,,0,0,A/4. 34244,7.25,,S
428 | 427,1,2,"Clarke, Mrs. Charles V (Ada Maria Winfield)",female,28,1,0,2003,26,,S
429 | 428,1,2,"Phillips, Miss. Kate Florence (""Mrs Kate Louise Phillips Marshall"")",female,19,0,0,250655,26,,S
430 | 429,0,3,"Flynn, Mr. James",male,,0,0,364851,7.75,,Q
431 | 430,1,3,"Pickard, Mr. Berk (Berk Trembisky)",male,32,0,0,SOTON/O.Q. 392078,8.05,E10,S
432 | 431,1,1,"Bjornstrom-Steffansson, Mr. Mauritz Hakan",male,28,0,0,110564,26.55,C52,S
433 | 432,1,3,"Thorneycroft, Mrs. Percival (Florence Kate White)",female,,1,0,376564,16.1,,S
434 | 433,1,2,"Louch, Mrs. Charles Alexander (Alice Adelaide Slow)",female,42,1,0,SC/AH 3085,26,,S
435 | 434,0,3,"Kallio, Mr. Nikolai Erland",male,17,0,0,STON/O 2. 3101274,7.125,,S
436 | 435,0,1,"Silvey, Mr. William Baird",male,50,1,0,13507,55.9,E44,S
437 | 436,1,1,"Carter, Miss. Lucile Polk",female,14,1,2,113760,120,B96 B98,S
438 | 437,0,3,"Ford, Miss. Doolina Margaret ""Daisy""",female,21,2,2,W./C. 6608,34.375,,S
439 | 438,1,2,"Richards, Mrs. Sidney (Emily Hocking)",female,24,2,3,29106,18.75,,S
440 | 439,0,1,"Fortune, Mr. Mark",male,64,1,4,19950,263,C23 C25 C27,S
441 | 440,0,2,"Kvillner, Mr. Johan Henrik Johannesson",male,31,0,0,C.A. 18723,10.5,,S
442 | 441,1,2,"Hart, Mrs. Benjamin (Esther Ada Bloomfield)",female,45,1,1,F.C.C. 13529,26.25,,S
443 | 442,0,3,"Hampe, Mr. Leon",male,20,0,0,345769,9.5,,S
444 | 443,0,3,"Petterson, Mr. Johan Emil",male,25,1,0,347076,7.775,,S
445 | 444,1,2,"Reynaldo, Ms. Encarnacion",female,28,0,0,230434,13,,S
446 | 445,1,3,"Johannesen-Bratthammer, Mr. Bernt",male,,0,0,65306,8.1125,,S
447 | 446,1,1,"Dodge, Master. Washington",male,4,0,2,33638,81.8583,A34,S
448 | 447,1,2,"Mellinger, Miss. Madeleine Violet",female,13,0,1,250644,19.5,,S
449 | 448,1,1,"Seward, Mr. Frederic Kimber",male,34,0,0,113794,26.55,,S
450 | 449,1,3,"Baclini, Miss. Marie Catherine",female,5,2,1,2666,19.2583,,C
451 | 450,1,1,"Peuchen, Major. Arthur Godfrey",male,52,0,0,113786,30.5,C104,S
452 | 451,0,2,"West, Mr. Edwy Arthur",male,36,1,2,C.A. 34651,27.75,,S
453 | 452,0,3,"Hagland, Mr. Ingvald Olai Olsen",male,,1,0,65303,19.9667,,S
454 | 453,0,1,"Foreman, Mr. Benjamin Laventall",male,30,0,0,113051,27.75,C111,C
455 | 454,1,1,"Goldenberg, Mr. Samuel L",male,49,1,0,17453,89.1042,C92,C
456 | 455,0,3,"Peduzzi, Mr. Joseph",male,,0,0,A/5 2817,8.05,,S
457 | 456,1,3,"Jalsevac, Mr. Ivan",male,29,0,0,349240,7.8958,,C
458 | 457,0,1,"Millet, Mr. Francis Davis",male,65,0,0,13509,26.55,E38,S
459 | 458,1,1,"Kenyon, Mrs. Frederick R (Marion)",female,,1,0,17464,51.8625,D21,S
460 | 459,1,2,"Toomey, Miss. Ellen",female,50,0,0,F.C.C. 13531,10.5,,S
461 | 460,0,3,"O'Connor, Mr. Maurice",male,,0,0,371060,7.75,,Q
462 | 461,1,1,"Anderson, Mr. Harry",male,48,0,0,19952,26.55,E12,S
463 | 462,0,3,"Morley, Mr. William",male,34,0,0,364506,8.05,,S
464 | 463,0,1,"Gee, Mr. Arthur H",male,47,0,0,111320,38.5,E63,S
465 | 464,0,2,"Milling, Mr. Jacob Christian",male,48,0,0,234360,13,,S
466 | 465,0,3,"Maisner, Mr. Simon",male,,0,0,A/S 2816,8.05,,S
467 | 466,0,3,"Goncalves, Mr. Manuel Estanslas",male,38,0,0,SOTON/O.Q. 3101306,7.05,,S
468 | 467,0,2,"Campbell, Mr. William",male,,0,0,239853,0,,S
469 | 468,0,1,"Smart, Mr. John Montgomery",male,56,0,0,113792,26.55,,S
470 | 469,0,3,"Scanlan, Mr. James",male,,0,0,36209,7.725,,Q
471 | 470,1,3,"Baclini, Miss. Helene Barbara",female,0.75,2,1,2666,19.2583,,C
472 | 471,0,3,"Keefe, Mr. Arthur",male,,0,0,323592,7.25,,S
473 | 472,0,3,"Cacic, Mr. Luka",male,38,0,0,315089,8.6625,,S
474 | 473,1,2,"West, Mrs. Edwy Arthur (Ada Mary Worth)",female,33,1,2,C.A. 34651,27.75,,S
475 | 474,1,2,"Jerwan, Mrs. Amin S (Marie Marthe Thuillard)",female,23,0,0,SC/AH Basle 541,13.7917,D,C
476 | 475,0,3,"Strandberg, Miss. Ida Sofia",female,22,0,0,7553,9.8375,,S
477 | 476,0,1,"Clifford, Mr. George Quincy",male,,0,0,110465,52,A14,S
478 | 477,0,2,"Renouf, Mr. Peter Henry",male,34,1,0,31027,21,,S
479 | 478,0,3,"Braund, Mr. Lewis Richard",male,29,1,0,3460,7.0458,,S
480 | 479,0,3,"Karlsson, Mr. Nils August",male,22,0,0,350060,7.5208,,S
481 | 480,1,3,"Hirvonen, Miss. Hildur E",female,2,0,1,3101298,12.2875,,S
482 | 481,0,3,"Goodwin, Master. Harold Victor",male,9,5,2,CA 2144,46.9,,S
483 | 482,0,2,"Frost, Mr. Anthony Wood ""Archie""",male,,0,0,239854,0,,S
484 | 483,0,3,"Rouse, Mr. Richard Henry",male,50,0,0,A/5 3594,8.05,,S
485 | 484,1,3,"Turkula, Mrs. (Hedwig)",female,63,0,0,4134,9.5875,,S
486 | 485,1,1,"Bishop, Mr. Dickinson H",male,25,1,0,11967,91.0792,B49,C
487 | 486,0,3,"Lefebre, Miss. Jeannie",female,,3,1,4133,25.4667,,S
488 | 487,1,1,"Hoyt, Mrs. Frederick Maxfield (Jane Anne Forby)",female,35,1,0,19943,90,C93,S
489 | 488,0,1,"Kent, Mr. Edward Austin",male,58,0,0,11771,29.7,B37,C
490 | 489,0,3,"Somerton, Mr. Francis William",male,30,0,0,A.5. 18509,8.05,,S
491 | 490,1,3,"Coutts, Master. Eden Leslie ""Neville""",male,9,1,1,C.A. 37671,15.9,,S
492 | 491,0,3,"Hagland, Mr. Konrad Mathias Reiersen",male,,1,0,65304,19.9667,,S
493 | 492,0,3,"Windelov, Mr. Einar",male,21,0,0,SOTON/OQ 3101317,7.25,,S
494 | 493,0,1,"Molson, Mr. Harry Markland",male,55,0,0,113787,30.5,C30,S
495 | 494,0,1,"Artagaveytia, Mr. Ramon",male,71,0,0,PC 17609,49.5042,,C
496 | 495,0,3,"Stanley, Mr. Edward Roland",male,21,0,0,A/4 45380,8.05,,S
497 | 496,0,3,"Yousseff, Mr. Gerious",male,,0,0,2627,14.4583,,C
498 | 497,1,1,"Eustis, Miss. Elizabeth Mussey",female,54,1,0,36947,78.2667,D20,C
499 | 498,0,3,"Shellard, Mr. Frederick William",male,,0,0,C.A. 6212,15.1,,S
500 | 499,0,1,"Allison, Mrs. Hudson J C (Bessie Waldo Daniels)",female,25,1,2,113781,151.55,C22 C26,S
501 | 500,0,3,"Svensson, Mr. Olof",male,24,0,0,350035,7.7958,,S
502 | 501,0,3,"Calic, Mr. Petar",male,17,0,0,315086,8.6625,,S
503 | 502,0,3,"Canavan, Miss. Mary",female,21,0,0,364846,7.75,,Q
504 | 503,0,3,"O'Sullivan, Miss. Bridget Mary",female,,0,0,330909,7.6292,,Q
505 | 504,0,3,"Laitinen, Miss. Kristina Sofia",female,37,0,0,4135,9.5875,,S
506 | 505,1,1,"Maioni, Miss. Roberta",female,16,0,0,110152,86.5,B79,S
507 | 506,0,1,"Penasco y Castellana, Mr. Victor de Satode",male,18,1,0,PC 17758,108.9,C65,C
508 | 507,1,2,"Quick, Mrs. Frederick Charles (Jane Richards)",female,33,0,2,26360,26,,S
509 | 508,1,1,"Bradley, Mr. George (""George Arthur Brayton"")",male,,0,0,111427,26.55,,S
510 | 509,0,3,"Olsen, Mr. Henry Margido",male,28,0,0,C 4001,22.525,,S
511 | 510,1,3,"Lang, Mr. Fang",male,26,0,0,1601,56.4958,,S
512 | 511,1,3,"Daly, Mr. Eugene Patrick",male,29,0,0,382651,7.75,,Q
513 | 512,0,3,"Webber, Mr. James",male,,0,0,SOTON/OQ 3101316,8.05,,S
514 | 513,1,1,"McGough, Mr. James Robert",male,36,0,0,PC 17473,26.2875,E25,S
515 | 514,1,1,"Rothschild, Mrs. Martin (Elizabeth L. Barrett)",female,54,1,0,PC 17603,59.4,,C
516 | 515,0,3,"Coleff, Mr. Satio",male,24,0,0,349209,7.4958,,S
517 | 516,0,1,"Walker, Mr. William Anderson",male,47,0,0,36967,34.0208,D46,S
518 | 517,1,2,"Lemore, Mrs. (Amelia Milley)",female,34,0,0,C.A. 34260,10.5,F33,S
519 | 518,0,3,"Ryan, Mr. Patrick",male,,0,0,371110,24.15,,Q
520 | 519,1,2,"Angle, Mrs. William A (Florence ""Mary"" Agnes Hughes)",female,36,1,0,226875,26,,S
521 | 520,0,3,"Pavlovic, Mr. Stefo",male,32,0,0,349242,7.8958,,S
522 | 521,1,1,"Perreault, Miss. Anne",female,30,0,0,12749,93.5,B73,S
523 | 522,0,3,"Vovk, Mr. Janko",male,22,0,0,349252,7.8958,,S
524 | 523,0,3,"Lahoud, Mr. Sarkis",male,,0,0,2624,7.225,,C
525 | 524,1,1,"Hippach, Mrs. Louis Albert (Ida Sophia Fischer)",female,44,0,1,111361,57.9792,B18,C
526 | 525,0,3,"Kassem, Mr. Fared",male,,0,0,2700,7.2292,,C
527 | 526,0,3,"Farrell, Mr. James",male,40.5,0,0,367232,7.75,,Q
528 | 527,1,2,"Ridsdale, Miss. Lucy",female,50,0,0,W./C. 14258,10.5,,S
529 | 528,0,1,"Farthing, Mr. John",male,,0,0,PC 17483,221.7792,C95,S
530 | 529,0,3,"Salonen, Mr. Johan Werner",male,39,0,0,3101296,7.925,,S
531 | 530,0,2,"Hocking, Mr. Richard George",male,23,2,1,29104,11.5,,S
532 | 531,1,2,"Quick, Miss. Phyllis May",female,2,1,1,26360,26,,S
533 | 532,0,3,"Toufik, Mr. Nakli",male,,0,0,2641,7.2292,,C
534 | 533,0,3,"Elias, Mr. Joseph Jr",male,17,1,1,2690,7.2292,,C
535 | 534,1,3,"Peter, Mrs. Catherine (Catherine Rizk)",female,,0,2,2668,22.3583,,C
536 | 535,0,3,"Cacic, Miss. Marija",female,30,0,0,315084,8.6625,,S
537 | 536,1,2,"Hart, Miss. Eva Miriam",female,7,0,2,F.C.C. 13529,26.25,,S
538 | 537,0,1,"Butt, Major. Archibald Willingham",male,45,0,0,113050,26.55,B38,S
539 | 538,1,1,"LeRoy, Miss. Bertha",female,30,0,0,PC 17761,106.425,,C
540 | 539,0,3,"Risien, Mr. Samuel Beard",male,,0,0,364498,14.5,,S
541 | 540,1,1,"Frolicher, Miss. Hedwig Margaritha",female,22,0,2,13568,49.5,B39,C
542 | 541,1,1,"Crosby, Miss. Harriet R",female,36,0,2,WE/P 5735,71,B22,S
543 | 542,0,3,"Andersson, Miss. Ingeborg Constanzia",female,9,4,2,347082,31.275,,S
544 | 543,0,3,"Andersson, Miss. Sigrid Elisabeth",female,11,4,2,347082,31.275,,S
545 | 544,1,2,"Beane, Mr. Edward",male,32,1,0,2908,26,,S
546 | 545,0,1,"Douglas, Mr. Walter Donald",male,50,1,0,PC 17761,106.425,C86,C
547 | 546,0,1,"Nicholson, Mr. Arthur Ernest",male,64,0,0,693,26,,S
548 | 547,1,2,"Beane, Mrs. Edward (Ethel Clarke)",female,19,1,0,2908,26,,S
549 | 548,1,2,"Padro y Manent, Mr. Julian",male,,0,0,SC/PARIS 2146,13.8625,,C
550 | 549,0,3,"Goldsmith, Mr. Frank John",male,33,1,1,363291,20.525,,S
551 | 550,1,2,"Davies, Master. John Morgan Jr",male,8,1,1,C.A. 33112,36.75,,S
552 | 551,1,1,"Thayer, Mr. John Borland Jr",male,17,0,2,17421,110.8833,C70,C
553 | 552,0,2,"Sharp, Mr. Percival James R",male,27,0,0,244358,26,,S
554 | 553,0,3,"O'Brien, Mr. Timothy",male,,0,0,330979,7.8292,,Q
555 | 554,1,3,"Leeni, Mr. Fahim (""Philip Zenni"")",male,22,0,0,2620,7.225,,C
556 | 555,1,3,"Ohman, Miss. Velin",female,22,0,0,347085,7.775,,S
557 | 556,0,1,"Wright, Mr. George",male,62,0,0,113807,26.55,,S
558 | 557,1,1,"Duff Gordon, Lady. (Lucille Christiana Sutherland) (""Mrs Morgan"")",female,48,1,0,11755,39.6,A16,C
559 | 558,0,1,"Robbins, Mr. Victor",male,,0,0,PC 17757,227.525,,C
560 | 559,1,1,"Taussig, Mrs. Emil (Tillie Mandelbaum)",female,39,1,1,110413,79.65,E67,S
561 | 560,1,3,"de Messemaeker, Mrs. Guillaume Joseph (Emma)",female,36,1,0,345572,17.4,,S
562 | 561,0,3,"Morrow, Mr. Thomas Rowan",male,,0,0,372622,7.75,,Q
563 | 562,0,3,"Sivic, Mr. Husein",male,40,0,0,349251,7.8958,,S
564 | 563,0,2,"Norman, Mr. Robert Douglas",male,28,0,0,218629,13.5,,S
565 | 564,0,3,"Simmons, Mr. John",male,,0,0,SOTON/OQ 392082,8.05,,S
566 | 565,0,3,"Meanwell, Miss. (Marion Ogden)",female,,0,0,SOTON/O.Q. 392087,8.05,,S
567 | 566,0,3,"Davies, Mr. Alfred J",male,24,2,0,A/4 48871,24.15,,S
568 | 567,0,3,"Stoytcheff, Mr. Ilia",male,19,0,0,349205,7.8958,,S
569 | 568,0,3,"Palsson, Mrs. Nils (Alma Cornelia Berglund)",female,29,0,4,349909,21.075,,S
570 | 569,0,3,"Doharr, Mr. Tannous",male,,0,0,2686,7.2292,,C
571 | 570,1,3,"Jonsson, Mr. Carl",male,32,0,0,350417,7.8542,,S
572 | 571,1,2,"Harris, Mr. George",male,62,0,0,S.W./PP 752,10.5,,S
573 | 572,1,1,"Appleton, Mrs. Edward Dale (Charlotte Lamson)",female,53,2,0,11769,51.4792,C101,S
574 | 573,1,1,"Flynn, Mr. John Irwin (""Irving"")",male,36,0,0,PC 17474,26.3875,E25,S
575 | 574,1,3,"Kelly, Miss. Mary",female,,0,0,14312,7.75,,Q
576 | 575,0,3,"Rush, Mr. Alfred George John",male,16,0,0,A/4. 20589,8.05,,S
577 | 576,0,3,"Patchett, Mr. George",male,19,0,0,358585,14.5,,S
578 | 577,1,2,"Garside, Miss. Ethel",female,34,0,0,243880,13,,S
579 | 578,1,1,"Silvey, Mrs. William Baird (Alice Munger)",female,39,1,0,13507,55.9,E44,S
580 | 579,0,3,"Caram, Mrs. Joseph (Maria Elias)",female,,1,0,2689,14.4583,,C
581 | 580,1,3,"Jussila, Mr. Eiriik",male,32,0,0,STON/O 2. 3101286,7.925,,S
582 | 581,1,2,"Christy, Miss. Julie Rachel",female,25,1,1,237789,30,,S
583 | 582,1,1,"Thayer, Mrs. John Borland (Marian Longstreth Morris)",female,39,1,1,17421,110.8833,C68,C
584 | 583,0,2,"Downton, Mr. William James",male,54,0,0,28403,26,,S
585 | 584,0,1,"Ross, Mr. John Hugo",male,36,0,0,13049,40.125,A10,C
586 | 585,0,3,"Paulner, Mr. Uscher",male,,0,0,3411,8.7125,,C
587 | 586,1,1,"Taussig, Miss. Ruth",female,18,0,2,110413,79.65,E68,S
588 | 587,0,2,"Jarvis, Mr. John Denzil",male,47,0,0,237565,15,,S
589 | 588,1,1,"Frolicher-Stehli, Mr. Maxmillian",male,60,1,1,13567,79.2,B41,C
590 | 589,0,3,"Gilinski, Mr. Eliezer",male,22,0,0,14973,8.05,,S
591 | 590,0,3,"Murdlin, Mr. Joseph",male,,0,0,A./5. 3235,8.05,,S
592 | 591,0,3,"Rintamaki, Mr. Matti",male,35,0,0,STON/O 2. 3101273,7.125,,S
593 | 592,1,1,"Stephenson, Mrs. Walter Bertram (Martha Eustis)",female,52,1,0,36947,78.2667,D20,C
594 | 593,0,3,"Elsbury, Mr. William James",male,47,0,0,A/5 3902,7.25,,S
595 | 594,0,3,"Bourke, Miss. Mary",female,,0,2,364848,7.75,,Q
596 | 595,0,2,"Chapman, Mr. John Henry",male,37,1,0,SC/AH 29037,26,,S
597 | 596,0,3,"Van Impe, Mr. Jean Baptiste",male,36,1,1,345773,24.15,,S
598 | 597,1,2,"Leitch, Miss. Jessie Wills",female,,0,0,248727,33,,S
599 | 598,0,3,"Johnson, Mr. Alfred",male,49,0,0,LINE,0,,S
600 | 599,0,3,"Boulos, Mr. Hanna",male,,0,0,2664,7.225,,C
601 | 600,1,1,"Duff Gordon, Sir. Cosmo Edmund (""Mr Morgan"")",male,49,1,0,PC 17485,56.9292,A20,C
602 | 601,1,2,"Jacobsohn, Mrs. Sidney Samuel (Amy Frances Christy)",female,24,2,1,243847,27,,S
603 | 602,0,3,"Slabenoff, Mr. Petco",male,,0,0,349214,7.8958,,S
604 | 603,0,1,"Harrington, Mr. Charles H",male,,0,0,113796,42.4,,S
605 | 604,0,3,"Torber, Mr. Ernst William",male,44,0,0,364511,8.05,,S
606 | 605,1,1,"Homer, Mr. Harry (""Mr E Haven"")",male,35,0,0,111426,26.55,,C
607 | 606,0,3,"Lindell, Mr. Edvard Bengtsson",male,36,1,0,349910,15.55,,S
608 | 607,0,3,"Karaic, Mr. Milan",male,30,0,0,349246,7.8958,,S
609 | 608,1,1,"Daniel, Mr. Robert Williams",male,27,0,0,113804,30.5,,S
610 | 609,1,2,"Laroche, Mrs. Joseph (Juliette Marie Louise Lafargue)",female,22,1,2,SC/Paris 2123,41.5792,,C
611 | 610,1,1,"Shutes, Miss. Elizabeth W",female,40,0,0,PC 17582,153.4625,C125,S
612 | 611,0,3,"Andersson, Mrs. Anders Johan (Alfrida Konstantia Brogren)",female,39,1,5,347082,31.275,,S
613 | 612,0,3,"Jardin, Mr. Jose Neto",male,,0,0,SOTON/O.Q. 3101305,7.05,,S
614 | 613,1,3,"Murphy, Miss. Margaret Jane",female,,1,0,367230,15.5,,Q
615 | 614,0,3,"Horgan, Mr. John",male,,0,0,370377,7.75,,Q
616 | 615,0,3,"Brocklebank, Mr. William Alfred",male,35,0,0,364512,8.05,,S
617 | 616,1,2,"Herman, Miss. Alice",female,24,1,2,220845,65,,S
618 | 617,0,3,"Danbom, Mr. Ernst Gilbert",male,34,1,1,347080,14.4,,S
619 | 618,0,3,"Lobb, Mrs. William Arthur (Cordelia K Stanlick)",female,26,1,0,A/5. 3336,16.1,,S
620 | 619,1,2,"Becker, Miss. Marion Louise",female,4,2,1,230136,39,F4,S
621 | 620,0,2,"Gavey, Mr. Lawrence",male,26,0,0,31028,10.5,,S
622 | 621,0,3,"Yasbeck, Mr. Antoni",male,27,1,0,2659,14.4542,,C
623 | 622,1,1,"Kimball, Mr. Edwin Nelson Jr",male,42,1,0,11753,52.5542,D19,S
624 | 623,1,3,"Nakid, Mr. Sahid",male,20,1,1,2653,15.7417,,C
625 | 624,0,3,"Hansen, Mr. Henry Damsgaard",male,21,0,0,350029,7.8542,,S
626 | 625,0,3,"Bowen, Mr. David John ""Dai""",male,21,0,0,54636,16.1,,S
627 | 626,0,1,"Sutton, Mr. Frederick",male,61,0,0,36963,32.3208,D50,S
628 | 627,0,2,"Kirkland, Rev. Charles Leonard",male,57,0,0,219533,12.35,,Q
629 | 628,1,1,"Longley, Miss. Gretchen Fiske",female,21,0,0,13502,77.9583,D9,S
630 | 629,0,3,"Bostandyeff, Mr. Guentcho",male,26,0,0,349224,7.8958,,S
631 | 630,0,3,"O'Connell, Mr. Patrick D",male,,0,0,334912,7.7333,,Q
632 | 631,1,1,"Barkworth, Mr. Algernon Henry Wilson",male,80,0,0,27042,30,A23,S
633 | 632,0,3,"Lundahl, Mr. Johan Svensson",male,51,0,0,347743,7.0542,,S
634 | 633,1,1,"Stahelin-Maeglin, Dr. Max",male,32,0,0,13214,30.5,B50,C
635 | 634,0,1,"Parr, Mr. William Henry Marsh",male,,0,0,112052,0,,S
636 | 635,0,3,"Skoog, Miss. Mabel",female,9,3,2,347088,27.9,,S
637 | 636,1,2,"Davis, Miss. Mary",female,28,0,0,237668,13,,S
638 | 637,0,3,"Leinonen, Mr. Antti Gustaf",male,32,0,0,STON/O 2. 3101292,7.925,,S
639 | 638,0,2,"Collyer, Mr. Harvey",male,31,1,1,C.A. 31921,26.25,,S
640 | 639,0,3,"Panula, Mrs. Juha (Maria Emilia Ojala)",female,41,0,5,3101295,39.6875,,S
641 | 640,0,3,"Thorneycroft, Mr. Percival",male,,1,0,376564,16.1,,S
642 | 641,0,3,"Jensen, Mr. Hans Peder",male,20,0,0,350050,7.8542,,S
643 | 642,1,1,"Sagesser, Mlle. Emma",female,24,0,0,PC 17477,69.3,B35,C
644 | 643,0,3,"Skoog, Miss. Margit Elizabeth",female,2,3,2,347088,27.9,,S
645 | 644,1,3,"Foo, Mr. Choong",male,,0,0,1601,56.4958,,S
646 | 645,1,3,"Baclini, Miss. Eugenie",female,0.75,2,1,2666,19.2583,,C
647 | 646,1,1,"Harper, Mr. Henry Sleeper",male,48,1,0,PC 17572,76.7292,D33,C
648 | 647,0,3,"Cor, Mr. Liudevit",male,19,0,0,349231,7.8958,,S
649 | 648,1,1,"Simonius-Blumer, Col. Oberst Alfons",male,56,0,0,13213,35.5,A26,C
650 | 649,0,3,"Willey, Mr. Edward",male,,0,0,S.O./P.P. 751,7.55,,S
651 | 650,1,3,"Stanley, Miss. Amy Zillah Elsie",female,23,0,0,CA. 2314,7.55,,S
652 | 651,0,3,"Mitkoff, Mr. Mito",male,,0,0,349221,7.8958,,S
653 | 652,1,2,"Doling, Miss. Elsie",female,18,0,1,231919,23,,S
654 | 653,0,3,"Kalvik, Mr. Johannes Halvorsen",male,21,0,0,8475,8.4333,,S
655 | 654,1,3,"O'Leary, Miss. Hanora ""Norah""",female,,0,0,330919,7.8292,,Q
656 | 655,0,3,"Hegarty, Miss. Hanora ""Nora""",female,18,0,0,365226,6.75,,Q
657 | 656,0,2,"Hickman, Mr. Leonard Mark",male,24,2,0,S.O.C. 14879,73.5,,S
658 | 657,0,3,"Radeff, Mr. Alexander",male,,0,0,349223,7.8958,,S
659 | 658,0,3,"Bourke, Mrs. John (Catherine)",female,32,1,1,364849,15.5,,Q
660 | 659,0,2,"Eitemiller, Mr. George Floyd",male,23,0,0,29751,13,,S
661 | 660,0,1,"Newell, Mr. Arthur Webster",male,58,0,2,35273,113.275,D48,C
662 | 661,1,1,"Frauenthal, Dr. Henry William",male,50,2,0,PC 17611,133.65,,S
663 | 662,0,3,"Badt, Mr. Mohamed",male,40,0,0,2623,7.225,,C
664 | 663,0,1,"Colley, Mr. Edward Pomeroy",male,47,0,0,5727,25.5875,E58,S
665 | 664,0,3,"Coleff, Mr. Peju",male,36,0,0,349210,7.4958,,S
666 | 665,1,3,"Lindqvist, Mr. Eino William",male,20,1,0,STON/O 2. 3101285,7.925,,S
667 | 666,0,2,"Hickman, Mr. Lewis",male,32,2,0,S.O.C. 14879,73.5,,S
668 | 667,0,2,"Butler, Mr. Reginald Fenton",male,25,0,0,234686,13,,S
669 | 668,0,3,"Rommetvedt, Mr. Knud Paust",male,,0,0,312993,7.775,,S
670 | 669,0,3,"Cook, Mr. Jacob",male,43,0,0,A/5 3536,8.05,,S
671 | 670,1,1,"Taylor, Mrs. Elmer Zebley (Juliet Cummins Wright)",female,,1,0,19996,52,C126,S
672 | 671,1,2,"Brown, Mrs. Thomas William Solomon (Elizabeth Catherine Ford)",female,40,1,1,29750,39,,S
673 | 672,0,1,"Davidson, Mr. Thornton",male,31,1,0,F.C. 12750,52,B71,S
674 | 673,0,2,"Mitchell, Mr. Henry Michael",male,70,0,0,C.A. 24580,10.5,,S
675 | 674,1,2,"Wilhelms, Mr. Charles",male,31,0,0,244270,13,,S
676 | 675,0,2,"Watson, Mr. Ennis Hastings",male,,0,0,239856,0,,S
677 | 676,0,3,"Edvardsson, Mr. Gustaf Hjalmar",male,18,0,0,349912,7.775,,S
678 | 677,0,3,"Sawyer, Mr. Frederick Charles",male,24.5,0,0,342826,8.05,,S
679 | 678,1,3,"Turja, Miss. Anna Sofia",female,18,0,0,4138,9.8417,,S
680 | 679,0,3,"Goodwin, Mrs. Frederick (Augusta Tyler)",female,43,1,6,CA 2144,46.9,,S
681 | 680,1,1,"Cardeza, Mr. Thomas Drake Martinez",male,36,0,1,PC 17755,512.3292,B51 B53 B55,C
682 | 681,0,3,"Peters, Miss. Katie",female,,0,0,330935,8.1375,,Q
683 | 682,1,1,"Hassab, Mr. Hammad",male,27,0,0,PC 17572,76.7292,D49,C
684 | 683,0,3,"Olsvigen, Mr. Thor Anderson",male,20,0,0,6563,9.225,,S
685 | 684,0,3,"Goodwin, Mr. Charles Edward",male,14,5,2,CA 2144,46.9,,S
686 | 685,0,2,"Brown, Mr. Thomas William Solomon",male,60,1,1,29750,39,,S
687 | 686,0,2,"Laroche, Mr. Joseph Philippe Lemercier",male,25,1,2,SC/Paris 2123,41.5792,,C
688 | 687,0,3,"Panula, Mr. Jaako Arnold",male,14,4,1,3101295,39.6875,,S
689 | 688,0,3,"Dakic, Mr. Branko",male,19,0,0,349228,10.1708,,S
690 | 689,0,3,"Fischer, Mr. Eberhard Thelander",male,18,0,0,350036,7.7958,,S
691 | 690,1,1,"Madill, Miss. Georgette Alexandra",female,15,0,1,24160,211.3375,B5,S
692 | 691,1,1,"Dick, Mr. Albert Adrian",male,31,1,0,17474,57,B20,S
693 | 692,1,3,"Karun, Miss. Manca",female,4,0,1,349256,13.4167,,C
694 | 693,1,3,"Lam, Mr. Ali",male,,0,0,1601,56.4958,,S
695 | 694,0,3,"Saad, Mr. Khalil",male,25,0,0,2672,7.225,,C
696 | 695,0,1,"Weir, Col. John",male,60,0,0,113800,26.55,,S
697 | 696,0,2,"Chapman, Mr. Charles Henry",male,52,0,0,248731,13.5,,S
698 | 697,0,3,"Kelly, Mr. James",male,44,0,0,363592,8.05,,S
699 | 698,1,3,"Mullens, Miss. Katherine ""Katie""",female,,0,0,35852,7.7333,,Q
700 | 699,0,1,"Thayer, Mr. John Borland",male,49,1,1,17421,110.8833,C68,C
701 | 700,0,3,"Humblen, Mr. Adolf Mathias Nicolai Olsen",male,42,0,0,348121,7.65,F G63,S
702 | 701,1,1,"Astor, Mrs. John Jacob (Madeleine Talmadge Force)",female,18,1,0,PC 17757,227.525,C62 C64,C
703 | 702,1,1,"Silverthorne, Mr. Spencer Victor",male,35,0,0,PC 17475,26.2875,E24,S
704 | 703,0,3,"Barbara, Miss. Saiide",female,18,0,1,2691,14.4542,,C
705 | 704,0,3,"Gallagher, Mr. Martin",male,25,0,0,36864,7.7417,,Q
706 | 705,0,3,"Hansen, Mr. Henrik Juul",male,26,1,0,350025,7.8542,,S
707 | 706,0,2,"Morley, Mr. Henry Samuel (""Mr Henry Marshall"")",male,39,0,0,250655,26,,S
708 | 707,1,2,"Kelly, Mrs. Florence ""Fannie""",female,45,0,0,223596,13.5,,S
709 | 708,1,1,"Calderhead, Mr. Edward Pennington",male,42,0,0,PC 17476,26.2875,E24,S
710 | 709,1,1,"Cleaver, Miss. Alice",female,22,0,0,113781,151.55,,S
711 | 710,1,3,"Moubarek, Master. Halim Gonios (""William George"")",male,,1,1,2661,15.2458,,C
712 | 711,1,1,"Mayne, Mlle. Berthe Antonine (""Mrs de Villiers"")",female,24,0,0,PC 17482,49.5042,C90,C
713 | 712,0,1,"Klaber, Mr. Herman",male,,0,0,113028,26.55,C124,S
714 | 713,1,1,"Taylor, Mr. Elmer Zebley",male,48,1,0,19996,52,C126,S
715 | 714,0,3,"Larsson, Mr. August Viktor",male,29,0,0,7545,9.4833,,S
716 | 715,0,2,"Greenberg, Mr. Samuel",male,52,0,0,250647,13,,S
717 | 716,0,3,"Soholt, Mr. Peter Andreas Lauritz Andersen",male,19,0,0,348124,7.65,F G73,S
718 | 717,1,1,"Endres, Miss. Caroline Louise",female,38,0,0,PC 17757,227.525,C45,C
719 | 718,1,2,"Troutt, Miss. Edwina Celia ""Winnie""",female,27,0,0,34218,10.5,E101,S
720 | 719,0,3,"McEvoy, Mr. Michael",male,,0,0,36568,15.5,,Q
721 | 720,0,3,"Johnson, Mr. Malkolm Joackim",male,33,0,0,347062,7.775,,S
722 | 721,1,2,"Harper, Miss. Annie Jessie ""Nina""",female,6,0,1,248727,33,,S
723 | 722,0,3,"Jensen, Mr. Svend Lauritz",male,17,1,0,350048,7.0542,,S
724 | 723,0,2,"Gillespie, Mr. William Henry",male,34,0,0,12233,13,,S
725 | 724,0,2,"Hodges, Mr. Henry Price",male,50,0,0,250643,13,,S
726 | 725,1,1,"Chambers, Mr. Norman Campbell",male,27,1,0,113806,53.1,E8,S
727 | 726,0,3,"Oreskovic, Mr. Luka",male,20,0,0,315094,8.6625,,S
728 | 727,1,2,"Renouf, Mrs. Peter Henry (Lillian Jefferys)",female,30,3,0,31027,21,,S
729 | 728,1,3,"Mannion, Miss. Margareth",female,,0,0,36866,7.7375,,Q
730 | 729,0,2,"Bryhl, Mr. Kurt Arnold Gottfrid",male,25,1,0,236853,26,,S
731 | 730,0,3,"Ilmakangas, Miss. Pieta Sofia",female,25,1,0,STON/O2. 3101271,7.925,,S
732 | 731,1,1,"Allen, Miss. Elisabeth Walton",female,29,0,0,24160,211.3375,B5,S
733 | 732,0,3,"Hassan, Mr. Houssein G N",male,11,0,0,2699,18.7875,,C
734 | 733,0,2,"Knight, Mr. Robert J",male,,0,0,239855,0,,S
735 | 734,0,2,"Berriman, Mr. William John",male,23,0,0,28425,13,,S
736 | 735,0,2,"Troupiansky, Mr. Moses Aaron",male,23,0,0,233639,13,,S
737 | 736,0,3,"Williams, Mr. Leslie",male,28.5,0,0,54636,16.1,,S
738 | 737,0,3,"Ford, Mrs. Edward (Margaret Ann Watson)",female,48,1,3,W./C. 6608,34.375,,S
739 | 738,1,1,"Lesurer, Mr. Gustave J",male,35,0,0,PC 17755,512.3292,B101,C
740 | 739,0,3,"Ivanoff, Mr. Kanio",male,,0,0,349201,7.8958,,S
741 | 740,0,3,"Nankoff, Mr. Minko",male,,0,0,349218,7.8958,,S
742 | 741,1,1,"Hawksford, Mr. Walter James",male,,0,0,16988,30,D45,S
743 | 742,0,1,"Cavendish, Mr. Tyrell William",male,36,1,0,19877,78.85,C46,S
744 | 743,1,1,"Ryerson, Miss. Susan Parker ""Suzette""",female,21,2,2,PC 17608,262.375,B57 B59 B63 B66,C
745 | 744,0,3,"McNamee, Mr. Neal",male,24,1,0,376566,16.1,,S
746 | 745,1,3,"Stranden, Mr. Juho",male,31,0,0,STON/O 2. 3101288,7.925,,S
747 | 746,0,1,"Crosby, Capt. Edward Gifford",male,70,1,1,WE/P 5735,71,B22,S
748 | 747,0,3,"Abbott, Mr. Rossmore Edward",male,16,1,1,C.A. 2673,20.25,,S
749 | 748,1,2,"Sinkkonen, Miss. Anna",female,30,0,0,250648,13,,S
750 | 749,0,1,"Marvin, Mr. Daniel Warner",male,19,1,0,113773,53.1,D30,S
751 | 750,0,3,"Connaghton, Mr. Michael",male,31,0,0,335097,7.75,,Q
752 | 751,1,2,"Wells, Miss. Joan",female,4,1,1,29103,23,,S
753 | 752,1,3,"Moor, Master. Meier",male,6,0,1,392096,12.475,E121,S
754 | 753,0,3,"Vande Velde, Mr. Johannes Joseph",male,33,0,0,345780,9.5,,S
755 | 754,0,3,"Jonkoff, Mr. Lalio",male,23,0,0,349204,7.8958,,S
756 | 755,1,2,"Herman, Mrs. Samuel (Jane Laver)",female,48,1,2,220845,65,,S
757 | 756,1,2,"Hamalainen, Master. Viljo",male,0.67,1,1,250649,14.5,,S
758 | 757,0,3,"Carlsson, Mr. August Sigfrid",male,28,0,0,350042,7.7958,,S
759 | 758,0,2,"Bailey, Mr. Percy Andrew",male,18,0,0,29108,11.5,,S
760 | 759,0,3,"Theobald, Mr. Thomas Leonard",male,34,0,0,363294,8.05,,S
761 | 760,1,1,"Rothes, the Countess. of (Lucy Noel Martha Dyer-Edwards)",female,33,0,0,110152,86.5,B77,S
762 | 761,0,3,"Garfirth, Mr. John",male,,0,0,358585,14.5,,S
763 | 762,0,3,"Nirva, Mr. Iisakki Antino Aijo",male,41,0,0,SOTON/O2 3101272,7.125,,S
764 | 763,1,3,"Barah, Mr. Hanna Assi",male,20,0,0,2663,7.2292,,C
765 | 764,1,1,"Carter, Mrs. William Ernest (Lucile Polk)",female,36,1,2,113760,120,B96 B98,S
766 | 765,0,3,"Eklund, Mr. Hans Linus",male,16,0,0,347074,7.775,,S
767 | 766,1,1,"Hogeboom, Mrs. John C (Anna Andrews)",female,51,1,0,13502,77.9583,D11,S
768 | 767,0,1,"Brewe, Dr. Arthur Jackson",male,,0,0,112379,39.6,,C
769 | 768,0,3,"Mangan, Miss. Mary",female,30.5,0,0,364850,7.75,,Q
770 | 769,0,3,"Moran, Mr. Daniel J",male,,1,0,371110,24.15,,Q
771 | 770,0,3,"Gronnestad, Mr. Daniel Danielsen",male,32,0,0,8471,8.3625,,S
772 | 771,0,3,"Lievens, Mr. Rene Aime",male,24,0,0,345781,9.5,,S
773 | 772,0,3,"Jensen, Mr. Niels Peder",male,48,0,0,350047,7.8542,,S
774 | 773,0,2,"Mack, Mrs. (Mary)",female,57,0,0,S.O./P.P. 3,10.5,E77,S
775 | 774,0,3,"Elias, Mr. Dibo",male,,0,0,2674,7.225,,C
776 | 775,1,2,"Hocking, Mrs. Elizabeth (Eliza Needs)",female,54,1,3,29105,23,,S
777 | 776,0,3,"Myhrman, Mr. Pehr Fabian Oliver Malkolm",male,18,0,0,347078,7.75,,S
778 | 777,0,3,"Tobin, Mr. Roger",male,,0,0,383121,7.75,F38,Q
779 | 778,1,3,"Emanuel, Miss. Virginia Ethel",female,5,0,0,364516,12.475,,S
780 | 779,0,3,"Kilgannon, Mr. Thomas J",male,,0,0,36865,7.7375,,Q
781 | 780,1,1,"Robert, Mrs. Edward Scott (Elisabeth Walton McMillan)",female,43,0,1,24160,211.3375,B3,S
782 | 781,1,3,"Ayoub, Miss. Banoura",female,13,0,0,2687,7.2292,,C
783 | 782,1,1,"Dick, Mrs. Albert Adrian (Vera Gillespie)",female,17,1,0,17474,57,B20,S
784 | 783,0,1,"Long, Mr. Milton Clyde",male,29,0,0,113501,30,D6,S
785 | 784,0,3,"Johnston, Mr. Andrew G",male,,1,2,W./C. 6607,23.45,,S
786 | 785,0,3,"Ali, Mr. William",male,25,0,0,SOTON/O.Q. 3101312,7.05,,S
787 | 786,0,3,"Harmer, Mr. Abraham (David Lishin)",male,25,0,0,374887,7.25,,S
788 | 787,1,3,"Sjoblom, Miss. Anna Sofia",female,18,0,0,3101265,7.4958,,S
789 | 788,0,3,"Rice, Master. George Hugh",male,8,4,1,382652,29.125,,Q
790 | 789,1,3,"Dean, Master. Bertram Vere",male,1,1,2,C.A. 2315,20.575,,S
791 | 790,0,1,"Guggenheim, Mr. Benjamin",male,46,0,0,PC 17593,79.2,B82 B84,C
792 | 791,0,3,"Keane, Mr. Andrew ""Andy""",male,,0,0,12460,7.75,,Q
793 | 792,0,2,"Gaskell, Mr. Alfred",male,16,0,0,239865,26,,S
794 | 793,0,3,"Sage, Miss. Stella Anna",female,,8,2,CA. 2343,69.55,,S
795 | 794,0,1,"Hoyt, Mr. William Fisher",male,,0,0,PC 17600,30.6958,,C
796 | 795,0,3,"Dantcheff, Mr. Ristiu",male,25,0,0,349203,7.8958,,S
797 | 796,0,2,"Otter, Mr. Richard",male,39,0,0,28213,13,,S
798 | 797,1,1,"Leader, Dr. Alice (Farnham)",female,49,0,0,17465,25.9292,D17,S
799 | 798,1,3,"Osman, Mrs. Mara",female,31,0,0,349244,8.6833,,S
800 | 799,0,3,"Ibrahim Shawah, Mr. Yousseff",male,30,0,0,2685,7.2292,,C
801 | 800,0,3,"Van Impe, Mrs. Jean Baptiste (Rosalie Paula Govaert)",female,30,1,1,345773,24.15,,S
802 | 801,0,2,"Ponesell, Mr. Martin",male,34,0,0,250647,13,,S
803 | 802,1,2,"Collyer, Mrs. Harvey (Charlotte Annie Tate)",female,31,1,1,C.A. 31921,26.25,,S
804 | 803,1,1,"Carter, Master. William Thornton II",male,11,1,2,113760,120,B96 B98,S
805 | 804,1,3,"Thomas, Master. Assad Alexander",male,0.42,0,1,2625,8.5167,,C
806 | 805,1,3,"Hedman, Mr. Oskar Arvid",male,27,0,0,347089,6.975,,S
807 | 806,0,3,"Johansson, Mr. Karl Johan",male,31,0,0,347063,7.775,,S
808 | 807,0,1,"Andrews, Mr. Thomas Jr",male,39,0,0,112050,0,A36,S
809 | 808,0,3,"Pettersson, Miss. Ellen Natalia",female,18,0,0,347087,7.775,,S
810 | 809,0,2,"Meyer, Mr. August",male,39,0,0,248723,13,,S
811 | 810,1,1,"Chambers, Mrs. Norman Campbell (Bertha Griggs)",female,33,1,0,113806,53.1,E8,S
812 | 811,0,3,"Alexander, Mr. William",male,26,0,0,3474,7.8875,,S
813 | 812,0,3,"Lester, Mr. James",male,39,0,0,A/4 48871,24.15,,S
814 | 813,0,2,"Slemen, Mr. Richard James",male,35,0,0,28206,10.5,,S
815 | 814,0,3,"Andersson, Miss. Ebba Iris Alfrida",female,6,4,2,347082,31.275,,S
816 | 815,0,3,"Tomlin, Mr. Ernest Portage",male,30.5,0,0,364499,8.05,,S
817 | 816,0,1,"Fry, Mr. Richard",male,,0,0,112058,0,B102,S
818 | 817,0,3,"Heininen, Miss. Wendla Maria",female,23,0,0,STON/O2. 3101290,7.925,,S
819 | 818,0,2,"Mallet, Mr. Albert",male,31,1,1,S.C./PARIS 2079,37.0042,,C
820 | 819,0,3,"Holm, Mr. John Fredrik Alexander",male,43,0,0,C 7075,6.45,,S
821 | 820,0,3,"Skoog, Master. Karl Thorsten",male,10,3,2,347088,27.9,,S
822 | 821,1,1,"Hays, Mrs. Charles Melville (Clara Jennings Gregg)",female,52,1,1,12749,93.5,B69,S
823 | 822,1,3,"Lulic, Mr. Nikola",male,27,0,0,315098,8.6625,,S
824 | 823,0,1,"Reuchlin, Jonkheer. John George",male,38,0,0,19972,0,,S
825 | 824,1,3,"Moor, Mrs. (Beila)",female,27,0,1,392096,12.475,E121,S
826 | 825,0,3,"Panula, Master. Urho Abraham",male,2,4,1,3101295,39.6875,,S
827 | 826,0,3,"Flynn, Mr. John",male,,0,0,368323,6.95,,Q
828 | 827,0,3,"Lam, Mr. Len",male,,0,0,1601,56.4958,,S
829 | 828,1,2,"Mallet, Master. Andre",male,1,0,2,S.C./PARIS 2079,37.0042,,C
830 | 829,1,3,"McCormack, Mr. Thomas Joseph",male,,0,0,367228,7.75,,Q
831 | 830,1,1,"Stone, Mrs. George Nelson (Martha Evelyn)",female,62,0,0,113572,80,B28,
832 | 831,1,3,"Yasbeck, Mrs. Antoni (Selini Alexander)",female,15,1,0,2659,14.4542,,C
833 | 832,1,2,"Richards, Master. George Sibley",male,0.83,1,1,29106,18.75,,S
834 | 833,0,3,"Saad, Mr. Amin",male,,0,0,2671,7.2292,,C
835 | 834,0,3,"Augustsson, Mr. Albert",male,23,0,0,347468,7.8542,,S
836 | 835,0,3,"Allum, Mr. Owen George",male,18,0,0,2223,8.3,,S
837 | 836,1,1,"Compton, Miss. Sara Rebecca",female,39,1,1,PC 17756,83.1583,E49,C
838 | 837,0,3,"Pasic, Mr. Jakob",male,21,0,0,315097,8.6625,,S
839 | 838,0,3,"Sirota, Mr. Maurice",male,,0,0,392092,8.05,,S
840 | 839,1,3,"Chip, Mr. Chang",male,32,0,0,1601,56.4958,,S
841 | 840,1,1,"Marechal, Mr. Pierre",male,,0,0,11774,29.7,C47,C
842 | 841,0,3,"Alhomaki, Mr. Ilmari Rudolf",male,20,0,0,SOTON/O2 3101287,7.925,,S
843 | 842,0,2,"Mudd, Mr. Thomas Charles",male,16,0,0,S.O./P.P. 3,10.5,,S
844 | 843,1,1,"Serepeca, Miss. Augusta",female,30,0,0,113798,31,,C
845 | 844,0,3,"Lemberopolous, Mr. Peter L",male,34.5,0,0,2683,6.4375,,C
846 | 845,0,3,"Culumovic, Mr. Jeso",male,17,0,0,315090,8.6625,,S
847 | 846,0,3,"Abbing, Mr. Anthony",male,42,0,0,C.A. 5547,7.55,,S
848 | 847,0,3,"Sage, Mr. Douglas Bullen",male,,8,2,CA. 2343,69.55,,S
849 | 848,0,3,"Markoff, Mr. Marin",male,35,0,0,349213,7.8958,,C
850 | 849,0,2,"Harper, Rev. John",male,28,0,1,248727,33,,S
851 | 850,1,1,"Goldenberg, Mrs. Samuel L (Edwiga Grabowska)",female,,1,0,17453,89.1042,C92,C
852 | 851,0,3,"Andersson, Master. Sigvard Harald Elias",male,4,4,2,347082,31.275,,S
853 | 852,0,3,"Svensson, Mr. Johan",male,74,0,0,347060,7.775,,S
854 | 853,0,3,"Boulos, Miss. Nourelain",female,9,1,1,2678,15.2458,,C
855 | 854,1,1,"Lines, Miss. Mary Conover",female,16,0,1,PC 17592,39.4,D28,S
856 | 855,0,2,"Carter, Mrs. Ernest Courtenay (Lilian Hughes)",female,44,1,0,244252,26,,S
857 | 856,1,3,"Aks, Mrs. Sam (Leah Rosen)",female,18,0,1,392091,9.35,,S
858 | 857,1,1,"Wick, Mrs. George Dennick (Mary Hitchcock)",female,45,1,1,36928,164.8667,,S
859 | 858,1,1,"Daly, Mr. Peter Denis ",male,51,0,0,113055,26.55,E17,S
860 | 859,1,3,"Baclini, Mrs. Solomon (Latifa Qurban)",female,24,0,3,2666,19.2583,,C
861 | 860,0,3,"Razi, Mr. Raihed",male,,0,0,2629,7.2292,,C
862 | 861,0,3,"Hansen, Mr. Claus Peter",male,41,2,0,350026,14.1083,,S
863 | 862,0,2,"Giles, Mr. Frederick Edward",male,21,1,0,28134,11.5,,S
864 | 863,1,1,"Swift, Mrs. Frederick Joel (Margaret Welles Barron)",female,48,0,0,17466,25.9292,D17,S
865 | 864,0,3,"Sage, Miss. Dorothy Edith ""Dolly""",female,,8,2,CA. 2343,69.55,,S
866 | 865,0,2,"Gill, Mr. John William",male,24,0,0,233866,13,,S
867 | 866,1,2,"Bystrom, Mrs. (Karolina)",female,42,0,0,236852,13,,S
868 | 867,1,2,"Duran y More, Miss. Asuncion",female,27,1,0,SC/PARIS 2149,13.8583,,C
869 | 868,0,1,"Roebling, Mr. Washington Augustus II",male,31,0,0,PC 17590,50.4958,A24,S
870 | 869,0,3,"van Melkebeke, Mr. Philemon",male,,0,0,345777,9.5,,S
871 | 870,1,3,"Johnson, Master. Harold Theodor",male,4,1,1,347742,11.1333,,S
872 | 871,0,3,"Balkic, Mr. Cerin",male,26,0,0,349248,7.8958,,S
873 | 872,1,1,"Beckwith, Mrs. Richard Leonard (Sallie Monypeny)",female,47,1,1,11751,52.5542,D35,S
874 | 873,0,1,"Carlsson, Mr. Frans Olof",male,33,0,0,695,5,B51 B53 B55,S
875 | 874,0,3,"Vander Cruyssen, Mr. Victor",male,47,0,0,345765,9,,S
876 | 875,1,2,"Abelson, Mrs. Samuel (Hannah Wizosky)",female,28,1,0,P/PP 3381,24,,C
877 | 876,1,3,"Najib, Miss. Adele Kiamie ""Jane""",female,15,0,0,2667,7.225,,C
878 | 877,0,3,"Gustafsson, Mr. Alfred Ossian",male,20,0,0,7534,9.8458,,S
879 | 878,0,3,"Petroff, Mr. Nedelio",male,19,0,0,349212,7.8958,,S
880 | 879,0,3,"Laleff, Mr. Kristo",male,,0,0,349217,7.8958,,S
881 | 880,1,1,"Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)",female,56,0,1,11767,83.1583,C50,C
882 | 881,1,2,"Shelley, Mrs. William (Imanita Parrish Hall)",female,25,0,1,230433,26,,S
883 | 882,0,3,"Markun, Mr. Johann",male,33,0,0,349257,7.8958,,S
884 | 883,0,3,"Dahlberg, Miss. Gerda Ulrika",female,22,0,0,7552,10.5167,,S
885 | 884,0,2,"Banfield, Mr. Frederick James",male,28,0,0,C.A./SOTON 34068,10.5,,S
886 | 885,0,3,"Sutehall, Mr. Henry Jr",male,25,0,0,SOTON/OQ 392076,7.05,,S
887 | 886,0,3,"Rice, Mrs. William (Margaret Norton)",female,39,0,5,382652,29.125,,Q
888 | 887,0,2,"Montvila, Rev. Juozas",male,27,0,0,211536,13,,S
889 | 888,1,1,"Graham, Miss. Margaret Edith",female,19,0,0,112053,30,B42,S
890 | 889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.45,,S
891 | 890,1,1,"Behr, Mr. Karl Howell",male,26,0,0,111369,30,C148,C
892 | 891,0,3,"Dooley, Mr. Patrick",male,32,0,0,370376,7.75,,Q
893 |
--------------------------------------------------------------------------------
/data/words.txt:
--------------------------------------------------------------------------------
1 | hello world
2 | hello pyspark
3 | spark context
4 | i like spark
5 | hadoop rdd
6 | text file
7 | word count
8 |
9 |
10 |
--------------------------------------------------------------------------------
/figures/20newsgroups.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vsmolyakov/pyspark/7106e747c0280f77403e157c42675a30f1509d10/figures/20newsgroups.png
--------------------------------------------------------------------------------
/figures/als.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vsmolyakov/pyspark/7106e747c0280f77403e157c42675a30f1509d10/figures/als.png
--------------------------------------------------------------------------------
/figures/lda_topics.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vsmolyakov/pyspark/7106e747c0280f77403e157c42675a30f1509d10/figures/lda_topics.png
--------------------------------------------------------------------------------
/figures/mlp.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vsmolyakov/pyspark/7106e747c0280f77403e157c42675a30f1509d10/figures/mlp.png
--------------------------------------------------------------------------------
/figures/random_forrest.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vsmolyakov/pyspark/7106e747c0280f77403e157c42675a30f1509d10/figures/random_forrest.png
--------------------------------------------------------------------------------
/figures/spark.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vsmolyakov/pyspark/7106e747c0280f77403e157c42675a30f1509d10/figures/spark.png
--------------------------------------------------------------------------------
/lda.py:
--------------------------------------------------------------------------------
1 |
2 | from pyspark import SparkContext
3 | from pyspark.sql import SQLContext
4 | from pyspark.sql import SparkSession
5 | from pyspark.sql import Row
6 |
7 | import re
8 | import numpy as np
9 | from time import time
10 | from sklearn.datasets import fetch_20newsgroups
11 |
12 | from pyspark.ml.feature import CountVectorizer, HashingTF, IDF
13 | from pyspark.ml.feature import Tokenizer, StopWordsRemover
14 | from pyspark.ml.clustering import LDA
15 |
16 | np.random.seed(0)
17 |
18 | if __name__ == "__main__":
19 |
20 | sc = SparkContext('local', 'lda')
21 | sqlContext = SQLContext(sc)
22 |
23 | spark = SparkSession\
24 | .builder\
25 | .appName("LDA")\
26 | .getOrCreate()
27 |
28 |
29 | num_features = 8000 #vocabulary size
30 | num_topics = 20 #fixed for LDA
31 |
32 | print "loading 20 newsgroups dataset..."
33 | tic = time()
34 | dataset = fetch_20newsgroups(shuffle=True, random_state=0, remove=('headers','footers','quotes'))
35 | train_corpus = dataset.data # a list of 11314 documents / entries
36 | toc = time()
37 | print "elapsed time: %.4f sec" %(toc - tic)
38 |
39 | #distribute data
40 | corpus_rdd = sc.parallelize(train_corpus)
41 | corpus_rdd = corpus_rdd.map(lambda doc: re.sub(r"[^A-Za-z]", " ", doc))
42 | corpus_rdd = corpus_rdd.map(lambda doc: u"".join(doc).encode('utf-8').strip())
43 |
44 | rdd_row = corpus_rdd.map(lambda doc: Row(raw_corpus=str(doc)))
45 | newsgroups = spark.createDataFrame(rdd_row)
46 |
47 | tokenizer = Tokenizer(inputCol="raw_corpus", outputCol="tokens")
48 | newsgroups = tokenizer.transform(newsgroups)
49 | newsgroups = newsgroups.drop('raw_corpus')
50 |
51 | stopwords = StopWordsRemover(inputCol="tokens", outputCol="tokens_filtered")
52 | newsgroups = stopwords.transform(newsgroups)
53 | newsgroups = newsgroups.drop('tokens')
54 |
55 | count_vec = CountVectorizer(inputCol="tokens_filtered", outputCol="tf_features", vocabSize=num_features, minDF=2.0)
56 | count_vec_model = count_vec.fit(newsgroups)
57 | vocab = count_vec_model.vocabulary
58 | newsgroups = count_vec_model.transform(newsgroups)
59 | newsgroups = newsgroups.drop('tokens_filtered')
60 |
61 | #hashingTF = HashingTF(inputCol="tokens_filtered", outputCol="tf_features", numFeatures=num_features)
62 | #newsgroups = hashingTF.transform(newsgroups)
63 | #newsgroups = newsgroups.drop('tokens_filtered')
64 |
65 | idf = IDF(inputCol="tf_features", outputCol="features")
66 | newsgroups = idf.fit(newsgroups).transform(newsgroups)
67 | newsgroups = newsgroups.drop('tf_features')
68 |
69 | lda = LDA(k=num_topics, featuresCol="features", seed=0)
70 | model = lda.fit(newsgroups)
71 |
72 | topics = model.describeTopics()
73 | topics.show()
74 |
75 | model.topicsMatrix()
76 |
77 | topics_rdd = topics.rdd
78 |
79 | topics_words = topics_rdd\
80 | .map(lambda row: row['termIndices'])\
81 | .map(lambda idx_list: [vocab[idx] for idx in idx_list])\
82 | .collect()
83 |
84 | for idx, topic in enumerate(topics_words):
85 | print "topic: ", idx
86 | print "----------"
87 | for word in topic:
88 | print word
89 | print "----------"
90 |
91 |
92 |
93 |
94 |
95 |
96 |
97 |
98 |
--------------------------------------------------------------------------------
/log_reg.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | from time import time
4 | import matplotlib.pyplot as plt
5 |
6 | from sklearn.datasets import fetch_20newsgroups
7 | from sklearn.feature_extraction.text import TfidfVectorizer
8 |
9 | from pyspark import SparkContext
10 | from pyspark.mllib.regression import LabeledPoint
11 | from pyspark.mllib.classification import LogisticRegressionWithLBFGS
12 |
13 | def parse_doc(values):
14 | return LabeledPoint(values[0], values[1:])
15 |
16 | def main():
17 |
18 | #parameters
19 | num_features = 400 #vocabulary size
20 |
21 | #load data
22 | print "loading 20 newsgroups dataset..."
23 | categories = ['rec.autos','rec.sport.hockey','comp.graphics','sci.space']
24 | tic = time()
25 | dataset = fetch_20newsgroups(shuffle=True, random_state=0, categories=categories, remove=('headers','footers','quotes'))
26 | train_corpus = dataset.data # a list of 11314 documents / entries
27 | train_labels = dataset.target
28 | toc = time()
29 | print "elapsed time: %.4f sec" %(toc - tic)
30 |
31 | #tf-idf vectorizer
32 | tfidf = TfidfVectorizer(max_df=0.5, max_features=num_features, \
33 | min_df=2, stop_words='english', use_idf=True)
34 | X_tfidf = tfidf.fit_transform(train_corpus).toarray()
35 |
36 | #append document labels
37 | train_labels = train_labels.reshape(-1,1)
38 | X_all = np.hstack([train_labels, X_tfidf])
39 |
40 | #distribute the data
41 | sc = SparkContext('local', 'log_reg')
42 | rdd = sc.parallelize(X_all)
43 | labeled_corpus = rdd.map(parse_doc)
44 | train_RDD, test_RDD = labeled_corpus.randomSplit([8, 2], seed=0)
45 |
46 | #distributed logistic regression
47 | print "training logistic regression..."
48 | model = LogisticRegressionWithLBFGS.train(train_RDD, regParam=1, regType='l1', numClasses=len(categories))
49 |
50 | #evaluated the model on test data
51 | labels_and_preds = test_RDD.map(lambda p: (p.label, model.predict(p.features)))
52 | test_err = labels_and_preds.filter(lambda (v, p): v != p).count() / float(test_RDD.count())
53 | print "log-reg test error: ", test_err
54 |
55 | #model.save(sc, './log_reg_lbfgs_model')
56 |
57 |
58 | if __name__ == "__main__":
59 | main()
--------------------------------------------------------------------------------
/mapper.py:
--------------------------------------------------------------------------------
1 | '''
2 | >>> lines.collect()
3 | [u'ATATCCCCGGGAT', u'ATCGATCGATATG']
4 |
5 | >>> rdd.collect()
6 | [(u'A', 3), (u'C', 4), (u'T', 3), (u'G', 3), (u'A', 4), (u'C', 2), (u'T', 4), (u'G', 3)]
7 |
8 | >>> cnt.collect()
9 | [(u'A', 7), (u'C', 6), (u'T', 7), (u'G', 6)]
10 | '''
11 |
12 | def mapper(seq):
13 | freq = dict()
14 | for x in list(seq):
15 | if x in freq:
16 | freq[x] += 1
17 | else:
18 | freq[x] = 1
19 |
20 | kv = [(x, freq[x]) for x in freq.keys()]
21 | return kv
22 |
23 |
24 | from pyspark import SparkContext
25 |
26 | if __name__ == "__main__":
27 |
28 | sc = SparkContext('local', 'mapper')
29 | lines = sc.textFile("./data/dna_seq.txt", 1)
30 |
31 | rdd = lines.flatMap(mapper)
32 | cnt = rdd.reduceByKey(lambda x, y: x + y)
33 | print cnt.collect()
34 |
--------------------------------------------------------------------------------
/mlp.py:
--------------------------------------------------------------------------------
1 |
2 | from pyspark import SparkContext
3 | from pyspark.sql import SQLContext
4 | from pyspark.sql import SparkSession
5 |
6 | from pyspark.ml.feature import StringIndexer, VectorAssembler
7 | from pyspark.ml.classification import MultilayerPerceptronClassifier
8 | from pyspark.ml.evaluation import MulticlassClassificationEvaluator
9 | from pyspark.sql.types import DoubleType, IntegerType
10 |
11 |
12 | if __name__ == "__main__":
13 |
14 | sc = SparkContext('local', 'mlp')
15 | sqlContext = SQLContext(sc)
16 |
17 | spark = SparkSession\
18 | .builder\
19 | .appName("MLPClassifier")\
20 | .getOrCreate()
21 |
22 | #read in csv as dataframe
23 | dataset = sqlContext.read.format('com.databricks.spark.csv').options(header='true').load('./data/titanic.csv')
24 | dataset = dataset.drop('PassengerId','Name','Ticket','Cabin')
25 |
26 | #set column types
27 | dataset = dataset.withColumn("Survived", dataset["Survived"].cast(IntegerType()))
28 | dataset = dataset.withColumn("Pclass", dataset["Pclass"].cast(IntegerType()))
29 | dataset = dataset.withColumn("Age", dataset["Age"].cast(DoubleType()))
30 | dataset = dataset.withColumn("SibSp", dataset["SibSp"].cast(IntegerType()))
31 | dataset = dataset.withColumn("Parch", dataset["Parch"].cast(IntegerType()))
32 | dataset = dataset.withColumn("Fare", dataset["Fare"].cast(DoubleType()))
33 |
34 | #fill NaN
35 | avg_age = round(dataset.groupBy().avg("age").collect()[0][0],2)
36 | dataset = dataset.na.fill({'Age': avg_age})
37 | dataset = dataset.na.drop()
38 |
39 | #map categorical data
40 | indexer = StringIndexer(inputCol="Sex", outputCol="SexInd")
41 | dataset = indexer.fit(dataset).transform(dataset)
42 |
43 | indexer = StringIndexer(inputCol="Embarked", outputCol="EmbarkedInd")
44 | dataset = indexer.fit(dataset).transform(dataset)
45 |
46 | #assemble features
47 | assembler = VectorAssembler(
48 | inputCols=["Age","Pclass","SexInd","SibSp","Parch","Fare","EmbarkedInd"],
49 | outputCol="features")
50 |
51 | dataset = assembler.transform(dataset)
52 |
53 | (trainingData, testData) = dataset.randomSplit([0.8, 0.2])
54 |
55 | #MLP
56 | layers = [7, 8, 4, 2] #input: 7 features; output: 2 classes
57 | mlp = MultilayerPerceptronClassifier(maxIter=100, layers=layers, labelCol="Survived", featuresCol="features", blockSize=128, seed=0)
58 |
59 | model = mlp.fit(trainingData)
60 | result = model.transform(testData)
61 |
62 | prediction_label = result.select("prediction", "Survived")
63 | evaluator = MulticlassClassificationEvaluator(labelCol="Survived", predictionCol="prediction", metricName="accuracy")
64 | print "MLP test accuracy: " + str(evaluator.evaluate(prediction_label))
65 |
66 |
--------------------------------------------------------------------------------
/outliers.py:
--------------------------------------------------------------------------------
1 | '''
2 | >>> print output
3 | [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
4 | '''
5 |
6 | from pyspark import SparkContext
7 |
8 | def remove_outliers(nums):
9 |
10 | stats = nums.stats()
11 | stddev = stats.stdev()
12 | return nums.filter(lambda x: abs(x-stats.mean()) < 3 * stddev)
13 |
14 | if __name__ == "__main__":
15 |
16 | sc = SparkContext('local', 'outliers')
17 | nums = sc.parallelize([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1000])
18 | output = sorted(remove_outliers(nums).collect())
19 | print output
20 |
21 |
--------------------------------------------------------------------------------
/pi_est.py:
--------------------------------------------------------------------------------
1 |
2 | import random
3 | from pyspark import SparkContext
4 |
5 | NUM_SAMPLES = 1000000
6 |
7 | def inside(p):
8 | x, y = random.random(), random.random()
9 | return x*x + y*y < 1
10 |
11 | if __name__ == "__main__":
12 |
13 | sc = SparkContext('local', 'pi_est')
14 |
15 | count = sc.parallelize(xrange(0,NUM_SAMPLES)).filter(inside).count()
16 | print "Pi estimate: %2.6f \n" %(4 * count / float(NUM_SAMPLES))
--------------------------------------------------------------------------------
/random_forest.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import pandas as pd
3 |
4 | from pyspark import SparkContext
5 | from pyspark.sql import SQLContext
6 | from pyspark.sql import SparkSession
7 | from pyspark.sql import Row
8 |
9 | from sklearn.preprocessing import LabelEncoder
10 | from pyspark.ml.classification import RandomForestClassifier
11 | from pyspark.ml.feature import VectorAssembler
12 | from pyspark.ml.evaluation import MulticlassClassificationEvaluator
13 |
14 |
15 | if __name__ == "__main__":
16 |
17 | sc = SparkContext('local', 'dataframe')
18 | sqlContext = SQLContext(sc)
19 |
20 | spark = SparkSession\
21 | .builder\
22 | .appName("RandomForestClassifier")\
23 | .getOrCreate()
24 |
25 | #dataset = sc.textFile("./data/titanic.csv", 1)
26 | #dataset = sqlContext.read.format('com.databricks.spark.csv').options(header='true').load('./data/titanic.csv')
27 | dataset = pd.read_csv("./data/titanic.csv")
28 | dataset = dataset.drop(['PassengerId','Name','Ticket','Cabin'], axis=1)
29 |
30 | #map categorical data
31 | for col in dataset.columns:
32 | if dataset[col].dtype == 'object':
33 | lbl = LabelEncoder()
34 | lbl.fit(dataset[col])
35 | dataset[col] = lbl.transform(dataset[col])
36 |
37 | #fill NaN
38 | median_age = dataset['Age'].dropna().median()
39 | dataset['Age'] = dataset['Age'].fillna(median_age)
40 | #dataset.isnull().sum()
41 |
42 | rdd_data = sc.parallelize(dataset.values)
43 | rdd_row = rdd_data.map(lambda p: Row(survived=int(p[0]),pclass=int(p[1]),sex=int(p[2]),age=float(p[3]),sibsp=int(p[4]),parch=int(p[5]),fare=float(p[6]),embarked=int(p[7])))
44 | titanic = spark.createDataFrame(rdd_row)
45 |
46 | assembler = VectorAssembler(
47 | inputCols=["age","embarked","fare","parch","pclass","sex","sibsp"],
48 | outputCol="features")
49 |
50 | titanic = assembler.transform(titanic)
51 | (trainingData, testData) = titanic.randomSplit([0.8, 0.2])
52 |
53 | #random forest classifier
54 | rfc = RandomForestClassifier(numTrees=100, maxDepth=6, labelCol="survived", featuresCol="features", seed=0)
55 | model = rfc.fit(trainingData)
56 |
57 | #feature importances
58 | model.featureImportances
59 |
60 | #predictions
61 | predictions = model.transform(testData)
62 | predictions.select("survived", "probability", "prediction").show(truncate=False)
63 |
64 | #compute test error
65 | evaluator = MulticlassClassificationEvaluator(labelCol="survived", predictionCol="prediction", metricName="accuracy")
66 | accuracy = evaluator.evaluate(predictions)
67 | print "RFC accuracy = %2.4f" % accuracy
68 |
69 |
--------------------------------------------------------------------------------
/recommender.py:
--------------------------------------------------------------------------------
1 |
2 | import os
3 | import math
4 | from time import time
5 | from pyspark import SparkContext
6 | from pyspark.mllib.recommendation import ALS
7 |
8 | DATASET_PATH = '/data/vision/fisher/data1/MovieLens/'
9 |
10 | def main():
11 |
12 | sc = SparkContext('local', 'als')
13 |
14 | small_ratings_file = os.path.join(DATASET_PATH,'ml-latest-small','ratings.csv')
15 | small_ratings_raw_data = sc.textFile(small_ratings_file)
16 | small_ratings_raw_data_header = small_ratings_raw_data.take(1)[0]
17 |
18 | #userId, movieId, rating, timestamp
19 | small_ratings_data = small_ratings_raw_data.filter(lambda line: line!=small_ratings_raw_data_header)\
20 | .map(lambda line: line.split(",")).map(lambda tokens: (tokens[0],tokens[1],tokens[2])).cache()
21 |
22 | small_movies_file = os.path.join(DATASET_PATH, 'ml-latest-small', 'movies.csv')
23 | small_movies_raw_data = sc.textFile(small_movies_file)
24 | small_movies_raw_data_header = small_movies_raw_data.take(1)[0]
25 |
26 | #movieId, title, genre
27 | small_movies_data = small_movies_raw_data.filter(lambda line: line!=small_movies_raw_data_header)\
28 | .map(lambda line: line.split(",")).map(lambda tokens: (tokens[0],tokens[1])).cache()
29 | small_movies_titles = small_movies_data.map(lambda x: (int(x[0]),x[1]))
30 |
31 |
32 | #training, validation, test split
33 | training_RDD, validation_RDD, test_RDD = small_ratings_data.randomSplit([6, 2, 2], seed=0L)
34 | validation_for_predict_RDD = validation_RDD.map(lambda x: (x[0], x[1]))
35 | test_for_predict_RDD = test_RDD.map(lambda x: (x[0], x[1]))
36 |
37 | #ALS parameters and model selection
38 | iterations = 10
39 | regularization_parameter = 0.1
40 | ranks = [4, 8, 12]
41 | tolerance = 0.02
42 | errors = []
43 |
44 | min_error = float('inf')
45 | best_rank = -1
46 | best_iteration = -1
47 |
48 | for rank in ranks:
49 | model = ALS.train(training_RDD, rank, seed=0, iterations=iterations, lambda_=regularization_parameter)
50 | predictions = model.predictAll(validation_for_predict_RDD).map(lambda r: ((r[0], r[1]), r[2])) #((user, product), rating)
51 | rates_and_preds = validation_RDD.map(lambda r: ((int(r[0]), int(r[1])), float(r[2]))).join(predictions) #((user, product), rating1, rating2)
52 | error = math.sqrt(rates_and_preds.map(lambda r: (r[1][0] - r[1][1])**2).mean()) #sqrt[mean((rating1 - rating2)**2)]
53 | errors.append(error)
54 |
55 | print 'For rank %s the RMSE is %s' % (rank, error)
56 | if error < min_error:
57 | min_error = error
58 | best_rank = rank
59 |
60 | print 'The best model was trained with rank %s' % best_rank
61 |
62 | #new user ratings
63 | new_user_ID = 0
64 | #userId, movieId, rating
65 | new_user_ratings = [
66 | (0,260,9), # Star Wars (1977)
67 | (0,1,8), # Toy Story (1995)
68 | (0,16,7), # Casino (1995)
69 | (0,25,8), # Leaving Las Vegas (1995)
70 | (0,32,9), # Twelve Monkeys (a.k.a. 12 Monkeys) (1995)
71 | (0,335,4), # Flintstones, The (1994)
72 | (0,379,3), # Timecop (1994)
73 | (0,296,7), # Pulp Fiction (1994)
74 | (0,858,10) , # Godfather, The (1972)
75 | (0,50,8) # Usual Suspects, The (1995)
76 | ]
77 | new_user_ratings_RDD = sc.parallelize(new_user_ratings)
78 | print 'New user ratings: %s' % new_user_ratings_RDD.take(10)
79 |
80 | #re-train the model
81 | small_data_with_new_ratings_RDD = small_ratings_data.union(new_user_ratings_RDD)
82 | t0 = time()
83 | new_ratings_model = ALS.train(small_data_with_new_ratings_RDD, best_rank, seed=0, iterations=iterations, lambda_=regularization_parameter)
84 | tt = time() - t0
85 | print "New model trained in %s seconds" % round(tt,3)
86 |
87 | #getting top recommendations
88 | new_user_ratings_ids = map(lambda x: x[1], new_user_ratings) # get just movie IDs
89 |
90 | #movieId, title, genre
91 | new_user_unrated_movies_RDD = (small_movies_data.filter(lambda x: x[0] not in new_user_ratings_ids).map(lambda x: (new_user_ID, x[0])))
92 | new_user_recommendations_RDD = new_ratings_model.predictAll(new_user_unrated_movies_RDD)
93 |
94 | #movieId, rating
95 | new_user_recommendations_rating_RDD = new_user_recommendations_RDD.map(lambda x: (x.product, x.rating))
96 | new_user_recommendations_rating_title_RDD = new_user_recommendations_rating_RDD.join(small_movies_titles)
97 | new_user_recommendations_rating_title_RDD = new_user_recommendations_rating_title_RDD.map(lambda r: (r[1][1], r[1][0]))
98 |
99 | top_movies = new_user_recommendations_rating_title_RDD.takeOrdered(25, key=lambda x: -x[1])
100 | print "top recommended movies:\n %s" % '\n'.join(map(str, top_movies))
101 |
102 |
103 |
104 |
105 |
106 |
107 |
108 |
109 |
110 | if __name__ == "__main__":
111 | main()
--------------------------------------------------------------------------------
/scala/pi_est.scala:
--------------------------------------------------------------------------------
1 | import org.apache.spark.sql.SparkSession
2 | import scala.math.random
3 |
4 | object pi_est {
5 |
6 | def inside(p:Int) : Boolean = {
7 | val x = random * 2 - 1
8 | val y = random * 2 - 1
9 | return x * x + y * y <= 1
10 | }
11 |
12 | def main(args: Array[String]): Unit = {
13 | val spark = SparkSession.builder.appName("pi_est").getOrCreate()
14 | val NUM_SAMPLES = 1000000
15 | val count = spark.sparkContext.parallelize(Range(0, NUM_SAMPLES)).filter(inside).count()
16 | val pi_est = 4.0 * count / NUM_SAMPLES
17 | println(s"Pi est = $pi_est")
18 | }
19 | }
--------------------------------------------------------------------------------
/term_doc.py:
--------------------------------------------------------------------------------
1 | '''
2 | >>> tokens.first()
3 | ['It', 'is', 'the', 'east,', 'and', 'Juliet', 'is', 'the', 'sun.']
4 |
5 | >>> local_vocab_map
6 | {'and': 0, 'A': 1, 'fit': 14, 'for': 13, 'of': 3, 'is': 4, 'gods.': 7, 'It': 11,\
7 | 'Brevity': 10, 'soul': 12, 'sun.': 8, 'dish': 2, 'east,': 9, 'the': 5, 'wit.': 6, 'Juliet': 15}
8 |
9 | >>> for doc in term_document_matrix.collect():
10 | print doc
11 | (16,[0,4,5,8,9,11,15],[1.0,2.0,2.0,1.0,1.0,1.0,1.0])
12 | (16,[1,2,5,7,13,14],[1.0,1.0,1.0,1.0,1.0,1.0])
13 | (16,[3,4,5,6,10,12],[1.0,1.0,1.0,1.0,1.0,1.0])
14 |
15 | '''
16 |
17 | from pyspark.mllib.linalg import SparseVector
18 | from collections import Counter
19 |
20 | from pyspark import SparkContext
21 |
22 | if __name__ == "__main__":
23 |
24 | sc = SparkContext('local', 'term_doc')
25 | corpus = sc.parallelize([
26 | "It is the east, and Juliet is the sun.",
27 | "A dish fit for the gods.",
28 | "Brevity is the soul of wit."])
29 |
30 | tokens = corpus.map(lambda raw_text: raw_text.split()).cache()
31 | local_vocab_map = tokens.flatMap(lambda token: token).distinct().zipWithIndex().collectAsMap()
32 |
33 | vocab_map = sc.broadcast(local_vocab_map)
34 | vocab_size = sc.broadcast(len(local_vocab_map))
35 |
36 | term_document_matrix = tokens \
37 | .map(Counter) \
38 | .map(lambda counts: {vocab_map.value[token]: float(counts[token]) for token in counts}) \
39 | .map(lambda index_counts: SparseVector(vocab_size.value, index_counts))
40 |
41 | for doc in term_document_matrix.collect():
42 | print doc
43 |
--------------------------------------------------------------------------------
/word_count.py:
--------------------------------------------------------------------------------
1 | '''
2 | >>> lines.collect()
3 | [u'hello world', u'hello pyspark', u'spark context', u'i like spark', u'hadoop rdd', u'text file', u'word count', u'', u'']
4 |
5 | >>> words.collect()
6 | [u'hello', u'world', u'hello', u'pyspark', u'spark', u'context', u'i', u'like', u'spark', u'hadoop', u'rdd', u'text', u'file', u'word', u'count', u'', u'']
7 |
8 | >>> ones.take(2)
9 | [(u'hello', 1), (u'world', 1)]
10 |
11 | >>> counts.takeSample(1, 2)
12 | [(u'spark', 2), (u'hello', 2)]
13 | '''
14 |
15 | from pyspark import SparkContext
16 |
17 | if __name__ == "__main__":
18 |
19 | sc = SparkContext('local', 'word_count')
20 | lines = sc.textFile("./data/words.txt", 1)
21 |
22 | words = lines.flatMap(lambda x: x.split(' '))
23 | ones = words.map(lambda x: (x, 1))
24 | counts = ones.reduceByKey(lambda x, y: x + y)
25 | counts = counts.sortByKey(1)
26 |
27 | counts.saveAsTextFile("./data/word_counts.txt")
--------------------------------------------------------------------------------