├── 25. PySpark ML (1).ipynb ├── Allstate Claims Severity.dbc ├── Allstate Claims Severity.ipynb ├── Apache Spark Action Examples with Python.ipynb ├── Apache Spark with Python Quick Start.ipynb ├── BigData.pdf ├── DF-3.ipynb ├── DF-4.ipynb ├── DataFrame Manupulation.ipynb ├── DataFrameProcessing.ipynb ├── DataFrames 2.ipynb ├── DataFrames Introduction using Python.dbc ├── Datamodeling.ipynb ├── Day 2 - Kabbage.ipynb ├── Day 2 - RDD.ipynb ├── Day 3 - Kabbage DF.ipynb ├── Day 3 - RDD.ipynb ├── GraphFrame Application.dbc ├── GraphFrame Application.ipynb ├── GraphFrame Basics.dbc ├── GraphFrame Basics.ipynb ├── HDFS.pdf ├── Infant Survival using Spark ML.dbc ├── Infant Survival using Spark ML.ipynb ├── KDD Cup Analysis.ipynb ├── KDD-Analysis.ipynb ├── Kabbage Project.dbc ├── Kabbage Project.ipynb ├── More to PySpark ML.dbc ├── More to PySpark ML.ipynb ├── Project-PySpark.zip ├── PySpark ML.dbc ├── PySpark ML.ipynb ├── PySpark SQL 2.ipynb ├── PySpark SQL.ipynb ├── README.md ├── Spark with Amazon S3.ipynb ├── Spark-SQL-CSV-with-Python.ipynb ├── Spark-Transformers-With-Spark.ipynb ├── Structured Streaming.dbc ├── Structured Streaming.ipynb ├── Test.ipynb ├── Understanding DataFrame.ipynb ├── YARN.pdf ├── all_state.py ├── data ├── Baby_Names__Beginning_2007.csv ├── README.md ├── Uber-Jan-Feb-FOIL.csv ├── airport-codes-na.txt ├── all-world-cup-players.json ├── allstate_test.csv.zip ├── allstate_train.csv.zip ├── baby_names_reduced.csv ├── births_transformed.csv.gz ├── departuredelays.csv ├── indian-premier-league-csv-dataset.zip ├── kddcup.data_10_percent.gz ├── test_sales.zip ├── train_sales.zip └── train_sales │ └── train.csv ├── first.py ├── hello.py ├── income_analysis.ipynb ├── kddcup.data_10_percent.gz ├── spark-business-case.pdf ├── uberstats.py └── zekeLabs_Logo.png /25. PySpark ML (1).ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/25. PySpark ML (1).ipynb -------------------------------------------------------------------------------- /Allstate Claims Severity.dbc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Allstate Claims Severity.dbc -------------------------------------------------------------------------------- /Allstate Claims Severity.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Allstate Claims Severity.ipynb -------------------------------------------------------------------------------- /Apache Spark Action Examples with Python.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Apache Spark Action Examples with Python.ipynb -------------------------------------------------------------------------------- /Apache Spark with Python Quick Start.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Apache Spark with Python Quick Start.ipynb -------------------------------------------------------------------------------- /BigData.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/BigData.pdf -------------------------------------------------------------------------------- /DF-3.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/DF-3.ipynb -------------------------------------------------------------------------------- /DF-4.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/DF-4.ipynb -------------------------------------------------------------------------------- /DataFrame Manupulation.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/DataFrame Manupulation.ipynb -------------------------------------------------------------------------------- /DataFrameProcessing.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/DataFrameProcessing.ipynb -------------------------------------------------------------------------------- /DataFrames 2.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/DataFrames 2.ipynb -------------------------------------------------------------------------------- /DataFrames Introduction using Python.dbc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/DataFrames Introduction using Python.dbc -------------------------------------------------------------------------------- /Datamodeling.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Datamodeling.ipynb -------------------------------------------------------------------------------- /Day 2 - Kabbage.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Day 2 - Kabbage.ipynb -------------------------------------------------------------------------------- /Day 2 - RDD.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Day 2 - RDD.ipynb -------------------------------------------------------------------------------- /Day 3 - Kabbage DF.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Day 3 - Kabbage DF.ipynb -------------------------------------------------------------------------------- /Day 3 - RDD.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Day 3 - RDD.ipynb -------------------------------------------------------------------------------- /GraphFrame Application.dbc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/GraphFrame Application.dbc -------------------------------------------------------------------------------- /GraphFrame Application.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/GraphFrame Application.ipynb -------------------------------------------------------------------------------- /GraphFrame Basics.dbc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/GraphFrame Basics.dbc -------------------------------------------------------------------------------- /GraphFrame Basics.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/GraphFrame Basics.ipynb -------------------------------------------------------------------------------- /HDFS.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/HDFS.pdf -------------------------------------------------------------------------------- /Infant Survival using Spark ML.dbc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Infant Survival using Spark ML.dbc -------------------------------------------------------------------------------- /Infant Survival using Spark ML.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Infant Survival using Spark ML.ipynb -------------------------------------------------------------------------------- /KDD Cup Analysis.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/KDD Cup Analysis.ipynb -------------------------------------------------------------------------------- /KDD-Analysis.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/KDD-Analysis.ipynb -------------------------------------------------------------------------------- /Kabbage Project.dbc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Kabbage Project.dbc -------------------------------------------------------------------------------- /Kabbage Project.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Kabbage Project.ipynb -------------------------------------------------------------------------------- /More to PySpark ML.dbc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/More to PySpark ML.dbc -------------------------------------------------------------------------------- /More to PySpark ML.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/More to PySpark ML.ipynb -------------------------------------------------------------------------------- /Project-PySpark.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Project-PySpark.zip -------------------------------------------------------------------------------- /PySpark ML.dbc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/PySpark ML.dbc -------------------------------------------------------------------------------- /PySpark ML.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/PySpark ML.ipynb -------------------------------------------------------------------------------- /PySpark SQL 2.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/PySpark SQL 2.ipynb -------------------------------------------------------------------------------- /PySpark SQL.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/PySpark SQL.ipynb -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/README.md -------------------------------------------------------------------------------- /Spark with Amazon S3.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Spark with Amazon S3.ipynb -------------------------------------------------------------------------------- /Spark-SQL-CSV-with-Python.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Spark-SQL-CSV-with-Python.ipynb -------------------------------------------------------------------------------- /Spark-Transformers-With-Spark.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Spark-Transformers-With-Spark.ipynb -------------------------------------------------------------------------------- /Structured Streaming.dbc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Structured Streaming.dbc -------------------------------------------------------------------------------- /Structured Streaming.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Structured Streaming.ipynb -------------------------------------------------------------------------------- /Test.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Test.ipynb -------------------------------------------------------------------------------- /Understanding DataFrame.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/Understanding DataFrame.ipynb -------------------------------------------------------------------------------- /YARN.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/YARN.pdf -------------------------------------------------------------------------------- /all_state.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/all_state.py -------------------------------------------------------------------------------- /data/Baby_Names__Beginning_2007.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/Baby_Names__Beginning_2007.csv -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/README.md -------------------------------------------------------------------------------- /data/Uber-Jan-Feb-FOIL.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/Uber-Jan-Feb-FOIL.csv -------------------------------------------------------------------------------- /data/airport-codes-na.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/airport-codes-na.txt -------------------------------------------------------------------------------- /data/all-world-cup-players.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/all-world-cup-players.json -------------------------------------------------------------------------------- /data/allstate_test.csv.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/allstate_test.csv.zip -------------------------------------------------------------------------------- /data/allstate_train.csv.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/allstate_train.csv.zip -------------------------------------------------------------------------------- /data/baby_names_reduced.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/baby_names_reduced.csv -------------------------------------------------------------------------------- /data/births_transformed.csv.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/births_transformed.csv.gz -------------------------------------------------------------------------------- /data/departuredelays.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/departuredelays.csv -------------------------------------------------------------------------------- /data/indian-premier-league-csv-dataset.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/indian-premier-league-csv-dataset.zip -------------------------------------------------------------------------------- /data/kddcup.data_10_percent.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/kddcup.data_10_percent.gz -------------------------------------------------------------------------------- /data/test_sales.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/test_sales.zip -------------------------------------------------------------------------------- /data/train_sales.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/train_sales.zip -------------------------------------------------------------------------------- /data/train_sales/train.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/data/train_sales/train.csv -------------------------------------------------------------------------------- /first.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/first.py -------------------------------------------------------------------------------- /hello.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/hello.py -------------------------------------------------------------------------------- /income_analysis.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/income_analysis.ipynb -------------------------------------------------------------------------------- /kddcup.data_10_percent.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/kddcup.data_10_percent.gz -------------------------------------------------------------------------------- /spark-business-case.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/spark-business-case.pdf -------------------------------------------------------------------------------- /uberstats.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/uberstats.py -------------------------------------------------------------------------------- /zekeLabs_Logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awantik/pyspark-learning/HEAD/zekeLabs_Logo.png --------------------------------------------------------------------------------