├── Code ├── 02-Spark RDDs │ ├── 02-Creating Spark RDD.py │ ├── 04 , 05-MAP.py │ ├── 07-Solution 1 (Map).py │ ├── 08-Solution 2 (Map).py │ ├── 09-RDD FlatMap.py │ ├── 10-RDD Filter.py │ ├── 12-Solution (Filter).py │ ├── 13-RDD Distinct.py │ ├── 14-RDD GroupByKey.py │ ├── 15-RDD ReduceByKey.py │ ├── 17-Solution (Word Count).py │ ├── 18-RDD (Count and CountByValue).py │ ├── 19-RDD (saveAsTextFile).py │ ├── 20-RDD (Partition).py │ ├── 22-Finding Average-2.py │ ├── 24-Solution (Average).py │ ├── 25-Finding Min and Max.py │ ├── 27-Solution (Min and Max).py │ ├── 28-36 Project Code.py │ ├── StudentData.csv │ ├── average_quiz_sample.csv │ └── movie_ratings.csv ├── 03-Spark DFs │ ├── 02-Creating Spark DFs.py │ ├── 03-Spark Infer Schema.py │ ├── 04-Spark Provide Schema.py │ ├── 05-Create DF from Rdd.py │ ├── 06-Rectifying the Error.py │ ├── 07-Select DF Colums.py │ ├── 08-Spark DF withColumn.py │ ├── 09-Spark DF withColumnRenamed and Alias.py │ ├── 10-Spark DF Filter rows.py │ ├── 12-Solution (select, withColumn, filter).py │ ├── 13-Spark DF (Count, Distinct, Duplicate).py │ ├── 15-Solution (Distinct, Duplicate).py │ ├── 16-Spark DF (sort, orderBy).py │ ├── 18-Solution (sort, orderBy).py │ ├── 19-Spark DF (Group By).py │ ├── 20-Spark DF (Group By - Multiple Rows and Aggregations).py │ ├── 22-Spark DF (Group By - Filtering).py │ ├── 24-Solution (Group By).py │ ├── 26-Solution (Word Count).py │ ├── 27-Spark DF (UDFs).py │ ├── 29-Solution (UDFs).py │ ├── 30-Solution (Cache and Presist).py │ ├── 31-Spark DF (DF to RDD).py │ ├── 32-Spark DF (Spark SQL).py │ ├── 33-Spark DF (Write DF).py │ ├── 34-39 Project.py │ ├── OfficeData.csv │ ├── OfficeDataProject.csv │ ├── StudentData.csv │ └── WordData.txt ├── 04-Collaborative filtering │ ├── Collaborative filtering.py │ └── Data Set │ │ └── Colaborative Filtering Dataset │ │ ├── movies.csv │ │ └── ratings.csv ├── 05-Spark Streaming │ ├── 2-6 Spark Streaming RDD.py │ ├── 7-9 Spark Streaming DF.py │ └── WordData.txt ├── 06-ETL │ ├── 1-13 ETL.py │ └── WordData.txt ├── 07-Project (CDC) │ ├── Dump │ ├── Glue.py │ └── lambda_function.py ├── 8-Chatbots Development with Amazon Lex.zip ├── Sample.txt └── Spark Setup.docx └── PPT └── Slides.pptx /Code/02-Spark RDDs/02-Creating Spark RDD.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/02-Creating Spark RDD.py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/04 , 05-MAP.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/04 , 05-MAP.py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/07-Solution 1 (Map).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/07-Solution 1 (Map).py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/08-Solution 2 (Map).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/08-Solution 2 (Map).py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/09-RDD FlatMap.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/09-RDD FlatMap.py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/10-RDD Filter.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/10-RDD Filter.py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/12-Solution (Filter).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/12-Solution (Filter).py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/13-RDD Distinct.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/13-RDD Distinct.py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/14-RDD GroupByKey.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/14-RDD GroupByKey.py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/15-RDD ReduceByKey.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/15-RDD ReduceByKey.py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/17-Solution (Word Count).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/17-Solution (Word Count).py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/18-RDD (Count and CountByValue).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/18-RDD (Count and CountByValue).py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/19-RDD (saveAsTextFile).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/19-RDD (saveAsTextFile).py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/20-RDD (Partition).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/20-RDD (Partition).py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/22-Finding Average-2.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/22-Finding Average-2.py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/24-Solution (Average).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/24-Solution (Average).py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/25-Finding Min and Max.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/25-Finding Min and Max.py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/27-Solution (Min and Max).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/27-Solution (Min and Max).py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/28-36 Project Code.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/28-36 Project Code.py -------------------------------------------------------------------------------- /Code/02-Spark RDDs/StudentData.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/StudentData.csv -------------------------------------------------------------------------------- /Code/02-Spark RDDs/average_quiz_sample.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/average_quiz_sample.csv -------------------------------------------------------------------------------- /Code/02-Spark RDDs/movie_ratings.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/02-Spark RDDs/movie_ratings.csv -------------------------------------------------------------------------------- /Code/03-Spark DFs/02-Creating Spark DFs.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/02-Creating Spark DFs.py -------------------------------------------------------------------------------- /Code/03-Spark DFs/03-Spark Infer Schema.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/03-Spark Infer Schema.py -------------------------------------------------------------------------------- /Code/03-Spark DFs/04-Spark Provide Schema.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/04-Spark Provide Schema.py -------------------------------------------------------------------------------- /Code/03-Spark DFs/05-Create DF from Rdd.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/05-Create DF from Rdd.py -------------------------------------------------------------------------------- /Code/03-Spark DFs/06-Rectifying the Error.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/06-Rectifying the Error.py -------------------------------------------------------------------------------- /Code/03-Spark DFs/07-Select DF Colums.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/07-Select DF Colums.py -------------------------------------------------------------------------------- /Code/03-Spark DFs/08-Spark DF withColumn.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/08-Spark DF withColumn.py -------------------------------------------------------------------------------- /Code/03-Spark DFs/09-Spark DF withColumnRenamed and Alias.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/09-Spark DF withColumnRenamed and Alias.py -------------------------------------------------------------------------------- /Code/03-Spark DFs/10-Spark DF Filter rows.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/10-Spark DF Filter rows.py -------------------------------------------------------------------------------- /Code/03-Spark DFs/12-Solution (select, withColumn, filter).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/12-Solution (select, withColumn, filter).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/13-Spark DF (Count, Distinct, Duplicate).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/13-Spark DF (Count, Distinct, Duplicate).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/15-Solution (Distinct, Duplicate).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/15-Solution (Distinct, Duplicate).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/16-Spark DF (sort, orderBy).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/16-Spark DF (sort, orderBy).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/18-Solution (sort, orderBy).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/18-Solution (sort, orderBy).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/19-Spark DF (Group By).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/19-Spark DF (Group By).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/20-Spark DF (Group By - Multiple Rows and Aggregations).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/20-Spark DF (Group By - Multiple Rows and Aggregations).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/22-Spark DF (Group By - Filtering).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/22-Spark DF (Group By - Filtering).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/24-Solution (Group By).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/24-Solution (Group By).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/26-Solution (Word Count).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/26-Solution (Word Count).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/27-Spark DF (UDFs).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/27-Spark DF (UDFs).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/29-Solution (UDFs).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/29-Solution (UDFs).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/30-Solution (Cache and Presist).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/30-Solution (Cache and Presist).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/31-Spark DF (DF to RDD).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/31-Spark DF (DF to RDD).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/32-Spark DF (Spark SQL).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/32-Spark DF (Spark SQL).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/33-Spark DF (Write DF).py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/33-Spark DF (Write DF).py -------------------------------------------------------------------------------- /Code/03-Spark DFs/34-39 Project.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/34-39 Project.py -------------------------------------------------------------------------------- /Code/03-Spark DFs/OfficeData.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/OfficeData.csv -------------------------------------------------------------------------------- /Code/03-Spark DFs/OfficeDataProject.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/OfficeDataProject.csv -------------------------------------------------------------------------------- /Code/03-Spark DFs/StudentData.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/StudentData.csv -------------------------------------------------------------------------------- /Code/03-Spark DFs/WordData.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/03-Spark DFs/WordData.txt -------------------------------------------------------------------------------- /Code/04-Collaborative filtering/Collaborative filtering.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/04-Collaborative filtering/Collaborative filtering.py -------------------------------------------------------------------------------- /Code/04-Collaborative filtering/Data Set/Colaborative Filtering Dataset/movies.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/04-Collaborative filtering/Data Set/Colaborative Filtering Dataset/movies.csv -------------------------------------------------------------------------------- /Code/04-Collaborative filtering/Data Set/Colaborative Filtering Dataset/ratings.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/04-Collaborative filtering/Data Set/Colaborative Filtering Dataset/ratings.csv -------------------------------------------------------------------------------- /Code/05-Spark Streaming/2-6 Spark Streaming RDD.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/05-Spark Streaming/2-6 Spark Streaming RDD.py -------------------------------------------------------------------------------- /Code/05-Spark Streaming/7-9 Spark Streaming DF.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/05-Spark Streaming/7-9 Spark Streaming DF.py -------------------------------------------------------------------------------- /Code/05-Spark Streaming/WordData.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/05-Spark Streaming/WordData.txt -------------------------------------------------------------------------------- /Code/06-ETL/1-13 ETL.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/06-ETL/1-13 ETL.py -------------------------------------------------------------------------------- /Code/06-ETL/WordData.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/06-ETL/WordData.txt -------------------------------------------------------------------------------- /Code/07-Project (CDC)/Dump: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/07-Project (CDC)/Dump -------------------------------------------------------------------------------- /Code/07-Project (CDC)/Glue.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/07-Project (CDC)/Glue.py -------------------------------------------------------------------------------- /Code/07-Project (CDC)/lambda_function.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/07-Project (CDC)/lambda_function.py -------------------------------------------------------------------------------- /Code/8-Chatbots Development with Amazon Lex.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/8-Chatbots Development with Amazon Lex.zip -------------------------------------------------------------------------------- /Code/Sample.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/Sample.txt -------------------------------------------------------------------------------- /Code/Spark Setup.docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/Code/Spark Setup.docx -------------------------------------------------------------------------------- /PPT/Slides.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AISCIENCES/course-master-big-data-with-pyspark-and-aws/HEAD/PPT/Slides.pptx --------------------------------------------------------------------------------