├── .gitignore ├── LICENSE ├── Pic └── DAGScheduler划分及提交stage-代码调用过程.jpg ├── README.md ├── Spark_With_Scala_Testing ├── .cache-main ├── .classpath ├── .gitignore ├── .project ├── .settings │ └── org.scala-ide.sdt.core.prefs ├── data │ ├── employees.json │ ├── hello.txt │ ├── people.csv │ ├── people.json │ ├── people.txt │ ├── result │ │ ├── ._SUCCESS.crc │ │ ├── .part-00000.crc │ │ ├── _SUCCESS │ │ └── part-00000 │ ├── secondarySort.txt │ ├── spark.md │ ├── test │ │ ├── ._SUCCESS.crc │ │ ├── .part-00000-305efd32-4d97-4a4d-acf9-fb22c9c6e05e-c000.csv.crc │ │ ├── _SUCCESS │ │ └── part-00000-305efd32-4d97-4a4d-acf9-fb22c9c6e05e-c000.csv │ ├── topN.txt │ └── users.parquet └── src │ ├── sparkCore │ ├── Action.scala │ ├── MyKey.scala │ ├── Persist.scala │ ├── SecondarySort.scala │ ├── TopN .scala │ ├── Transformation.scala │ └── sortedWordCount.scala │ ├── sparkSql │ ├── DataFrameOperations.scala │ ├── LoadAndSave.scala │ ├── RDDtoDataFrame.scala │ ├── RDDtoDataFrame2.scala │ └── SqlContextTest.scala │ └── test │ └── Test .scala └── notes ├── LearningSpark(1)数据来源.md ├── LearningSpark(2)spark-submit可选参数.md ├── LearningSpark(3)RDD操作.md ├── LearningSpark(4)Spark持久化操作.md ├── LearningSpark(5)Spark共享变量.md ├── LearningSpark(6)Spark内核架构剖析.md ├── LearningSpark(7)SparkSQL之DataFrame学习.md ├── LearningSpark(8)RDD如何转化为DataFrame.md ├── LearningSpark(9)SparkSQL数据来源.md ├── RDD如何作为参数传给函数.md ├── Scala排序函数使用.md ├── Spark DataFrame如何更改列column的类型.md ├── Spark2.4+Hive使用Hive现有仓库.md ├── assets ├── 1550161394643.png ├── 1550670063721.png ├── 20190215002003.png ├── cache.png ├── cluster-client.png ├── source.jpg └── 导入spark.png ├── eclipse中Attach Source找不到源码,该如何查看jar包源码.md ├── eclipse如何导入Spark源码方便阅读.md ├── 使用JDBC将DataFrame写入mysql.md ├── 判断RDD是否为空.md ├── 报错和问题归纳.md └── 高级排序和topN问题.md /.gitignore: -------------------------------------------------------------------------------- 1 | *.class 2 | *.log 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/LICENSE -------------------------------------------------------------------------------- /Pic/DAGScheduler划分及提交stage-代码调用过程.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Pic/DAGScheduler划分及提交stage-代码调用过程.jpg -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/README.md -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/.cache-main: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/.cache-main -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/.classpath: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/.classpath -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/.gitignore: -------------------------------------------------------------------------------- 1 | /bin/ 2 | -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/.project: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/.project -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/.settings/org.scala-ide.sdt.core.prefs: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/.settings/org.scala-ide.sdt.core.prefs -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/employees.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/employees.json -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/hello.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/hello.txt -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/people.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/people.csv -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/people.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/people.json -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/people.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/people.txt -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/result/._SUCCESS.crc: -------------------------------------------------------------------------------- 1 | crc -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/result/.part-00000.crc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/result/.part-00000.crc -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/result/_SUCCESS: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/result/part-00000: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/result/part-00000 -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/secondarySort.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/secondarySort.txt -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/spark.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/spark.md -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/test/._SUCCESS.crc: -------------------------------------------------------------------------------- 1 | crc -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/test/.part-00000-305efd32-4d97-4a4d-acf9-fb22c9c6e05e-c000.csv.crc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/test/.part-00000-305efd32-4d97-4a4d-acf9-fb22c9c6e05e-c000.csv.crc -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/test/_SUCCESS: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/test/part-00000-305efd32-4d97-4a4d-acf9-fb22c9c6e05e-c000.csv: -------------------------------------------------------------------------------- 1 | age;name 2 | "";Michael 3 | 30;Andy 4 | 19;Justin 5 | -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/topN.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/topN.txt -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/data/users.parquet: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/data/users.parquet -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkCore/Action.scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkCore/Action.scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkCore/MyKey.scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkCore/MyKey.scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkCore/Persist.scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkCore/Persist.scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkCore/SecondarySort.scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkCore/SecondarySort.scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkCore/TopN .scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkCore/TopN .scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkCore/Transformation.scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkCore/Transformation.scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkCore/sortedWordCount.scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkCore/sortedWordCount.scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkSql/DataFrameOperations.scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkSql/DataFrameOperations.scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkSql/LoadAndSave.scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkSql/LoadAndSave.scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkSql/RDDtoDataFrame.scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkSql/RDDtoDataFrame.scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkSql/RDDtoDataFrame2.scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkSql/RDDtoDataFrame2.scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/sparkSql/SqlContextTest.scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/sparkSql/SqlContextTest.scala -------------------------------------------------------------------------------- /Spark_With_Scala_Testing/src/test/Test .scala: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/Spark_With_Scala_Testing/src/test/Test .scala -------------------------------------------------------------------------------- /notes/LearningSpark(1)数据来源.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/LearningSpark(1)数据来源.md -------------------------------------------------------------------------------- /notes/LearningSpark(2)spark-submit可选参数.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/LearningSpark(2)spark-submit可选参数.md -------------------------------------------------------------------------------- /notes/LearningSpark(3)RDD操作.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/LearningSpark(3)RDD操作.md -------------------------------------------------------------------------------- /notes/LearningSpark(4)Spark持久化操作.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/LearningSpark(4)Spark持久化操作.md -------------------------------------------------------------------------------- /notes/LearningSpark(5)Spark共享变量.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/LearningSpark(5)Spark共享变量.md -------------------------------------------------------------------------------- /notes/LearningSpark(6)Spark内核架构剖析.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/LearningSpark(6)Spark内核架构剖析.md -------------------------------------------------------------------------------- /notes/LearningSpark(7)SparkSQL之DataFrame学习.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/LearningSpark(7)SparkSQL之DataFrame学习.md -------------------------------------------------------------------------------- /notes/LearningSpark(8)RDD如何转化为DataFrame.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/LearningSpark(8)RDD如何转化为DataFrame.md -------------------------------------------------------------------------------- /notes/LearningSpark(9)SparkSQL数据来源.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/LearningSpark(9)SparkSQL数据来源.md -------------------------------------------------------------------------------- /notes/RDD如何作为参数传给函数.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/RDD如何作为参数传给函数.md -------------------------------------------------------------------------------- /notes/Scala排序函数使用.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/Scala排序函数使用.md -------------------------------------------------------------------------------- /notes/Spark DataFrame如何更改列column的类型.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/Spark DataFrame如何更改列column的类型.md -------------------------------------------------------------------------------- /notes/Spark2.4+Hive使用Hive现有仓库.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/Spark2.4+Hive使用Hive现有仓库.md -------------------------------------------------------------------------------- /notes/assets/1550161394643.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/assets/1550161394643.png -------------------------------------------------------------------------------- /notes/assets/1550670063721.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/assets/1550670063721.png -------------------------------------------------------------------------------- /notes/assets/20190215002003.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/assets/20190215002003.png -------------------------------------------------------------------------------- /notes/assets/cache.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/assets/cache.png -------------------------------------------------------------------------------- /notes/assets/cluster-client.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/assets/cluster-client.png -------------------------------------------------------------------------------- /notes/assets/source.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/assets/source.jpg -------------------------------------------------------------------------------- /notes/assets/导入spark.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/assets/导入spark.png -------------------------------------------------------------------------------- /notes/eclipse中Attach Source找不到源码,该如何查看jar包源码.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/eclipse中Attach Source找不到源码,该如何查看jar包源码.md -------------------------------------------------------------------------------- /notes/eclipse如何导入Spark源码方便阅读.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/eclipse如何导入Spark源码方便阅读.md -------------------------------------------------------------------------------- /notes/使用JDBC将DataFrame写入mysql.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/使用JDBC将DataFrame写入mysql.md -------------------------------------------------------------------------------- /notes/判断RDD是否为空.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/判断RDD是否为空.md -------------------------------------------------------------------------------- /notes/报错和问题归纳.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/报错和问题归纳.md -------------------------------------------------------------------------------- /notes/高级排序和topN问题.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josonle/Learning-Spark/HEAD/notes/高级排序和topN问题.md --------------------------------------------------------------------------------