├── LICENSE ├── README.md ├── SUMMARY.md ├── deploying ├── running-spark-on-yarn.md ├── spark-standalone-mode.md └── submitting-applications.md ├── graphx-programming-guide ├── README.md ├── examples.md ├── getting-started.md ├── graph-algorithms.md ├── graph-builders.md ├── graph-operators.md ├── pregel-api.md ├── property-graph.md └── vertex-and-edge-rdds.md ├── img ├── data_parallel_vs_graph_parallel.png ├── flume.png ├── graph_analytics_pipeline.png ├── property_graph.png ├── streaming-arch.png ├── streaming-dstream-ops.png ├── streaming-dstream-window.png ├── streaming-dstream.png ├── streaming-flow.png ├── streaming-kinesis-arch.png ├── tables_and_graphs.png └── triplet.png ├── more ├── spark-configuration.md └── spark-tuning.md ├── programming-guide ├── README.md ├── from-here.md ├── initializing-spark.md ├── linking-with-spark.md ├── rdds │ ├── README.md │ ├── actions.md │ ├── external-datasets.md │ ├── parallelized-collections.md │ ├── passing-functions-to-spark.md │ ├── rdd-operations.md │ ├── rdd-persistences.md │ ├── transformations.md │ └── working-with-key-value-pairs.md └── shared-variables.md ├── quick-start ├── README.md ├── standalone-applications.md ├── using-spark-shell.md └── where-to-go-from-here.md ├── spark-sql ├── README.md ├── compatibility-with-other-systems │ ├── compatibility-with-apache-hive.md │ └── migration-guide-shark-user.md ├── data-sources │ ├── README.md │ ├── hive-tables.md │ ├── jSON-datasets.md │ ├── parquet-files.md │ └── rdds.md ├── getting-started.md ├── other-sql-interfaces.md ├── performance-tuning.md ├── spark-sql-dataType-reference.md └── writing-language-integrated-relational-queries.md └── spark-streaming ├── README.md ├── a-quick-example.md ├── basic-concepts ├── README.md ├── caching-persistence.md ├── checkpointing.md ├── custom-receiver.md ├── deploying-applications.md ├── discretized-streams.md ├── flume-integration-guide.md ├── initializing-StreamingContext.md ├── input-DStreams.md ├── kafka-integration-guide.md ├── kinesis-integration.md ├── linking.md ├── monitoring-applications.md ├── output-operations-on-DStreams.md └── transformations-on-DStreams.md ├── fault-tolerance-semantics └── README.md └── performance-tuning ├── README.md ├── memory-tuning.md ├── reducing-processing-time.md └── setting-right-batch-size.md /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/README.md -------------------------------------------------------------------------------- /SUMMARY.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/SUMMARY.md -------------------------------------------------------------------------------- /deploying/running-spark-on-yarn.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/deploying/running-spark-on-yarn.md -------------------------------------------------------------------------------- /deploying/spark-standalone-mode.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/deploying/spark-standalone-mode.md -------------------------------------------------------------------------------- /deploying/submitting-applications.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/deploying/submitting-applications.md -------------------------------------------------------------------------------- /graphx-programming-guide/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/graphx-programming-guide/README.md -------------------------------------------------------------------------------- /graphx-programming-guide/examples.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/graphx-programming-guide/examples.md -------------------------------------------------------------------------------- /graphx-programming-guide/getting-started.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/graphx-programming-guide/getting-started.md -------------------------------------------------------------------------------- /graphx-programming-guide/graph-algorithms.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/graphx-programming-guide/graph-algorithms.md -------------------------------------------------------------------------------- /graphx-programming-guide/graph-builders.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/graphx-programming-guide/graph-builders.md -------------------------------------------------------------------------------- /graphx-programming-guide/graph-operators.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/graphx-programming-guide/graph-operators.md -------------------------------------------------------------------------------- /graphx-programming-guide/pregel-api.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/graphx-programming-guide/pregel-api.md -------------------------------------------------------------------------------- /graphx-programming-guide/property-graph.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/graphx-programming-guide/property-graph.md -------------------------------------------------------------------------------- /graphx-programming-guide/vertex-and-edge-rdds.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/graphx-programming-guide/vertex-and-edge-rdds.md -------------------------------------------------------------------------------- /img/data_parallel_vs_graph_parallel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/data_parallel_vs_graph_parallel.png -------------------------------------------------------------------------------- /img/flume.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/flume.png -------------------------------------------------------------------------------- /img/graph_analytics_pipeline.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/graph_analytics_pipeline.png -------------------------------------------------------------------------------- /img/property_graph.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/property_graph.png -------------------------------------------------------------------------------- /img/streaming-arch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/streaming-arch.png -------------------------------------------------------------------------------- /img/streaming-dstream-ops.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/streaming-dstream-ops.png -------------------------------------------------------------------------------- /img/streaming-dstream-window.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/streaming-dstream-window.png -------------------------------------------------------------------------------- /img/streaming-dstream.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/streaming-dstream.png -------------------------------------------------------------------------------- /img/streaming-flow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/streaming-flow.png -------------------------------------------------------------------------------- /img/streaming-kinesis-arch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/streaming-kinesis-arch.png -------------------------------------------------------------------------------- /img/tables_and_graphs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/tables_and_graphs.png -------------------------------------------------------------------------------- /img/triplet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/img/triplet.png -------------------------------------------------------------------------------- /more/spark-configuration.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/more/spark-configuration.md -------------------------------------------------------------------------------- /more/spark-tuning.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/more/spark-tuning.md -------------------------------------------------------------------------------- /programming-guide/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/README.md -------------------------------------------------------------------------------- /programming-guide/from-here.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/from-here.md -------------------------------------------------------------------------------- /programming-guide/initializing-spark.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/initializing-spark.md -------------------------------------------------------------------------------- /programming-guide/linking-with-spark.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/linking-with-spark.md -------------------------------------------------------------------------------- /programming-guide/rdds/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/rdds/README.md -------------------------------------------------------------------------------- /programming-guide/rdds/actions.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/rdds/actions.md -------------------------------------------------------------------------------- /programming-guide/rdds/external-datasets.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/rdds/external-datasets.md -------------------------------------------------------------------------------- /programming-guide/rdds/parallelized-collections.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/rdds/parallelized-collections.md -------------------------------------------------------------------------------- /programming-guide/rdds/passing-functions-to-spark.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/rdds/passing-functions-to-spark.md -------------------------------------------------------------------------------- /programming-guide/rdds/rdd-operations.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/rdds/rdd-operations.md -------------------------------------------------------------------------------- /programming-guide/rdds/rdd-persistences.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/rdds/rdd-persistences.md -------------------------------------------------------------------------------- /programming-guide/rdds/transformations.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/rdds/transformations.md -------------------------------------------------------------------------------- /programming-guide/rdds/working-with-key-value-pairs.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/rdds/working-with-key-value-pairs.md -------------------------------------------------------------------------------- /programming-guide/shared-variables.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/programming-guide/shared-variables.md -------------------------------------------------------------------------------- /quick-start/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/quick-start/README.md -------------------------------------------------------------------------------- /quick-start/standalone-applications.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/quick-start/standalone-applications.md -------------------------------------------------------------------------------- /quick-start/using-spark-shell.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/quick-start/using-spark-shell.md -------------------------------------------------------------------------------- /quick-start/where-to-go-from-here.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/quick-start/where-to-go-from-here.md -------------------------------------------------------------------------------- /spark-sql/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/README.md -------------------------------------------------------------------------------- /spark-sql/compatibility-with-other-systems/compatibility-with-apache-hive.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/compatibility-with-other-systems/compatibility-with-apache-hive.md -------------------------------------------------------------------------------- /spark-sql/compatibility-with-other-systems/migration-guide-shark-user.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/compatibility-with-other-systems/migration-guide-shark-user.md -------------------------------------------------------------------------------- /spark-sql/data-sources/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/data-sources/README.md -------------------------------------------------------------------------------- /spark-sql/data-sources/hive-tables.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/data-sources/hive-tables.md -------------------------------------------------------------------------------- /spark-sql/data-sources/jSON-datasets.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/data-sources/jSON-datasets.md -------------------------------------------------------------------------------- /spark-sql/data-sources/parquet-files.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/data-sources/parquet-files.md -------------------------------------------------------------------------------- /spark-sql/data-sources/rdds.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/data-sources/rdds.md -------------------------------------------------------------------------------- /spark-sql/getting-started.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/getting-started.md -------------------------------------------------------------------------------- /spark-sql/other-sql-interfaces.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/other-sql-interfaces.md -------------------------------------------------------------------------------- /spark-sql/performance-tuning.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/performance-tuning.md -------------------------------------------------------------------------------- /spark-sql/spark-sql-dataType-reference.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/spark-sql-dataType-reference.md -------------------------------------------------------------------------------- /spark-sql/writing-language-integrated-relational-queries.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-sql/writing-language-integrated-relational-queries.md -------------------------------------------------------------------------------- /spark-streaming/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/README.md -------------------------------------------------------------------------------- /spark-streaming/a-quick-example.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/a-quick-example.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/README.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/caching-persistence.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/caching-persistence.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/checkpointing.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/checkpointing.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/custom-receiver.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/custom-receiver.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/deploying-applications.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/deploying-applications.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/discretized-streams.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/discretized-streams.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/flume-integration-guide.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/flume-integration-guide.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/initializing-StreamingContext.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/initializing-StreamingContext.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/input-DStreams.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/input-DStreams.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/kafka-integration-guide.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/kafka-integration-guide.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/kinesis-integration.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/kinesis-integration.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/linking.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/linking.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/monitoring-applications.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/monitoring-applications.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/output-operations-on-DStreams.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/output-operations-on-DStreams.md -------------------------------------------------------------------------------- /spark-streaming/basic-concepts/transformations-on-DStreams.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/basic-concepts/transformations-on-DStreams.md -------------------------------------------------------------------------------- /spark-streaming/fault-tolerance-semantics/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/fault-tolerance-semantics/README.md -------------------------------------------------------------------------------- /spark-streaming/performance-tuning/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/performance-tuning/README.md -------------------------------------------------------------------------------- /spark-streaming/performance-tuning/memory-tuning.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/performance-tuning/memory-tuning.md -------------------------------------------------------------------------------- /spark-streaming/performance-tuning/reducing-processing-time.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/performance-tuning/reducing-processing-time.md -------------------------------------------------------------------------------- /spark-streaming/performance-tuning/setting-right-batch-size.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/endymecy/spark-programming-guide-zh-cn/HEAD/spark-streaming/performance-tuning/setting-right-batch-size.md --------------------------------------------------------------------------------