├── README.md └── labs ├── 01-starting-kafka.md └── 02-create-topic.md /README.md: -------------------------------------------------------------------------------- 1 | # kafka-learning 2 | 3 |

4 | 5 |

6 | 7 | ## Overview 8 | - [Home](https://kafka.apache.org/) 9 | - [QuickStart](https://kafka.apache.org/quickstart) 10 | 11 | ## Tutorials 12 | - [Apache Kafka Tutorial](https://www.youtube.com/playlist?list=PLkz1SCf5iB4enAR00Z46JwY9GGkaS2NON) 13 | - [Udemy Apache Kafka Series - Learning Apache Kafka for Beginners](https://www.udemy.com/apache-kafka-series-kafka-from-beginner-to-intermediate) 14 | - [Udemy Apache Kafka Series - Kafka Connect Hands-on Learning](https://www.udemy.com/apache-kafka-series-kafka-connect-hands-on-learning/) 15 | 16 | ## Books 17 | - [Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale](https://www.amazon.com/Kafka-Definitive-Real-Time-Stream-Processing/dp/1491936169) 18 | 19 | ## Videos 20 | - [ETL Is Dead, Long Live Streams: real-time streams w/ Apache Kafka](https://www.youtube.com/watch?v=I32hmY4diFY) (2017.02) 21 | - [Airstream: Spark Streaming At Airbnb](https://youtu.be/tJ1uIHQtoNc) (2016.07) 22 | - [Developing Real-Time Data Pipelines with Apache Kafka](https://www.youtube.com/watch?v=GRPLRONVDWY) (2016.03) 23 | - [Building Realtime Data Pipelines with Kafka Connect and Spark Streaming](https://youtu.be/wMLAlJimPzk) (2016.02) 24 | - [Putting Apache Kafka to Use for Event Streams](https://www.youtube.com/watch?v=el-SqcZLZlI) (2015.04) 25 | 26 | ## Links 27 | - [Wiki](https://cwiki.apache.org/confluence/display/KAFKA/Index) 28 | - [FAQ](https://cwiki.apache.org/confluence/display/KAFKA/FAQ) 29 | - [Committers](http://kafka.apache.org/committers) 30 | - [Ecosystem](https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem) 31 | - [Papers & Talks](https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations) 32 | - [Awesome Kafka](https://github.com/infoslack/awesome-kafka) 33 | -------------------------------------------------------------------------------- /labs/01-starting-kafka.md: -------------------------------------------------------------------------------- 1 | Docker for Mac >= 1.12, Linux, Docker for Windows 10: 2 | 3 | docker run --rm -it \ 4 | -p 2181:2181 -p 3030:3030 -p 8081:8081 \ 5 | -p 8082:8082 -p 8083:8083 -p 9092:9092 \ 6 | -e ADV_HOST=127.0.0.1 \ 7 | landoop/fast-data-dev 8 | 9 | Docker toolbox: 10 | 11 | docker run --rm -it \ 12 | -p 2181:2181 -p 3030:3030 -p 8081:8081 \ 13 | -p 8082:8082 -p 8083:8083 -p 9092:9092 \ 14 | -e ADV_HOST=192.168.99.100 \ 15 | landoop/fast-data-dev 16 | 17 | Kafka command lines tools: 18 | 19 | docker run --rm -it --net=host landoop/fast-data-dev bash 20 | 21 | 22 | Results: 23 | 24 | docker run --rm -it \ 25 | -p 2181:2181 -p 3030:3030 -p 8081:8081 \ 26 | -p 8082:8082 -p 8083:8083 -p 9092:9092 \ 27 | -e ADV_HOST=127.0.0.1 \ 28 | landoop/fast-data-dev 29 | Setting advertised host to 127.0.0.1. 30 | Operating system RAM available is 1755 MiB, which is less than the lowest 31 | recommended of 5120 MiB. Your system performance may be seriously impacted. 32 | Starting services. 33 | This is landoop’s fast-data-dev. Kafka 0.10.2.1, Confluent OSS 3.2.1. 34 | You may visit http://127.0.0.1:3030 in about a minute. 35 | 2017-05-20 02:23:39,015 CRIT Supervisor running as root (no user in config file) 36 | 2017-05-20 02:23:39,015 WARN No file matches via include "/etc/supervisord.d/*.conf" 37 | 2017-05-20 02:23:39,020 INFO supervisord started with pid 6 38 | 2017-05-20 02:23:40,028 INFO spawned: 'sample-data' with pid 94 39 | 2017-05-20 02:23:40,032 INFO spawned: 'zookeeper' with pid 95 40 | 2017-05-20 02:23:40,041 INFO spawned: 'caddy' with pid 97 41 | 2017-05-20 02:23:40,044 INFO spawned: 'broker' with pid 99 42 | 2017-05-20 02:23:40,047 INFO spawned: 'smoke-tests' with pid 103 43 | 2017-05-20 02:23:40,050 INFO spawned: 'connect-distributed' with pid 106 44 | 2017-05-20 02:23:40,053 INFO spawned: 'logs-to-kafka' with pid 107 45 | 2017-05-20 02:23:40,057 INFO spawned: 'schema-registry' with pid 109 46 | 2017-05-20 02:23:40,060 INFO spawned: 'rest-proxy' with pid 110 47 | 2017-05-20 02:23:41,112 INFO success: sample-data entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 48 | 2017-05-20 02:23:41,112 INFO success: zookeeper entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 49 | 2017-05-20 02:23:41,113 INFO success: caddy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 50 | 2017-05-20 02:23:41,113 INFO success: broker entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 51 | 2017-05-20 02:23:41,113 INFO success: smoke-tests entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 52 | 2017-05-20 02:23:41,113 INFO success: connect-distributed entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 53 | 2017-05-20 02:23:41,113 INFO success: logs-to-kafka entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 54 | 2017-05-20 02:23:41,114 INFO success: schema-registry entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 55 | 2017-05-20 02:23:41,114 INFO success: rest-proxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 56 | 2017-05-20 02:23:42,757 INFO exited: schema-registry (exit status 1; not expected) 57 | 2017-05-20 02:23:42,765 INFO spawned: 'schema-registry' with pid 291 58 | 2017-05-20 02:23:43,055 INFO exited: rest-proxy (exit status 1; not expected) 59 | 2017-05-20 02:23:43,537 INFO spawned: 'rest-proxy' with pid 330 60 | 2017-05-20 02:23:44,147 INFO success: schema-registry entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 61 | 2017-05-20 02:23:44,562 INFO success: rest-proxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 62 | -------------------------------------------------------------------------------- /labs/02-create-topic.md: -------------------------------------------------------------------------------- 1 | To get Kafka command lines tools: 2 | 3 | docker run --rm -it --net=host landoop/fast-data-dev bash 4 | 5 | Query all the comands in this Kafka box: 6 | 7 | root@fast-data-dev / $ kafka- 8 | kafka-acls kafka-replica-verification 9 | kafka-avro-console-consumer kafka-rest-run-class 10 | kafka-avro-console-producer kafka-rest-start 11 | kafka-configs kafka-rest-stop 12 | kafka-console-consumer kafka-rest-stop-service 13 | kafka-console-producer kafka-run-class 14 | kafka-consumer-groups kafka-server-start 15 | kafka-consumer-offset-checker kafka-server-stop 16 | kafka-consumer-perf-test kafka-simple-consumer-shell 17 | kafka-mirror-maker kafka-streams-application-reset 18 | kafka-preferred-replica-election kafka-topics 19 | kafka-producer-perf-test kafka-verifiable-consumer 20 | kafka-reassign-partitions kafka-verifiable-producer 21 | kafka-replay-log-producer 22 | 23 | Options for `kafta-topics`: 24 | 25 | root@fast-data-dev / $ kafka-topics 26 | Create, delete, describe, or change a topic. 27 | Option Description 28 | ------ ----------- 29 | --alter Alter the number of partitions, 30 | replica assignment, and/or 31 | configuration for the topic. 32 | --config A topic configuration override for the 33 | topic being created or altered.The 34 | following is a list of valid 35 | configurations: 36 | cleanup.policy 37 | compression.type 38 | delete.retention.ms 39 | file.delete.delay.ms 40 | flush.messages 41 | flush.ms 42 | follower.replication.throttled. 43 | replicas 44 | index.interval.bytes 45 | leader.replication.throttled.replicas 46 | max.message.bytes 47 | message.format.version 48 | message.timestamp.difference.max.ms 49 | message.timestamp.type 50 | min.cleanable.dirty.ratio 51 | min.compaction.lag.ms 52 | min.insync.replicas 53 | preallocate 54 | retention.bytes 55 | retention.ms 56 | segment.bytes 57 | segment.index.bytes 58 | segment.jitter.ms 59 | segment.ms 60 | unclean.leader.election.enable 61 | See the Kafka documentation for full 62 | details on the topic configs. 63 | --create Create a new topic. 64 | --delete Delete a topic 65 | --delete-config A topic configuration override to be 66 | removed for an existing topic (see 67 | the list of configurations under the 68 | --config option). 69 | --describe List details for the given topics. 70 | --disable-rack-aware Disable rack aware replica assignment 71 | --force Suppress console prompts 72 | --help Print usage information. 73 | --if-exists if set when altering or deleting 74 | topics, the action will only execute 75 | if the topic exists 76 | --if-not-exists if set when creating topics, the 77 | action will only execute if the 78 | topic does not already exist 79 | --list List all available topics. 80 | --partitions The number of partitions for the topic 81 | being created or altered (WARNING: 82 | If partitions are increased for a 83 | topic that has a key, the partition 84 | logic or ordering of the messages 85 | will be affected 86 | --replica-assignment 91 | --replication-factor partition in the topic being created. 93 | --topic The topic to be create, alter or 94 | describe. Can also accept a regular 95 | expression except for --create option 96 | --topics-with-overrides if set when describing topics, only 97 | show topics that have overridden 98 | configs 99 | --unavailable-partitions if set when describing topics, only 100 | show partitions whose leader is not 101 | available 102 | --under-replicated-partitions if set when describing topics, only 103 | show under replicated partitions 104 | --zookeeper REQUIRED: The connection string for 105 | the zookeeper connection in the form 106 | host:port. Multiple URLS can be 107 | given to allow fail-over. 108 | 109 | 110 | Create a topic: 111 | 112 | root@fast-data-dev / $ kafka-topics --zookeeper 127.0.0.1:2181 --create --topic first_topic --partition 3 --replication-factor 1 113 | WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both. 114 | Created topic "first_topic". 115 | 116 | Describe a topic: 117 | 118 | root@fast-data-dev / $ kafka-topics --zookeeper 127.0.0.1:2181 --describe --topic first_topic 119 | Topic:first_topic PartitionCount:3 ReplicationFactor:1 Configs: 120 | Topic: first_topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0 121 | Topic: first_topic Partition: 1 Leader: 0 Replicas: 0 Isr: 0 122 | Topic: first_topic Partition: 2 Leader: 0 Replicas: 0 Isr: 0 123 | 124 | Delete a topic: 125 | 126 | root@fast-data-dev / kafka-topics --zookeeper 127.0.0.1:2181 --topic second_topic --delete 127 | Topic second_topic is marked for deletion. 128 | Note: This will have no impact if delete.topic.enable is not set to true. 129 | --------------------------------------------------------------------------------