├── .gitignore ├── Dockerfile ├── Makefile ├── README.md ├── install.sh ├── integration.go ├── internal └── kafka │ ├── client.go │ └── config.go ├── main.go └── package.sh /.gitignore: -------------------------------------------------------------------------------- 1 | connect-kafka 2 | target/ 3 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM ubuntu 2 | 3 | COPY target/connect-kafka-linux-amd64 /connect-kafka 4 | 5 | ENTRYPOINT ["/connect-kafka"] 6 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | build: main.go 2 | @mkdir -p target 3 | @go build -o target/connect-kafka 4 | 5 | clean: 6 | @rm -rf target 7 | 8 | docker: 9 | @./package.sh linux > /dev/null 10 | @docker build -t segment/connect-kafka . > /dev/null 11 | 12 | docker-push: docker 13 | @docker push segment/connect-kafka 14 | 15 | .PHONY: clean docker 16 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # connect-kafka 2 | 3 | This program is an example implementation of a [Segment](https://segment.com/) [Webhook](https://segment.com/docs/integrations/webhooks) consumer that publishes events to [Kafka](http://kafka.apache.org/). 4 | 5 | This is not an officially supported Segment product, but is meant to demonstrate a simple server that you can fork or emulate to route Segment data to your internal systems. It may even suit your needs as is! 6 | 7 | 8 | 9 | ## Features 10 | `connect-kafka` is a simple server that you deploy in your infrastructure and expose to the internet. It listens for Segment events and forwards them to the Kafka topic of your choice. 11 | 12 | - Easily forward web, mobile, server analytics events to your Kafka instance 13 | - Deploys in your infrastructure 14 | - Supports any Kafka cluster 15 | - Built with [Heroku Kafka](https://www.heroku.com/kafka) support in mind (with public/private space support) 16 | - Supports SSL (or not) connections to your cluster 17 | - Supports all Segment standard methods (`identify`, `track`, `page`, `screen`, `group`) 18 | 19 | ## Quickstart 20 | 21 | 1. *Connect to Kafka* - connect the `connect-kafka` to your Kafka instance. 22 | 2. *Setup Webbook* - Enter connect-kafka's listen address into your Segment webhook menu. 23 | 24 | ## FAQ 25 | 26 | #### Does this support shared secret authentication? 27 | 28 | Not yet, though we'd love a contribution that adds it! 29 | 30 | #### How do Segment Webhooks behave if my server goes down? 31 | 32 | We will retry the requests to the server 5 times over an hour if your server becomes unavailable. 33 | 34 | #### Will the events arrive in order? 35 | 36 | Because we're dealing with unbounded streaming data, we can't guarantee that your events arrive in the absolute order that they were collected in your client devices. As such, we recommend using the `timestamp` fields on each message with event-time windowing approaches in your destinations and streaming data applications. 37 | 38 | ### Connect to Kafka 39 | 40 | Download `connect-kafka` using curl: 41 | 42 | ```bash 43 | curl -s http://connect.segment.com/install-connect-kafka.sh | sh 44 | ``` 45 | 46 | If you just want the binary and install it yourself: 47 | 48 | ```bash 49 | http://connect.segment.com/connect-kafka-darwin-amd64 50 | ``` 51 | 52 | You can also use Docker: 53 | 54 | ```bash 55 | make docker 56 | docker run segment/connect-kafka [...] 57 | ``` 58 | 59 | You can connect to any internal Kafka deployment. 60 | 61 | ``` 62 | $ connect-kafka -h 63 | 64 | Usage: 65 | connect-kafka 66 | [--debug] 67 | --topic= 68 | --broker=... 69 | [--listen=] 70 | [--trusted-cert= --client-cert= --client-cert-key=] 71 | connect-kafka -h | --help 72 | connect-kafka --version 73 | 74 | Options: 75 | -h --help Show this screen 76 | --version Show version 77 | --topic= Kafka topic name 78 | --listen= Address to listen on [default: localhost:3000] 79 | --broker= Kafka broker URL 80 | ``` 81 | 82 | #### Heroku Kafka 83 | 84 | Below is an example to connect to a Heroku Kafka in a public space (via SSL): 85 | 86 | ```bash 87 | go get -u github.com/segment-integrations/connect-kafka 88 | heroku config:get KAFKA_URL -a kafka-integration-demo # copy the kafka broker urls into command below 89 | heroku config:get KAFKA_TRUSTED_CERT -a kafka-integration-demo > kafka_trusted_cert.cer 90 | heroku config:get KAFKA_CLIENT_CERT -a kafka-integration-demo > kafka_client_cert.cer 91 | heroku config:get KAFKA_CLIENT_CERT_KEY -a kafka-integration-demo > kafka_client_key_cert.cer 92 | connect-kafka \ 93 | --debug \ 94 | --topic=segment \ 95 | --broker=kafka+ssl://ec2-51-16-10-109.compute-1.amazonaws.com:9096 \ 96 | --broker=kafka+ssl://ec2-62-7-61-181.compute-1.amazonaws.com:9096 \ 97 | --broker=kafka+ssl://ec2-33-20-240-35.compute-1.amazonaws.com:9096 \ 98 | --trusted-cert=kafka_trusted_cert.cer \ 99 | --client-cert=kafka_client_cert.cer \ 100 | --client-cert-key=kafka_client_key_cert.cer 101 | ``` 102 | 103 | ### Setup Webhook 104 | 105 | 1. Go to the Segment.com and select the source you want to connect to Kafka 106 | 2. Add your `connect-kafka` server's address to the webhook integration's settings. 107 | 108 | ![](http://g.recordit.co/XcyIz2fqJv.gif) 109 | 110 | 111 | ## Testing 112 | 113 | ### via localtunnel 114 | 115 | You can open up a localtunnel on your local machine while you're testing: 116 | 117 | ``` 118 | npm install -g localtunnel 119 | lt --port 3000 120 | ``` 121 | 122 | Enter the resulting localtunnel url as the Segment webhook with `/listen` appended, like: `https://aqjujyhnck.localtunnel.me/listen` 123 | 124 | ## License 125 | 126 | MIT 127 | -------------------------------------------------------------------------------- /install.sh: -------------------------------------------------------------------------------- 1 | # Install Script 2 | # Usage: curl -s $addr | sh 3 | 4 | platform='unknown' 5 | host='https://connect.segment.com.s3-us-west-2.amazonaws.com' 6 | binary='connect-kafka' 7 | install_dir='/usr/local/bin' 8 | arch='amd64' 9 | 10 | if [[ "$OSTYPE" == "linux-gnu" ]]; then 11 | platform='linux' 12 | elif [[ "$OSTYPE" == "darwin"* ]]; then 13 | platform='darwin' 14 | elif [[ "$OSTYPE" == "cygwin" ]]; then 15 | platform='windows' 16 | elif [[ "$OSTYPE" == "msys" ]]; then 17 | platform='windows' 18 | elif [[ "$OSTYPE" == "win32" ]]; then 19 | echo 'Platform is not supported!' 20 | exit 1 21 | elif [[ "$OSTYPE" == "freebsd"* ]]; then 22 | platform='freebsd' 23 | else 24 | echo 'Platform is not supported!' 25 | exit 1 26 | fi 27 | 28 | echo "Installing $binary for $platform/$arch..." 29 | echo "Debug $host/$binary-$platform-$arch" 30 | 31 | curl -s "$host/$binary-$platform-$arch" >> "$install_dir/$binary" 32 | chmod +x $install_dir/$binary 33 | 34 | size=$(wc -c <"$install_dir/$binary") 35 | 36 | echo "Size: $size" 37 | 38 | echo "$binary was installed successfully to $install_dir" 39 | -------------------------------------------------------------------------------- /integration.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "crypto/tls" 5 | "crypto/x509" 6 | "io" 7 | "io/ioutil" 8 | 9 | "github.com/Shopify/sarama" 10 | log "github.com/Sirupsen/logrus" 11 | "github.com/segment-integrations/connect-kafka/internal/kafka" 12 | "github.com/tj/docopt" 13 | ) 14 | 15 | type KafkaIntegration struct { 16 | topic string 17 | producer sarama.SyncProducer 18 | } 19 | 20 | func (k *KafkaIntegration) newTLSFromConfig(m map[string]interface{}) *tls.Config { 21 | trustedCertPath, _ := m["--trusted-cert"].(string) 22 | clientCertPath, _ := m["--client-cert"].(string) 23 | clientCertKeyPath, _ := m["--client-cert-key"].(string) 24 | 25 | if trustedCertPath == "" && clientCertPath == "" && clientCertKeyPath == "" { 26 | return nil 27 | } 28 | 29 | trustedCertBytes, err := ioutil.ReadFile(trustedCertPath) 30 | if err != nil { 31 | log.Fatal(err) 32 | } 33 | 34 | clientCertBytes, err := ioutil.ReadFile(clientCertPath) 35 | if err != nil { 36 | log.Fatal(err) 37 | } 38 | 39 | clientCertKeyBytes, err := ioutil.ReadFile(clientCertKeyPath) 40 | if err != nil { 41 | log.Fatal(err) 42 | } 43 | 44 | cert, err := tls.X509KeyPair(clientCertBytes, clientCertKeyBytes) 45 | if err != nil { 46 | log.Fatal(err) 47 | } 48 | certPool := x509.NewCertPool() 49 | certPool.AppendCertsFromPEM(trustedCertBytes) 50 | 51 | tlsConfig := &tls.Config{ 52 | Certificates: []tls.Certificate{cert}, 53 | InsecureSkipVerify: true, 54 | RootCAs: certPool, 55 | } 56 | tlsConfig.BuildNameToCertificate() 57 | 58 | return tlsConfig 59 | } 60 | 61 | func (k *KafkaIntegration) Init() error { 62 | m, err := docopt.Parse(usage, nil, true, Version, false) 63 | if err != nil { 64 | return err 65 | } 66 | 67 | kafkaConfig := &kafka.Config{BrokerAddresses: m["--broker"].([]string)} 68 | kafkaConfig.TLSConfig = k.newTLSFromConfig(m) 69 | 70 | producer, err := kafka.NewProducer(kafkaConfig) 71 | if err != nil { 72 | return err 73 | } 74 | 75 | k.producer = producer 76 | k.topic = m["--topic"].(string) 77 | 78 | return nil 79 | } 80 | 81 | func (k *KafkaIntegration) Process(r io.ReadCloser) error { 82 | defer r.Close() 83 | b, err := ioutil.ReadAll(r) 84 | if err != nil { 85 | return err 86 | } 87 | 88 | _, _, err = k.producer.SendMessage(&sarama.ProducerMessage{ 89 | Topic: k.topic, 90 | Value: sarama.ByteEncoder(b), 91 | }) 92 | 93 | return err 94 | } 95 | -------------------------------------------------------------------------------- /internal/kafka/client.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "github.com/Shopify/sarama" 5 | "github.com/satori/go.uuid" 6 | ) 7 | 8 | func NewProducer(c *Config) (sarama.SyncProducer, error) { 9 | config := sarama.NewConfig() 10 | config.Producer.Return.Errors = true 11 | config.ClientID = uuid.NewV4().String() 12 | 13 | // TLS 14 | if c.TLSConfig != nil { 15 | config.Net.TLS.Config = c.TLSConfig 16 | config.Net.TLS.Enable = true 17 | } 18 | 19 | err := config.Validate() 20 | if err != nil { 21 | return nil, err 22 | } 23 | 24 | producer, err := sarama.NewSyncProducer(c.getBrokers(), config) 25 | if err != nil { 26 | return nil, err 27 | } 28 | 29 | return producer, nil 30 | } 31 | -------------------------------------------------------------------------------- /internal/kafka/config.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "crypto/tls" 5 | "log" 6 | "net/url" 7 | ) 8 | 9 | type Config struct { 10 | BrokerAddresses []string 11 | TLSConfig *tls.Config 12 | } 13 | 14 | func (c *Config) getBrokers() []string { 15 | addrs := make([]string, len(c.BrokerAddresses)) 16 | for i, v := range c.BrokerAddresses { 17 | u, err := url.Parse(v) 18 | if err != nil { 19 | log.Fatal(err) 20 | } 21 | addrs[i] = u.Host 22 | } 23 | return addrs 24 | } 25 | -------------------------------------------------------------------------------- /main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | _ "net/http/pprof" 5 | 6 | "github.com/segmentio/connect" 7 | ) 8 | 9 | const ( 10 | Version = "0.0.1-beta" 11 | ) 12 | 13 | var usage = ` 14 | Usage: 15 | connect-kafka 16 | --topic= 17 | --broker=... 18 | [--trusted-cert= --client-cert= --client-cert-key=] 19 | connect-kafka -h | --help 20 | connect-kafka --version 21 | 22 | Options: 23 | -h --help Show this screen 24 | --version Show version 25 | --topic= Kafka topic name 26 | --broker= Kafka broker URL 27 | ` 28 | 29 | func main() { 30 | connect.Run(&KafkaIntegration{}) 31 | } 32 | -------------------------------------------------------------------------------- /package.sh: -------------------------------------------------------------------------------- 1 | # The more popular operating systems that Go supports. 2 | platforms=(darwin openbsd freebsd linux) 3 | arch=amd64 4 | bucket='connect.segment.com' 5 | install="install-connect-kafka.sh" 6 | name="connect-kafka" 7 | host='https://connect.segment.com.s3-us-west-2.amazonaws.com' 8 | 9 | if hash gpg 2>/dev/null; then 10 | echo 'gpg was found, skipping...' 11 | else 12 | echo 'gpg is not installed. Using homebrew to install it...' 13 | brew install gpg 14 | fi 15 | 16 | 17 | build() { 18 | local platform=$1 19 | 20 | rm -rf target 21 | mkdir -p target 22 | 23 | echo "Building for $platform on $arch" 24 | GOOS=$platform GOARCH=$arch go build -ldflags "-s -w" -o target/$name-$platform-$arch 25 | } 26 | 27 | build_all() { 28 | for i in "${platforms[@]}" 29 | do 30 | build $i 31 | done 32 | } 33 | 34 | upload() { 35 | echo "Uploading artifacts to s3..." 36 | 37 | # Upload the targets to the production S3 bucket. 38 | aws-vault exec production -- aws s3 cp target/ s3://$bucket/ --recursive 39 | aws-vault exec production -- aws s3 cp install.sh s3://$bucket/$install 40 | 41 | echo "\n\nInstall script available at $host/$install" 42 | echo "\nTo install, run:" 43 | echo "-------------------------------------------" 44 | echo " $ curl -s $host/$install | sh " 45 | echo "-------------------------------------------" 46 | } 47 | 48 | case $1 in 49 | linux) 50 | build 'linux' 51 | ;; 52 | *) 53 | build_all 54 | upload 55 | ;; 56 | esac 57 | --------------------------------------------------------------------------------