├── .gitignore
├── Dockerfile
├── Makefile
├── README.md
├── install.sh
├── integration.go
├── internal
└── kafka
│ ├── client.go
│ └── config.go
├── main.go
└── package.sh
/.gitignore:
--------------------------------------------------------------------------------
1 | connect-kafka
2 | target/
3 |
--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------
1 | FROM ubuntu
2 |
3 | COPY target/connect-kafka-linux-amd64 /connect-kafka
4 |
5 | ENTRYPOINT ["/connect-kafka"]
6 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | build: main.go
2 | @mkdir -p target
3 | @go build -o target/connect-kafka
4 |
5 | clean:
6 | @rm -rf target
7 |
8 | docker:
9 | @./package.sh linux > /dev/null
10 | @docker build -t segment/connect-kafka . > /dev/null
11 |
12 | docker-push: docker
13 | @docker push segment/connect-kafka
14 |
15 | .PHONY: clean docker
16 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # connect-kafka
2 |
3 | This program is an example implementation of a [Segment](https://segment.com/) [Webhook](https://segment.com/docs/integrations/webhooks) consumer that publishes events to [Kafka](http://kafka.apache.org/).
4 |
5 | This is not an officially supported Segment product, but is meant to demonstrate a simple server that you can fork or emulate to route Segment data to your internal systems. It may even suit your needs as is!
6 |
7 |
8 |
9 | ## Features
10 | `connect-kafka` is a simple server that you deploy in your infrastructure and expose to the internet. It listens for Segment events and forwards them to the Kafka topic of your choice.
11 |
12 | - Easily forward web, mobile, server analytics events to your Kafka instance
13 | - Deploys in your infrastructure
14 | - Supports any Kafka cluster
15 | - Built with [Heroku Kafka](https://www.heroku.com/kafka) support in mind (with public/private space support)
16 | - Supports SSL (or not) connections to your cluster
17 | - Supports all Segment standard methods (`identify`, `track`, `page`, `screen`, `group`)
18 |
19 | ## Quickstart
20 |
21 | 1. *Connect to Kafka* - connect the `connect-kafka` to your Kafka instance.
22 | 2. *Setup Webbook* - Enter connect-kafka's listen address into your Segment webhook menu.
23 |
24 | ## FAQ
25 |
26 | #### Does this support shared secret authentication?
27 |
28 | Not yet, though we'd love a contribution that adds it!
29 |
30 | #### How do Segment Webhooks behave if my server goes down?
31 |
32 | We will retry the requests to the server 5 times over an hour if your server becomes unavailable.
33 |
34 | #### Will the events arrive in order?
35 |
36 | Because we're dealing with unbounded streaming data, we can't guarantee that your events arrive in the absolute order that they were collected in your client devices. As such, we recommend using the `timestamp` fields on each message with event-time windowing approaches in your destinations and streaming data applications.
37 |
38 | ### Connect to Kafka
39 |
40 | Download `connect-kafka` using curl:
41 |
42 | ```bash
43 | curl -s http://connect.segment.com/install-connect-kafka.sh | sh
44 | ```
45 |
46 | If you just want the binary and install it yourself:
47 |
48 | ```bash
49 | http://connect.segment.com/connect-kafka-darwin-amd64
50 | ```
51 |
52 | You can also use Docker:
53 |
54 | ```bash
55 | make docker
56 | docker run segment/connect-kafka [...]
57 | ```
58 |
59 | You can connect to any internal Kafka deployment.
60 |
61 | ```
62 | $ connect-kafka -h
63 |
64 | Usage:
65 | connect-kafka
66 | [--debug]
67 | --topic=
68 | --broker=...
69 | [--listen=]
70 | [--trusted-cert= --client-cert= --client-cert-key=]
71 | connect-kafka -h | --help
72 | connect-kafka --version
73 |
74 | Options:
75 | -h --help Show this screen
76 | --version Show version
77 | --topic= Kafka topic name
78 | --listen= Address to listen on [default: localhost:3000]
79 | --broker= Kafka broker URL
80 | ```
81 |
82 | #### Heroku Kafka
83 |
84 | Below is an example to connect to a Heroku Kafka in a public space (via SSL):
85 |
86 | ```bash
87 | go get -u github.com/segment-integrations/connect-kafka
88 | heroku config:get KAFKA_URL -a kafka-integration-demo # copy the kafka broker urls into command below
89 | heroku config:get KAFKA_TRUSTED_CERT -a kafka-integration-demo > kafka_trusted_cert.cer
90 | heroku config:get KAFKA_CLIENT_CERT -a kafka-integration-demo > kafka_client_cert.cer
91 | heroku config:get KAFKA_CLIENT_CERT_KEY -a kafka-integration-demo > kafka_client_key_cert.cer
92 | connect-kafka \
93 | --debug \
94 | --topic=segment \
95 | --broker=kafka+ssl://ec2-51-16-10-109.compute-1.amazonaws.com:9096 \
96 | --broker=kafka+ssl://ec2-62-7-61-181.compute-1.amazonaws.com:9096 \
97 | --broker=kafka+ssl://ec2-33-20-240-35.compute-1.amazonaws.com:9096 \
98 | --trusted-cert=kafka_trusted_cert.cer \
99 | --client-cert=kafka_client_cert.cer \
100 | --client-cert-key=kafka_client_key_cert.cer
101 | ```
102 |
103 | ### Setup Webhook
104 |
105 | 1. Go to the Segment.com and select the source you want to connect to Kafka
106 | 2. Add your `connect-kafka` server's address to the webhook integration's settings.
107 |
108 | 
109 |
110 |
111 | ## Testing
112 |
113 | ### via localtunnel
114 |
115 | You can open up a localtunnel on your local machine while you're testing:
116 |
117 | ```
118 | npm install -g localtunnel
119 | lt --port 3000
120 | ```
121 |
122 | Enter the resulting localtunnel url as the Segment webhook with `/listen` appended, like: `https://aqjujyhnck.localtunnel.me/listen`
123 |
124 | ## License
125 |
126 | MIT
127 |
--------------------------------------------------------------------------------
/install.sh:
--------------------------------------------------------------------------------
1 | # Install Script
2 | # Usage: curl -s $addr | sh
3 |
4 | platform='unknown'
5 | host='https://connect.segment.com.s3-us-west-2.amazonaws.com'
6 | binary='connect-kafka'
7 | install_dir='/usr/local/bin'
8 | arch='amd64'
9 |
10 | if [[ "$OSTYPE" == "linux-gnu" ]]; then
11 | platform='linux'
12 | elif [[ "$OSTYPE" == "darwin"* ]]; then
13 | platform='darwin'
14 | elif [[ "$OSTYPE" == "cygwin" ]]; then
15 | platform='windows'
16 | elif [[ "$OSTYPE" == "msys" ]]; then
17 | platform='windows'
18 | elif [[ "$OSTYPE" == "win32" ]]; then
19 | echo 'Platform is not supported!'
20 | exit 1
21 | elif [[ "$OSTYPE" == "freebsd"* ]]; then
22 | platform='freebsd'
23 | else
24 | echo 'Platform is not supported!'
25 | exit 1
26 | fi
27 |
28 | echo "Installing $binary for $platform/$arch..."
29 | echo "Debug $host/$binary-$platform-$arch"
30 |
31 | curl -s "$host/$binary-$platform-$arch" >> "$install_dir/$binary"
32 | chmod +x $install_dir/$binary
33 |
34 | size=$(wc -c <"$install_dir/$binary")
35 |
36 | echo "Size: $size"
37 |
38 | echo "$binary was installed successfully to $install_dir"
39 |
--------------------------------------------------------------------------------
/integration.go:
--------------------------------------------------------------------------------
1 | package main
2 |
3 | import (
4 | "crypto/tls"
5 | "crypto/x509"
6 | "io"
7 | "io/ioutil"
8 |
9 | "github.com/Shopify/sarama"
10 | log "github.com/Sirupsen/logrus"
11 | "github.com/segment-integrations/connect-kafka/internal/kafka"
12 | "github.com/tj/docopt"
13 | )
14 |
15 | type KafkaIntegration struct {
16 | topic string
17 | producer sarama.SyncProducer
18 | }
19 |
20 | func (k *KafkaIntegration) newTLSFromConfig(m map[string]interface{}) *tls.Config {
21 | trustedCertPath, _ := m["--trusted-cert"].(string)
22 | clientCertPath, _ := m["--client-cert"].(string)
23 | clientCertKeyPath, _ := m["--client-cert-key"].(string)
24 |
25 | if trustedCertPath == "" && clientCertPath == "" && clientCertKeyPath == "" {
26 | return nil
27 | }
28 |
29 | trustedCertBytes, err := ioutil.ReadFile(trustedCertPath)
30 | if err != nil {
31 | log.Fatal(err)
32 | }
33 |
34 | clientCertBytes, err := ioutil.ReadFile(clientCertPath)
35 | if err != nil {
36 | log.Fatal(err)
37 | }
38 |
39 | clientCertKeyBytes, err := ioutil.ReadFile(clientCertKeyPath)
40 | if err != nil {
41 | log.Fatal(err)
42 | }
43 |
44 | cert, err := tls.X509KeyPair(clientCertBytes, clientCertKeyBytes)
45 | if err != nil {
46 | log.Fatal(err)
47 | }
48 | certPool := x509.NewCertPool()
49 | certPool.AppendCertsFromPEM(trustedCertBytes)
50 |
51 | tlsConfig := &tls.Config{
52 | Certificates: []tls.Certificate{cert},
53 | InsecureSkipVerify: true,
54 | RootCAs: certPool,
55 | }
56 | tlsConfig.BuildNameToCertificate()
57 |
58 | return tlsConfig
59 | }
60 |
61 | func (k *KafkaIntegration) Init() error {
62 | m, err := docopt.Parse(usage, nil, true, Version, false)
63 | if err != nil {
64 | return err
65 | }
66 |
67 | kafkaConfig := &kafka.Config{BrokerAddresses: m["--broker"].([]string)}
68 | kafkaConfig.TLSConfig = k.newTLSFromConfig(m)
69 |
70 | producer, err := kafka.NewProducer(kafkaConfig)
71 | if err != nil {
72 | return err
73 | }
74 |
75 | k.producer = producer
76 | k.topic = m["--topic"].(string)
77 |
78 | return nil
79 | }
80 |
81 | func (k *KafkaIntegration) Process(r io.ReadCloser) error {
82 | defer r.Close()
83 | b, err := ioutil.ReadAll(r)
84 | if err != nil {
85 | return err
86 | }
87 |
88 | _, _, err = k.producer.SendMessage(&sarama.ProducerMessage{
89 | Topic: k.topic,
90 | Value: sarama.ByteEncoder(b),
91 | })
92 |
93 | return err
94 | }
95 |
--------------------------------------------------------------------------------
/internal/kafka/client.go:
--------------------------------------------------------------------------------
1 | package kafka
2 |
3 | import (
4 | "github.com/Shopify/sarama"
5 | "github.com/satori/go.uuid"
6 | )
7 |
8 | func NewProducer(c *Config) (sarama.SyncProducer, error) {
9 | config := sarama.NewConfig()
10 | config.Producer.Return.Errors = true
11 | config.ClientID = uuid.NewV4().String()
12 |
13 | // TLS
14 | if c.TLSConfig != nil {
15 | config.Net.TLS.Config = c.TLSConfig
16 | config.Net.TLS.Enable = true
17 | }
18 |
19 | err := config.Validate()
20 | if err != nil {
21 | return nil, err
22 | }
23 |
24 | producer, err := sarama.NewSyncProducer(c.getBrokers(), config)
25 | if err != nil {
26 | return nil, err
27 | }
28 |
29 | return producer, nil
30 | }
31 |
--------------------------------------------------------------------------------
/internal/kafka/config.go:
--------------------------------------------------------------------------------
1 | package kafka
2 |
3 | import (
4 | "crypto/tls"
5 | "log"
6 | "net/url"
7 | )
8 |
9 | type Config struct {
10 | BrokerAddresses []string
11 | TLSConfig *tls.Config
12 | }
13 |
14 | func (c *Config) getBrokers() []string {
15 | addrs := make([]string, len(c.BrokerAddresses))
16 | for i, v := range c.BrokerAddresses {
17 | u, err := url.Parse(v)
18 | if err != nil {
19 | log.Fatal(err)
20 | }
21 | addrs[i] = u.Host
22 | }
23 | return addrs
24 | }
25 |
--------------------------------------------------------------------------------
/main.go:
--------------------------------------------------------------------------------
1 | package main
2 |
3 | import (
4 | _ "net/http/pprof"
5 |
6 | "github.com/segmentio/connect"
7 | )
8 |
9 | const (
10 | Version = "0.0.1-beta"
11 | )
12 |
13 | var usage = `
14 | Usage:
15 | connect-kafka
16 | --topic=
17 | --broker=...
18 | [--trusted-cert= --client-cert= --client-cert-key=]
19 | connect-kafka -h | --help
20 | connect-kafka --version
21 |
22 | Options:
23 | -h --help Show this screen
24 | --version Show version
25 | --topic= Kafka topic name
26 | --broker= Kafka broker URL
27 | `
28 |
29 | func main() {
30 | connect.Run(&KafkaIntegration{})
31 | }
32 |
--------------------------------------------------------------------------------
/package.sh:
--------------------------------------------------------------------------------
1 | # The more popular operating systems that Go supports.
2 | platforms=(darwin openbsd freebsd linux)
3 | arch=amd64
4 | bucket='connect.segment.com'
5 | install="install-connect-kafka.sh"
6 | name="connect-kafka"
7 | host='https://connect.segment.com.s3-us-west-2.amazonaws.com'
8 |
9 | if hash gpg 2>/dev/null; then
10 | echo 'gpg was found, skipping...'
11 | else
12 | echo 'gpg is not installed. Using homebrew to install it...'
13 | brew install gpg
14 | fi
15 |
16 |
17 | build() {
18 | local platform=$1
19 |
20 | rm -rf target
21 | mkdir -p target
22 |
23 | echo "Building for $platform on $arch"
24 | GOOS=$platform GOARCH=$arch go build -ldflags "-s -w" -o target/$name-$platform-$arch
25 | }
26 |
27 | build_all() {
28 | for i in "${platforms[@]}"
29 | do
30 | build $i
31 | done
32 | }
33 |
34 | upload() {
35 | echo "Uploading artifacts to s3..."
36 |
37 | # Upload the targets to the production S3 bucket.
38 | aws-vault exec production -- aws s3 cp target/ s3://$bucket/ --recursive
39 | aws-vault exec production -- aws s3 cp install.sh s3://$bucket/$install
40 |
41 | echo "\n\nInstall script available at $host/$install"
42 | echo "\nTo install, run:"
43 | echo "-------------------------------------------"
44 | echo " $ curl -s $host/$install | sh "
45 | echo "-------------------------------------------"
46 | }
47 |
48 | case $1 in
49 | linux)
50 | build 'linux'
51 | ;;
52 | *)
53 | build_all
54 | upload
55 | ;;
56 | esac
57 |
--------------------------------------------------------------------------------