├── .github └── workflows │ └── docs.yml ├── .gitignore ├── ARCHITECTURE.md ├── Dockerfile ├── Dockerfile.default ├── LICENSE ├── Makefile ├── README.md ├── aggregator ├── cluster.go ├── data.go ├── kafka │ ├── crc32_field.go │ ├── decoder.go │ ├── decompress.go │ ├── errors.go │ ├── fetch_response.go │ ├── length_field.go │ ├── message.go │ ├── message_set.go │ ├── produce_request.go │ ├── real_decoder.go │ ├── record.go │ ├── record_batch.go │ ├── records.go │ ├── request.go │ ├── response_header.go │ ├── timestamp.go │ ├── versions.go │ └── ztsd.go ├── persist.go ├── pg_test.go ├── sock_line_test.go ├── sock_num_line.go └── socket.go ├── config └── db.go ├── cri ├── container.go └── cri.go ├── datastore ├── backend.go ├── datastore.go ├── dto.go └── payload.go ├── docs └── syscalls.txt ├── ebpf-builder └── Dockerfile ├── ebpf ├── bpf.go ├── c │ ├── amqp.c │ ├── bpf.c │ ├── bpf_bpfeb.go │ ├── bpf_bpfeb.o │ ├── bpf_bpfel.go │ ├── bpf_bpfel.o │ ├── generate.go │ ├── go_internal.h │ ├── http.c │ ├── http2.c │ ├── kafka.c │ ├── l7.c │ ├── loader.go │ ├── macros.h │ ├── map.h │ ├── mongo.c │ ├── mysql.c │ ├── openssl.c │ ├── postgres.c │ ├── proc.c │ ├── redis.c │ ├── struct.h │ ├── tcp.c │ └── tcp_sock.c ├── collector.go ├── headers │ ├── bpf.h │ ├── bpf_core_read.h │ ├── bpf_helper_defs.h │ ├── bpf_helpers.h │ ├── common.h │ ├── l7_req.h │ ├── log.h │ ├── pt_regs.h │ ├── tcp.h │ └── vmlinux.h ├── l7_req │ └── l7.go ├── proc │ └── proc.go ├── ssllib.go ├── ssllib_test.go └── tcp_state │ └── tcp.go ├── go.mod ├── go.sum ├── gpu ├── collector.go └── nvml.go ├── k8s ├── daemonset.go ├── deployment.go ├── endpoints.go ├── informer.go ├── pod.go ├── replicaset.go ├── service.go └── statefulset.go ├── log └── logger.go ├── logstreamer ├── caCert.go ├── pool.go └── stream.go ├── main.go ├── main_benchmark_test.go ├── resources └── alaz.yaml └── testconfig └── config1.json /.github/workflows/docs.yml: -------------------------------------------------------------------------------- 1 | name: Documentation 2 | 3 | on: 4 | push: 5 | branches: 6 | - master 7 | - develop 8 | pull_request: 9 | branches: 10 | - master 11 | - develop 12 | 13 | jobs: 14 | link-checker: 15 | name: Check links 16 | runs-on: ubuntu-latest 17 | steps: 18 | - name: Checkout the repository 19 | uses: actions/checkout@v4 20 | 21 | - name: Check the links 22 | uses: lycheeverse/lychee-action@v1 23 | with: 24 | args: --max-concurrency 1 -v *.md 25 | fail: true 26 | env: 27 | GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} 28 | 29 | spelling-checker: 30 | name: Check spelling 31 | runs-on: ubuntu-latest 32 | steps: 33 | - name: Checkout the repository 34 | uses: actions/checkout@v4 35 | 36 | - name: Check spelling mistakes 37 | uses: codespell-project/actions-codespell@master 38 | with: 39 | check_filenames: true 40 | check_hidden: true 41 | path: "*.md" 42 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | .vscode/settings.json 3 | .vscode/launch.json 4 | -------------------------------------------------------------------------------- /ARCHITECTURE.md: -------------------------------------------------------------------------------- 1 | # Alaz Architecture 2 | 3 | 4 | 5 | - [1. Kubernetes Client](#1-kubernetes-client) 6 | - [2. Container Runtimes (`containerd`)](#2-container-runtimes-containerd) 7 | - [3. eBPF Programs](#3-ebpf-programs) 8 | - [Note](#note) 9 | - [How to Build](#how-to-build) 10 | - [How to Deploy](#how-to-deploy) 11 | 12 | 13 | 14 | Alaz is designed to run in a kubernetes cluster as an agent, deployed as Daemonset (runs on each cluster node separately). 15 | 16 | What it does is to watch and pull data from cluster to gain visibility onto the cluster. 17 | 18 | It gathers information from 3 different sources: 19 | 20 | ## 1. Kubernetes Client 21 | 22 | Using kubernetes client, it polls different type of events related to kubernetes resources. Like **ADD, UPDATE, DELETE** events for any kind of K8s resources like **Pods,Deployments,Services** etc. 23 | 24 | We use the following packages: 25 | 26 | - `k8s.io/api/core/v1` 27 | - `k8s.io/apimachinery/pkg/util/runtime` 28 | - `k8s.io/client-go` 29 | 30 | ## 2. Container Runtimes (`containerd`) 31 | 32 | There are different types of container runtimes available for K8s clusters like containerd, crio, docker etc. 33 | By connecting to chosen container runtimes socket, Alaz is able to gather more detailed information on containers running on nodes. 34 | 35 | - log directory of the container, 36 | - information related to its sandbox, 37 | - pid, 38 | - cgroups 39 | - environment variables 40 | - etc. 41 | 42 | > We do not take into consideration container runtimes data, we do not need it for todays objectives. Will be used later on for collecting more detailed data. 43 | 44 | ## 3. eBPF Programs 45 | 46 | In Alaz's eBPF directory there are a couple of eBPF programs written in C using libbpf. 47 | 48 | In order to compile these programs, we have a **eBPF-builder image** that contains necessary dependencies installed like clang, llvm, libbpf and go. 49 | 50 | > eBPF programs are compiled in mentioned container, leveraging [Cilium bpf2go package](https://github.com/cilium/ebpf/tree/main/cmd/bpf2go). 51 | 52 | Using go generate directive with `bpf2go`, it compiles the eBPF program and generated necessary helper files in go in order us to interact with eBPF programs. 53 | 54 | - Link the program to a tracepoint or a kprobe. 55 | - Read bpf maps from user space and pass them for sense-making of data. 56 | 57 | Used packages from cilium are: 58 | 59 | - `github.com/cilium/eBPF/link` 60 | - `github.com/cilium/eBPF/perf` 61 | - `github.com/cilium/eBPF/rlimit` 62 | 63 | eBPF programs: 64 | 65 | - `tcp_state` : Detects newly established, closed, and listened TCP connections. The number of sockets associated with the program's PID depends on the remote IP address. Keeping this data together with the file descriptor is useful. 66 | - `l7_req` : Monitors both incoming and outgoing payloads by tracking the write,read syscalls and uprobes. Then use `tcp_state` to aggregate the data we receive, allowing us to determine who sent which request to where. 67 | 68 | Current programs are generally attached to kernel tracepoints like: 69 | 70 | ``` 71 | tracepoint/syscalls/sys_enter_write (l7_req) 72 | tracepoint/syscalls/sys_exit_write (l7_req) 73 | tracepoint/syscalls/sys_enter_sendto (l7_req) 74 | tracepoint/syscalls/sys_exit_sendto (l7_req) 75 | tracepoint/syscalls/sys_enter_read (l7_req) 76 | tracepoint/syscalls/sys_exit_read (l7_req) 77 | tracepoint/syscalls/sys_enter_recvfrom (l7_req) 78 | tracepoint/syscalls/sys_exit_recvfrom (l7_req) 79 | tracepoint/sock/inet_sock_set_state (tcp_state) 80 | tracepoint/syscalls/sys_enter_connect (tcp_state) 81 | tracepoint/syscalls/sys_exit_connect (tcp_state) 82 | ``` 83 | 84 | uprobes: 85 | 86 | ``` 87 | SSL_write 88 | SSL_read 89 | crypto/tls.(*Conn).Write 90 | crypto/tls.(*Conn).Read 91 | ``` 92 | 93 | ### Note 94 | 95 | Uretprobes crashes go applications. See 96 | 97 | That's why we disassemble the executable and find return instructions addresses and attach classic uprobes on them as a workaround. 98 | 99 | ## How to Build 100 | 101 | Alaz embeds compiled eBPF programs in it. After compilation process on eBPF-builder is done, compiled programs are located in project structure. 102 | 103 | Using **//go:embed** directive of golang. We embed _.o_ files and load them into kernel using [Cilium eBPF package](https://github.com/cilium/eBPF). 104 | 105 | Then we build Alaz like a ordinary golang app more or less since compiled codes are embedded. 106 | 107 | ## How to Deploy 108 | 109 | Deployed as a privileged DaemonSet resource on the cluster. Alaz is required to run as a privileged container since it needs read access to `/proc` directory of the host machine. 110 | 111 | And Alaz's `serviceAccount` must be should be associated with `ClusterRole` and `ClusterRoleBinding` resources in order to be able to talk with K8s server. 112 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM golang:1.22.4-bullseye AS builder 2 | WORKDIR /app 3 | COPY . ./ 4 | RUN apt update 5 | 6 | ARG VERSION 7 | ENV GOCACHE=/root/.cache/go-build 8 | RUN go mod tidy -v 9 | RUN --mount=type=cache,target="/root/.cache/go-build" GOOS=linux go build -ldflags="-X 'github.com/ddosify/alaz/datastore.tag=$VERSION'" -o alaz 10 | 11 | FROM registry.access.redhat.com/ubi9/ubi-minimal:9.3-1552 12 | RUN microdnf update -y && microdnf install procps ca-certificates -y && microdnf clean all 13 | 14 | COPY --chown=1001:0 --from=builder /app/alaz ./bin/ 15 | COPY --chown=1001:0 LICENSE /licenses/LICENSE 16 | 17 | USER 1001 18 | ENTRYPOINT ["alaz"] 19 | -------------------------------------------------------------------------------- /Dockerfile.default: -------------------------------------------------------------------------------- 1 | FROM golang:1.22.5-bullseye AS builder 2 | WORKDIR /app 3 | COPY . ./ 4 | RUN apt update 5 | 6 | ARG VERSION 7 | ENV GOCACHE=/root/.cache/go-build 8 | RUN go mod tidy -v 9 | RUN --mount=type=cache,target="/root/.cache/go-build" GOOS=linux go build -ldflags="-X 'github.com/ddosify/alaz/datastore.tag=$VERSION'" -o alaz 10 | 11 | FROM debian:12.6-slim 12 | RUN apt-get update && apt-get install -y procps ca-certificates && rm -rf /var/lib/apt/lists/* 13 | 14 | COPY --chown=0:0 --from=builder /app/alaz ./bin/ 15 | ENTRYPOINT ["alaz"] 16 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | # The development version of clang is distributed as the 'clang' binary, 2 | # while stable/released versions have a version number attached. 3 | # Pin the default clang to a stable version. 4 | CLANG ?= clang-14 5 | STRIP ?= llvm-strip-14 6 | OBJCOPY ?= llvm-objcopy-14 7 | TARGET_ARCH ?= arm64 # x86 or arm64 8 | CFLAGS := -O2 -g -Wall -Werror -D__TARGET_ARCH_$(TARGET_ARCH) $(CFLAGS) 9 | 10 | # Obtain an absolute path to the directory of the Makefile. 11 | # Assume the Makefile is in the root of the repository. 12 | REPODIR := $(shell dirname $(realpath $(firstword $(MAKEFILE_LIST)))) 13 | UIDGID := $(shell stat -c '%u:%g' ${REPODIR}) 14 | 15 | # Prefer podman if installed, otherwise use docker. 16 | # Note: Setting the var at runtime will always override. 17 | CONTAINER_ENGINE ?= docker 18 | CONTAINER_RUN_ARGS ?= $(--user "${UIDGID}") 19 | 20 | IMAGE_GENERATE := ebpf-builder 21 | VERSION_GENERATE := v1-1.22.1 22 | GENERATE_DOCKERFILE := ebpf-builder/Dockerfile 23 | 24 | # clang <8 doesn't tag relocs properly (STT_NOTYPE) 25 | # clang 9 is the first version emitting BTF 26 | TARGETS := \ 27 | 28 | .PHONY: go_builder_image_build 29 | go_builder_image_build: 30 | docker build -t ${IMAGE_GENERATE}:${VERSION_GENERATE} -f ${GENERATE_DOCKERFILE} . 31 | 32 | 33 | .PHONY: all clean go_generate container-shell generate 34 | 35 | .DEFAULT_TARGET = go_generate 36 | 37 | # Build all ELF binaries using a containerized LLVM toolchain. 38 | go_generate: 39 | +${CONTAINER_ENGINE} run --rm ${CONTAINER_RUN_ARGS} \ 40 | -v "${REPODIR}":/ebpf -w /ebpf --env MAKEFLAGS \ 41 | --env CFLAGS="-fdebug-prefix-map=/ebpf=." \ 42 | --env HOME="/tmp" \ 43 | "${IMAGE_GENERATE}:${VERSION_GENERATE}" \ 44 | make all 45 | 46 | # (debug) Drop the user into a shell inside the container as root. 47 | container-shell: 48 | ${CONTAINER_ENGINE} run --rm -ti \ 49 | -v "${REPODIR}":/ebpf -w /ebpf \ 50 | "${IMAGE_GENERATE}:${VERSION_GENERATE}" 51 | 52 | 53 | all: generate 54 | 55 | # $BPF_CLANG is used in go:generate invocations. 56 | generate: export BPF_CLANG := $(CLANG) 57 | generate: export BPF_CFLAGS := $(CFLAGS) 58 | generate: 59 | go generate ./... 60 | 61 | %-el.elf: %.c 62 | $(CLANG) $(CFLAGS) -target bpfel -g -c $< -o $@ 63 | $(STRIP) -g $@ 64 | 65 | %-eb.elf : %.c 66 | $(CLANG) $(CFLAGS) -target bpfeb -c $< -o $@ 67 | $(STRIP) -g $@ 68 | 69 | 70 | ## Alaz Image 71 | 72 | ALAZ_IMAGE_NAME := alaz 73 | ALAZ_TAG ?= latest 74 | REGISTRY ?= ddosify 75 | ALAZ_DOCKERFILE ?= Dockerfile.default 76 | BUILDX_BUILDER := buildx-multi-arch 77 | 78 | ifeq ($(TARGET_ARCH), arm64) 79 | DOCKER_PLATFORM := linux/arm64 80 | else 81 | DOCKER_PLATFORM := linux/amd64 82 | endif 83 | 84 | .PHONY: build_push_buildx 85 | build_push_buildx: 86 | docker buildx inspect $(BUILDX_BUILDER) || \ 87 | docker buildx create --name=$(BUILDX_BUILDER) && \ 88 | docker buildx build --push --platform=$(DOCKER_PLATFORM) --builder=$(BUILDX_BUILDER) --build-arg ALAZ_TAG=$(ALAZ_TAG) --build-arg VERSION=$(ALAZ_TAG) --tag=$(REGISTRY)/$(ALAZ_IMAGE_NAME):$(ALAZ_TAG)-$(TARGET_ARCH) -f $(ALAZ_DOCKERFILE) . 89 | 90 | .PHONY: docker_merge_platforms 91 | docker_merge_platforms: 92 | docker buildx imagetools create --tag $(REGISTRY)/$(ALAZ_IMAGE_NAME):$(ALAZ_TAG) $(REGISTRY)/$(ALAZ_IMAGE_NAME):$(ALAZ_TAG)-arm64 $(REGISTRY)/$(ALAZ_IMAGE_NAME):$(ALAZ_TAG)-x86 93 | 94 | .PHONY: build_push 95 | build_push: 96 | docker build --build-arg VERSION=$(ALAZ_TAG) -t $(REGISTRY)/$(ALAZ_IMAGE_NAME):$(ALAZ_TAG) -f $(ALAZ_DOCKERFILE) . 97 | docker push $(REGISTRY)/$(ALAZ_IMAGE_NAME):$(ALAZ_TAG) 98 | 99 | # make go_builder_image_build 100 | # ALAZ_TAG=latest 101 | # make go_generate TARGET_ARCH=arm64 102 | # make build_push_buildx TARGET_ARCH=arm64 ALAZ_TAG=$ALAZ_TAG 103 | 104 | # make go_generate TARGET_ARCH=x86 105 | # make build_push_buildx TARGET_ARCH=x86 ALAZ_TAG=$ALAZ_TAG 106 | 107 | # make docker_merge_platforms ALAZ_TAG=$ALAZ_TAG 108 | -------------------------------------------------------------------------------- /aggregator/cluster.go: -------------------------------------------------------------------------------- 1 | package aggregator 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | "sync" 7 | "sync/atomic" 8 | 9 | "github.com/ddosify/alaz/log" 10 | "k8s.io/apimachinery/pkg/types" 11 | ) 12 | 13 | type ClusterInfo struct { 14 | k8smu sync.RWMutex 15 | PodIPToPodUid map[string]types.UID `json:"podIPToPodUid"` 16 | ServiceIPToServiceUid map[string]types.UID `json:"serviceIPToServiceUid"` 17 | 18 | // Pid -> SocketMap 19 | // pid -> fd -> {saddr, sport, daddr, dport} 20 | SocketMaps []*SocketMap // index symbolizes pid 21 | socketMapsmu sync.Mutex 22 | 23 | // Below mutexes guard socketMaps, set to mu inside SocketMap struct 24 | // Used to find the correct mutex for the process, some pids can share the same mutex 25 | muIndex atomic.Uint64 26 | muArray []*sync.RWMutex 27 | 28 | signalChan chan uint32 // pids are signaled on this channel to notify clusterInfo struct to initialize a SocketMap 29 | } 30 | 31 | func newClusterInfo(liveProcCount int) *ClusterInfo { 32 | ci := &ClusterInfo{ 33 | PodIPToPodUid: map[string]types.UID{}, 34 | ServiceIPToServiceUid: map[string]types.UID{}, 35 | } 36 | ci.signalChan = make(chan uint32) 37 | sockMaps := make([]*SocketMap, maxPid+1) // index=pid 38 | ci.SocketMaps = sockMaps 39 | ci.muIndex = atomic.Uint64{} 40 | 41 | // initialize mutex array 42 | 43 | // normally, mutex per pid is straightforward solution 44 | // on regular systems, maxPid is around 32768 45 | // so, we allocate 32768 mutexes, which is 32768 * 24 bytes = 786KB 46 | // but on 64-bit systems, maxPid can be 4194304 47 | // and we don't want to allocate 4194304 mutexes, it adds up to 4194304 * 24 bytes = 100MB 48 | // So, some process will have to share the mutex 49 | 50 | // assume liveprocesses can increase up to 100 times of current count 51 | // if processes exceeds the count of mutex, they will share the mutex 52 | countMuArray := liveProcCount * 100 53 | if countMuArray > maxPid { 54 | countMuArray = maxPid 55 | } 56 | // for 2k processes, 200k mutex => 200k * 24 bytes = 4.80MB 57 | // in case of maxPid is 32678, 32678 * 24 bytes = 784KB, pick the smaller one 58 | ci.muArray = make([]*sync.RWMutex, countMuArray) 59 | go ci.handleSocketMapCreation() 60 | return ci 61 | } 62 | 63 | func (ci *ClusterInfo) SignalSocketMapCreation(pid uint32) { 64 | ci.signalChan <- pid 65 | } 66 | 67 | // events will be processed sequentially here in one goroutine. 68 | // in order to prevent race. 69 | func (ci *ClusterInfo) handleSocketMapCreation() { 70 | for pid := range ci.signalChan { 71 | if ci.SocketMaps[pid] != nil { 72 | continue 73 | } 74 | 75 | ctxPid := context.WithValue(context.Background(), log.LOG_CONTEXT, fmt.Sprint(pid)) 76 | 77 | sockMap := &SocketMap{ 78 | mu: nil, // set below 79 | pid: pid, 80 | M: map[uint64]*SocketLine{}, 81 | waitingFds: make(chan uint64, 1000), 82 | processedFds: map[uint64]struct{}{}, 83 | processedFdsmu: sync.RWMutex{}, 84 | closeCh: make(chan struct{}, 1), 85 | ctx: ctxPid, 86 | } 87 | 88 | ci.muIndex.Add(1) 89 | i := (ci.muIndex.Load()) % uint64(len(ci.muArray)) 90 | ci.muArray[i] = &sync.RWMutex{} 91 | sockMap.mu = ci.muArray[i] 92 | ci.SocketMaps[pid] = sockMap 93 | go sockMap.ProcessSocketLineCreationRequests() 94 | } 95 | } 96 | 97 | func (ci *ClusterInfo) clearProc(pid uint32) { 98 | sm := ci.SocketMaps[pid] 99 | if sm == nil { 100 | return 101 | } 102 | 103 | // stop waiting for socketline creation requests 104 | sm.mu.Lock() 105 | sm.closeCh <- struct{}{} 106 | sm.mu.Unlock() 107 | 108 | // reset 109 | ci.SocketMaps[pid] = nil 110 | } 111 | -------------------------------------------------------------------------------- /aggregator/kafka/crc32_field.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "encoding/binary" 5 | "fmt" 6 | "hash/crc32" 7 | "sync" 8 | ) 9 | 10 | type crcPolynomial int8 11 | 12 | const ( 13 | crcIEEE crcPolynomial = iota 14 | crcCastagnoli 15 | ) 16 | 17 | var crc32FieldPool = sync.Pool{} 18 | 19 | func acquireCrc32Field(polynomial crcPolynomial) *crc32Field { 20 | val := crc32FieldPool.Get() 21 | if val != nil { 22 | c := val.(*crc32Field) 23 | c.polynomial = polynomial 24 | return c 25 | } 26 | return newCRC32Field(polynomial) 27 | } 28 | 29 | func releaseCrc32Field(c *crc32Field) { 30 | crc32FieldPool.Put(c) 31 | } 32 | 33 | var castagnoliTable = crc32.MakeTable(crc32.Castagnoli) 34 | 35 | // crc32Field implements the pushEncoder and pushDecoder interfaces for calculating CRC32s. 36 | type crc32Field struct { 37 | startOffset int 38 | polynomial crcPolynomial 39 | } 40 | 41 | func (c *crc32Field) saveOffset(in int) { 42 | c.startOffset = in 43 | } 44 | 45 | func (c *crc32Field) reserveLength() int { 46 | return 4 47 | } 48 | 49 | func newCRC32Field(polynomial crcPolynomial) *crc32Field { 50 | return &crc32Field{polynomial: polynomial} 51 | } 52 | 53 | func (c *crc32Field) check(curOffset int, buf []byte) error { 54 | crc, err := c.crc(curOffset, buf) 55 | if err != nil { 56 | return err 57 | } 58 | 59 | expected := binary.BigEndian.Uint32(buf[c.startOffset:]) 60 | if crc != expected { 61 | return PacketDecodingError{fmt.Sprintf("CRC didn't match expected %#x got %#x", expected, crc)} 62 | } 63 | 64 | return nil 65 | } 66 | 67 | func (c *crc32Field) crc(curOffset int, buf []byte) (uint32, error) { 68 | var tab *crc32.Table 69 | switch c.polynomial { 70 | case crcIEEE: 71 | tab = crc32.IEEETable 72 | case crcCastagnoli: 73 | tab = castagnoliTable 74 | default: 75 | return 0, PacketDecodingError{"invalid CRC type"} 76 | } 77 | return crc32.Checksum(buf[c.startOffset+4:curOffset], tab), nil 78 | } 79 | -------------------------------------------------------------------------------- /aggregator/kafka/decoder.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | type versionedDecoder interface { 4 | decode(pd packetDecoder, version int16) error 5 | } 6 | 7 | type packetDecoder interface { 8 | // Primitives 9 | getInt8() (int8, error) 10 | getInt16() (int16, error) 11 | getInt32() (int32, error) 12 | getInt64() (int64, error) 13 | getVarint() (int64, error) 14 | getUVarint() (uint64, error) 15 | getFloat64() (float64, error) 16 | getArrayLength() (int, error) 17 | getCompactArrayLength() (int, error) 18 | getBool() (bool, error) 19 | getEmptyTaggedFieldArray() (int, error) 20 | 21 | // Collections 22 | getBytes() ([]byte, error) 23 | getVarintBytes() ([]byte, error) 24 | getCompactBytes() ([]byte, error) 25 | getRawBytes(length int) ([]byte, error) 26 | getString() (string, error) 27 | getNullableString() (*string, error) 28 | getCompactString() (string, error) 29 | getCompactNullableString() (*string, error) 30 | getCompactInt32Array() ([]int32, error) 31 | getInt32Array() ([]int32, error) 32 | getInt64Array() ([]int64, error) 33 | getStringArray() ([]string, error) 34 | 35 | // Subsets 36 | remaining() int 37 | getSubset(length int) (packetDecoder, error) 38 | peek(offset, length int) (packetDecoder, error) // similar to getSubset, but it doesn't advance the offset 39 | peekInt8(offset int) (int8, error) // similar to peek, but just one byte 40 | 41 | // Stacks, see PushDecoder 42 | push(in pushDecoder) error 43 | pop() error 44 | } 45 | 46 | // PushDecoder is the interface for decoding fields like CRCs and lengths where the validity 47 | // of the field depends on what is after it in the packet. Start them with PacketDecoder.Push() where 48 | // the actual value is located in the packet, then PacketDecoder.Pop() them when all the bytes they 49 | // depend upon have been decoded. 50 | type pushDecoder interface { 51 | // Saves the offset into the input buffer as the location to actually read the calculated value when able. 52 | saveOffset(in int) 53 | 54 | // Returns the length of data to reserve for the input of this encoder (e.g. 4 bytes for a CRC32). 55 | reserveLength() int 56 | 57 | // Indicates that all required data is now available to calculate and check the field. 58 | // SaveOffset is guaranteed to have been called first. The implementation should read ReserveLength() bytes 59 | // of data from the saved offset, and verify it based on the data between the saved offset and curOffset. 60 | check(curOffset int, buf []byte) error 61 | } 62 | 63 | // dynamicPushDecoder extends the interface of pushDecoder for uses cases where the length of the 64 | // fields itself is unknown until its value was decoded (for instance varint encoded length 65 | // fields). 66 | // During push, dynamicPushDecoder.decode() method will be called instead of reserveLength() 67 | type dynamicPushDecoder interface { 68 | pushDecoder 69 | decoder 70 | } 71 | 72 | type decoder interface { 73 | decode(pd packetDecoder) error 74 | } 75 | 76 | // decode takes bytes and a decoder and fills the fields of the decoder from the bytes, 77 | // interpreted using Kafka's encoding rules. 78 | func decode(buf []byte, in decoder) error { 79 | if buf == nil { 80 | return nil 81 | } 82 | 83 | helper := realDecoder{ 84 | raw: buf, 85 | } 86 | err := in.decode(&helper) 87 | if err != nil { 88 | return err 89 | } 90 | 91 | if helper.off != len(buf) { 92 | return PacketDecodingError{"invalid length"} 93 | } 94 | 95 | return nil 96 | } 97 | 98 | func VersionedDecode(buf []byte, in versionedDecoder, version int16) (int, error) { 99 | if buf == nil { 100 | return 0, nil 101 | } 102 | 103 | helper := realDecoder{ 104 | raw: buf, 105 | } 106 | 107 | err := in.decode(&helper, version) 108 | if err != nil { 109 | return helper.off, err 110 | } 111 | 112 | // if helper.off != len(buf) { 113 | // return helper.off, PacketDecodingError{ 114 | // Info: fmt.Sprintf("invalid length (off=%d, len=%d)", helper.off, len(buf)), 115 | // } 116 | // } 117 | 118 | return helper.off, nil 119 | } 120 | -------------------------------------------------------------------------------- /aggregator/kafka/decompress.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "bytes" 5 | "fmt" 6 | "sync" 7 | 8 | snappy "github.com/eapache/go-xerial-snappy" 9 | "github.com/klauspost/compress/gzip" 10 | "github.com/pierrec/lz4/v4" 11 | ) 12 | 13 | var ( 14 | lz4ReaderPool = sync.Pool{ 15 | New: func() interface{} { 16 | return lz4.NewReader(nil) 17 | }, 18 | } 19 | 20 | gzipReaderPool sync.Pool 21 | 22 | bufferPool = sync.Pool{ 23 | New: func() interface{} { 24 | return new(bytes.Buffer) 25 | }, 26 | } 27 | 28 | bytesPool = sync.Pool{ 29 | New: func() interface{} { 30 | res := make([]byte, 0, 4096) 31 | return &res 32 | }, 33 | } 34 | ) 35 | 36 | func decompress(cc CompressionCodec, data []byte) ([]byte, error) { 37 | switch cc { 38 | case CompressionNone: 39 | return data, nil 40 | case CompressionGZIP: 41 | var err error 42 | reader, ok := gzipReaderPool.Get().(*gzip.Reader) 43 | if !ok { 44 | reader, err = gzip.NewReader(bytes.NewReader(data)) 45 | } else { 46 | err = reader.Reset(bytes.NewReader(data)) 47 | } 48 | 49 | if err != nil { 50 | return nil, err 51 | } 52 | 53 | buffer := bufferPool.Get().(*bytes.Buffer) 54 | _, err = buffer.ReadFrom(reader) 55 | // copy the buffer to a new slice with the correct length 56 | // reuse gzipReader and buffer 57 | gzipReaderPool.Put(reader) 58 | res := make([]byte, buffer.Len()) 59 | copy(res, buffer.Bytes()) 60 | buffer.Reset() 61 | bufferPool.Put(buffer) 62 | 63 | return res, err 64 | case CompressionSnappy: 65 | return snappy.Decode(data) 66 | case CompressionLZ4: 67 | reader, ok := lz4ReaderPool.Get().(*lz4.Reader) 68 | if !ok { 69 | reader = lz4.NewReader(bytes.NewReader(data)) 70 | } else { 71 | reader.Reset(bytes.NewReader(data)) 72 | } 73 | buffer := bufferPool.Get().(*bytes.Buffer) 74 | _, err := buffer.ReadFrom(reader) 75 | // copy the buffer to a new slice with the correct length 76 | // reuse lz4Reader and buffer 77 | lz4ReaderPool.Put(reader) 78 | res := make([]byte, buffer.Len()) 79 | copy(res, buffer.Bytes()) 80 | buffer.Reset() 81 | bufferPool.Put(buffer) 82 | 83 | return res, err 84 | case CompressionZSTD: 85 | buffer := *bytesPool.Get().(*[]byte) 86 | var err error 87 | buffer, err = zstdDecompress(ZstdDecoderParams{}, buffer, data) 88 | // copy the buffer to a new slice with the correct length and reuse buffer 89 | res := make([]byte, len(buffer)) 90 | copy(res, buffer) 91 | buffer = buffer[:0] 92 | bytesPool.Put(&buffer) 93 | 94 | return res, err 95 | default: 96 | return nil, PacketDecodingError{fmt.Sprintf("invalid compression specified (%d)", cc)} 97 | } 98 | } 99 | -------------------------------------------------------------------------------- /aggregator/kafka/fetch_response.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "errors" 5 | 6 | "github.com/ddosify/alaz/log" 7 | 8 | "time" 9 | ) 10 | 11 | type AbortedTransaction struct { 12 | // ProducerID contains the producer id associated with the aborted transaction. 13 | ProducerID int64 14 | // FirstOffset contains the first offset in the aborted transaction. 15 | FirstOffset int64 16 | } 17 | 18 | func (t *AbortedTransaction) decode(pd packetDecoder) (err error) { 19 | if t.ProducerID, err = pd.getInt64(); err != nil { 20 | return err 21 | } 22 | 23 | if t.FirstOffset, err = pd.getInt64(); err != nil { 24 | return err 25 | } 26 | 27 | return nil 28 | } 29 | 30 | type FetchResponseBlock struct { 31 | // Err contains the error code, or 0 if there was no fetch error. 32 | Err KError 33 | // HighWatermarkOffset contains the current high water mark. 34 | HighWaterMarkOffset int64 35 | // LastStableOffset contains the last stable offset (or LSO) of the 36 | // partition. This is the last offset such that the state of all 37 | // transactional records prior to this offset have been decided (ABORTED or 38 | // COMMITTED) 39 | LastStableOffset int64 40 | LastRecordsBatchOffset *int64 41 | // LogStartOffset contains the current log start offset. 42 | LogStartOffset int64 43 | // AbortedTransactions contains the aborted transactions. 44 | AbortedTransactions []*AbortedTransaction 45 | // PreferredReadReplica contains the preferred read replica for the 46 | // consumer to use on its next fetch request 47 | PreferredReadReplica int32 48 | // RecordsSet contains the record data. 49 | RecordsSet []*Records 50 | 51 | Partial bool 52 | Records *Records // deprecated: use FetchResponseBlock.RecordsSet 53 | } 54 | 55 | func (b *FetchResponseBlock) decode(pd packetDecoder, version int16) (err error) { 56 | tmp, err := pd.getInt16() 57 | if err != nil { 58 | return err 59 | } 60 | b.Err = KError(tmp) 61 | 62 | b.HighWaterMarkOffset, err = pd.getInt64() 63 | if err != nil { 64 | return err 65 | } 66 | 67 | if version >= 4 { 68 | b.LastStableOffset, err = pd.getInt64() 69 | if err != nil { 70 | return err 71 | } 72 | 73 | if version >= 5 { 74 | b.LogStartOffset, err = pd.getInt64() 75 | if err != nil { 76 | return err 77 | } 78 | } 79 | 80 | numTransact, err := pd.getArrayLength() 81 | if err != nil { 82 | return err 83 | } 84 | 85 | if numTransact >= 0 { 86 | b.AbortedTransactions = make([]*AbortedTransaction, numTransact) 87 | } 88 | 89 | for i := 0; i < numTransact; i++ { 90 | transact := new(AbortedTransaction) 91 | if err = transact.decode(pd); err != nil { 92 | return err 93 | } 94 | b.AbortedTransactions[i] = transact 95 | } 96 | } 97 | 98 | if version >= 11 { 99 | b.PreferredReadReplica, err = pd.getInt32() 100 | if err != nil { 101 | return err 102 | } 103 | } else { 104 | b.PreferredReadReplica = -1 105 | } 106 | 107 | recordsSize, err := pd.getInt32() 108 | if err != nil { 109 | return err 110 | } 111 | 112 | recordsDecoder, err := pd.getSubset(int(recordsSize)) 113 | if err != nil { 114 | return err 115 | } 116 | 117 | b.RecordsSet = []*Records{} 118 | 119 | for recordsDecoder.remaining() > 0 { 120 | records := &Records{} 121 | if err := records.decode(recordsDecoder); err != nil { 122 | // If we have at least one decoded records, this is not an error 123 | if errors.Is(err, ErrInsufficientData) { 124 | if len(b.RecordsSet) == 0 { 125 | b.Partial = true 126 | } 127 | break 128 | } 129 | return err 130 | } 131 | 132 | b.LastRecordsBatchOffset, err = records.recordsOffset() 133 | if err != nil { 134 | return err 135 | } 136 | 137 | partial, err := records.isPartial() 138 | if err != nil { 139 | return err 140 | } 141 | 142 | n, err := records.numRecords() 143 | if err != nil { 144 | return err 145 | } 146 | 147 | if n > 0 || (partial && len(b.RecordsSet) == 0) { 148 | b.RecordsSet = append(b.RecordsSet, records) 149 | 150 | if b.Records == nil { 151 | b.Records = records 152 | } 153 | } 154 | 155 | overflow, err := records.isOverflow() 156 | if err != nil { 157 | return err 158 | } 159 | 160 | if partial || overflow { 161 | break 162 | } 163 | } 164 | 165 | return nil 166 | } 167 | 168 | type FetchResponse struct { 169 | // Version defines the protocol version to use for encode and decode 170 | Version int16 171 | // ThrottleTime contains the duration in milliseconds for which the request 172 | // was throttled due to a quota violation, or zero if the request did not 173 | // violate any quota. 174 | ThrottleTime time.Duration 175 | // ErrorCode contains the top level response error code. 176 | ErrorCode int16 177 | // SessionID contains the fetch session ID, or 0 if this is not part of a fetch session. 178 | SessionID int32 179 | // Blocks contains the response topics. 180 | Blocks map[string]map[int32]*FetchResponseBlock 181 | 182 | LogAppendTime bool 183 | Timestamp time.Time 184 | } 185 | 186 | func (r *FetchResponse) decode(pd packetDecoder, version int16) (err error) { 187 | r.Version = version 188 | 189 | if r.Version >= 1 { 190 | throttle, err := pd.getInt32() 191 | if err != nil { 192 | return err 193 | } 194 | r.ThrottleTime = time.Duration(throttle) * time.Millisecond 195 | } 196 | 197 | if r.Version >= 7 { 198 | r.ErrorCode, err = pd.getInt16() 199 | if err != nil { 200 | return err 201 | } 202 | r.SessionID, err = pd.getInt32() 203 | if err != nil { 204 | return err 205 | } 206 | } 207 | 208 | numTopics, err := pd.getArrayLength() 209 | if err != nil { 210 | return err 211 | } 212 | 213 | log.Logger.Warn().Msgf("sarama-numTopics: %d", numTopics) 214 | 215 | r.Blocks = make(map[string]map[int32]*FetchResponseBlock, numTopics) 216 | for i := 0; i < numTopics; i++ { 217 | name, err := pd.getString() 218 | if err != nil { 219 | return err 220 | } 221 | 222 | numBlocks, err := pd.getArrayLength() 223 | if err != nil { 224 | return err 225 | } 226 | 227 | r.Blocks[name] = make(map[int32]*FetchResponseBlock, numBlocks) 228 | 229 | for j := 0; j < numBlocks; j++ { 230 | id, err := pd.getInt32() 231 | if err != nil { 232 | return err 233 | } 234 | 235 | block := new(FetchResponseBlock) 236 | err = block.decode(pd, version) 237 | if err != nil { 238 | return err 239 | } 240 | r.Blocks[name][id] = block 241 | } 242 | } 243 | 244 | return nil 245 | } 246 | 247 | func (r *FetchResponse) key() int16 { 248 | return 1 249 | } 250 | 251 | func (r *FetchResponse) version() int16 { 252 | return r.Version 253 | } 254 | 255 | func (r *FetchResponse) headerVersion() int16 { 256 | return 0 257 | } 258 | 259 | func (r *FetchResponse) isValidVersion() bool { 260 | return r.Version >= 0 && r.Version <= 11 261 | } 262 | 263 | func (r *FetchResponse) requiredVersion() KafkaVersion { 264 | switch r.Version { 265 | case 11: 266 | return V2_3_0_0 267 | case 9, 10: 268 | return V2_1_0_0 269 | case 8: 270 | return V2_0_0_0 271 | case 7: 272 | return V1_1_0_0 273 | case 6: 274 | return V1_0_0_0 275 | case 4, 5: 276 | return V0_11_0_0 277 | case 3: 278 | return V0_10_1_0 279 | case 2: 280 | return V0_10_0_0 281 | case 1: 282 | return V0_9_0_0 283 | case 0: 284 | return V0_8_2_0 285 | default: 286 | return V2_3_0_0 287 | } 288 | } 289 | -------------------------------------------------------------------------------- /aggregator/kafka/length_field.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "encoding/binary" 5 | "sync" 6 | ) 7 | 8 | // LengthField implements the PushEncoder and PushDecoder interfaces for calculating 4-byte lengths. 9 | type lengthField struct { 10 | startOffset int 11 | length int32 12 | } 13 | 14 | var lengthFieldPool = sync.Pool{} 15 | 16 | func acquireLengthField() *lengthField { 17 | val := lengthFieldPool.Get() 18 | if val != nil { 19 | return val.(*lengthField) 20 | } 21 | return &lengthField{} 22 | } 23 | 24 | func releaseLengthField(m *lengthField) { 25 | lengthFieldPool.Put(m) 26 | } 27 | 28 | func (l *lengthField) decode(pd packetDecoder) error { 29 | var err error 30 | l.length, err = pd.getInt32() 31 | if err != nil { 32 | return err 33 | } 34 | if l.length > int32(pd.remaining()) { 35 | return ErrInsufficientData 36 | } 37 | return nil 38 | } 39 | 40 | func (l *lengthField) saveOffset(in int) { 41 | l.startOffset = in 42 | } 43 | 44 | func (l *lengthField) reserveLength() int { 45 | return 4 46 | } 47 | 48 | func (l *lengthField) check(curOffset int, buf []byte) error { 49 | if int32(curOffset-l.startOffset-4) != l.length { 50 | return PacketDecodingError{"length field invalid"} 51 | } 52 | 53 | return nil 54 | } 55 | 56 | type varintLengthField struct { 57 | startOffset int 58 | length int64 59 | } 60 | 61 | func (l *varintLengthField) decode(pd packetDecoder) error { 62 | var err error 63 | l.length, err = pd.getVarint() 64 | return err 65 | } 66 | 67 | func (l *varintLengthField) saveOffset(in int) { 68 | l.startOffset = in 69 | } 70 | 71 | func (l *varintLengthField) reserveLength() int { 72 | var tmp [binary.MaxVarintLen64]byte 73 | return binary.PutVarint(tmp[:], l.length) 74 | } 75 | 76 | func (l *varintLengthField) check(curOffset int, buf []byte) error { 77 | if int64(curOffset-l.startOffset-l.reserveLength()) != l.length { 78 | return PacketDecodingError{"length field invalid"} 79 | } 80 | 81 | return nil 82 | } 83 | -------------------------------------------------------------------------------- /aggregator/kafka/message.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "fmt" 5 | "time" 6 | ) 7 | 8 | const ( 9 | // CompressionNone no compression 10 | CompressionNone CompressionCodec = iota 11 | // CompressionGZIP compression using GZIP 12 | CompressionGZIP 13 | // CompressionSnappy compression using snappy 14 | CompressionSnappy 15 | // CompressionLZ4 compression using LZ4 16 | CompressionLZ4 17 | // CompressionZSTD compression using ZSTD 18 | CompressionZSTD 19 | 20 | // The lowest 3 bits contain the compression codec used for the message 21 | compressionCodecMask int8 = 0x07 22 | 23 | // Bit 3 set for "LogAppend" timestamps 24 | timestampTypeMask = 0x08 25 | 26 | // CompressionLevelDefault is the constant to use in CompressionLevel 27 | // to have the default compression level for any codec. The value is picked 28 | // that we don't use any existing compression levels. 29 | CompressionLevelDefault = -1000 30 | ) 31 | 32 | // CompressionCodec represents the various compression codecs recognized by Kafka in messages. 33 | type CompressionCodec int8 34 | 35 | func (cc CompressionCodec) String() string { 36 | return []string{ 37 | "none", 38 | "gzip", 39 | "snappy", 40 | "lz4", 41 | "zstd", 42 | }[int(cc)] 43 | } 44 | 45 | // UnmarshalText returns a CompressionCodec from its string representation. 46 | func (cc *CompressionCodec) UnmarshalText(text []byte) error { 47 | codecs := map[string]CompressionCodec{ 48 | "none": CompressionNone, 49 | "gzip": CompressionGZIP, 50 | "snappy": CompressionSnappy, 51 | "lz4": CompressionLZ4, 52 | "zstd": CompressionZSTD, 53 | } 54 | codec, ok := codecs[string(text)] 55 | if !ok { 56 | return fmt.Errorf("cannot parse %q as a compression codec", string(text)) 57 | } 58 | *cc = codec 59 | return nil 60 | } 61 | 62 | // MarshalText transforms a CompressionCodec into its string representation. 63 | func (cc CompressionCodec) MarshalText() ([]byte, error) { 64 | return []byte(cc.String()), nil 65 | } 66 | 67 | // Message is a kafka message type 68 | type Message struct { 69 | Codec CompressionCodec // codec used to compress the message contents 70 | CompressionLevel int // compression level 71 | LogAppendTime bool // the used timestamp is LogAppendTime 72 | Key []byte // the message key, may be nil 73 | Value []byte // the message contents 74 | Set *MessageSet // the message set a message might wrap 75 | Version int8 // v1 requires Kafka 0.10 76 | Timestamp time.Time // the timestamp of the message (version 1+ only) 77 | 78 | // compressedCache []byte 79 | // compressedSize int // used for computing the compression ratio metrics 80 | } 81 | 82 | func (m *Message) decode(pd packetDecoder) (err error) { 83 | crc32Decoder := acquireCrc32Field(crcIEEE) 84 | defer releaseCrc32Field(crc32Decoder) 85 | 86 | err = pd.push(crc32Decoder) 87 | if err != nil { 88 | return err 89 | } 90 | 91 | m.Version, err = pd.getInt8() 92 | if err != nil { 93 | return err 94 | } 95 | 96 | if m.Version > 1 { 97 | return PacketDecodingError{fmt.Sprintf("unknown magic byte (%v)", m.Version)} 98 | } 99 | 100 | attribute, err := pd.getInt8() 101 | if err != nil { 102 | return err 103 | } 104 | m.Codec = CompressionCodec(attribute & compressionCodecMask) 105 | m.LogAppendTime = attribute×tampTypeMask == timestampTypeMask 106 | 107 | if m.Version == 1 { 108 | if err := (Timestamp{&m.Timestamp}).decode(pd); err != nil { 109 | return err 110 | } 111 | } 112 | 113 | m.Key, err = pd.getBytes() 114 | if err != nil { 115 | return err 116 | } 117 | 118 | m.Value, err = pd.getBytes() 119 | if err != nil { 120 | return err 121 | } 122 | 123 | // Required for deep equal assertion during tests but might be useful 124 | // for future metrics about the compression ratio in fetch requests 125 | // m.compressedSize = len(m.Value) 126 | 127 | if m.Value != nil && m.Codec != CompressionNone { 128 | m.Value, err = decompress(m.Codec, m.Value) 129 | if err != nil { 130 | return err 131 | } 132 | 133 | if err := m.decodeSet(); err != nil { 134 | return err 135 | } 136 | } 137 | 138 | return pd.pop() 139 | } 140 | 141 | // decodes a message set from a previously encoded bulk-message 142 | func (m *Message) decodeSet() (err error) { 143 | pd := realDecoder{raw: m.Value} 144 | m.Set = &MessageSet{} 145 | return m.Set.decode(&pd) 146 | } 147 | -------------------------------------------------------------------------------- /aggregator/kafka/message_set.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import "errors" 4 | 5 | type MessageBlock struct { 6 | Offset int64 7 | Msg *Message 8 | } 9 | 10 | // Messages convenience helper which returns either all the 11 | // messages that are wrapped in this block 12 | func (msb *MessageBlock) Messages() []*MessageBlock { 13 | if msb.Msg.Set != nil { 14 | return msb.Msg.Set.Messages 15 | } 16 | return []*MessageBlock{msb} 17 | } 18 | 19 | func (msb *MessageBlock) decode(pd packetDecoder) (err error) { 20 | if msb.Offset, err = pd.getInt64(); err != nil { 21 | return err 22 | } 23 | 24 | lengthDecoder := acquireLengthField() 25 | defer releaseLengthField(lengthDecoder) 26 | 27 | if err = pd.push(lengthDecoder); err != nil { 28 | return err 29 | } 30 | 31 | msb.Msg = new(Message) 32 | if err = msb.Msg.decode(pd); err != nil { 33 | return err 34 | } 35 | 36 | if err = pd.pop(); err != nil { 37 | return err 38 | } 39 | 40 | return nil 41 | } 42 | 43 | type MessageSet struct { 44 | PartialTrailingMessage bool // whether the set on the wire contained an incomplete trailing MessageBlock 45 | OverflowMessage bool // whether the set on the wire contained an overflow message 46 | Messages []*MessageBlock 47 | } 48 | 49 | func (ms *MessageSet) decode(pd packetDecoder) (err error) { 50 | ms.Messages = nil 51 | 52 | for pd.remaining() > 0 { 53 | magic, err := magicValue(pd) 54 | if err != nil { 55 | if errors.Is(err, ErrInsufficientData) { 56 | ms.PartialTrailingMessage = true 57 | return nil 58 | } 59 | return err 60 | } 61 | 62 | if magic > 1 { 63 | return nil 64 | } 65 | 66 | msb := new(MessageBlock) 67 | err = msb.decode(pd) 68 | if err == nil { 69 | ms.Messages = append(ms.Messages, msb) 70 | } else if errors.Is(err, ErrInsufficientData) { 71 | // As an optimization the server is allowed to return a partial message at the 72 | // end of the message set. Clients should handle this case. So we just ignore such things. 73 | if msb.Offset == -1 { 74 | // This is an overflow message caused by chunked down conversion 75 | ms.OverflowMessage = true 76 | } else { 77 | ms.PartialTrailingMessage = true 78 | } 79 | return nil 80 | } else { 81 | return err 82 | } 83 | } 84 | 85 | return nil 86 | } 87 | -------------------------------------------------------------------------------- /aggregator/kafka/produce_request.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | // RequiredAcks is used in Produce Requests to tell the broker how many replica acknowledgements 4 | // it must see before responding. Any of the constants defined here are valid. On broker versions 5 | // prior to 0.8.2.0 any other positive int16 is also valid (the broker will wait for that many 6 | // acknowledgements) but in 0.8.2.0 and later this will raise an exception (it has been replaced 7 | // by setting the `min.isr` value in the brokers configuration). 8 | type RequiredAcks int16 9 | 10 | const ( 11 | // NoResponse doesn't send any response, the TCP ACK is all you get. 12 | NoResponse RequiredAcks = 0 13 | // WaitForLocal waits for only the local commit to succeed before responding. 14 | WaitForLocal RequiredAcks = 1 15 | // WaitForAll waits for all in-sync replicas to commit before responding. 16 | // The minimum number of in-sync replicas is configured on the broker via 17 | // the `min.insync.replicas` configuration key. 18 | WaitForAll RequiredAcks = -1 19 | ) 20 | 21 | type ProduceRequest struct { 22 | TransactionalID *string 23 | RequiredAcks RequiredAcks 24 | Timeout int32 25 | Version int16 // v1 requires Kafka 0.9, v2 requires Kafka 0.10, v3 requires Kafka 0.11 26 | Records map[string]map[int32]Records 27 | } 28 | 29 | func (r *ProduceRequest) decode(pd packetDecoder, version int16) error { 30 | r.Version = version 31 | 32 | if version >= 3 { 33 | id, err := pd.getNullableString() 34 | if err != nil { 35 | return err 36 | } 37 | r.TransactionalID = id 38 | } 39 | requiredAcks, err := pd.getInt16() 40 | if err != nil { 41 | return err 42 | } 43 | r.RequiredAcks = RequiredAcks(requiredAcks) 44 | if r.Timeout, err = pd.getInt32(); err != nil { 45 | return err 46 | } 47 | topicCount, err := pd.getArrayLength() 48 | if err != nil { 49 | return err 50 | } 51 | if topicCount == 0 { 52 | return nil 53 | } 54 | 55 | r.Records = make(map[string]map[int32]Records) 56 | for i := 0; i < topicCount; i++ { 57 | topic, err := pd.getString() 58 | if err != nil { 59 | return err 60 | } 61 | partitionCount, err := pd.getArrayLength() 62 | if err != nil { 63 | return err 64 | } 65 | r.Records[topic] = make(map[int32]Records) 66 | 67 | for j := 0; j < partitionCount; j++ { 68 | partition, err := pd.getInt32() 69 | if err != nil { 70 | return err 71 | } 72 | size, err := pd.getInt32() 73 | if err != nil { 74 | return err 75 | } 76 | recordsDecoder, err := pd.getSubset(int(size)) 77 | if err != nil { 78 | return err 79 | } 80 | var records Records 81 | if err := records.decode(recordsDecoder); err != nil { 82 | return err 83 | } 84 | r.Records[topic][partition] = records 85 | } 86 | } 87 | 88 | return nil 89 | } 90 | 91 | func (r *ProduceRequest) key() int16 { 92 | return 0 93 | } 94 | 95 | func (r *ProduceRequest) version() int16 { 96 | return r.Version 97 | } 98 | 99 | func (r *ProduceRequest) headerVersion() int16 { 100 | return 1 101 | } 102 | 103 | func (r *ProduceRequest) isValidVersion() bool { 104 | return r.Version >= 0 && r.Version <= 7 105 | } 106 | 107 | func (r *ProduceRequest) requiredVersion() KafkaVersion { 108 | switch r.Version { 109 | case 7: 110 | return V2_1_0_0 111 | case 6: 112 | return V2_0_0_0 113 | case 4, 5: 114 | return V1_0_0_0 115 | case 3: 116 | return V0_11_0_0 117 | case 2: 118 | return V0_10_0_0 119 | case 1: 120 | return V0_9_0_0 121 | case 0: 122 | return V0_8_2_0 123 | default: 124 | return V2_1_0_0 125 | } 126 | } 127 | -------------------------------------------------------------------------------- /aggregator/kafka/record.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "encoding/binary" 5 | "time" 6 | ) 7 | 8 | const ( 9 | isTransactionalMask = 0x10 10 | controlMask = 0x20 11 | maximumRecordOverhead = 5*binary.MaxVarintLen32 + binary.MaxVarintLen64 + 1 12 | ) 13 | 14 | // RecordHeader stores key and value for a record header 15 | type RecordHeader struct { 16 | Key []byte 17 | Value []byte 18 | } 19 | 20 | func (h *RecordHeader) decode(pd packetDecoder) (err error) { 21 | if h.Key, err = pd.getVarintBytes(); err != nil { 22 | return err 23 | } 24 | 25 | if h.Value, err = pd.getVarintBytes(); err != nil { 26 | return err 27 | } 28 | return nil 29 | } 30 | 31 | // Record is kafka record type 32 | type Record struct { 33 | Headers []*RecordHeader 34 | 35 | Attributes int8 36 | TimestampDelta time.Duration 37 | OffsetDelta int64 38 | Key []byte 39 | Value []byte 40 | length varintLengthField 41 | } 42 | 43 | func (r *Record) decode(pd packetDecoder) (err error) { 44 | if err = pd.push(&r.length); err != nil { 45 | return err 46 | } 47 | 48 | if r.Attributes, err = pd.getInt8(); err != nil { 49 | return err 50 | } 51 | 52 | timestamp, err := pd.getVarint() 53 | if err != nil { 54 | return err 55 | } 56 | r.TimestampDelta = time.Duration(timestamp) * time.Millisecond 57 | 58 | if r.OffsetDelta, err = pd.getVarint(); err != nil { 59 | return err 60 | } 61 | 62 | if r.Key, err = pd.getVarintBytes(); err != nil { 63 | return err 64 | } 65 | 66 | if r.Value, err = pd.getVarintBytes(); err != nil { 67 | return err 68 | } 69 | 70 | numHeaders, err := pd.getVarint() 71 | if err != nil { 72 | return err 73 | } 74 | 75 | if numHeaders >= 0 { 76 | r.Headers = make([]*RecordHeader, numHeaders) 77 | } 78 | for i := int64(0); i < numHeaders; i++ { 79 | hdr := new(RecordHeader) 80 | if err := hdr.decode(pd); err != nil { 81 | return err 82 | } 83 | r.Headers[i] = hdr 84 | } 85 | 86 | return pd.pop() 87 | } 88 | -------------------------------------------------------------------------------- /aggregator/kafka/record_batch.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "errors" 5 | "time" 6 | ) 7 | 8 | const recordBatchOverhead = 49 9 | 10 | type recordsArray []*Record 11 | 12 | func (e recordsArray) decode(pd packetDecoder) error { 13 | records := make([]Record, len(e)) 14 | for i := range e { 15 | if err := records[i].decode(pd); err != nil { 16 | return err 17 | } 18 | e[i] = &records[i] 19 | } 20 | return nil 21 | } 22 | 23 | type RecordBatch struct { 24 | FirstOffset int64 25 | PartitionLeaderEpoch int32 26 | Version int8 27 | Codec CompressionCodec 28 | CompressionLevel int 29 | Control bool 30 | LogAppendTime bool 31 | LastOffsetDelta int32 32 | FirstTimestamp time.Time 33 | MaxTimestamp time.Time 34 | ProducerID int64 35 | ProducerEpoch int16 36 | FirstSequence int32 37 | Records []*Record 38 | PartialTrailingRecord bool 39 | IsTransactional bool 40 | 41 | compressedRecords []byte 42 | recordsLen int // uncompressed records size 43 | } 44 | 45 | func (b *RecordBatch) LastOffset() int64 { 46 | return b.FirstOffset + int64(b.LastOffsetDelta) 47 | } 48 | 49 | func (b *RecordBatch) decode(pd packetDecoder) (err error) { 50 | if b.FirstOffset, err = pd.getInt64(); err != nil { 51 | return err 52 | } 53 | 54 | batchLen, err := pd.getInt32() 55 | if err != nil { 56 | return err 57 | } 58 | 59 | if b.PartitionLeaderEpoch, err = pd.getInt32(); err != nil { 60 | return err 61 | } 62 | 63 | if b.Version, err = pd.getInt8(); err != nil { 64 | return err 65 | } 66 | 67 | crc32Decoder := acquireCrc32Field(crcCastagnoli) 68 | defer releaseCrc32Field(crc32Decoder) 69 | 70 | if err = pd.push(crc32Decoder); err != nil { 71 | return err 72 | } 73 | 74 | attributes, err := pd.getInt16() 75 | if err != nil { 76 | return err 77 | } 78 | b.Codec = CompressionCodec(int8(attributes) & compressionCodecMask) 79 | b.Control = attributes&controlMask == controlMask 80 | b.LogAppendTime = attributes×tampTypeMask == timestampTypeMask 81 | b.IsTransactional = attributes&isTransactionalMask == isTransactionalMask 82 | 83 | if b.LastOffsetDelta, err = pd.getInt32(); err != nil { 84 | return err 85 | } 86 | 87 | if err = (Timestamp{&b.FirstTimestamp}).decode(pd); err != nil { 88 | return err 89 | } 90 | 91 | if err = (Timestamp{&b.MaxTimestamp}).decode(pd); err != nil { 92 | return err 93 | } 94 | 95 | if b.ProducerID, err = pd.getInt64(); err != nil { 96 | return err 97 | } 98 | 99 | if b.ProducerEpoch, err = pd.getInt16(); err != nil { 100 | return err 101 | } 102 | 103 | if b.FirstSequence, err = pd.getInt32(); err != nil { 104 | return err 105 | } 106 | 107 | numRecs, err := pd.getArrayLength() 108 | if err != nil { 109 | return err 110 | } 111 | if numRecs >= 0 { 112 | b.Records = make([]*Record, numRecs) 113 | } 114 | 115 | bufSize := int(batchLen) - recordBatchOverhead 116 | recBuffer, err := pd.getRawBytes(bufSize) 117 | if err != nil { 118 | if errors.Is(err, ErrInsufficientData) { 119 | b.PartialTrailingRecord = true 120 | b.Records = nil 121 | return nil 122 | } 123 | return err 124 | } 125 | 126 | if err = pd.pop(); err != nil { 127 | return err 128 | } 129 | 130 | recBuffer, err = decompress(b.Codec, recBuffer) 131 | if err != nil { 132 | return err 133 | } 134 | 135 | b.recordsLen = len(recBuffer) 136 | err = decode(recBuffer, recordsArray(b.Records)) 137 | if errors.Is(err, ErrInsufficientData) { 138 | b.PartialTrailingRecord = true 139 | b.Records = nil 140 | return nil 141 | } 142 | return err 143 | } 144 | 145 | func (b *RecordBatch) addRecord(r *Record) { 146 | b.Records = append(b.Records, r) 147 | } 148 | -------------------------------------------------------------------------------- /aggregator/kafka/records.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import "fmt" 4 | 5 | const ( 6 | unknownRecords = iota 7 | legacyRecords 8 | defaultRecords 9 | 10 | magicOffset = 16 11 | ) 12 | 13 | // Records implements a union type containing either a RecordBatch or a legacy MessageSet. 14 | type Records struct { 15 | recordsType int 16 | MsgSet *MessageSet 17 | RecordBatch *RecordBatch 18 | } 19 | 20 | // setTypeFromFields sets type of Records depending on which of MsgSet or RecordBatch is not nil. 21 | // The first return value indicates whether both fields are nil (and the type is not set). 22 | // If both fields are not nil, it returns an error. 23 | func (r *Records) setTypeFromFields() (bool, error) { 24 | if r.MsgSet == nil && r.RecordBatch == nil { 25 | return true, nil 26 | } 27 | if r.MsgSet != nil && r.RecordBatch != nil { 28 | return false, fmt.Errorf("both MsgSet and RecordBatch are set, but record type is unknown") 29 | } 30 | r.recordsType = defaultRecords 31 | if r.MsgSet != nil { 32 | r.recordsType = legacyRecords 33 | } 34 | return false, nil 35 | } 36 | 37 | func (r *Records) setTypeFromMagic(pd packetDecoder) error { 38 | magic, err := magicValue(pd) 39 | if err != nil { 40 | return err 41 | } 42 | 43 | r.recordsType = defaultRecords 44 | if magic < 2 { 45 | r.recordsType = legacyRecords 46 | } 47 | 48 | return nil 49 | } 50 | 51 | func (r *Records) decode(pd packetDecoder) error { 52 | if r.recordsType == unknownRecords { 53 | if err := r.setTypeFromMagic(pd); err != nil { 54 | return err 55 | } 56 | } 57 | 58 | switch r.recordsType { 59 | case legacyRecords: 60 | r.MsgSet = &MessageSet{} 61 | return r.MsgSet.decode(pd) 62 | case defaultRecords: 63 | r.RecordBatch = &RecordBatch{} 64 | return r.RecordBatch.decode(pd) 65 | } 66 | return fmt.Errorf("unknown records type: %v", r.recordsType) 67 | } 68 | 69 | func (r *Records) numRecords() (int, error) { 70 | if r.recordsType == unknownRecords { 71 | if empty, err := r.setTypeFromFields(); err != nil || empty { 72 | return 0, err 73 | } 74 | } 75 | 76 | switch r.recordsType { 77 | case legacyRecords: 78 | if r.MsgSet == nil { 79 | return 0, nil 80 | } 81 | return len(r.MsgSet.Messages), nil 82 | case defaultRecords: 83 | if r.RecordBatch == nil { 84 | return 0, nil 85 | } 86 | return len(r.RecordBatch.Records), nil 87 | } 88 | return 0, fmt.Errorf("unknown records type: %v", r.recordsType) 89 | } 90 | 91 | func (r *Records) isPartial() (bool, error) { 92 | if r.recordsType == unknownRecords { 93 | if empty, err := r.setTypeFromFields(); err != nil || empty { 94 | return false, err 95 | } 96 | } 97 | 98 | switch r.recordsType { 99 | case unknownRecords: 100 | return false, nil 101 | case legacyRecords: 102 | if r.MsgSet == nil { 103 | return false, nil 104 | } 105 | return r.MsgSet.PartialTrailingMessage, nil 106 | case defaultRecords: 107 | if r.RecordBatch == nil { 108 | return false, nil 109 | } 110 | return r.RecordBatch.PartialTrailingRecord, nil 111 | } 112 | return false, fmt.Errorf("unknown records type: %v", r.recordsType) 113 | } 114 | 115 | func (r *Records) isOverflow() (bool, error) { 116 | if r.recordsType == unknownRecords { 117 | if empty, err := r.setTypeFromFields(); err != nil || empty { 118 | return false, err 119 | } 120 | } 121 | 122 | switch r.recordsType { 123 | case unknownRecords: 124 | return false, nil 125 | case legacyRecords: 126 | if r.MsgSet == nil { 127 | return false, nil 128 | } 129 | return r.MsgSet.OverflowMessage, nil 130 | case defaultRecords: 131 | return false, nil 132 | } 133 | return false, fmt.Errorf("unknown records type: %v", r.recordsType) 134 | } 135 | 136 | func (r *Records) recordsOffset() (*int64, error) { 137 | switch r.recordsType { 138 | case unknownRecords: 139 | return nil, nil 140 | case legacyRecords: 141 | return nil, nil 142 | case defaultRecords: 143 | if r.RecordBatch == nil { 144 | return nil, nil 145 | } 146 | return &r.RecordBatch.FirstOffset, nil 147 | } 148 | return nil, fmt.Errorf("unknown records type: %v", r.recordsType) 149 | } 150 | 151 | func magicValue(pd packetDecoder) (int8, error) { 152 | return pd.peekInt8(magicOffset) 153 | } 154 | -------------------------------------------------------------------------------- /aggregator/kafka/request.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "encoding/binary" 5 | "fmt" 6 | "io" 7 | ) 8 | 9 | // KafkaVersion instances represent versions of the upstream Kafka broker. 10 | type KafkaVersion struct { 11 | // it's a struct rather than just typing the array directly to make it opaque and stop people 12 | // generating their own arbitrary versions 13 | version [4]uint 14 | } 15 | 16 | type ProtocolBody interface { 17 | // encoder 18 | versionedDecoder 19 | key() int16 20 | version() int16 21 | headerVersion() int16 22 | isValidVersion() bool 23 | requiredVersion() KafkaVersion 24 | } 25 | 26 | const MaxRequestSize int32 = 100 * 1024 * 1024 27 | 28 | func (r *Request) decode(pd packetDecoder) (err error) { 29 | key, err := pd.getInt16() 30 | if err != nil { 31 | return err 32 | } 33 | 34 | version, err := pd.getInt16() 35 | if err != nil { 36 | return err 37 | } 38 | 39 | r.CorrelationID, err = pd.getInt32() 40 | if err != nil { 41 | return err 42 | } 43 | 44 | r.ClientID, err = pd.getString() 45 | if err != nil { 46 | return err 47 | } 48 | 49 | r.Body = allocateBody(key, version) 50 | if r.Body == nil { 51 | return fmt.Errorf(fmt.Sprintf("unknown request key (%d)", key)) 52 | } 53 | 54 | if r.Body.headerVersion() >= 2 { 55 | // tagged field 56 | _, err = pd.getUVarint() 57 | if err != nil { 58 | return err 59 | } 60 | } 61 | 62 | return r.Body.decode(pd, version) 63 | } 64 | 65 | type Request struct { 66 | CorrelationID int32 67 | ClientID string 68 | Body ProtocolBody 69 | } 70 | 71 | func allocateBody(key, version int16) ProtocolBody { 72 | switch key { 73 | case 0: 74 | return &ProduceRequest{Version: version} 75 | // case 1: 76 | // return &FetchRequest{Version: version} 77 | // case 2: 78 | // return &OffsetRequest{Version: version} 79 | // case 3: 80 | // return &MetadataRequest{Version: version} 81 | // // 4: LeaderAndIsrRequest 82 | // // 5: StopReplicaRequest 83 | // // 6: UpdateMetadataRequest 84 | // // 7: ControlledShutdownRequest 85 | // case 8: 86 | // return &OffsetCommitRequest{Version: version} 87 | // case 9: 88 | // return &OffsetFetchRequest{Version: version} 89 | // case 10: 90 | // return &FindCoordinatorRequest{Version: version} 91 | // case 11: 92 | // return &JoinGroupRequest{Version: version} 93 | // case 12: 94 | // return &HeartbeatRequest{Version: version} 95 | // case 13: 96 | // return &LeaveGroupRequest{Version: version} 97 | // case 14: 98 | // return &SyncGroupRequest{Version: version} 99 | // case 15: 100 | // return &DescribeGroupsRequest{Version: version} 101 | // case 16: 102 | // return &ListGroupsRequest{Version: version} 103 | // case 17: 104 | // return &SaslHandshakeRequest{Version: version} 105 | // case 18: 106 | // return &ApiVersionsRequest{Version: version} 107 | // case 19: 108 | // return &CreateTopicsRequest{Version: version} 109 | // case 20: 110 | // return &DeleteTopicsRequest{Version: version} 111 | // case 21: 112 | // return &DeleteRecordsRequest{Version: version} 113 | // case 22: 114 | // return &InitProducerIDRequest{Version: version} 115 | // // 23: OffsetForLeaderEpochRequest 116 | // case 24: 117 | // return &AddPartitionsToTxnRequest{Version: version} 118 | // case 25: 119 | // return &AddOffsetsToTxnRequest{Version: version} 120 | // case 26: 121 | // return &EndTxnRequest{Version: version} 122 | // // 27: WriteTxnMarkersRequest 123 | // case 28: 124 | // return &TxnOffsetCommitRequest{Version: version} 125 | // case 29: 126 | // return &DescribeAclsRequest{Version: int(version)} 127 | // case 30: 128 | // return &CreateAclsRequest{Version: version} 129 | // case 31: 130 | // return &DeleteAclsRequest{Version: int(version)} 131 | // case 32: 132 | // return &DescribeConfigsRequest{Version: version} 133 | // case 33: 134 | // return &AlterConfigsRequest{Version: version} 135 | // // 34: AlterReplicaLogDirsRequest 136 | // case 35: 137 | // return &DescribeLogDirsRequest{Version: version} 138 | // case 36: 139 | // return &SaslAuthenticateRequest{Version: version} 140 | // case 37: 141 | // return &CreatePartitionsRequest{Version: version} 142 | // // 38: CreateDelegationTokenRequest 143 | // // 39: RenewDelegationTokenRequest 144 | // // 40: ExpireDelegationTokenRequest 145 | // // 41: DescribeDelegationTokenRequest 146 | // case 42: 147 | // return &DeleteGroupsRequest{Version: version} 148 | // // 43: ElectLeadersRequest 149 | // case 44: 150 | // return &IncrementalAlterConfigsRequest{Version: version} 151 | // case 45: 152 | // return &AlterPartitionReassignmentsRequest{Version: version} 153 | // case 46: 154 | // return &ListPartitionReassignmentsRequest{Version: version} 155 | // case 47: 156 | // return &DeleteOffsetsRequest{Version: version} 157 | // case 48: 158 | // return &DescribeClientQuotasRequest{Version: version} 159 | // case 49: 160 | // return &AlterClientQuotasRequest{Version: version} 161 | // case 50: 162 | // return &DescribeUserScramCredentialsRequest{Version: version} 163 | // case 51: 164 | // return &AlterUserScramCredentialsRequest{Version: version} 165 | // 52: VoteRequest 166 | // 53: BeginQuorumEpochRequest 167 | // 54: EndQuorumEpochRequest 168 | // 55: DescribeQuorumRequest 169 | // 56: AlterPartitionRequest 170 | // 57: UpdateFeaturesRequest 171 | // 58: EnvelopeRequest 172 | // 59: FetchSnapshotRequest 173 | // 60: DescribeClusterRequest 174 | // 61: DescribeProducersRequest 175 | // 62: BrokerRegistrationRequest 176 | // 63: BrokerHeartbeatRequest 177 | // 64: UnregisterBrokerRequest 178 | // 65: DescribeTransactionsRequest 179 | // 66: ListTransactionsRequest 180 | // 67: AllocateProducerIdsRequest 181 | // 68: ConsumerGroupHeartbeatRequest 182 | } 183 | return nil 184 | } 185 | 186 | func DecodeRequest(r io.Reader) (*Request, int, error) { 187 | var ( 188 | bytesRead int 189 | lengthBytes = make([]byte, 4) 190 | ) 191 | 192 | if _, err := io.ReadFull(r, lengthBytes); err != nil { 193 | return nil, bytesRead, err 194 | } 195 | 196 | bytesRead += len(lengthBytes) 197 | length := int32(binary.BigEndian.Uint32(lengthBytes)) 198 | 199 | if length <= 4 || length > MaxRequestSize { 200 | return nil, bytesRead, PacketDecodingError{fmt.Sprintf("message of length %d too large or too small", length)} 201 | } 202 | 203 | encodedReq := make([]byte, length) 204 | if _, err := io.ReadFull(r, encodedReq); err != nil { 205 | return nil, bytesRead, err 206 | } 207 | 208 | bytesRead += len(encodedReq) 209 | 210 | req := &Request{} 211 | if err := decode(encodedReq, req); err != nil { 212 | return nil, bytesRead, err 213 | } 214 | 215 | return req, bytesRead, nil 216 | } 217 | -------------------------------------------------------------------------------- /aggregator/kafka/response_header.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import "fmt" 4 | 5 | const MaxResponseSize int32 = 100 * 1024 * 1024 6 | 7 | // headerVersion derives the header version from the request api key and request api version 8 | // 9 | //nolint:funlen,gocognit,gocyclo,cyclop,maintidx 10 | func ResponseHeaderVersion(apiKey, apiVersion int16) int16 { 11 | switch apiKey { 12 | case 0: // Produce 13 | if apiVersion >= 9 { 14 | return 1 15 | } 16 | return 0 17 | case 1: // Fetch 18 | if apiVersion >= 12 { 19 | return 1 20 | } 21 | return 0 22 | case 2: // ListOffsets 23 | if apiVersion >= 6 { 24 | return 1 25 | } 26 | return 0 27 | case 3: // Metadata 28 | if apiVersion >= 9 { 29 | return 1 30 | } 31 | return 0 32 | case 4: // LeaderAndIsr 33 | if apiVersion >= 4 { 34 | return 1 35 | } 36 | return 0 37 | case 5: // StopReplica 38 | if apiVersion >= 2 { 39 | return 1 40 | } 41 | return 0 42 | case 6: // UpdateMetadata 43 | if apiVersion >= 6 { 44 | return 1 45 | } 46 | return 0 47 | case 7: // ControlledShutdown 48 | if apiVersion >= 3 { 49 | return 1 50 | } 51 | return 0 52 | case 8: // OffsetCommit 53 | if apiVersion >= 8 { 54 | return 1 55 | } 56 | return 0 57 | case 9: // OffsetFetch 58 | if apiVersion >= 6 { 59 | return 1 60 | } 61 | return 0 62 | case 10: // FindCoordinator 63 | if apiVersion >= 3 { 64 | return 1 65 | } 66 | return 0 67 | case 11: // JoinGroup 68 | if apiVersion >= 6 { 69 | return 1 70 | } 71 | return 0 72 | case 12: // Heartbeat 73 | if apiVersion >= 4 { 74 | return 1 75 | } 76 | return 0 77 | case 13: // LeaveGroup 78 | if apiVersion >= 4 { 79 | return 1 80 | } 81 | return 0 82 | case 14: // SyncGroup 83 | if apiVersion >= 4 { 84 | return 1 85 | } 86 | return 0 87 | case 15: // DescribeGroups 88 | if apiVersion >= 5 { 89 | return 1 90 | } 91 | return 0 92 | case 16: // ListGroups 93 | if apiVersion >= 3 { 94 | return 1 95 | } 96 | return 0 97 | case 17: // SaslHandshake 98 | return 0 99 | case 18: // ApiVersions 100 | // ApiVersionsResponse always includes a v0 header. 101 | // See KIP-511 for details. 102 | return 0 103 | case 19: // CreateTopics 104 | if apiVersion >= 5 { 105 | return 1 106 | } 107 | return 0 108 | case 20: // DeleteTopics 109 | if apiVersion >= 4 { 110 | return 1 111 | } 112 | return 0 113 | case 21: // DeleteRecords 114 | if apiVersion >= 2 { 115 | return 1 116 | } 117 | return 0 118 | case 22: // InitProducerId 119 | if apiVersion >= 2 { 120 | return 1 121 | } 122 | return 0 123 | case 23: // OffsetForLeaderEpoch 124 | if apiVersion >= 4 { 125 | return 1 126 | } 127 | return 0 128 | case 24: // AddPartitionsToTxn 129 | if apiVersion >= 3 { 130 | return 1 131 | } 132 | return 0 133 | case 25: // AddOffsetsToTxn 134 | if apiVersion >= 3 { 135 | return 1 136 | } 137 | return 0 138 | case 26: // EndTxn 139 | if apiVersion >= 3 { 140 | return 1 141 | } 142 | return 0 143 | case 27: // WriteTxnMarkers 144 | if apiVersion >= 1 { 145 | return 1 146 | } 147 | return 0 148 | case 28: // TxnOffsetCommit 149 | if apiVersion >= 3 { 150 | return 1 151 | } 152 | return 0 153 | case 29: // DescribeAcls 154 | if apiVersion >= 2 { 155 | return 1 156 | } 157 | return 0 158 | case 30: // CreateAcls 159 | if apiVersion >= 2 { 160 | return 1 161 | } 162 | return 0 163 | case 31: // DeleteAcls 164 | if apiVersion >= 2 { 165 | return 1 166 | } 167 | return 0 168 | case 32: // DescribeConfigs 169 | if apiVersion >= 4 { 170 | return 1 171 | } 172 | return 0 173 | case 33: // AlterConfigs 174 | if apiVersion >= 2 { 175 | return 1 176 | } 177 | return 0 178 | case 34: // AlterReplicaLogDirs 179 | if apiVersion >= 2 { 180 | return 1 181 | } 182 | return 0 183 | case 35: // DescribeLogDirs 184 | if apiVersion >= 2 { 185 | return 1 186 | } 187 | return 0 188 | case 36: // SaslAuthenticate 189 | if apiVersion >= 2 { 190 | return 1 191 | } 192 | return 0 193 | case 37: // CreatePartitions 194 | if apiVersion >= 2 { 195 | return 1 196 | } 197 | return 0 198 | case 38: // CreateDelegationToken 199 | if apiVersion >= 2 { 200 | return 1 201 | } 202 | return 0 203 | case 39: // RenewDelegationToken 204 | if apiVersion >= 2 { 205 | return 1 206 | } 207 | return 0 208 | case 40: // ExpireDelegationToken 209 | if apiVersion >= 2 { 210 | return 1 211 | } 212 | return 0 213 | case 41: // DescribeDelegationToken 214 | if apiVersion >= 2 { 215 | return 1 216 | } 217 | return 0 218 | case 42: // DeleteGroups 219 | if apiVersion >= 2 { 220 | return 1 221 | } 222 | return 0 223 | case 43: // ElectLeaders 224 | if apiVersion >= 2 { 225 | return 1 226 | } 227 | return 0 228 | case 44: // IncrementalAlterConfigs 229 | if apiVersion >= 1 { 230 | return 1 231 | } 232 | return 0 233 | case 45: // AlterPartitionReassignments 234 | return 1 235 | case 46: // ListPartitionReassignments 236 | return 1 237 | case 47: // OffsetDelete 238 | return 0 239 | case 48: // DescribeClientQuotas 240 | if apiVersion >= 1 { 241 | return 1 242 | } 243 | return 0 244 | case 49: // AlterClientQuotas 245 | if apiVersion >= 1 { 246 | return 1 247 | } 248 | return 0 249 | case 50: // DescribeUserScramCredentials 250 | return 1 251 | case 51: // AlterUserScramCredentials 252 | return 1 253 | case 52: // Vote 254 | return 1 255 | case 53: // BeginQuorumEpoch 256 | return 0 257 | case 54: // EndQuorumEpoch 258 | return 0 259 | case 55: // DescribeQuorum 260 | return 1 261 | case 56: // AlterIsr 262 | return 1 263 | case 57: // UpdateFeatures 264 | return 1 265 | case 58: // Envelope 266 | return 1 267 | case 59: // FetchSnapshot 268 | return 1 269 | case 60: // DescribeCluster 270 | return 1 271 | case 61: // DescribeProducers 272 | return 1 273 | case 62: // BrokerRegistration 274 | return 1 275 | case 63: // BrokerHeartbeat 276 | return 1 277 | case 64: // UnregisterBroker 278 | return 1 279 | case 65: // DescribeTransactions 280 | return 1 281 | case 66: // ListTransactions 282 | return 1 283 | case 67: // AllocateProducerIds 284 | return 1 285 | default: 286 | return -1 287 | } 288 | } 289 | 290 | type ResponseHeader struct { 291 | Length int32 292 | CorrelationID int32 293 | } 294 | 295 | func (r *ResponseHeader) decode(pd packetDecoder, version int16) (err error) { 296 | r.Length, err = pd.getInt32() 297 | if err != nil { 298 | return err 299 | } 300 | if r.Length <= 4 || r.Length > MaxResponseSize { 301 | return PacketDecodingError{fmt.Sprintf("message of length %d too large or too small", r.Length)} 302 | } 303 | 304 | r.CorrelationID, err = pd.getInt32() 305 | 306 | if version >= 1 { 307 | if _, err := pd.getEmptyTaggedFieldArray(); err != nil { 308 | return err 309 | } 310 | } 311 | 312 | return err 313 | } 314 | -------------------------------------------------------------------------------- /aggregator/kafka/timestamp.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "time" 5 | ) 6 | 7 | type Timestamp struct { 8 | *time.Time 9 | } 10 | 11 | func (t Timestamp) decode(pd packetDecoder) error { 12 | millis, err := pd.getInt64() 13 | if err != nil { 14 | return err 15 | } 16 | 17 | // negative timestamps are invalid, in these cases we should return 18 | // a zero time 19 | timestamp := time.Time{} 20 | if millis >= 0 { 21 | timestamp = time.Unix(millis/1000, (millis%1000)*int64(time.Millisecond)) 22 | } 23 | 24 | *t.Time = timestamp 25 | return nil 26 | } 27 | -------------------------------------------------------------------------------- /aggregator/kafka/versions.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "bufio" 5 | "fmt" 6 | "net" 7 | "regexp" 8 | ) 9 | 10 | type none struct{} 11 | 12 | // make []int32 sortable so we can sort partition numbers 13 | type int32Slice []int32 14 | 15 | func (slice int32Slice) Len() int { 16 | return len(slice) 17 | } 18 | 19 | func (slice int32Slice) Less(i, j int) bool { 20 | return slice[i] < slice[j] 21 | } 22 | 23 | func (slice int32Slice) Swap(i, j int) { 24 | slice[i], slice[j] = slice[j], slice[i] 25 | } 26 | 27 | func dupInt32Slice(input []int32) []int32 { 28 | ret := make([]int32, 0, len(input)) 29 | ret = append(ret, input...) 30 | return ret 31 | } 32 | 33 | // Encoder is a simple interface for any type that can be encoded as an array of bytes 34 | // in order to be sent as the key or value of a Kafka message. Length() is provided as an 35 | // optimization, and must return the same as len() on the result of Encode(). 36 | type Encoder interface { 37 | Encode() ([]byte, error) 38 | Length() int 39 | } 40 | 41 | // make strings and byte slices encodable for convenience so they can be used as keys 42 | // and/or values in kafka messages 43 | 44 | // StringEncoder implements the Encoder interface for Go strings so that they can be used 45 | // as the Key or Value in a ProducerMessage. 46 | type StringEncoder string 47 | 48 | func (s StringEncoder) Encode() ([]byte, error) { 49 | return []byte(s), nil 50 | } 51 | 52 | func (s StringEncoder) Length() int { 53 | return len(s) 54 | } 55 | 56 | // ByteEncoder implements the Encoder interface for Go byte slices so that they can be used 57 | // as the Key or Value in a ProducerMessage. 58 | type ByteEncoder []byte 59 | 60 | func (b ByteEncoder) Encode() ([]byte, error) { 61 | return b, nil 62 | } 63 | 64 | func (b ByteEncoder) Length() int { 65 | return len(b) 66 | } 67 | 68 | // bufConn wraps a net.Conn with a buffer for reads to reduce the number of 69 | // reads that trigger syscalls. 70 | type bufConn struct { 71 | net.Conn 72 | buf *bufio.Reader 73 | } 74 | 75 | func newBufConn(conn net.Conn) *bufConn { 76 | return &bufConn{ 77 | Conn: conn, 78 | buf: bufio.NewReader(conn), 79 | } 80 | } 81 | 82 | func (bc *bufConn) Read(b []byte) (n int, err error) { 83 | return bc.buf.Read(b) 84 | } 85 | 86 | func newKafkaVersion(major, minor, veryMinor, patch uint) KafkaVersion { 87 | return KafkaVersion{ 88 | version: [4]uint{major, minor, veryMinor, patch}, 89 | } 90 | } 91 | 92 | // IsAtLeast return true if and only if the version it is called on is 93 | // greater than or equal to the version passed in: 94 | // 95 | // V1.IsAtLeast(V2) // false 96 | // V2.IsAtLeast(V1) // true 97 | func (v KafkaVersion) IsAtLeast(other KafkaVersion) bool { 98 | for i := range v.version { 99 | if v.version[i] > other.version[i] { 100 | return true 101 | } else if v.version[i] < other.version[i] { 102 | return false 103 | } 104 | } 105 | return true 106 | } 107 | 108 | // Effective constants defining the supported kafka versions. 109 | var ( 110 | V0_8_2_0 = newKafkaVersion(0, 8, 2, 0) 111 | V0_8_2_1 = newKafkaVersion(0, 8, 2, 1) 112 | V0_8_2_2 = newKafkaVersion(0, 8, 2, 2) 113 | V0_9_0_0 = newKafkaVersion(0, 9, 0, 0) 114 | V0_9_0_1 = newKafkaVersion(0, 9, 0, 1) 115 | V0_10_0_0 = newKafkaVersion(0, 10, 0, 0) 116 | V0_10_0_1 = newKafkaVersion(0, 10, 0, 1) 117 | V0_10_1_0 = newKafkaVersion(0, 10, 1, 0) 118 | V0_10_1_1 = newKafkaVersion(0, 10, 1, 1) 119 | V0_10_2_0 = newKafkaVersion(0, 10, 2, 0) 120 | V0_10_2_1 = newKafkaVersion(0, 10, 2, 1) 121 | V0_10_2_2 = newKafkaVersion(0, 10, 2, 2) 122 | V0_11_0_0 = newKafkaVersion(0, 11, 0, 0) 123 | V0_11_0_1 = newKafkaVersion(0, 11, 0, 1) 124 | V0_11_0_2 = newKafkaVersion(0, 11, 0, 2) 125 | V1_0_0_0 = newKafkaVersion(1, 0, 0, 0) 126 | V1_0_1_0 = newKafkaVersion(1, 0, 1, 0) 127 | V1_0_2_0 = newKafkaVersion(1, 0, 2, 0) 128 | V1_1_0_0 = newKafkaVersion(1, 1, 0, 0) 129 | V1_1_1_0 = newKafkaVersion(1, 1, 1, 0) 130 | V2_0_0_0 = newKafkaVersion(2, 0, 0, 0) 131 | V2_0_1_0 = newKafkaVersion(2, 0, 1, 0) 132 | V2_1_0_0 = newKafkaVersion(2, 1, 0, 0) 133 | V2_1_1_0 = newKafkaVersion(2, 1, 1, 0) 134 | V2_2_0_0 = newKafkaVersion(2, 2, 0, 0) 135 | V2_2_1_0 = newKafkaVersion(2, 2, 1, 0) 136 | V2_2_2_0 = newKafkaVersion(2, 2, 2, 0) 137 | V2_3_0_0 = newKafkaVersion(2, 3, 0, 0) 138 | V2_3_1_0 = newKafkaVersion(2, 3, 1, 0) 139 | V2_4_0_0 = newKafkaVersion(2, 4, 0, 0) 140 | V2_4_1_0 = newKafkaVersion(2, 4, 1, 0) 141 | V2_5_0_0 = newKafkaVersion(2, 5, 0, 0) 142 | V2_5_1_0 = newKafkaVersion(2, 5, 1, 0) 143 | V2_6_0_0 = newKafkaVersion(2, 6, 0, 0) 144 | V2_6_1_0 = newKafkaVersion(2, 6, 1, 0) 145 | V2_6_2_0 = newKafkaVersion(2, 6, 2, 0) 146 | V2_6_3_0 = newKafkaVersion(2, 6, 3, 0) 147 | V2_7_0_0 = newKafkaVersion(2, 7, 0, 0) 148 | V2_7_1_0 = newKafkaVersion(2, 7, 1, 0) 149 | V2_7_2_0 = newKafkaVersion(2, 7, 2, 0) 150 | V2_8_0_0 = newKafkaVersion(2, 8, 0, 0) 151 | V2_8_1_0 = newKafkaVersion(2, 8, 1, 0) 152 | V2_8_2_0 = newKafkaVersion(2, 8, 2, 0) 153 | V3_0_0_0 = newKafkaVersion(3, 0, 0, 0) 154 | V3_0_1_0 = newKafkaVersion(3, 0, 1, 0) 155 | V3_0_2_0 = newKafkaVersion(3, 0, 2, 0) 156 | V3_1_0_0 = newKafkaVersion(3, 1, 0, 0) 157 | V3_1_1_0 = newKafkaVersion(3, 1, 1, 0) 158 | V3_1_2_0 = newKafkaVersion(3, 1, 2, 0) 159 | V3_2_0_0 = newKafkaVersion(3, 2, 0, 0) 160 | V3_2_1_0 = newKafkaVersion(3, 2, 1, 0) 161 | V3_2_2_0 = newKafkaVersion(3, 2, 2, 0) 162 | V3_2_3_0 = newKafkaVersion(3, 2, 3, 0) 163 | V3_3_0_0 = newKafkaVersion(3, 3, 0, 0) 164 | V3_3_1_0 = newKafkaVersion(3, 3, 1, 0) 165 | V3_3_2_0 = newKafkaVersion(3, 3, 2, 0) 166 | V3_4_0_0 = newKafkaVersion(3, 4, 0, 0) 167 | V3_4_1_0 = newKafkaVersion(3, 4, 1, 0) 168 | V3_5_0_0 = newKafkaVersion(3, 5, 0, 0) 169 | V3_5_1_0 = newKafkaVersion(3, 5, 1, 0) 170 | V3_6_0_0 = newKafkaVersion(3, 6, 0, 0) 171 | 172 | SupportedVersions = []KafkaVersion{ 173 | V0_8_2_0, 174 | V0_8_2_1, 175 | V0_8_2_2, 176 | V0_9_0_0, 177 | V0_9_0_1, 178 | V0_10_0_0, 179 | V0_10_0_1, 180 | V0_10_1_0, 181 | V0_10_1_1, 182 | V0_10_2_0, 183 | V0_10_2_1, 184 | V0_10_2_2, 185 | V0_11_0_0, 186 | V0_11_0_1, 187 | V0_11_0_2, 188 | V1_0_0_0, 189 | V1_0_1_0, 190 | V1_0_2_0, 191 | V1_1_0_0, 192 | V1_1_1_0, 193 | V2_0_0_0, 194 | V2_0_1_0, 195 | V2_1_0_0, 196 | V2_1_1_0, 197 | V2_2_0_0, 198 | V2_2_1_0, 199 | V2_2_2_0, 200 | V2_3_0_0, 201 | V2_3_1_0, 202 | V2_4_0_0, 203 | V2_4_1_0, 204 | V2_5_0_0, 205 | V2_5_1_0, 206 | V2_6_0_0, 207 | V2_6_1_0, 208 | V2_6_2_0, 209 | V2_7_0_0, 210 | V2_7_1_0, 211 | V2_8_0_0, 212 | V2_8_1_0, 213 | V2_8_2_0, 214 | V3_0_0_0, 215 | V3_0_1_0, 216 | V3_0_2_0, 217 | V3_1_0_0, 218 | V3_1_1_0, 219 | V3_1_2_0, 220 | V3_2_0_0, 221 | V3_2_1_0, 222 | V3_2_2_0, 223 | V3_2_3_0, 224 | V3_3_0_0, 225 | V3_3_1_0, 226 | V3_3_2_0, 227 | V3_4_0_0, 228 | V3_4_1_0, 229 | V3_5_0_0, 230 | V3_5_1_0, 231 | V3_6_0_0, 232 | } 233 | MinVersion = V0_8_2_0 234 | MaxVersion = V3_6_0_0 235 | DefaultVersion = V2_1_0_0 236 | 237 | // reduced set of protocol versions to matrix test 238 | fvtRangeVersions = []KafkaVersion{ 239 | V0_8_2_2, 240 | V0_10_2_2, 241 | V1_0_2_0, 242 | V1_1_1_0, 243 | V2_0_1_0, 244 | V2_2_2_0, 245 | V2_4_1_0, 246 | V2_6_2_0, 247 | V2_8_2_0, 248 | V3_1_2_0, 249 | V3_3_2_0, 250 | V3_6_0_0, 251 | } 252 | ) 253 | 254 | var ( 255 | // This regex validates that a string complies with the pre kafka 1.0.0 format for version strings, for example 0.11.0.3 256 | validPreKafka1Version = regexp.MustCompile(`^0\.\d+\.\d+\.\d+$`) 257 | 258 | // This regex validates that a string complies with the post Kafka 1.0.0 format, for example 1.0.0 259 | validPostKafka1Version = regexp.MustCompile(`^\d+\.\d+\.\d+$`) 260 | ) 261 | 262 | // ParseKafkaVersion parses and returns kafka version or error from a string 263 | func ParseKafkaVersion(s string) (KafkaVersion, error) { 264 | if len(s) < 5 { 265 | return DefaultVersion, fmt.Errorf("invalid version `%s`", s) 266 | } 267 | var major, minor, veryMinor, patch uint 268 | var err error 269 | if s[0] == '0' { 270 | err = scanKafkaVersion(s, validPreKafka1Version, "0.%d.%d.%d", [3]*uint{&minor, &veryMinor, &patch}) 271 | } else { 272 | err = scanKafkaVersion(s, validPostKafka1Version, "%d.%d.%d", [3]*uint{&major, &minor, &veryMinor}) 273 | } 274 | if err != nil { 275 | return DefaultVersion, err 276 | } 277 | return newKafkaVersion(major, minor, veryMinor, patch), nil 278 | } 279 | 280 | func scanKafkaVersion(s string, pattern *regexp.Regexp, format string, v [3]*uint) error { 281 | if !pattern.MatchString(s) { 282 | return fmt.Errorf("invalid version `%s`", s) 283 | } 284 | _, err := fmt.Sscanf(s, format, v[0], v[1], v[2]) 285 | return err 286 | } 287 | 288 | func (v KafkaVersion) String() string { 289 | if v.version[0] == 0 { 290 | return fmt.Sprintf("0.%d.%d.%d", v.version[1], v.version[2], v.version[3]) 291 | } 292 | 293 | return fmt.Sprintf("%d.%d.%d", v.version[0], v.version[1], v.version[2]) 294 | } 295 | -------------------------------------------------------------------------------- /aggregator/kafka/ztsd.go: -------------------------------------------------------------------------------- 1 | package kafka 2 | 3 | import ( 4 | "sync" 5 | 6 | "github.com/klauspost/compress/zstd" 7 | ) 8 | 9 | // zstdMaxBufferedEncoders maximum number of not-in-use zstd encoders 10 | // If the pool of encoders is exhausted then new encoders will be created on the fly 11 | const zstdMaxBufferedEncoders = 1 12 | 13 | type ZstdEncoderParams struct { 14 | Level int 15 | } 16 | type ZstdDecoderParams struct { 17 | } 18 | 19 | var zstdDecMap sync.Map 20 | 21 | var zstdAvailableEncoders sync.Map 22 | 23 | func getZstdEncoderChannel(params ZstdEncoderParams) chan *zstd.Encoder { 24 | if c, ok := zstdAvailableEncoders.Load(params); ok { 25 | return c.(chan *zstd.Encoder) 26 | } 27 | c, _ := zstdAvailableEncoders.LoadOrStore(params, make(chan *zstd.Encoder, zstdMaxBufferedEncoders)) 28 | return c.(chan *zstd.Encoder) 29 | } 30 | 31 | func getZstdEncoder(params ZstdEncoderParams) *zstd.Encoder { 32 | select { 33 | case enc := <-getZstdEncoderChannel(params): 34 | return enc 35 | default: 36 | encoderLevel := zstd.SpeedDefault 37 | if params.Level != CompressionLevelDefault { 38 | encoderLevel = zstd.EncoderLevelFromZstd(params.Level) 39 | } 40 | zstdEnc, _ := zstd.NewWriter(nil, zstd.WithZeroFrames(true), 41 | zstd.WithEncoderLevel(encoderLevel), 42 | zstd.WithEncoderConcurrency(1)) 43 | return zstdEnc 44 | } 45 | } 46 | 47 | func releaseEncoder(params ZstdEncoderParams, enc *zstd.Encoder) { 48 | select { 49 | case getZstdEncoderChannel(params) <- enc: 50 | default: 51 | } 52 | } 53 | 54 | func getDecoder(params ZstdDecoderParams) *zstd.Decoder { 55 | if ret, ok := zstdDecMap.Load(params); ok { 56 | return ret.(*zstd.Decoder) 57 | } 58 | // It's possible to race and create multiple new readers. 59 | // Only one will survive GC after use. 60 | zstdDec, _ := zstd.NewReader(nil, zstd.WithDecoderConcurrency(0)) 61 | zstdDecMap.Store(params, zstdDec) 62 | return zstdDec 63 | } 64 | 65 | func zstdDecompress(params ZstdDecoderParams, dst, src []byte) ([]byte, error) { 66 | return getDecoder(params).DecodeAll(src, dst) 67 | } 68 | -------------------------------------------------------------------------------- /aggregator/socket.go: -------------------------------------------------------------------------------- 1 | package aggregator 2 | 3 | import ( 4 | "context" 5 | "sync" 6 | 7 | "github.com/ddosify/alaz/log" 8 | ) 9 | 10 | type AddressPair struct { 11 | Saddr string `json:"saddr"` 12 | Sport uint16 `json:"sport"` 13 | Daddr string `json:"daddr"` 14 | Dport uint16 `json:"dport"` 15 | } 16 | 17 | // We need to keep track of the following 18 | // in order to build find relationships between 19 | // connections and pods/services 20 | type SockInfo struct { 21 | Pid uint32 `json:"pid"` 22 | Fd uint64 `json:"fd"` 23 | Saddr string `json:"saddr"` 24 | Sport uint16 `json:"sport"` 25 | Daddr string `json:"daddr"` 26 | Dport uint16 `json:"dport"` 27 | } 28 | 29 | // type SocketMap 30 | type SocketMap struct { 31 | mu *sync.RWMutex 32 | pid uint32 33 | M map[uint64]*SocketLine `json:"fdToSockLine"` // fd -> SockLine 34 | waitingFds chan uint64 35 | 36 | processedFds map[uint64]struct{} 37 | processedFdsmu sync.RWMutex 38 | closeCh chan struct{} 39 | ctx context.Context 40 | } 41 | 42 | // only one worker can create socket lines for a particular process(socketmap) 43 | func (sm *SocketMap) ProcessSocketLineCreationRequests() { 44 | for { 45 | select { 46 | case <-sm.closeCh: 47 | sm.M = nil 48 | return 49 | case fd := <-sm.waitingFds: 50 | if _, ok := sm.M[fd]; !ok { 51 | sm.createSocketLine(fd, true) 52 | log.Logger.Debug().Ctx(sm.ctx). 53 | Uint32("pid", sm.pid). 54 | Uint64("fd", fd). 55 | Msgf("created socket line for fd:%d", fd) 56 | } 57 | } 58 | } 59 | } 60 | 61 | func (sm *SocketMap) SignalSocketLine(ctx context.Context, fd uint64) { 62 | sm.processedFdsmu.RLock() 63 | if _, ok := sm.processedFds[fd]; ok { 64 | sm.processedFdsmu.RUnlock() 65 | return 66 | } else { 67 | sm.processedFdsmu.RUnlock() 68 | 69 | sm.processedFdsmu.Lock() 70 | sm.processedFds[fd] = struct{}{} 71 | sm.processedFdsmu.Unlock() 72 | } 73 | 74 | sm.waitingFds <- fd 75 | } 76 | 77 | func (sm *SocketMap) createSocketLine(fd uint64, fetch bool) { 78 | skLine := NewSocketLine(sm.ctx, sm.pid, fd, fetch) 79 | sm.mu.Lock() 80 | sm.M[fd] = skLine 81 | sm.mu.Unlock() 82 | } 83 | -------------------------------------------------------------------------------- /config/db.go: -------------------------------------------------------------------------------- 1 | package config 2 | 3 | type PostgresConfig struct { 4 | Host string 5 | Port string 6 | Username string 7 | Password string 8 | DBName string 9 | } 10 | 11 | type BackendDSConfig struct { 12 | Host string 13 | MetricsExport bool 14 | GpuMetricsExport bool 15 | MetricsExportInterval int // in seconds 16 | 17 | ReqBufferSize int 18 | ConnBufferSize int 19 | KafkaEventBufferSize int 20 | } 21 | -------------------------------------------------------------------------------- /datastore/datastore.go: -------------------------------------------------------------------------------- 1 | package datastore 2 | 3 | type DataStore interface { 4 | PersistPod(pod Pod, eventType string) error 5 | PersistService(service Service, eventType string) error 6 | PersistReplicaSet(rs ReplicaSet, eventType string) error 7 | PersistDeployment(d Deployment, eventType string) error 8 | PersistEndpoints(e Endpoints, eventType string) error 9 | PersistContainer(c Container, eventType string) error 10 | PersistDaemonSet(ds DaemonSet, eventType string) error 11 | PersistStatefulSet(ss StatefulSet, eventType string) error 12 | 13 | PersistRequest(request *Request) error 14 | 15 | PersistKafkaEvent(request *KafkaEvent) error 16 | 17 | // PersistTraceEvent(trace *l7_req.TraceEvent) error 18 | 19 | PersistAliveConnection(trace *AliveConnection) error 20 | } 21 | -------------------------------------------------------------------------------- /datastore/dto.go: -------------------------------------------------------------------------------- 1 | package datastore 2 | 3 | type Pod struct { 4 | UID string // Pod UID 5 | Name string // Pod Name 6 | Namespace string // Namespace 7 | Image string // Main container image 8 | IP string // Pod IP 9 | OwnerType string // ReplicaSet or nil 10 | OwnerID string // ReplicaSet UID 11 | OwnerName string // ReplicaSet Name 12 | } 13 | 14 | type Service struct { 15 | UID string 16 | Name string 17 | Namespace string 18 | Type string 19 | ClusterIP string 20 | ClusterIPs []string 21 | Ports []struct { 22 | Name string `json:"name"` 23 | Src int32 `json:"src"` 24 | Dest int32 `json:"dest"` 25 | Protocol string `json:"protocol"` 26 | } 27 | } 28 | 29 | type ReplicaSet struct { 30 | UID string // ReplicaSet UID 31 | Name string // ReplicaSet Name 32 | Namespace string // Namespace 33 | OwnerType string // Deployment or nil 34 | OwnerID string // Deployment UID 35 | OwnerName string // Deployment Name 36 | Replicas int32 // Number of replicas 37 | } 38 | 39 | type DaemonSet struct { 40 | UID string // ReplicaSet UID 41 | Name string // ReplicaSet Name 42 | Namespace string // Namespace 43 | } 44 | 45 | type StatefulSet struct { 46 | UID string // ReplicaSet UID 47 | Name string // ReplicaSet Name 48 | Namespace string // Namespace 49 | } 50 | 51 | type Deployment struct { 52 | UID string // Deployment UID 53 | Name string // Deployment Name 54 | Namespace string // Namespace 55 | Replicas int32 // Number of replicas 56 | } 57 | 58 | type Endpoints struct { 59 | UID string // Endpoints UID 60 | Name string // Endpoints Name 61 | Namespace string // Namespace 62 | Addresses []Address 63 | } 64 | 65 | type AddressIP struct { 66 | Type string `json:"type"` // pod or external 67 | ID string `json:"id"` // Pod UID or empty 68 | Name string `json:"name"` 69 | Namespace string `json:"namespace"` // Pod Namespace or empty 70 | IP string `json:"ip"` // Pod IP or external IP 71 | } 72 | 73 | type AddressPort struct { 74 | Port int32 `json:"port"` // Port number 75 | Protocol string `json:"protocol"` // TCP or UDP 76 | Name string `json:"name"` 77 | } 78 | 79 | // Subsets 80 | type Address struct { 81 | IPs []AddressIP `json:"ips"` 82 | Ports []AddressPort `json:"ports"` 83 | } 84 | 85 | type Container struct { 86 | Name string `json:"name"` 87 | Namespace string `json:"namespace"` 88 | PodUID string `json:"pod"` // Pod UID 89 | Image string `json:"image"` 90 | Ports []struct { 91 | Port int32 `json:"port"` 92 | Protocol string `json:"protocol"` 93 | } `json:"ports"` 94 | } 95 | 96 | type AliveConnection struct { 97 | CheckTime int64 // connection is alive at this time, ms 98 | FromIP string 99 | FromType string 100 | FromUID string 101 | FromPort uint16 102 | ToIP string 103 | ToType string 104 | ToUID string 105 | ToPort uint16 106 | } 107 | 108 | type DirectionalEvent interface { 109 | SetFromUID(string) 110 | SetFromIP(string) 111 | SetFromType(string) 112 | SetFromPort(uint16) 113 | 114 | SetToUID(string) 115 | SetToIP(string) 116 | SetToType(string) 117 | SetToPort(uint16) 118 | 119 | ReverseDirection() 120 | } 121 | 122 | type KafkaEvent struct { 123 | StartTime int64 124 | Latency uint64 // in ns 125 | FromIP string 126 | FromType string 127 | FromUID string 128 | FromPort uint16 129 | ToIP string 130 | ToType string 131 | ToUID string 132 | ToPort uint16 133 | Topic string 134 | Partition uint32 135 | Key string 136 | Value string 137 | Type string // PUBLISH or CONSUME 138 | Tls bool 139 | // dist tracing disabled by default temporarily 140 | // Tid uint32 141 | // Seq uint32 142 | } 143 | 144 | func (ke *KafkaEvent) SetFromUID(uid string) { 145 | ke.FromUID = uid 146 | } 147 | func (ke *KafkaEvent) SetFromIP(ip string) { 148 | ke.FromIP = ip 149 | } 150 | func (ke *KafkaEvent) SetFromType(typ string) { 151 | ke.FromType = typ 152 | } 153 | func (ke *KafkaEvent) SetFromPort(port uint16) { 154 | ke.FromPort = port 155 | } 156 | 157 | func (ke *KafkaEvent) SetToUID(uid string) { 158 | ke.ToUID = uid 159 | } 160 | func (ke *KafkaEvent) SetToIP(ip string) { 161 | ke.ToIP = ip 162 | } 163 | func (ke *KafkaEvent) SetToType(typ string) { 164 | ke.ToType = typ 165 | } 166 | func (ke *KafkaEvent) SetToPort(port uint16) { 167 | ke.ToPort = port 168 | } 169 | 170 | func (req *KafkaEvent) ReverseDirection() { 171 | req.FromIP, req.ToIP = req.ToIP, req.FromIP 172 | req.FromPort, req.ToPort = req.ToPort, req.FromPort 173 | req.FromUID, req.ToUID = req.ToUID, req.FromUID 174 | req.FromType, req.ToType = req.ToType, req.FromType 175 | } 176 | 177 | type Request struct { 178 | StartTime int64 179 | Latency uint64 // in ns 180 | FromIP string 181 | FromType string 182 | FromUID string 183 | FromPort uint16 184 | ToIP string 185 | ToType string 186 | ToUID string 187 | ToPort uint16 188 | Protocol string 189 | Tls bool 190 | Completed bool 191 | StatusCode uint32 192 | FailReason string 193 | Method string 194 | Path string 195 | // dist tracing disabled by default temporarily 196 | // Tid uint32 197 | // Seq uint32 198 | } 199 | 200 | func (r *Request) SetFromUID(uid string) { 201 | r.FromUID = uid 202 | } 203 | func (r *Request) SetFromIP(ip string) { 204 | r.FromIP = ip 205 | } 206 | func (r *Request) SetFromType(typ string) { 207 | r.FromType = typ 208 | } 209 | func (r *Request) SetFromPort(port uint16) { 210 | r.FromPort = port 211 | } 212 | 213 | func (r *Request) SetToUID(uid string) { 214 | r.ToUID = uid 215 | } 216 | func (r *Request) SetToIP(ip string) { 217 | r.ToIP = ip 218 | } 219 | func (r *Request) SetToType(typ string) { 220 | r.ToType = typ 221 | } 222 | func (r *Request) SetToPort(port uint16) { 223 | r.ToPort = port 224 | } 225 | 226 | func (req *Request) ReverseDirection() { 227 | req.FromIP, req.ToIP = req.ToIP, req.FromIP 228 | req.FromPort, req.ToPort = req.ToPort, req.FromPort 229 | req.FromUID, req.ToUID = req.ToUID, req.FromUID 230 | req.FromType, req.ToType = req.ToType, req.FromType 231 | } 232 | 233 | type BackendResponse struct { 234 | Msg string `json:"msg"` 235 | Errors []struct { 236 | EventNum int `json:"event_num"` 237 | Event interface{} `json:"event"` 238 | Error string `json:"error"` 239 | } `json:"errors"` 240 | } 241 | 242 | type ReqBackendReponse struct { 243 | Msg string `json:"msg"` 244 | Errors []struct { 245 | EventNum int `json:"request_num"` 246 | Event interface{} `json:"request"` 247 | Error string `json:"errors"` 248 | } `json:"errors"` 249 | } 250 | -------------------------------------------------------------------------------- /datastore/payload.go: -------------------------------------------------------------------------------- 1 | package datastore 2 | 3 | type Metadata struct { 4 | MonitoringID string `json:"monitoring_id"` 5 | IdempotencyKey string `json:"idempotency_key"` 6 | NodeID string `json:"node_id"` 7 | AlazVersion string `json:"alaz_version"` 8 | } 9 | 10 | type HealthCheckPayload struct { 11 | Metadata Metadata `json:"metadata"` 12 | Info struct { 13 | TracingEnabled bool `json:"tracing"` 14 | MetricsEnabled bool `json:"metrics"` 15 | LogsEnabled bool `json:"logs"` 16 | NamespaceFilter string `json:"namespace_filter"` 17 | } `json:"alaz_info"` 18 | Telemetry struct { 19 | KernelVersion string `json:"kernel_version"` 20 | K8sVersion string `json:"k8s_version"` 21 | CloudProvider string `json:"cloud_provider"` 22 | } `json:"telemetry"` 23 | } 24 | 25 | type EventPayload struct { 26 | Metadata Metadata `json:"metadata"` 27 | Events []interface{} `json:"events"` 28 | } 29 | 30 | type PodEvent struct { 31 | UID string `json:"uid"` 32 | EventType string `json:"event_type"` 33 | Name string `json:"name"` 34 | Namespace string `json:"namespace"` 35 | IP string `json:"ip"` 36 | OwnerType string `json:"owner_type"` 37 | OwnerName string `json:"owner_name"` 38 | OwnerID string `json:"owner_id"` 39 | } 40 | 41 | type SvcEvent struct { 42 | UID string `json:"uid"` 43 | EventType string `json:"event_type"` 44 | Name string `json:"name"` 45 | Namespace string `json:"namespace"` 46 | Type string `json:"type"` 47 | ClusterIPs []string `json:"cluster_ips"` 48 | Ports []struct { 49 | Name string `json:"name"` 50 | Src int32 `json:"src"` 51 | Dest int32 `json:"dest"` 52 | Protocol string `json:"protocol"` 53 | } `json:"ports"` 54 | } 55 | 56 | type RsEvent struct { 57 | UID string `json:"uid"` 58 | EventType string `json:"event_type"` 59 | Name string `json:"name"` 60 | Namespace string `json:"namespace"` 61 | Replicas int32 `json:"replicas"` 62 | OwnerType string `json:"owner_type"` 63 | OwnerName string `json:"owner_name"` 64 | OwnerID string `json:"owner_id"` 65 | } 66 | 67 | type DsEvent struct { 68 | UID string `json:"uid"` 69 | EventType string `json:"event_type"` 70 | Name string `json:"name"` 71 | Namespace string `json:"namespace"` 72 | } 73 | type SsEvent struct { 74 | UID string `json:"uid"` 75 | EventType string `json:"event_type"` 76 | Name string `json:"name"` 77 | Namespace string `json:"namespace"` 78 | } 79 | 80 | type DepEvent struct { 81 | UID string `json:"uid"` 82 | EventType string `json:"event_type"` 83 | Name string `json:"name"` 84 | Namespace string `json:"namespace"` 85 | Replicas int32 `json:"replicas"` 86 | } 87 | 88 | type EpEvent struct { 89 | UID string `json:"uid"` 90 | EventType string `json:"event_type"` 91 | Name string `json:"name"` 92 | Namespace string `json:"namespace"` 93 | Addresses []Address `json:"addresses"` 94 | } 95 | 96 | type ContainerEvent struct { 97 | UID string `json:"uid"` 98 | EventType string `json:"event_type"` 99 | Name string `json:"name"` 100 | Namespace string `json:"namespace"` 101 | Pod string `json:"pod"` 102 | Image string `json:"image"` 103 | Ports []struct { 104 | Port int32 `json:"port"` 105 | Protocol string `json:"protocol"` 106 | } `json:"ports"` 107 | } 108 | 109 | // 0) StartTime 110 | // 1) Latency 111 | // 2) Source IP 112 | // 3) Source Type 113 | // 4) Source ID 114 | // 5) Source Port 115 | // 6) Destination IP 116 | // 7) Destination Type 117 | // 8) Destination ID 118 | // 9) Destination Port 119 | // 10) Protocol 120 | // 11) Response Status Code 121 | // 12) Fail Reason // TODO: not used yet 122 | // 13) Method 123 | // 14) Path 124 | // 15) Encrypted (bool) 125 | type ReqInfo [16]interface{} 126 | 127 | // dist tracing disabled 128 | // 16) Seq 129 | // 17) Tid 130 | 131 | type RequestsPayload struct { 132 | Metadata Metadata `json:"metadata"` 133 | Requests []*ReqInfo `json:"requests"` 134 | } 135 | 136 | // 0) CheckTime // connection is alive at that time 137 | // 1) Source IP 138 | // 2) Source Type 139 | // 3) Source ID 140 | // 4) Source Port 141 | // 5) Destination IP 142 | // 6) Destination Type 143 | // 7) Destination ID 144 | // 8) Destination Port 145 | type ConnInfo [9]interface{} 146 | 147 | type ConnInfoPayload struct { 148 | Metadata Metadata `json:"metadata"` 149 | Connections []*ConnInfo `json:"connections"` 150 | } 151 | 152 | // 0) Timestamp 153 | // 1) Tcp Seq Num 154 | // 2) Tid 155 | // 3) Ingress(true), Egress(false) 156 | type TraceInfo [4]interface{} 157 | 158 | type TracePayload struct { 159 | Metadata Metadata `json:"metadata"` 160 | Traces []*TraceInfo `json:"traffic"` 161 | } 162 | 163 | // 0) StartTime 164 | // 1) Latency 165 | // 2) Source IP 166 | // 3) Source Type 167 | // 4) Source ID 168 | // 5) Source Port 169 | // 6) Destination IP 170 | // 7) Destination Type 171 | // 8) Destination ID 172 | // 9) Destination Port 173 | // 10) Topic 174 | // 11) Partition 175 | // 12) Key 176 | // 13) Value 177 | // 14) Type 178 | // 15) Encrypted (bool) 179 | type KafkaEventInfo [16]interface{} 180 | 181 | // dist tracing disabled 182 | // 16) Seq 183 | // 17) Tid 184 | 185 | type KafkaEventInfoPayload struct { 186 | Metadata Metadata `json:"metadata"` 187 | KafkaEvents []*KafkaEventInfo `json:"kafka_events"` 188 | } 189 | 190 | func convertPodToPodEvent(pod Pod, eventType string) PodEvent { 191 | return PodEvent{ 192 | UID: pod.UID, 193 | EventType: eventType, 194 | Name: pod.Name, 195 | Namespace: pod.Namespace, 196 | IP: pod.IP, 197 | OwnerType: pod.OwnerType, 198 | OwnerName: pod.OwnerName, 199 | OwnerID: pod.OwnerID, 200 | } 201 | } 202 | 203 | func convertSvcToSvcEvent(service Service, eventType string) SvcEvent { 204 | return SvcEvent{ 205 | UID: service.UID, 206 | EventType: eventType, 207 | Name: service.Name, 208 | Namespace: service.Namespace, 209 | Type: service.Type, 210 | ClusterIPs: service.ClusterIPs, 211 | Ports: service.Ports, 212 | } 213 | } 214 | 215 | func convertRsToRsEvent(rs ReplicaSet, eventType string) RsEvent { 216 | return RsEvent{ 217 | UID: rs.UID, 218 | EventType: eventType, 219 | Name: rs.Name, 220 | Namespace: rs.Namespace, 221 | Replicas: rs.Replicas, 222 | OwnerType: rs.OwnerType, 223 | OwnerName: rs.OwnerName, 224 | OwnerID: rs.OwnerID, 225 | } 226 | } 227 | 228 | func convertDsToDsEvent(ds DaemonSet, eventType string) DsEvent { 229 | return DsEvent{ 230 | UID: ds.UID, 231 | EventType: eventType, 232 | Name: ds.Name, 233 | Namespace: ds.Namespace, 234 | } 235 | } 236 | 237 | func convertSsToSsEvent(ss StatefulSet, eventType string) SsEvent { 238 | return SsEvent{ 239 | UID: ss.UID, 240 | EventType: eventType, 241 | Name: ss.Name, 242 | Namespace: ss.Namespace, 243 | } 244 | } 245 | 246 | func convertDepToDepEvent(d Deployment, eventType string) DepEvent { 247 | return DepEvent{ 248 | UID: d.UID, 249 | EventType: eventType, 250 | Name: d.Name, 251 | Namespace: d.Namespace, 252 | Replicas: d.Replicas, 253 | } 254 | } 255 | 256 | func convertEpToEpEvent(ep Endpoints, eventType string) EpEvent { 257 | return EpEvent{ 258 | UID: ep.UID, 259 | EventType: eventType, 260 | Name: ep.Name, 261 | Namespace: ep.Namespace, 262 | Addresses: ep.Addresses, 263 | } 264 | } 265 | 266 | func convertContainerToContainerEvent(c Container, eventType string) ContainerEvent { 267 | return ContainerEvent{ 268 | EventType: eventType, 269 | Name: c.Name, 270 | Namespace: c.Namespace, 271 | Pod: c.PodUID, 272 | Image: c.Image, 273 | Ports: c.Ports, 274 | } 275 | } 276 | -------------------------------------------------------------------------------- /docs/syscalls.txt: -------------------------------------------------------------------------------- 1 | Use following command for man pages for syscalls: 2 | > man 2 3 | 4 | // Libbpf docs 5 | https://libbpf.readthedocs.io/en/latest/ 6 | 7 | // In , there is macros that are helpful while declaring funcs, getting params. Ex: BPF_KPROBE -------------------------------------------------------------------------------- /ebpf-builder/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM ubuntu:22.04 2 | 3 | ENV LIBBPF_VERSION 1.2.2 4 | ENV GOLANG_VERSION 1.22.1 5 | 6 | # Install Clang and LLVM Strip 7 | RUN apt-get update && apt-get install -y clang-14 llvm && \ 8 | update-alternatives --install /usr/bin/clang clang /usr/bin/clang-14 100 && \ 9 | update-alternatives --install /usr/bin/llvm-strip llvm-strip /usr/bin/llvm-strip-14 100 10 | 11 | # Install Make 12 | RUN apt-get update && apt-get install -y make 13 | RUN apt-get install -y gcc-multilib 14 | 15 | # Install libbpf dependencies 16 | RUN apt-get update && apt-get install -y bison build-essential cmake flex git libelf-dev libssl-dev libudev-dev pkg-config wget 17 | 18 | # Install libbpf 19 | RUN wget --quiet https://github.com/libbpf/libbpf/archive/refs/tags/v${LIBBPF_VERSION}.tar.gz && \ 20 | tar -xzf v${LIBBPF_VERSION}.tar.gz && \ 21 | rm v${LIBBPF_VERSION}.tar.gz && \ 22 | cd libbpf-${LIBBPF_VERSION}/src && \ 23 | make && make install 24 | 25 | # Install Go 26 | RUN wget -q https://golang.org/dl/go${GOLANG_VERSION}.linux-amd64.tar.gz && \ 27 | tar -C /usr/local -xzf go${GOLANG_VERSION}.linux-amd64.tar.gz && \ 28 | rm go${GOLANG_VERSION}.linux-amd64.tar.gz 29 | 30 | ENV PATH="/usr/local/go/bin:${PATH}" 31 | 32 | # Set the working directory 33 | WORKDIR /app 34 | 35 | # Copy your application code to the container 36 | COPY . /app 37 | 38 | # Run your application 39 | CMD ["bash"] 40 | -------------------------------------------------------------------------------- /ebpf/bpf.go: -------------------------------------------------------------------------------- 1 | package ebpf 2 | 3 | import "context" 4 | 5 | type Program interface { 6 | Attach() // attach links to programs, in case error process must exit 7 | InitMaps() // initialize bpf map readers, must be called before Consume 8 | Consume(ctx context.Context, ch chan interface{}) // consume bpf events, publishes to chan provided 9 | Close() // release resources 10 | } 11 | -------------------------------------------------------------------------------- /ebpf/c/amqp.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | // AMQP is a binary protocol. Information is organised into "frames", of various types. Frames carry 3 | // protocol methods and other information. All frames have the same general format: 4 | // frame header, payload and frame end. The frame payload format depends on the frame type. 5 | 6 | // Within a single socket connection, there can be multiple independent threads of control, called "channels". 7 | // Each frame is numbered with a channel number. By interleaving their frames, different channels share the 8 | // connection. 9 | 10 | // The AMQP client and server negotiate the protocol. 11 | 12 | 13 | // https://www.rabbitmq.com/resources/specs/amqp0-9-1.pdf 14 | // 2.3.5 Frame Details 15 | 16 | // All frames consist of a header (7 octets), a payload of arbitrary size, and a 'frame-end' octet that detects 17 | // malformed frames 18 | 19 | 20 | // Each frame in AMQP consists of a header, payload, and frame-end marker. 21 | // The header contains frame-specific information, such as the frame type, channel number, and payload size. 22 | // The payload field, in turn, holds the actual data associated with the frame. 23 | 24 | 25 | // octet -> 1 byte 26 | 27 | // FRAME FORMAT 28 | // 0 1 3 7 size+7 size+8 29 | // +------+---------+--------- +-------------------+------+---------+ 30 | // | type | channel | size | payload | frame-end | 31 | // +------+---------+--------- +-------------------+------+---------+ 32 | // octet short long size octets octet 33 | // --------header------------- 34 | 35 | // FRAME PAYLOAD FORMAT FOR METHOD FRAMES 36 | // 0 2 4 37 | // +----------+----------+-----------+ 38 | // | class-id | method-id| arguments | 39 | // +----------+----------+-----------+ 40 | // short short ... 41 | 42 | 43 | // Frame types and values 44 | // 0x01: METHOD frame - Used to carry methods such as connection setup and channel operations. 45 | // 0x02: HEADER frame - Used to carry content header properties for a message. 46 | // 0x03: BODY frame - Used to carry message body content. (content frames) 47 | // 0x04: HEARTBEAT frame - Used for keep-alive and monitoring purposes. 48 | 49 | 50 | 51 | #define AMQP_FRAME_TYPE_METHOD 0x01 52 | #define AMQP_FRAME_TYPE_HEADER 0x02 53 | #define AMQP_FRAME_TYPE_CONTENT 0x03 54 | #define AMQP_FRAME_TYPE_HEARTBEAT 0x04 55 | 56 | #define AMQP_FRAME_END 0xCE 57 | 58 | #define AMQP_CLASS_CONN 10 // handles connection-related operations, such as establishing and terminating connections, authentication, and handling connection parameters 59 | #define AMQP_CLASS_CHANNEL 20 // including opening and closing channels, flow control, and channel-level exceptions 60 | #define AMQP_CLASS_EXCHANGE 40 // for managing exchanges, which are entities that receive messages from producers and route them to queues based on certain criteria 61 | #define AMQP_CLASS_QUEUE 50 // used for queue-related operations, such as declaring a queue, binding a queue to an exchange, and consuming a queue 62 | #define AMQP_CLASS_BASIC 60 // used for basic message-related operations, such as publishing messages, consuming messages, and handling acknowledgments 63 | 64 | // for rabbitmq methods 65 | #define METHOD_PUBLISH 1 66 | #define METHOD_DELIVER 2 67 | 68 | // Methods differ according to the class they belong to 69 | 70 | // Content frame comes after method frame 71 | // method(1) - content_header(2) - content_body(3) 72 | 73 | // Basic class methods 74 | #define AMQP_METHOD_PUBLISH 40 75 | #define AMQP_METHOD_DELIVER 60 // Deliver 76 | #define AMQP_METHOD_ACK 80 77 | #define AMQP_METHOD_REJECT 90 78 | 79 | static __always_inline 80 | int amqp_method_is(char *buf, __u64 buf_size, __u16 expected_method) { 81 | if (buf_size < 12) { 82 | return 0; 83 | } 84 | __u8 type = 0; 85 | bpf_read_into_from(type,buf); // read the frame type 86 | if (type != AMQP_FRAME_TYPE_METHOD) { 87 | return 0; 88 | } 89 | 90 | __u32 size = 0; 91 | bpf_read_into_from(size,buf+3); // read the frame size 92 | size = bpf_htonl(size); 93 | if (7 + size + 1 > buf_size) { // buf_size is smaller than the frame size 94 | return 0; 95 | } 96 | 97 | __u8 end = 0; 98 | bpf_read_into_from(end,buf+7+size); // read the frame end, which is the last byte of the frame 99 | if (end != AMQP_FRAME_END) { 100 | return 0; 101 | } 102 | 103 | // the frame is a valid method frame 104 | // check the class and method from the frame payload 105 | 106 | __u16 class = 0; 107 | bpf_read_into_from(class,buf+7); // read the class-id 108 | if (bpf_htons(class) != AMQP_CLASS_BASIC) { 109 | return 0; 110 | } 111 | 112 | __u16 method = 0; 113 | bpf_read_into_from(method,buf+9); 114 | if (bpf_htons(method) != expected_method) { 115 | return 0; 116 | } 117 | 118 | return 1; 119 | } 120 | 121 | static __always_inline 122 | int is_rabbitmq_publish(char *buf, __u64 buf_size) { 123 | return amqp_method_is(buf, buf_size, AMQP_METHOD_PUBLISH); 124 | } 125 | 126 | static __always_inline 127 | int is_rabbitmq_consume(char *buf, __u64 buf_size) { 128 | return amqp_method_is(buf, buf_size, AMQP_METHOD_DELIVER); 129 | } 130 | -------------------------------------------------------------------------------- /ebpf/c/bpf.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | 3 | #include "../headers/bpf.h" 4 | #include "../headers/common.h" 5 | #include "../headers/tcp.h" 6 | #include "../headers/l7_req.h" 7 | 8 | 9 | // order is important 10 | #ifndef __BPF__H 11 | #define __BPF__H 12 | #include 13 | #include 14 | #include 15 | #include 16 | #endif 17 | 18 | #define FILTER_OUT_NON_CONTAINER 19 | // #define DIST_TRACING_ENABLED // disabled by default 20 | 21 | #include 22 | #include "../headers/pt_regs.h" 23 | #include 24 | 25 | #include "../headers/log.h" 26 | 27 | #include "macros.h" 28 | #include "struct.h" 29 | #include "map.h" 30 | 31 | #include "tcp.c" 32 | #include "proc.c" 33 | 34 | #include "http.c" 35 | #include "amqp.c" 36 | #include "postgres.c" 37 | #include "redis.c" 38 | #include "kafka.c" 39 | #include "mysql.c" 40 | #include "mongo.c" 41 | #include "openssl.c" 42 | #include "http2.c" 43 | #include "tcp_sock.c" 44 | #include "go_internal.h" 45 | #include "l7.c" 46 | 47 | char __license[] SEC("license") = "Dual MIT/GPL"; 48 | 49 | 50 | -------------------------------------------------------------------------------- /ebpf/c/bpf_bpfeb.o: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/getanteon/alaz/828b997f7b9e956e411e7480ed727f7d1594ade2/ebpf/c/bpf_bpfeb.o -------------------------------------------------------------------------------- /ebpf/c/bpf_bpfel.o: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/getanteon/alaz/828b997f7b9e956e411e7480ed727f7d1594ade2/ebpf/c/bpf_bpfel.o -------------------------------------------------------------------------------- /ebpf/c/generate.go: -------------------------------------------------------------------------------- 1 | package c 2 | 3 | import ( 4 | "github.com/ddosify/alaz/log" 5 | 6 | "github.com/cilium/ebpf/linux" 7 | "github.com/cilium/ebpf/rlimit" 8 | ) 9 | 10 | // $BPF_CLANG and $BPF_CFLAGS are set by the Makefile. 11 | //go:generate go run github.com/cilium/ebpf/cmd/bpf2go -cc $BPF_CLANG -cflags $BPF_CFLAGS bpf bpf.c -- -I../headers 12 | 13 | var BpfObjs bpfObjects 14 | 15 | func Load() { 16 | // Allow the current process to lock memory for eBPF resources. 17 | if err := rlimit.RemoveMemlock(); err != nil { 18 | log.Logger.Fatal().Err(err).Msg("failed to remove memlock limit") 19 | } 20 | 21 | // Load pre-compiled programs and maps into the kernel. 22 | BpfObjs = bpfObjects{} 23 | if err := loadBpfObjects(&BpfObjs, nil); err != nil { 24 | log.Logger.Fatal().Err(err).Msg("loading objects") 25 | } 26 | 27 | linux.FlushCaches() 28 | } 29 | -------------------------------------------------------------------------------- /ebpf/c/go_internal.h: -------------------------------------------------------------------------------- 1 | struct go_interface { 2 | __s64 type; 3 | void* ptr; 4 | }; 5 | 6 | #if defined(__TARGET_ARCH_x86) 7 | #define GO_PARAM1(x) ((x)->ax) 8 | #define GO_PARAM2(x) ((x)->bx) 9 | #define GO_PARAM3(x) ((x)->cx) 10 | #define GOROUTINE(x) ((x)->r14) 11 | #elif defined(__TARGET_ARCH_arm64) 12 | /* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */ 13 | #define GO_PARAM1(x) (((struct user_pt_regs *)(x))->regs[0]) 14 | #define GO_PARAM2(x) (((struct user_pt_regs *)(x))->regs[1]) 15 | #define GO_PARAM3(x) (((struct user_pt_regs *)(x))->regs[2]) 16 | #define GOROUTINE(x) (((struct user_pt_regs *)(x))->regs[28]) 17 | #endif -------------------------------------------------------------------------------- /ebpf/c/http.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | #define METHOD_UNKNOWN 0 3 | #define METHOD_GET 1 4 | #define METHOD_POST 2 5 | #define METHOD_PUT 3 6 | #define METHOD_PATCH 4 7 | #define METHOD_DELETE 5 8 | #define METHOD_HEAD 6 9 | #define METHOD_CONNECT 7 10 | #define METHOD_OPTIONS 8 11 | #define METHOD_TRACE 9 12 | 13 | #define MIN_METHOD_LEN 8 14 | #define MIN_RESP_LEN 12 15 | 16 | static __always_inline 17 | int parse_http_method(char *buf) { 18 | char buf_prefix[MIN_METHOD_LEN]; 19 | long r = bpf_probe_read(&buf_prefix, sizeof(buf_prefix), (void *)(buf)) ; 20 | 21 | if (r < 0) { 22 | return 0; 23 | } 24 | 25 | if (buf_prefix[0] == 'G' && buf_prefix[1] == 'E' && buf_prefix[2] == 'T') { 26 | return METHOD_GET; 27 | }else if(buf_prefix[0] == 'P' && buf_prefix[1] == 'O' && buf_prefix[2] == 'S' && buf_prefix[3] == 'T'){ 28 | return METHOD_POST; 29 | }else if(buf_prefix[0] == 'P' && buf_prefix[1] == 'U' && buf_prefix[2] == 'T'){ 30 | return METHOD_PUT; 31 | }else if(buf_prefix[0] == 'P' && buf_prefix[1] == 'A' && buf_prefix[2] == 'T' && buf_prefix[3] == 'C' && buf_prefix[4] == 'H'){ 32 | return METHOD_PATCH; 33 | }else if(buf_prefix[0] == 'D' && buf_prefix[1] == 'E' && buf_prefix[2] == 'L' && buf_prefix[3] == 'E' && buf_prefix[4] == 'T' && buf_prefix[5] == 'E'){ 34 | return METHOD_DELETE; 35 | }else if(buf_prefix[0] == 'H' && buf_prefix[1] == 'E' && buf_prefix[2] == 'A' && buf_prefix[3] == 'D'){ 36 | return METHOD_HEAD; 37 | }else if (buf_prefix[0] == 'C' && buf_prefix[1] == 'O' && buf_prefix[2] == 'N' && buf_prefix[3] == 'N' && buf_prefix[4] == 'E' && buf_prefix[5] == 'C' && buf_prefix[6] == 'T'){ 38 | return METHOD_CONNECT; 39 | }else if(buf_prefix[0] == 'O' && buf_prefix[1] == 'P' && buf_prefix[2] == 'T' && buf_prefix[3] == 'I' && buf_prefix[4] == 'O' && buf_prefix[5] == 'N' && buf_prefix[6] == 'S'){ 40 | return METHOD_OPTIONS; 41 | }else if(buf_prefix[0] == 'T' && buf_prefix[1] == 'R' && buf_prefix[2] == 'A' && buf_prefix[3] == 'C' && buf_prefix[4] == 'E'){ 42 | return METHOD_TRACE; 43 | } 44 | return -1; 45 | } 46 | 47 | static __always_inline 48 | int parse_http_status(char *buf) { 49 | 50 | char b[MIN_RESP_LEN]; 51 | long r = bpf_probe_read(&b, sizeof(b), (void *)(buf)) ; 52 | 53 | if (r < 0) { 54 | return 0; 55 | } 56 | 57 | // HTTP/1.1 200 OK 58 | if (b[0] != 'H' || b[1] != 'T' || b[2] != 'T' || b[3] != 'P' || b[4] != '/') { 59 | return -1; 60 | } 61 | if (b[5] < '0' || b[5] > '9') { 62 | return -1; 63 | } 64 | if (b[6] != '.') { 65 | return -1; 66 | } 67 | if (b[7] < '0' || b[7] > '9') { 68 | return -1; 69 | } 70 | if (b[8] != ' ') { 71 | return -1; 72 | } 73 | if (b[9] < '0' || b[9] > '9' || b[10] < '0' || b[10] > '9' || b[11] < '0' || b[11] > '9') { 74 | return -1; 75 | } 76 | return (b[9]-'0')*100 + (b[10]-'0')*10 + (b[11]-'0'); 77 | } 78 | -------------------------------------------------------------------------------- /ebpf/c/http2.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | #define CLIENT_FRAME 1 3 | #define SERVER_FRAME 2 4 | 5 | #define MAGIC_MESSAGE_LEN 24 6 | 7 | static __always_inline 8 | int is_http2_magic(char *buf) { 9 | char buf_prefix[MAGIC_MESSAGE_LEN]; 10 | long r = bpf_probe_read(&buf_prefix, sizeof(buf_prefix), (void *)(buf)) ; 11 | 12 | if (r < 0) { 13 | return 0; 14 | } 15 | 16 | /* 17 | PRI * HTTP/2.0 18 | 19 | SM 20 | */ 21 | const char packet_bytes[MAGIC_MESSAGE_LEN] = { 22 | 0x50, 0x52, 0x49, 0x20, 0x2a, 0x20, 0x48, 0x54, 23 | 0x54, 0x50, 0x2f, 0x32, 0x2e, 0x30, 0x0d, 0x0a, 24 | 0x0d, 0x0a, 0x53, 0x4d, 0x0d, 0x0a, 0x0d, 0x0a 25 | }; 26 | 27 | for (int i = 0; i < MAGIC_MESSAGE_LEN; i++) { 28 | if (buf_prefix[i] != packet_bytes[i]) { 29 | return 0; 30 | } 31 | } 32 | 33 | return 1; 34 | } 35 | 36 | static __always_inline 37 | int is_http2_magic_2(char *buf){ 38 | char buf_prefix[MAGIC_MESSAGE_LEN]; 39 | long r = bpf_probe_read(&buf_prefix, sizeof(buf_prefix), (void *)(buf)) ; 40 | 41 | if (r < 0) { 42 | return 0; 43 | } 44 | 45 | 46 | if (buf_prefix[0] == 'P' && buf_prefix[1] == 'R' && buf_prefix[2] == 'I' && buf_prefix[3] == ' ' && buf_prefix[4] == '*' && buf_prefix[5] == ' ' && buf_prefix[6] == 'H' && buf_prefix[7] == 'T' && buf_prefix[8] == 'T' && buf_prefix[9] == 'P' && buf_prefix[10] == '/' && buf_prefix[11] == '2' && buf_prefix[12] == '.' && buf_prefix[13] == '0'){ 47 | return 1; 48 | } 49 | return 0; 50 | } 51 | 52 | 53 | static __always_inline 54 | int is_http2_frame(char *buf, __u64 size) { 55 | if (size < 9) { 56 | return 0; 57 | } 58 | 59 | // magic message is not a frame 60 | if (is_http2_magic_2(buf)){ 61 | return 1; 62 | } 63 | 64 | // try to parse frame 65 | 66 | // 3 bytes length 67 | // 1 byte type 68 | // 1 byte flags 69 | // 4 bytes stream id 70 | // 9 bytes total 71 | 72 | // #length bytes payload 73 | 74 | __u32 length; 75 | bpf_read_into_from(length,buf); 76 | length = bpf_htonl(length) >> 8; // slide off the last 8 bits 77 | 78 | __u8 type; 79 | bpf_read_into_from(type,buf+3); // 3 bytes in 80 | 81 | // frame types are 1 byte 82 | // 0x00 DATA 83 | // 0x01 HEADERS 84 | // 0x02 PRIORITY 85 | // 0x03 RST_STREAM 86 | // 0x04 SETTINGS 87 | // 0x05 PUSH_PROMISE 88 | // 0x06 PING 89 | // 0x07 GOAWAY 90 | // 0x08 WINDOW_UPDATE 91 | // 0x09 CONTINUATION 92 | 93 | // other frames can precede headers frames, so only check if its a valid frame type 94 | if (type > 0x09){ 95 | return 0; 96 | } 97 | 98 | __u32 stream_id; // 4 bytes 99 | bpf_read_into_from(stream_id,buf+5); 100 | stream_id = bpf_htonl(stream_id); 101 | 102 | // odd stream ids are client initiated 103 | // even stream ids are server initiated 104 | 105 | if (stream_id == 0) { // special stream for window updates, pings 106 | return 1; 107 | } 108 | 109 | // only track client initiated streams 110 | if (stream_id % 2 == 1) { 111 | return 1; 112 | } 113 | return 0; 114 | } -------------------------------------------------------------------------------- /ebpf/c/kafka.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | // https://kafka.apache.org/protocol.html 3 | 4 | // RequestOrResponse => Size (RequestMessage | ResponseMessage) 5 | // Size => int32 6 | 7 | 8 | // Request Header v0 => request_api_key request_api_version correlation_id 9 | // request_api_key => INT16 10 | // request_api_version => INT16 11 | // correlation_id => INT32 12 | // client_id => NULLABLE_STRING // added in v1 13 | 14 | // method will be decoded in user space 15 | #define METHOD_KAFKA_PRODUCE_REQUEST 1 16 | #define METHOD_KAFKA_FETCH_RESPONSE 2 17 | 18 | 19 | #define KAFKA_API_KEY_PRODUCE_API 0 20 | #define KAFKA_API_KEY_FETCH_API 1 21 | 22 | struct kafka_request_header { 23 | __s32 size; 24 | __s16 api_key; 25 | __s16 api_version; 26 | __s32 correlation_id; 27 | }; 28 | 29 | // Response Header v1 => correlation_id TAG_BUFFER 30 | // correlation_id => INT32 31 | 32 | struct kafka_response_header { 33 | __s32 size; 34 | __s32 correlation_id; 35 | }; 36 | 37 | static __always_inline 38 | int is_kafka_request_header(char *buf, __u64 buf_size, __s32 *request_id, __s16 *api_key, __s16 *api_version) { 39 | struct kafka_request_header h = {}; 40 | if (buf_size < sizeof(h)) { 41 | return 0; 42 | } 43 | 44 | if (bpf_probe_read(&h, sizeof(h), buf) < 0) { 45 | return 0; 46 | } 47 | 48 | h.size = bpf_htonl(h.size); 49 | 50 | // we parse only one message in one write syscall for now. 51 | // batch publish is not supported. 52 | if (h.size+4 != buf_size) { 53 | return 0; 54 | } 55 | 56 | h.api_key = bpf_htons(h.api_key); // determines message api, ProduceAPI, FetchAPI, etc. 57 | h.api_version = bpf_htons(h.api_version); // version of the API, v8, v9, etc. 58 | h.correlation_id = bpf_htonl(h.correlation_id); 59 | if (h.correlation_id > 0 && (h.api_key >= 0 && h.api_key <= 74)) { // https://kafka.apache.org/protocol.html#protocol_api_keys 60 | *request_id = h.correlation_id; 61 | *api_key = h.api_key; 62 | *api_version = h.api_version; 63 | return 1; 64 | } 65 | return 0; 66 | } 67 | 68 | static __always_inline 69 | int is_kafka_response_header(char *buf, __s32 correlation_id) { 70 | struct kafka_response_header h = {}; 71 | if (bpf_probe_read(&h, sizeof(h), buf) < 0) { 72 | return 0; 73 | } 74 | // correlation_id match 75 | if (bpf_htonl(h.correlation_id) == correlation_id) { 76 | return 1; 77 | } 78 | return 0; 79 | } 80 | 81 | 82 | -------------------------------------------------------------------------------- /ebpf/c/loader.go: -------------------------------------------------------------------------------- 1 | package c 2 | -------------------------------------------------------------------------------- /ebpf/c/macros.h: -------------------------------------------------------------------------------- 1 | #define TASK_COMM_LEN 16 2 | #define MAX_FILENAME_LEN 127 3 | 4 | #define PROC_EXEC_EVENT 1 5 | #define PROC_EXIT_EVENT 2 6 | -------------------------------------------------------------------------------- /ebpf/c/map.h: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | 3 | // keeps open sockets 4 | // key: skaddr 5 | // value: sk_info 6 | // remove when connection is established or when socket is closed 7 | struct 8 | { 9 | __uint(type, BPF_MAP_TYPE_HASH); 10 | __uint(max_entries, 10240); 11 | __type(key, void *); 12 | __type(value, struct sk_info); 13 | } sock_map SEC(".maps"); 14 | 15 | 16 | // opening sockets, delete when connection is established or connection fails 17 | struct 18 | { 19 | __uint(type, BPF_MAP_TYPE_HASH); 20 | __uint(max_entries, 10240); 21 | __type(key, void *); 22 | __type(value, struct sk_info); 23 | } sock_map_temp SEC(".maps"); 24 | 25 | 26 | struct { 27 | __uint(type, BPF_MAP_TYPE_HASH); 28 | __type(key, __u32); 29 | __type(value, __u8); 30 | __uint(max_entries, 5000); 31 | } container_pids SEC(".maps"); 32 | 33 | // used for sending events to user space 34 | // EVENT_TCP_LISTEN, EVENT_TCP_LISTEN_CLOSED 35 | struct 36 | { 37 | __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); 38 | __uint(key_size, sizeof(int)); 39 | __uint(value_size, sizeof(int)); 40 | } tcp_listen_events SEC(".maps"); 41 | 42 | // used for sending events to user space 43 | // EVENT_TCP_ESTABLISHED, EVENT_TCP_CLOSED, EVENT_TCP_CONNECT_FAILED 44 | struct 45 | { 46 | __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); 47 | __uint(key_size, sizeof(int)); 48 | __uint(value_size, sizeof(int)); 49 | } tcp_connect_events SEC(".maps"); 50 | 51 | // keeps the pid and fd of the process that opened the socket 52 | struct 53 | { 54 | __uint(type, BPF_MAP_TYPE_HASH); 55 | __uint(key_size, sizeof(__u64)); 56 | __uint(value_size, sizeof(__u64)); 57 | __uint(max_entries, 10240); 58 | } fd_by_pid_tgid SEC(".maps"); 59 | -------------------------------------------------------------------------------- /ebpf/c/mongo.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | // https://www.mongodb.com/docs/manual/reference/mongodb-wire-protocol/ 3 | // Mongo Request Query 4 | // 4 bytes message len 5 | // 4 bytes request id 6 | // 4 bytes response to 7 | // 4 bytes opcode (2004 for Query) 8 | // 4 bytes query flags 9 | // fullCollectionName : ? 10 | // 4 bytes number to skip 11 | // 4 bytes number to return 12 | // 4 bytes Document Length 13 | // Elements 14 | 15 | // Extensible Message Format 16 | // 4 bytes len 17 | // 4 bytes request id 18 | // 4 bytes response to 19 | // 4 bytes opcode (2013 for extensible message format) 20 | // 4 bytes message flags 21 | // Section 22 | // 1 byte Kind (0 for body) 23 | // BodyDocument 24 | // 4 bytes document length 25 | // Elements 26 | // Section 27 | // Kind : Document Sequence (1) 28 | // SeqId: "documents" 29 | // DocumentSequence 30 | // Document 31 | // 4 bytes doc len 32 | 33 | // For response: 34 | // same with above 35 | 36 | #define MONGO_OP_COMPRESSED 2012 // Wraps other opcodes using compression 37 | #define MONGO_OP_MSG 2013 // Send a message using the standard format. Used for both client requests and database replies. 38 | 39 | // https://www.mongodb.com/docs/manual/reference/mongodb-wire-protocol/#standard-message-header 40 | struct mongo_header { 41 | __s32 length; // total message size, including this 42 | __s32 request_id; // identifier for this message 43 | __s32 response_to; // requestID from the original request (used in responses from the database) 44 | __s32 opcode; // message type 45 | }; 46 | 47 | struct mongo_header_wout_len { 48 | // __s32 length; // total message size, including this 49 | __s32 request_id; // identifier for this message 50 | __s32 response_to; // requestID from the original request (used in responses from the database) 51 | __s32 opcode; // message type 52 | }; 53 | 54 | static __always_inline 55 | int is_mongo_request(char *buf, __u64 buf_size) { 56 | struct mongo_header h = {}; 57 | if (bpf_probe_read(&h, sizeof(h), (void *)((char *)buf)) < 0) { 58 | return 0; 59 | } 60 | if (h.response_to == 0 && (h.opcode == MONGO_OP_MSG || h.opcode == MONGO_OP_COMPRESSED)) { 61 | bpf_printk("this is a mongo_request\n"); 62 | return 1; 63 | } 64 | return 0; 65 | } 66 | 67 | // mongo replies read in 2 parts 68 | // [pid 286873] read(7, "\x2d\x00\x00\x00", 4) = 4 // these 4 bytes are length 69 | // [pid 286873] read(7, "\xe1\x0b\x00\x00 \x09\x00\x00\x00 \xdd\x07\x00\x00 // request_id - response_to - opcode 70 | // \x00\x00\x00\x00\x00\x18\x00\x00\x00\x10 71 | // \x6e\x00 72 | // \x01\x00\x00\x00\x01\x6f\x6b\x00"..., 41) = 41static __always_inline 73 | // (ok) 74 | static __always_inline 75 | int is_mongo_reply(char *buf, __u64 buf_size) { 76 | struct mongo_header_wout_len h = {}; 77 | if (bpf_probe_read(&h, sizeof(h), (void *)((char *)buf)) < 0) { 78 | bpf_printk("this is a mongo_reply_header_fail\n"); 79 | return 0; 80 | } 81 | if (h.response_to == 0) { 82 | bpf_printk("this is a mongo_reply_response_to0, - %d\n",h.opcode); 83 | return 0; 84 | } 85 | if (h.opcode == MONGO_OP_MSG || h.opcode == MONGO_OP_COMPRESSED) { 86 | bpf_printk("this is a mongo_reply\n"); 87 | return 1; 88 | } 89 | 90 | bpf_printk("this is a mongo_reply-fail - %d\n",h.opcode); 91 | return 0; 92 | } 93 | 94 | -------------------------------------------------------------------------------- /ebpf/c/mysql.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | // https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_command_phase.html 3 | 4 | // 01 00 00 00 01 5 | // ^^- command-byte 6 | // ^^---- sequence-id == 0 7 | 8 | // https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_com_query.html 9 | #define MYSQL_COM_QUERY 0x03 // Text Protocol 10 | 11 | // https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_command_phase_ps.html 12 | #define MYSQL_COM_STMT_PREPARE 0x16 // Creates a prepared statement for the passed query string. 13 | // The server returns a COM_STMT_PREPARE Response which contains a statement-id which is ised to identify the prepared statement. 14 | 15 | // https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_com_stmt_execute.html 16 | #define MYSQL_COM_STMT_EXECUTE 0x17 // COM_STMT_EXECUTE asks the server to execute a prepared statement as identified by statement_id. 17 | 18 | 19 | // https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_com_stmt_close.html 20 | #define MYSQL_COM_STMT_CLOSE 0x19 // COM_STMT_CLOSE deallocates a prepared statement. 21 | // No response packet is sent back to the client. 22 | 23 | 24 | #define MYSQL_RESPONSE_OK 0x00 25 | #define MYSQL_RESPONSE_EOF 0xfe 26 | #define MYSQL_RESPONSE_ERROR 0xff 27 | 28 | #define METHOD_UNKNOWN 0 29 | #define METHOD_MYSQL_TEXT_QUERY 1 30 | #define METHOD_MYSQL_PREPARE_STMT 2 31 | #define METHOD_MYSQL_EXEC_STMT 3 32 | #define METHOD_MYSQL_STMT_CLOSE 4 33 | 34 | 35 | #define MYSQL_STATUS_OK 1 36 | #define MYSQL_STATUS_FAILED 2 37 | 38 | static __always_inline 39 | int is_mysql_query(char *buf, __u64 buf_size, __u8 *request_type) { 40 | if (buf_size < 5) { 41 | return 0; 42 | } 43 | __u8 b[5]; // first 5 bytes, first 3 represents length, 4th is packet number, 5th is command type 44 | if (bpf_probe_read(&b, sizeof(b), (void *)((char *)buf)) < 0) { 45 | return 0; 46 | } 47 | int len = (int)b[0] | (int)b[1] << 8 | (int)b[2] << 16; 48 | // command byte is inside the packet 49 | if (len+4 != buf_size || b[3] != 0) { // packet number must be 0 50 | return 0; 51 | } 52 | 53 | if (b[4] == MYSQL_COM_QUERY || b[4] == MYSQL_COM_STMT_EXECUTE) { 54 | *request_type = b[4]; 55 | return 1; 56 | } 57 | 58 | if (b[4] == MYSQL_COM_STMT_CLOSE) { 59 | *request_type = MYSQL_COM_STMT_CLOSE; 60 | return 1; 61 | } 62 | 63 | if (b[4] == MYSQL_COM_STMT_PREPARE) { 64 | *request_type = MYSQL_COM_STMT_PREPARE; 65 | return 1; 66 | } 67 | return 0; 68 | } 69 | 70 | // __u32 *statement_id 71 | static __always_inline 72 | int is_mysql_response(char *buf, __u64 buf_size, __u8 request_type, __u32 *statement_id) { 73 | __u8 b[5]; // first 5 bytes, first 3 represents length, 4th is packet number, 5th is response code 74 | if (bpf_probe_read(&b, sizeof(b), (void *)((char *)buf)) < 0) { 75 | return 0; 76 | } 77 | if (b[3] <= 0) { // sequence must be > 0 78 | return 0; 79 | } 80 | int length = (int)b[0] | (int)b[1] << 8 | (int)b[2] << 16; 81 | 82 | if (length == 1 || b[4] == MYSQL_RESPONSE_EOF) { 83 | return MYSQL_STATUS_OK; 84 | } 85 | if (b[4] == MYSQL_RESPONSE_OK) { 86 | if (request_type == MYSQL_COM_STMT_PREPARE) { 87 | // 6-9th bytes returns statement id 88 | if (bpf_probe_read(statement_id, sizeof(*statement_id), (void *)((char *)buf+5)) < 0) { 89 | return 0; 90 | } 91 | } 92 | return MYSQL_STATUS_OK; 93 | } 94 | if (b[4] == MYSQL_RESPONSE_ERROR) { 95 | // *status = STATUS_FAILED; 96 | return MYSQL_STATUS_FAILED; 97 | } 98 | return 0; 99 | } -------------------------------------------------------------------------------- /ebpf/c/openssl.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | struct padding {}; 3 | typedef long (*padding_fn)(); 4 | 5 | 6 | //OpenSSL_1_0_2 7 | struct ssl_st_v1_0_2 { 8 | __s32 version; 9 | __s32 type; 10 | struct padding* method; // const SSL_METHOD *method; 11 | // ifndef OPENSSL_NO_BIO 12 | struct bio_st_v1* rbio; // used by SSL_read 13 | struct bio_st_v1* wbio; // used by SSL_write 14 | }; 15 | 16 | struct bio_st_v1_0_2 { 17 | struct padding* method; // BIO_METHOD *method; 18 | padding_fn callback; // long (*callback) (struct bio_st *, int, const char *, int, long, long); 19 | char* cb_arg; /* first argument for the callback */ 20 | int init; 21 | int shutdown; 22 | int flags; /* extra storage */ 23 | int retry_reason; 24 | int num; // fd 25 | }; 26 | 27 | 28 | //OpenSSL_1_1_1 29 | struct ssl_st_v1_1_1 { 30 | __s32 version; 31 | struct padding* method; // const SSL_METHOD *method; 32 | struct bio_st_v1_1_1* rbio; // used by SSL_read 33 | struct bio_st_v1_1_1* wbio; // used by SSL_write 34 | }; 35 | 36 | struct bio_st_v1_1_1 { 37 | struct padding* method; // const BIO_METHOD *method; 38 | padding_fn callback; // long (*callback) (struct bio_st *, int, const char *, int, long, long); 39 | padding_fn callback_ex; 40 | char* cb_arg; 41 | int init; 42 | int shutdown; 43 | int flags; 44 | int retry_reason; 45 | int num; // fd 46 | }; 47 | 48 | //openssl-3.0.0 49 | struct ssl_st_v3_0_0 { 50 | __s32 version; 51 | struct padding* method; // const SSL_METHOD *method; 52 | /* used by SSL_read */ 53 | struct bio_st_v3_0_0* rbio; 54 | /* used by SSL_write */ 55 | struct bio_st_v3_0_0* wbio; 56 | 57 | }; 58 | 59 | struct bio_st_v3_0 { 60 | struct padding* libctx; // OSSL_LIB_CTX *libctx; 61 | struct padding* method; // const BIO_METHOD *method; 62 | padding_fn callback; // BIO_callback_fn callback; 63 | padding_fn callback_ex; // BIO_callback_fn_ex callback_ex; 64 | char* cb_arg; 65 | int init; 66 | int shutdown; 67 | int flags; 68 | int retry_reason; 69 | int num; // fd 70 | // void *ptr; 71 | // struct bio_st *next_bio; /* used by filter BIOs */ 72 | // struct bio_st *prev_bio; /* used by filter BIOs */ 73 | // CRYPTO_REF_COUNT references; 74 | // uint64_t num_read; 75 | // uint64_t num_write; 76 | // CRYPTO_EX_DATA ex_data; 77 | // CRYPTO_RWLOCK *lock; 78 | }; 79 | 80 | // struct ssl_st { 81 | // /* 82 | // * protocol version (one of SSL2_VERSION, SSL3_VERSION, TLS1_VERSION, 83 | // * DTLS1_VERSION) 84 | // */ 85 | // int version; 86 | // /* SSLv3 */ 87 | // const SSL_METHOD *method; 88 | // /* 89 | // * There are 2 BIO's even though they are normally both the same. This 90 | // * is so data can be read and written to different handlers 91 | // */ 92 | // /* used by SSL_read */ 93 | // BIO *rbio; 94 | // /* used by SSL_write */ 95 | // BIO *wbio; 96 | // /* used during session-id reuse to concatenate messages */ 97 | // BIO *bbio; 98 | // /* 99 | // * This holds a variable that indicates what we were doing when a 0 or -1 100 | // * is returned. This is needed for non-blocking IO so we know what 101 | // * request needs re-doing when in SSL_accept or SSL_connect 102 | // */ 103 | // int rwstate; 104 | // int (*handshake_func) (SSL *); 105 | // /* 106 | // * Imagine that here's a boolean member "init" that is switched as soon 107 | // * as SSL_set_{accept/connect}_state is called for the first time, so 108 | // * that "state" and "handshake_func" are properly initialized. But as 109 | // * handshake_func is == 0 until then, we use this test instead of an 110 | // * "init" member. 111 | // */ 112 | // /* are we the server side? */ 113 | // int server; 114 | // /* 115 | // * Generate a new session or reuse an old one. 116 | // * NB: For servers, the 'new' session may actually be a previously 117 | // * cached session or even the previous session unless 118 | // * SSL_OP_NO_SESSION_RESUMPTION_ON_RENEGOTIATION is set 119 | // */ 120 | // int new_session; 121 | // /* don't send shutdown packets */ 122 | // int quiet_shutdown; 123 | // /* we have shut things down, 0x01 sent, 0x02 for received */ 124 | // int shutdown; 125 | 126 | // ... 127 | // ... 128 | 129 | // } -------------------------------------------------------------------------------- /ebpf/c/postgres.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | // https://www.postgresql.org/docs/current/protocol.html 3 | // PostgreSQL uses a message-based protocol for communication between frontends and backends (clients and servers). 4 | // The protocol is supported over TCP/IP and also over Unix-domain sockets 5 | 6 | // In order to serve multiple clients efficiently, the server launches a new “backend” process for each client. 7 | 8 | // All communication is through a stream of messages. 9 | // The first byte of a message identifies the message type, and the next four bytes specify the length of the rest of the message (this length count includes itself, but not the message-type byte) 10 | // 1 byte of message type + 4 bytes of length + payload 11 | 12 | // In the extended-query protocol, execution of SQL commands is divided into multiple steps. 13 | // The state retained between steps is represented by two types of objects: prepared statements and portals 14 | 15 | // A prepared statement represents the result of parsing and semantic analysis of a textual query string. A prepared statement is not in itself ready to execute, because it might lack specific values for parameters. 16 | // A portal represents a ready-to-execute or already-partially-executed statement, with any missing parameter values filled in 17 | 18 | // 1) parse step, which creates a prepared statement from a textual query string 19 | // 2) bind step, which creates a portal (a prepared statement with parameter values filled in) from a prepared statement 20 | // 3) execute step, which executes a portal's query 21 | // In case of query returns rows, execute step maybe repeated multiple times 22 | 23 | // As of PostgreSQL 7.4 the only supported formats are “text” and “binary”. Clients can specify a format code. 24 | // binary representations for complex data types might change across server versions; the text format is usually the more portable choice 25 | 26 | // state of the connection: start-up, query, function call, COPY, and termination 27 | 28 | // The ReadyForQuery message is the same one that the backend will issue after each command cycle is completed. 29 | 30 | // A simple query cycle is initiated by the frontend sending a Query message to the backend 31 | // The message includes an SQL command (or commands) expressed as a text string 32 | // The backend then sends one or more response messages depending on the contents of the query command string, and finally a ReadyForQuery response message 33 | // CommandComplete 34 | // RowDescription 35 | // DataRow 36 | // EmptyQueryResponse 37 | // ErrorResponse 38 | 39 | // SELECT - EXPLAIN - SHOW 40 | // RowDescription, zero or more DataRow messages, and then CommandComplete 41 | 42 | // a query string could contain several queries (separated by semicolons) 43 | 44 | // Simple and Extended Query Modes 45 | 46 | // In simple Query mode, the format of retrieved values is always text, except when the given command is a FETCH from a cursor declared with the BINARY option 47 | // multi-statement Query message in an implicit transaction block 48 | 49 | // In the extended protocol, the frontend first sends a Parse message, which contains a textual query string, 50 | // optionally some information about data types of parameter placeholders, and the name of a destination prepared-statement object 51 | // The response is either ParseComplete or ErrorResponse 52 | 53 | // The query string contained in a Parse message cannot include more than one SQL statement; if it does, the backend will throw an error 54 | // This restriction does not exist in the simple-query protocol, but it does exist in the extended protocol, because allowing prepared statements or portals to contain multiple commands would complicate the protocol unduly. 55 | 56 | // If successfully created, a named prepared-statement object lasts till the end of the current session, unless explicitly destroyed 57 | 58 | // unnamed statement 59 | 60 | // Named prepared statements must be explicitly closed before they can be redefined by another Parse message 61 | // Parse - Bind - Execute 62 | 63 | // The simple Query message is approximately equivalent to the series Parse, Bind, portal Describe, Execute, Close, Sync, 64 | // using the unnamed prepared statement and portal objects and no parameters. 65 | // One difference is that it will accept multiple SQL statements in the query string, automatically performing the bind/describe/execute sequence for each one in succession. 66 | // Another difference is that it will not return ParseComplete, BindComplete, CloseComplete, or NoData messages. 67 | 68 | // Q(1 byte), length(4 bytes), query(length-4 bytes) 69 | #define POSTGRES_MESSAGE_SIMPLE_QUERY 'Q' // 'Q' + 4 bytes of length + query 70 | 71 | // C(1 byte), length(4 bytes), Byte1('S' to close a prepared statement; or 'P' to close a portal), name of the prepared statement or portal(length-5 bytes) 72 | #define POSTGRES_MESSAGE_CLOSE 'C' 73 | 74 | // X(1 byte), length(4 bytes) 75 | #define POSTGRES_MESSAGE_TERMINATE 'X' 76 | 77 | // C(1 byte), length(4 bytes), tag(length-4 bytes) 78 | #define POSTGRES_MESSAGE_COMMAND_COMPLETION 'C' 79 | 80 | // prepared statement 81 | #define POSTGRES_MESSAGE_PARSE 'P' // 'P' + 4 bytes of length + query 82 | #define POSTGRES_MESSAGE_BIND 'B' // 'P' + 4 bytes of length + query 83 | 84 | 85 | #define METHOD_UNKNOWN 0 86 | #define METHOD_STATEMENT_CLOSE_OR_CONN_TERMINATE 1 87 | #define METHOD_SIMPLE_QUERY 2 88 | #define METHOD_EXTENDED_QUERY 3 89 | 90 | #define COMMAND_COMPLETE 1 91 | #define ERROR_RESPONSE 2 92 | // #define ROW_DESCRIPTION 4 93 | // #define DATA_ROW 5 94 | // #define EMPTY_QUERY_RESPONSE 7 95 | // #define NO_DATA 8 96 | // #define PORTAL_SUSPENDED 9 97 | // #define PARAMETER_STATUS 10 98 | // #define BACKEND_KEY_DATA 11 99 | // #define READY_FOR_QUERY 12 100 | 101 | // should used on client side 102 | // checks if the message is a postgresql Q, C, X message 103 | static __always_inline 104 | int parse_client_postgres_data(char *buf, int buf_size, __u8 *request_type) { 105 | if (buf_size < 1) { 106 | return 0; 107 | } 108 | char identifier; 109 | __u32 len; 110 | if (bpf_probe_read(&identifier, sizeof(identifier), (void *)((char *)buf)) < 0) { 111 | return 0; 112 | } 113 | 114 | if (bpf_probe_read(&len, sizeof(len), (void *)((char *)buf+1)) < 0) { 115 | return 0; 116 | } 117 | len = bpf_htonl(len); 118 | 119 | if (identifier == POSTGRES_MESSAGE_TERMINATE && len == 4) { 120 | *request_type = identifier; 121 | return 1; 122 | } 123 | 124 | // long queries can be split into multiple packets 125 | // therefore specified length can exceed the buf_size 126 | // normally (len + 1 byte of identifier == buf_size) should be true 127 | 128 | // Simple Query Protocol 129 | if (identifier == POSTGRES_MESSAGE_SIMPLE_QUERY) { 130 | *request_type = identifier; 131 | return 1; 132 | } 133 | 134 | // Extended Query Protocol (Prepared Statement) 135 | // >P/D/S (Parse/Describe/Sync) creating a prepared statement 136 | // >B/E/S (Bind/Execute/Sync) executing a prepared statement 137 | if (identifier == POSTGRES_MESSAGE_PARSE || identifier == POSTGRES_MESSAGE_BIND) { 138 | // For fine grained parsing check Sync message, Http2 has a similar message starting with 'P' (PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n) 139 | // read last 5 bytes of the buffer 140 | char sync[5]; 141 | if (bpf_probe_read(&sync, sizeof(sync), (void *)((char *)buf+buf_size-5)) < 0) { 142 | return 0; 143 | } 144 | if (sync[0] == 'S' && sync[1] == 0 && sync[2] == 0 && sync[3] == 0 && sync[4] == 4) { 145 | *request_type = identifier; 146 | return 1; 147 | } 148 | } 149 | 150 | return 0; 151 | } 152 | 153 | static __always_inline 154 | __u32 parse_postgres_server_resp(char *buf, int buf_size) { 155 | char identifier; 156 | int len; 157 | if (bpf_probe_read(&identifier, sizeof(identifier), (void *)((char *)buf)) < 0) { 158 | return 0; 159 | } 160 | if (bpf_probe_read(&len, sizeof(len), (void *)((char *)buf+1)) < 0) { 161 | return 0; 162 | } 163 | len = bpf_htonl(len); 164 | 165 | if (len+1 > buf_size) { 166 | return 0; 167 | } 168 | 169 | // TODO: write a state machine to parse the response 170 | 171 | // '1' : ParseComplete 172 | // '2' : BindComplete 173 | // '3' : CloseComplete 174 | // 'T' : RowDescription 175 | // 'D' : DataRow 176 | // 'C' : CommandComplete 177 | // 'E' : ErrorResponse 178 | // 'I' : EmptyQueryResponse 179 | // 'N' : NoData 180 | // 'S' : PortalSuspended 181 | // 's' : ParameterStatus 182 | // 'K' : BackendKeyData 183 | // 'Z' : ReadyForQuery 184 | 185 | 186 | 187 | // if ((cmd == '1' || cmd == '2') && length == 4 && buf_size >= 10) { 188 | // if (bpf_probe_read(&cmd, sizeof(cmd), (void *)((char *)buf+5)) < 0) { 189 | // return 0; 190 | // } 191 | // if (bpf_probe_read(&length, sizeof(length), (void *)((char *)buf+5+1)) < 0) { 192 | // return 0; 193 | // } 194 | // } 195 | 196 | if (identifier == 'E') { 197 | return ERROR_RESPONSE; 198 | } 199 | 200 | // TODO: multiple pg messages can be in one packet, need to parse all of them and check if any of them is a command complete 201 | // assume C came if you see a T or D 202 | // when parsed C, it will have sql command in it (tag field, e.g. SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, etc.) 203 | if (identifier == 't' || identifier == 'T' || identifier == 'D' || identifier == 'C') { 204 | return COMMAND_COMPLETE; 205 | } 206 | 207 | return 0; 208 | } 209 | -------------------------------------------------------------------------------- /ebpf/c/proc.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | 3 | struct p_event{ 4 | __u32 pid; 5 | __u8 type; 6 | }; 7 | 8 | struct { 9 | __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); 10 | __type(key, __u32); 11 | __type(value, struct p_event); 12 | __uint(max_entries, 1); 13 | } proc_event_heap SEC(".maps"); 14 | 15 | struct { 16 | __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); 17 | __uint(key_size, sizeof(int)); 18 | __uint(value_size, sizeof(int)); 19 | } proc_events SEC(".maps"); 20 | 21 | SEC("tracepoint/sched/sched_process_exec") 22 | int sched_process_exec(struct trace_event_raw_sched_process_exec* ctx) 23 | { 24 | __u32 pid, tid; 25 | __u64 id = 0; 26 | 27 | /* get PID and TID of exiting thread/process */ 28 | id = bpf_get_current_pid_tgid(); 29 | pid = id >> 32; 30 | tid = (__u32)id; 31 | 32 | /* ignore thread exec */ 33 | if (pid != tid) 34 | return 0; 35 | 36 | int zero = 0; 37 | struct p_event *e = bpf_map_lookup_elem(&proc_event_heap, &zero); 38 | if (!e) { 39 | return 0; 40 | } 41 | 42 | e->pid = pid; 43 | e->type = PROC_EXEC_EVENT; 44 | 45 | bpf_perf_event_output(ctx, &proc_events, BPF_F_CURRENT_CPU, e, sizeof(*e)); 46 | return 0; 47 | } 48 | 49 | 50 | SEC("tracepoint/sched/sched_process_exit") 51 | int sched_process_exit(struct trace_event_raw_sched_process_exit* ctx) 52 | { 53 | __u32 pid, tid; 54 | __u64 id = 0; 55 | 56 | /* get PID and TID of exiting thread/process */ 57 | id = bpf_get_current_pid_tgid(); 58 | pid = id >> 32; 59 | tid = (__u32)id; 60 | 61 | #ifdef FILTER_OUT_NON_CONTAINER 62 | // try to remove pid from container_pids map(it may not exist, but it's ok) 63 | // since we add pids on sched_process_fork regardless of being process or thread 64 | // try to remove both pid and tid 65 | if (pid == tid){ // if it's a process, remove pid 66 | // process exiting 67 | bpf_map_delete_elem(&container_pids, &pid); 68 | }else{ 69 | // thread exiting 70 | bpf_map_delete_elem(&container_pids, &tid); 71 | } 72 | #endif 73 | 74 | /* ignore thread exits */ 75 | if (pid != tid) 76 | return 0; 77 | 78 | int zero = 0; 79 | struct p_event *e = bpf_map_lookup_elem(&proc_event_heap, &zero); 80 | if (!e) { 81 | return 0; 82 | } 83 | 84 | e->pid = pid; 85 | e->type = PROC_EXIT_EVENT; 86 | bpf_perf_event_output(ctx, &proc_events, BPF_F_CURRENT_CPU, e, sizeof(*e)); 87 | return 0; 88 | } 89 | 90 | SEC("tracepoint/sched/sched_process_fork") 91 | int sched_process_fork(struct trace_event_raw_sched_process_fork* ctx) 92 | { 93 | #ifdef FILTER_OUT_NON_CONTAINER 94 | // check parent pid is in container 95 | // (ctx->pid can be a thread too, linux kernel treats threads and processes in the same way) 96 | // there is a spectrum between threads and processes in terms of sharing resources via flags. 97 | __u32 pid = ctx->pid; 98 | __u32 child_pid =ctx->child_pid; 99 | 100 | __u8 *is_container_pid = bpf_map_lookup_elem(&container_pids, &pid); 101 | if (!is_container_pid) 102 | return 0; 103 | 104 | unsigned char func_name[] = "sched_process_fork"; 105 | __u8 exists = 1; 106 | // write child_pid to container_pids map 107 | long res = bpf_map_update_elem(&container_pids, &child_pid, &exists, BPF_ANY); 108 | if (res < 0){ 109 | unsigned char log_msg[] = "failed forked task -- pid|child_pid|res"; 110 | log_to_userspace(ctx, DEBUG, func_name, log_msg, ctx->pid,ctx->child_pid, res); 111 | } 112 | #endif 113 | 114 | return 0; 115 | } -------------------------------------------------------------------------------- /ebpf/c/redis.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | // Redis serialization protocol (RESP) specification 3 | // https://redis.io/docs/reference/protocol-spec/ 4 | 5 | // A client sends the Redis server an array consisting of only bulk strings. 6 | // A Redis server replies to clients, sending any valid RESP data type as a reply. 7 | 8 | 9 | #define STATUS_SUCCESS 1 10 | #define STATUS_ERROR 2 11 | #define STATUS_UNKNOWN 3 12 | 13 | #define METHOD_REDIS_COMMAND 1 14 | #define METHOD_REDIS_PUSHED_EVENT 2 15 | #define METHOD_REDIS_PING 3 16 | 17 | 18 | static __always_inline 19 | int is_redis_ping(char *buf, __u64 buf_size) { 20 | // *1\r\n$4\r\nping\r\n 21 | if (buf_size < 14) { 22 | return 0; 23 | } 24 | char b[14]; 25 | if (bpf_probe_read(&b, sizeof(b), (void *)((char *)buf)) < 0) { 26 | return 0; 27 | } 28 | 29 | if (b[0] != '*' || b[1] != '1' || b[2] != '\r' || b[3] != '\n' || b[4] != '$' || b[5] != '4' || b[6] != '\r' || b[7] != '\n') { 30 | return 0; 31 | } 32 | 33 | if (b[8] != 'p' || b[9] != 'i' || b[10] != 'n' || b[11] != 'g' || b[12] != '\r' || b[13] != '\n') { 34 | return 0; 35 | } 36 | 37 | return STATUS_SUCCESS; 38 | } 39 | 40 | static __always_inline 41 | int is_redis_pong(char *buf, __u64 buf_size) { 42 | // *2\r\n$4\r\npong\r\n$0\r\n\r\n 43 | if (buf_size < 14) { 44 | return 0; 45 | } 46 | char b[14]; 47 | if (bpf_probe_read(&b, sizeof(b), (void *)((char *)buf)) < 0) { 48 | return 0; 49 | } 50 | 51 | if (b[0] != '*' || b[1] < '0' || b[1] > '9' || b[2] != '\r' || b[3] != '\n' || b[4] != '$' || b[5] != '4' || b[6] != '\r' || b[7] != '\n') { 52 | return 0; 53 | } 54 | 55 | if (b[8] != 'p' || b[9] != 'o' || b[10] != 'n' || b[11] != 'g' || b[12] != '\r' || b[13] != '\n') { 56 | return 0; 57 | } 58 | 59 | return STATUS_SUCCESS; 60 | } 61 | 62 | static __always_inline 63 | int is_redis_command(char *buf, __u64 buf_size) { 64 | //*3\r\n$7\r\nmessage\r\n$10\r\nmy_channel\r\n$13\r\nHello, World!\r\n 65 | if (buf_size < 11) { 66 | return 0; 67 | } 68 | char b[11]; 69 | if (bpf_probe_read(&b, sizeof(b), (void *)((char *)buf)) < 0) { 70 | return 0; 71 | } 72 | // Clients send commands to the Redis server as RESP arrays 73 | // * is the array prefix 74 | // latter is the number of elements in the array 75 | // check if it is a RESP array 76 | if (b[0] != '*' || b[1] < '0' || b[1] > '9') { 77 | return 0; 78 | } 79 | // Check if command is not "message", message command is used for pub/sub by server to notify sub. 80 | // CLRF(\r\n) is the seperator in RESP protocol 81 | if (b[2] == '\r' && b[3] == '\n') { 82 | if (b[4]=='$' && b[5] == '7' && b[6] == '\r' && b[7] == '\n' && b[8] == 'm' && b[9] == 'e' && b[10] == 's'){ 83 | return 0; 84 | } 85 | return 1; 86 | } 87 | 88 | // Array length can exceed 9, so check if the second byte is a digit 89 | if (b[2] >= '0' && b[2] <= '9' && b[3] == '\r' && b[4] == '\n') { 90 | if (b[5]=='$' && b[6] == '7' && b[7] == '\r' && b[8] == '\n' && b[9] == 'm' && b[10] == 'e'){ 91 | return 0; 92 | } 93 | return 1; 94 | } 95 | 96 | 97 | return 0; 98 | } 99 | 100 | static __always_inline 101 | __u32 is_redis_pushed_event(char *buf, __u64 buf_size){ 102 | char b[17]; 103 | if (buf_size < 17) { 104 | return 0; 105 | } 106 | if (bpf_probe_read(&b, sizeof(b), (void *)((char *)buf)) < 0) { 107 | return 0; 108 | } 109 | 110 | //*3\r\n$7\r\nmessage\r\n$10\r\nmy_channel\r\n$13\r\nHello, World!\r\n 111 | // message received from the Redis server 112 | 113 | // In RESP3 protocol, the first byte of the pushed event is '>' 114 | // whereas in RESP2 protocol, the first byte is '*' 115 | if ((b[0] != '>' && b[0] != '*') || b[1] < '0' || b[1] > '9') { 116 | return 0; 117 | } 118 | 119 | // CLRF(\r\n) is the seperator in RESP protocol 120 | if (b[2] == '\r' && b[3] == '\n') { 121 | if (b[4]=='$' && b[5] == '7' && b[6] == '\r' && b[7] == '\n' && b[8] == 'm' && b[9] == 'e' && b[10] == 's' && b[11] == 's' && b[12] == 'a' && b[13] == 'g' && b[14] == 'e' && b[15] == '\r' && b[16] == '\n'){ 122 | return 1; 123 | }else{ 124 | return 0; 125 | } 126 | } 127 | 128 | // TODO: long messages ? 129 | // // Array length can exceed 9, so check if the second byte is a digit 130 | // if (b[2] >= '0' && b[2] <= '9' && b[3] == '\r' && b[4] == '\n') { 131 | // return 1; 132 | // } 133 | 134 | return 0; 135 | } 136 | 137 | static __always_inline 138 | __u32 parse_redis_response(char *buf, __u64 buf_size) { 139 | char type; 140 | if (bpf_probe_read(&type, sizeof(type), (void *)((char *)buf)) < 0) { 141 | return STATUS_UNKNOWN; 142 | } 143 | char end[2]; // must end with \r\n 144 | 145 | if (bpf_probe_read(&end, sizeof(end), (void *)((char *)buf+buf_size-2)) < 0) { 146 | return 0; 147 | } 148 | 149 | if (end[0] != '\r' || end[1] != '\n') { 150 | return STATUS_UNKNOWN; 151 | } 152 | 153 | // Accepted since RESP2 154 | // Array | Integer | Bulk String | Simple String 155 | if (type == '*' || type == ':' || type == '$' || type == '+' 156 | ) { 157 | return STATUS_SUCCESS; 158 | } 159 | 160 | // https://redis.io/docs/latest/develop/reference/protocol-spec/#simple-errors 161 | // Accepted since RESP2 162 | // Error 163 | if (type == '-') { 164 | return STATUS_ERROR; 165 | } 166 | 167 | // Accepted since RESP3 168 | // Null | Boolean | Double | Big Numbers | Verbatim String | Maps | Set 169 | if (type == '_' || type == '#' || type == ',' || type =='(' || type == '=' || type == '%' || type == '~') { 170 | return STATUS_SUCCESS; 171 | } 172 | 173 | 174 | // Accepted since RESP3 175 | // Bulk Errors 176 | if (type == '!') { 177 | return STATUS_ERROR; 178 | } 179 | 180 | return STATUS_UNKNOWN; 181 | } 182 | -------------------------------------------------------------------------------- /ebpf/c/struct.h: -------------------------------------------------------------------------------- 1 | 2 | //go:build ignore 3 | struct tcp_event 4 | { 5 | __u64 fd; 6 | __u64 timestamp; 7 | __u32 type; 8 | __u32 pid; 9 | __u16 sport; 10 | __u16 dport; 11 | __u8 saddr[16]; 12 | __u8 daddr[16]; 13 | }; 14 | 15 | // pid and fd of socket 16 | struct sk_info 17 | { 18 | __u64 fd; 19 | __u32 pid; 20 | }; 21 | 22 | struct trace_event_raw_sched_process_exit { 23 | __u64 unused; 24 | char comm[TASK_COMM_LEN]; 25 | __u32 pid; 26 | }; 27 | 28 | struct trace_event_raw_sched_process_exec { 29 | __u64 unused; 30 | __u32 filename_unused; 31 | __u32 pid; 32 | }; 33 | 34 | struct trace_event_raw_sched_process_fork { 35 | __u64 unused; 36 | char parent_comm[TASK_COMM_LEN]; 37 | __u32 pid; 38 | char child_comm[TASK_COMM_LEN]; 39 | __u32 child_pid; 40 | }; -------------------------------------------------------------------------------- /ebpf/c/tcp.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | SEC("tracepoint/sock/inet_sock_set_state") 3 | int inet_sock_set_state(void *ctx) 4 | { 5 | // unsigned char func_name[] = "inet_sock_set_state"; 6 | __u64 timestamp = bpf_ktime_get_ns(); 7 | struct trace_event_raw_inet_sock_set_state args = {}; 8 | if (bpf_core_read(&args, sizeof(args), ctx) < 0) 9 | { 10 | return 0; 11 | } 12 | 13 | // if not tcp protocol, ignore 14 | if (BPF_CORE_READ(&args, protocol) != IPPROTO_TCP) 15 | { 16 | return 0; 17 | } 18 | 19 | // get pid 20 | __u64 id = bpf_get_current_pid_tgid(); 21 | __u32 pid = id >> 32; 22 | 23 | const void *skaddr; 24 | 25 | // if state transition is from BPF_TCP_CLOSE to BPF_TCP_SYN_SENT, 26 | // a new connection attempt 27 | 28 | int oldstate; 29 | int newstate; 30 | 31 | oldstate = BPF_CORE_READ(&args, oldstate); 32 | newstate = BPF_CORE_READ(&args, newstate); 33 | skaddr = BPF_CORE_READ(&args, skaddr); 34 | 35 | if (oldstate == BPF_TCP_CLOSE && newstate == BPF_TCP_SYN_SENT) 36 | { 37 | __u64 *fdp = bpf_map_lookup_elem(&fd_by_pid_tgid, &id); 38 | 39 | if (!fdp) 40 | { 41 | return 0; 42 | } 43 | 44 | struct sk_info i = {}; 45 | i.pid = pid; 46 | i.fd = *fdp; // file descriptor pointer 47 | 48 | // remove from fd_by_pid_tgid map, we are going to keep fdp 49 | bpf_map_delete_elem(&fd_by_pid_tgid, &id); 50 | bpf_map_update_elem(&sock_map_temp, &skaddr, &i, BPF_ANY); 51 | return 0; 52 | } 53 | 54 | __u64 fd = 0; 55 | __u32 type = 0; 56 | 57 | void *map = &tcp_connect_events; 58 | if (oldstate == BPF_TCP_SYN_SENT) 59 | { 60 | struct sk_info *i = bpf_map_lookup_elem(&sock_map_temp, &skaddr); 61 | if (!i) 62 | { 63 | return 0; 64 | } 65 | if (newstate == BPF_TCP_ESTABLISHED) 66 | { 67 | type = EVENT_TCP_ESTABLISHED; 68 | pid = i->pid; 69 | fd = i->fd; 70 | bpf_map_delete_elem(&sock_map_temp, &skaddr); 71 | 72 | // add to sock_map 73 | struct sk_info si = {}; 74 | si.pid = i->pid; 75 | si.fd = i->fd; 76 | bpf_map_update_elem(&sock_map, &skaddr, &si, BPF_ANY); 77 | } 78 | else if (newstate == BPF_TCP_CLOSE) 79 | { 80 | type = EVENT_TCP_CONNECT_FAILED; 81 | pid = i->pid; 82 | fd = i->fd; 83 | bpf_map_delete_elem(&sock_map_temp, &skaddr); 84 | } 85 | } 86 | 87 | if (oldstate == BPF_TCP_ESTABLISHED && 88 | (newstate == BPF_TCP_FIN_WAIT1 || newstate == BPF_TCP_CLOSE_WAIT)) 89 | { 90 | type = EVENT_TCP_CLOSED; 91 | 92 | struct sk_info *i = bpf_map_lookup_elem(&sock_map, &skaddr); 93 | if (!i) 94 | { 95 | return 0; 96 | } 97 | pid = i->pid; 98 | fd = i->fd; 99 | 100 | // delete from sock_map 101 | bpf_map_delete_elem(&sock_map, &skaddr); 102 | } 103 | if (oldstate == BPF_TCP_CLOSE && newstate == BPF_TCP_LISTEN) 104 | { 105 | type = EVENT_TCP_LISTEN; 106 | map = &tcp_listen_events; 107 | } 108 | if (oldstate == BPF_TCP_LISTEN && newstate == BPF_TCP_CLOSE) 109 | { 110 | type = EVENT_TCP_LISTEN_CLOSED; 111 | map = &tcp_listen_events; 112 | } 113 | 114 | if (type == 0) 115 | { 116 | return 0; 117 | } 118 | 119 | struct tcp_event e = {}; 120 | e.type = type; 121 | e.timestamp = timestamp; 122 | e.pid = pid; 123 | e.sport = BPF_CORE_READ(&args, sport); 124 | e.dport = BPF_CORE_READ(&args, dport); 125 | e.fd = fd; 126 | 127 | __builtin_memcpy(&e.saddr, &args.saddr, sizeof(e.saddr)); 128 | __builtin_memcpy(&e.daddr, &args.daddr, sizeof(e.saddr)); 129 | 130 | __u8 *val = bpf_map_lookup_elem(&container_pids, &e.pid); 131 | if (!val) 132 | { 133 | return 0; // not a container process, ignore 134 | } 135 | 136 | bpf_perf_event_output(ctx, map, BPF_F_CURRENT_CPU, &e, sizeof(e)); 137 | return 0; 138 | } 139 | 140 | // triggered before entering connect syscall 141 | SEC("tracepoint/syscalls/sys_enter_connect") 142 | int sys_enter_connect(void *ctx) 143 | { 144 | __u64 id = bpf_get_current_pid_tgid(); 145 | __u32 pid = id >> 32; 146 | 147 | __u8 *val = bpf_map_lookup_elem(&container_pids, &pid); 148 | if (!val) 149 | { 150 | return 0; // not a container process, ignore 151 | } 152 | 153 | struct trace_event_sys_enter_connect args = {}; 154 | if (bpf_core_read(&args, sizeof(args), ctx) < 0) 155 | { 156 | return 0; 157 | } 158 | bpf_map_update_elem(&fd_by_pid_tgid, &id, &args.fd, BPF_ANY); 159 | return 0; 160 | } 161 | 162 | SEC("tracepoint/syscalls/sys_exit_connect") 163 | int sys_exit_connect(void *ctx) 164 | { 165 | __u64 id = bpf_get_current_pid_tgid(); 166 | // __u32 pid = id >> 32; 167 | 168 | bpf_map_delete_elem(&fd_by_pid_tgid, &id); 169 | 170 | // __u8 *val = bpf_map_lookup_elem(&container_pids, &pid); 171 | // if (!val) 172 | // { 173 | // return 0; // not a container process, ignore 174 | // } 175 | 176 | return 0; 177 | } 178 | -------------------------------------------------------------------------------- /ebpf/c/tcp_sock.c: -------------------------------------------------------------------------------- 1 | //go:build ignore 2 | #define INGRESS 0 3 | #define EGRESS 1 4 | 5 | struct call_event { 6 | __u32 pid; 7 | __u32 tid; 8 | __u64 tx; // timestamp 9 | __u8 type; // INGRESS or EGRESS 10 | __u32 seq; // tcp sequence number 11 | }; 12 | 13 | struct { 14 | __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); 15 | __type(key, __u32); 16 | __type(value, struct call_event); 17 | __uint(max_entries, 1); 18 | } ingress_egress_heap SEC(".maps"); 19 | 20 | struct { 21 | __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); 22 | __uint(key_size, sizeof(int)); 23 | __uint(value_size, sizeof(int)); 24 | } ingress_egress_calls SEC(".maps"); 25 | 26 | struct file { 27 | void *private_data; 28 | }; 29 | 30 | struct fdtable { 31 | unsigned int max_fds; 32 | struct file **fd; /* current fd array, struct file * */ 33 | }; 34 | 35 | typedef struct { 36 | int counter; 37 | } atomic_t; 38 | 39 | struct files_struct { 40 | atomic_t count; // atomic_t count; 41 | struct fdtable *fdt; 42 | }; 43 | 44 | struct task_struct { 45 | __u32 pid; // equals to POSIX tid 46 | __u32 tgid; // equals to POSIX pid 47 | struct files_struct *files; 48 | }; 49 | 50 | struct socket { 51 | short int type; 52 | long unsigned int flags; 53 | struct file *file; 54 | struct sock *sk; 55 | const struct proto_ops *ops; 56 | }; 57 | struct tcp_sock { 58 | __u32 write_seq; 59 | __u32 copied_seq; 60 | }; 61 | 62 | typedef __u32 __bitwise __portpair; 63 | typedef __u64 __bitwise __addrpair; 64 | 65 | struct sock_common { 66 | union { 67 | __addrpair skc_addrpair; 68 | struct { 69 | __be32 skc_daddr; 70 | __be32 skc_rcv_saddr; 71 | }; 72 | }; 73 | union { 74 | unsigned int skc_hash; 75 | __u16 skc_u16hashes[2]; 76 | }; 77 | /* skc_dport && skc_num must be grouped as well */ 78 | union { 79 | __portpair skc_portpair; 80 | struct { 81 | __be16 skc_dport; 82 | __u16 skc_num; 83 | }; 84 | }; 85 | 86 | }; 87 | 88 | struct sock { 89 | struct sock_common __sk_common; 90 | // __be32 sk_rcv_saddr; // not inet_saddr 91 | // __be16 sk_num; // local port, not inet_sport 92 | // __be32 sk_daddr; 93 | // __be16 sk_dport; 94 | #define sk_rcv_saddr __sk_common.skc_rcv_saddr 95 | #define sk_daddr __sk_common.skc_daddr 96 | #define sk_num __sk_common.skc_num 97 | #define sk_dport __sk_common.skc_dport 98 | }; 99 | 100 | static __always_inline 101 | struct sock * get_sock(__u32 fd_num) { 102 | struct task_struct *task = (struct task_struct *)bpf_get_current_task(); 103 | struct file **fdarray = NULL; 104 | fdarray = BPF_CORE_READ(task, files, fdt, fd); 105 | 106 | if(fdarray == NULL){ 107 | return NULL; 108 | }else{ 109 | struct file *file = NULL; 110 | long r = bpf_probe_read_kernel(&file, sizeof(file), fdarray + fd_num); 111 | if(r <0){ 112 | return NULL; 113 | } 114 | 115 | void * private_data = NULL; 116 | private_data = BPF_CORE_READ(file, private_data); 117 | if(private_data){ 118 | struct socket *socket = private_data; 119 | short int socket_type = BPF_CORE_READ(socket,type); 120 | 121 | void * __file = BPF_CORE_READ(socket,file); 122 | 123 | if(socket_type == SOCK_STREAM && file == __file){ 124 | struct sock *sk = NULL; 125 | sk = BPF_CORE_READ(socket,sk); 126 | 127 | return sk; 128 | } 129 | return NULL; 130 | } 131 | return NULL; 132 | } 133 | return NULL; 134 | } 135 | 136 | static __always_inline 137 | struct tcp_sock * get_tcp_sock(__u32 fd_num){ 138 | struct task_struct *task = (struct task_struct *)bpf_get_current_task(); 139 | // __u32 pid = BPF_CORE_READ(task, pid); 140 | // __u32 tgid = BPF_CORE_READ(task, tgid); 141 | 142 | // atomic_t count = BPF_CORE_READ(task, files, count); 143 | 144 | struct file **fdarray = NULL; 145 | fdarray = BPF_CORE_READ(task, files, fdt, fd); 146 | 147 | if(fdarray == NULL){ 148 | return 0; 149 | }else{ 150 | struct file *file = NULL; 151 | long r = bpf_probe_read_kernel(&file, sizeof(file), fdarray + fd_num); 152 | if(r <0){ 153 | return 0; 154 | } 155 | 156 | void * private_data = NULL; 157 | private_data = BPF_CORE_READ(file, private_data); 158 | 159 | if(private_data){ 160 | struct socket *socket = private_data; 161 | short int socket_type = BPF_CORE_READ(socket,type); 162 | 163 | void * __file = BPF_CORE_READ(socket,file); 164 | 165 | if(socket_type == SOCK_STREAM && file == __file ){ 166 | struct sock *sk = NULL; 167 | sk = BPF_CORE_READ(socket,sk); 168 | 169 | if(sk != NULL){ 170 | struct tcp_sock * __tcp_sock = (struct tcp_sock *)sk; 171 | return __tcp_sock; 172 | } 173 | } 174 | } 175 | } 176 | return NULL; 177 | } 178 | 179 | 180 | static __always_inline 181 | __u32 get_tcp_write_seq_from_fd(__u32 fd_num){ 182 | struct tcp_sock * __tcp_sock = (struct tcp_sock *) get_tcp_sock(fd_num); 183 | if(__tcp_sock == NULL){ 184 | return 0; 185 | } 186 | 187 | __u32 tcp_seq = 0; 188 | tcp_seq = BPF_CORE_READ(__tcp_sock,write_seq); 189 | return tcp_seq; 190 | } 191 | 192 | 193 | static __always_inline 194 | __u32 get_tcp_copied_seq_from_fd(__u32 fd_num){ 195 | struct tcp_sock * __tcp_sock = (struct tcp_sock *) get_tcp_sock(fd_num); 196 | if(__tcp_sock == NULL){ 197 | return 0; 198 | } 199 | 200 | __u32 tcp_seq = 0; 201 | tcp_seq = BPF_CORE_READ(__tcp_sock,copied_seq); 202 | 203 | return tcp_seq; 204 | } 205 | 206 | static __always_inline 207 | __u64 process_for_dist_trace_write(void* ctx, __u64 fd){ 208 | // unsigned char func_name[] = "process_for_dist_trace_write"; 209 | __u32 pid, tid; 210 | __u64 id = 0; 211 | 212 | /* get PID and TID of exiting thread/process */ 213 | id = bpf_get_current_pid_tgid(); 214 | pid = id >> 32; 215 | tid = (__u32)id; 216 | 217 | __u32 seq = get_tcp_write_seq_from_fd(fd); 218 | if(seq == 0){ 219 | return 0; 220 | } 221 | 222 | int zero = 0; 223 | struct call_event *e = bpf_map_lookup_elem(&ingress_egress_heap, &zero); 224 | if (!e) { 225 | return 0; 226 | } 227 | 228 | e->pid = pid; 229 | e->tid = tid; 230 | e->seq = seq; 231 | e->tx = bpf_ktime_get_ns(); 232 | e->type = EGRESS; 233 | 234 | __u8 *val = bpf_map_lookup_elem(&container_pids, &(e->pid)); 235 | if (!val) 236 | { 237 | // unsigned char log_msg[] = "filter out traffic egress event -- pid|fd|psize"; 238 | // log_to_userspace(ctx, DEBUG, func_name, log_msg, e->pid, 0, 0); 239 | return 0; // not a container process, ignore 240 | } 241 | bpf_perf_event_output(ctx, &ingress_egress_calls, BPF_F_CURRENT_CPU, e, sizeof(*e)); 242 | 243 | return seq; 244 | } 245 | 246 | static __always_inline 247 | void process_for_dist_trace_read(void* ctx, __u32 fd){ 248 | // unsigned char func_name[] = "process_for_dist_trace_read"; 249 | __u32 pid, tid; 250 | __u64 id = 0; 251 | 252 | /* get PID and TID of exiting thread/process */ 253 | id = bpf_get_current_pid_tgid(); 254 | pid = id >> 32; 255 | tid = (__u32)id; 256 | 257 | __u32 seq = get_tcp_copied_seq_from_fd(fd); 258 | if(seq == 0){ 259 | return; 260 | } 261 | 262 | int zero = 0; 263 | struct call_event *e = bpf_map_lookup_elem(&ingress_egress_heap, &zero); 264 | if (!e) { 265 | return; 266 | } 267 | 268 | e->pid = pid; 269 | e->tid = tid; 270 | e->seq = seq; 271 | e->tx = bpf_ktime_get_ns(); 272 | e->type = INGRESS; 273 | 274 | __u8 *val = bpf_map_lookup_elem(&container_pids, &(e->pid)); 275 | if (!val) 276 | { 277 | // unsigned char log_msg[] = "filter out traffic ingress event -- pid|fd|psize"; 278 | // log_to_userspace(ctx, DEBUG, func_name, log_msg, e->pid, 0, 0); 279 | return; // not a container process, ignore 280 | } 281 | bpf_perf_event_output(ctx, &ingress_egress_calls, BPF_F_CURRENT_CPU, e, sizeof(*e)); 282 | } 283 | -------------------------------------------------------------------------------- /ebpf/headers/common.h: -------------------------------------------------------------------------------- 1 | // This is a compact version of `vmlinux.h` to be used in the examples using C code. 2 | 3 | #pragma once 4 | 5 | typedef unsigned char __u8; 6 | typedef short int __s16; 7 | typedef short unsigned int __u16; 8 | typedef int __s32; 9 | typedef unsigned int __u32; 10 | typedef long long int __s64; 11 | typedef long long unsigned int __u64; 12 | typedef __u8 u8; 13 | typedef __s16 s16; 14 | typedef __u16 u16; 15 | typedef __s32 s32; 16 | typedef __u32 u32; 17 | typedef __s64 s64; 18 | typedef __u64 u64; 19 | typedef __u16 __le16; 20 | typedef __u16 __be16; 21 | typedef __u32 __be32; 22 | typedef __u64 __be64; 23 | typedef __u32 __wsum; 24 | 25 | 26 | /* 27 | * Helper macro to place programs, maps, license in 28 | * different sections in elf_bpf file. Section names 29 | * are interpreted by libbpf depending on the context (BPF programs, BPF maps, 30 | * extern variables, etc). 31 | * To allow use of SEC() with externs (e.g., for extern .maps declarations), 32 | * make sure __attribute__((unused)) doesn't trigger compilation warning. 33 | */ 34 | #define SEC(name) \ 35 | _Pragma("GCC diagnostic push") \ 36 | _Pragma("GCC diagnostic ignored \"-Wignored-attributes\"") \ 37 | __attribute__((section(name), used)) \ 38 | _Pragma("GCC diagnostic pop") \ 39 | 40 | 41 | #define __uint(name, val) int (*name)[val] 42 | #define __type(name, val) typeof(val) *name 43 | #define __array(name, val) typeof(val) *name[] 44 | 45 | typedef signed char __s8; 46 | 47 | typedef unsigned char __u8; 48 | 49 | typedef short int __s16; 50 | 51 | typedef short unsigned int __u16; 52 | 53 | typedef int __s32; 54 | 55 | typedef unsigned int __u32; 56 | 57 | typedef long long int __s64; 58 | 59 | typedef long long unsigned int __u64; 60 | 61 | typedef __s8 s8; 62 | 63 | typedef __u8 u8; 64 | 65 | typedef __s16 s16; 66 | 67 | typedef __u16 u16; 68 | 69 | typedef __s32 s32; 70 | 71 | typedef __u32 u32; 72 | 73 | typedef __s64 s64; 74 | 75 | typedef __u64 u64; 76 | 77 | enum { 78 | false = 0, 79 | true = 1, 80 | }; 81 | 82 | enum { 83 | IPPROTO_IP = 0, 84 | IPPROTO_ICMP = 1, 85 | IPPROTO_IGMP = 2, 86 | IPPROTO_IPIP = 4, 87 | IPPROTO_TCP = 6, 88 | IPPROTO_EGP = 8, 89 | IPPROTO_PUP = 12, 90 | IPPROTO_UDP = 17, 91 | IPPROTO_IDP = 22, 92 | IPPROTO_TP = 29, 93 | IPPROTO_DCCP = 33, 94 | IPPROTO_IPV6 = 41, 95 | IPPROTO_RSVP = 46, 96 | IPPROTO_GRE = 47, 97 | IPPROTO_ESP = 50, 98 | IPPROTO_AH = 51, 99 | IPPROTO_MTP = 92, 100 | IPPROTO_BEETPH = 94, 101 | IPPROTO_ENCAP = 98, 102 | IPPROTO_PIM = 103, 103 | IPPROTO_COMP = 108, 104 | IPPROTO_SCTP = 132, 105 | IPPROTO_UDPLITE = 136, 106 | IPPROTO_MPLS = 137, 107 | IPPROTO_ETHERNET = 143, 108 | IPPROTO_RAW = 255, 109 | IPPROTO_MPTCP = 262, 110 | IPPROTO_MAX = 263, 111 | }; 112 | 113 | #define bpf_read_into_from(dst, src) \ 114 | ({ \ 115 | if (bpf_probe_read(&dst, sizeof(dst), src) < 0) { \ 116 | return 0; \ 117 | } \ 118 | }) 119 | 120 | #ifndef __VMLINUX_H__ 121 | #define __VMLINUX_H__ 122 | 123 | // #if defined(__TARGET_ARCH_x86) 124 | // #define bpf_target_x86 125 | // #define bpf_target_defined 126 | // #elif defined(__TARGET_ARCH_arm64) 127 | // #define bpf_target_arm64 128 | // #define bpf_target_defined 129 | // #else 130 | // #undef bpf_target_defined 131 | // #endif 132 | #endif /* __VMLINUX_H__ */ 133 | 134 | 135 | 136 | -------------------------------------------------------------------------------- /ebpf/headers/l7_req.h: -------------------------------------------------------------------------------- 1 | // struct trace_entry { 2 | // short unsigned int type; 3 | // unsigned char flags; 4 | // unsigned char preempt_count; 5 | // int pid; 6 | // }; 7 | 8 | struct trace_event_raw_sys_exit { 9 | struct trace_entry ent; 10 | long int id; 11 | long int ret; 12 | char __data[0]; 13 | }; 14 | 15 | 16 | struct trace_event_raw_sys_enter_read{ 17 | struct trace_entry ent; 18 | int __syscall_nr; 19 | unsigned long int fd; 20 | char * buf; 21 | __u64 count; 22 | }; 23 | 24 | struct trace_event_raw_sys_enter_recvfrom { 25 | struct trace_entry ent; 26 | __s32 __syscall_nr; 27 | __u64 fd; 28 | void * ubuf; 29 | __u64 size; 30 | __u64 flags; 31 | struct sockaddr * addr; 32 | __u64 addr_len; 33 | }; 34 | 35 | struct trace_event_raw_sys_exit_read { 36 | __u64 unused; 37 | __s32 id; 38 | __s64 ret; 39 | }; 40 | 41 | struct trace_event_raw_sys_exit_recvfrom { 42 | __u64 unused; 43 | __s32 id; 44 | __s64 ret; 45 | }; 46 | 47 | struct trace_event_raw_sys_exit_write { 48 | __u64 unused; 49 | __s32 id; 50 | __s64 ret; 51 | }; 52 | 53 | struct trace_event_raw_sys_exit_sendto { 54 | __u64 unused; 55 | __s32 id; 56 | __s64 ret; 57 | }; 58 | 59 | 60 | struct trace_event_raw_sys_exit_writev { 61 | __u64 unused; 62 | __s32 id; 63 | __s64 ret; 64 | }; 65 | struct trace_event_raw_sys_enter_write { 66 | struct trace_entry ent; 67 | __s32 __syscall_nr; 68 | __u64 fd; 69 | char * buf; 70 | __u64 count; 71 | }; 72 | 73 | struct trace_event_raw_sys_enter_writev { 74 | struct trace_entry ent; 75 | __s32 __syscall_nr; 76 | __u64 fd; 77 | struct iovec * vec; // struct iovec * 78 | __u64 vlen; 79 | }; 80 | 81 | // TODO: remove unused fields ? 82 | struct trace_event_raw_sys_enter_sendto { 83 | struct trace_entry ent; 84 | __s32 __syscall_nr; 85 | __u64 fd; 86 | void * buff; 87 | __u64 len; // size_t ?? 88 | __u64 flags; 89 | struct sockaddr * addr; 90 | __u64 addr_len; 91 | }; 92 | 93 | 94 | struct user_msghdr { 95 | void *msg_name; 96 | int msg_namelen; 97 | struct iovec *msg_iov; 98 | __kernel_size_t msg_iovlen; 99 | void *msg_control; 100 | __kernel_size_t msg_controllen; 101 | unsigned int msg_flags; 102 | }; 103 | 104 | struct trace_event_raw_sys_enter_sendmsg { 105 | struct trace_entry ent; 106 | __s32 __syscall_nr; 107 | __u64 fd; 108 | struct user_msghdr * msg; 109 | __u64 flags; 110 | }; 111 | 112 | 113 | 114 | -------------------------------------------------------------------------------- /ebpf/headers/log.h: -------------------------------------------------------------------------------- 1 | // Log levels 2 | #define DEBUG 0 3 | #define INFO 1 4 | #define WARN 2 5 | #define ERROR 3 6 | 7 | #define LOG_MSG_SIZE 100 8 | 9 | // bpf_trace_printk() 10 | // %s, %d, and %c work 11 | // %pi6 for ipv6 address 12 | // $pks for kernel strings 13 | // can accept only up to 3 input arguments (bpf helpers can accept up to 5 in total) 14 | 15 | struct log_message { 16 | __u32 level; 17 | // specify the what are the arguments in log message 18 | // Args:[type, type, type] -- log message 19 | unsigned char log_msg[LOG_MSG_SIZE]; 20 | unsigned char func_name[LOG_MSG_SIZE]; 21 | __u32 pid; 22 | __u64 arg1; 23 | __u64 arg2; 24 | __u64 arg3; 25 | }; 26 | 27 | struct { 28 | __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); 29 | __uint(key_size, sizeof(int)); 30 | __uint(value_size, sizeof(int)); 31 | __uint(max_entries, 10240); 32 | } log_map SEC(".maps"); 33 | 34 | struct { 35 | __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); 36 | __type(key, __u32); 37 | __type(value, struct log_message); 38 | __uint(max_entries, 1); 39 | } log_heap SEC(".maps"); 40 | 41 | // use while development 42 | // struct log_message l = {}; 43 | // l.level = DEBUG; 44 | // BPF_SNPRINTF(l.payload, sizeof(l.payload),"process_enter_of_syscalls_write_sendto %d %s\n", 1, "cakir"); 45 | // log_to_trace_pipe(l.payload, sizeof(l.payload)); 46 | static __always_inline 47 | void log_to_trace_pipe(char *msg, __u32 size) { 48 | long res = bpf_trace_printk(msg, size); 49 | if(res < 0){ 50 | bpf_printk("bpf_trace_printk failed %d\n", res); 51 | } 52 | } 53 | 54 | static __always_inline 55 | void log_to_userspace(void *ctx, __u32 level, unsigned char *func_name, unsigned char * log_msg, __u64 arg1, __u64 arg2, __u64 arg3){ 56 | int zero = 0; 57 | struct log_message *l = bpf_map_lookup_elem(&log_heap, &zero); 58 | if (!l) { 59 | bpf_printk("log_to_userspace failed, %s %s\n",func_name, log_msg); 60 | return; 61 | } 62 | 63 | l->level = level; 64 | l->pid = bpf_get_current_pid_tgid() >> 32; 65 | l->arg1 = arg1; 66 | l->arg2 = arg2; 67 | l->arg3 = arg3; 68 | bpf_probe_read_str(&l->func_name, sizeof(l->func_name), func_name); 69 | bpf_probe_read_str(&l->log_msg, sizeof(l->log_msg), log_msg); 70 | 71 | bpf_perf_event_output(ctx, &log_map, BPF_F_CURRENT_CPU, l, sizeof(*l)); 72 | } 73 | 74 | 75 | -------------------------------------------------------------------------------- /ebpf/headers/pt_regs.h: -------------------------------------------------------------------------------- 1 | #if defined(bpf_target_x86) 2 | struct pt_regs { 3 | long unsigned int r15; 4 | long unsigned int r14; 5 | long unsigned int r13; 6 | long unsigned int r12; 7 | long unsigned int bp; 8 | long unsigned int bx; 9 | long unsigned int r11; 10 | long unsigned int r10; 11 | long unsigned int r9; 12 | long unsigned int r8; 13 | long unsigned int ax; 14 | long unsigned int cx; 15 | long unsigned int dx; 16 | long unsigned int si; 17 | long unsigned int di; 18 | long unsigned int orig_ax; 19 | long unsigned int ip; 20 | long unsigned int cs; 21 | long unsigned int flags; 22 | long unsigned int sp; 23 | long unsigned int ss; 24 | }; 25 | #endif /* bpf_target_x86 */ 26 | 27 | 28 | #if defined(bpf_target_arm64) 29 | struct user_pt_regs { 30 | __u64 regs[31]; 31 | __u64 sp; 32 | __u64 pc; 33 | __u64 pstate; 34 | }; 35 | 36 | struct pt_regs { 37 | union { 38 | struct user_pt_regs user_regs; 39 | struct { 40 | u64 regs[31]; 41 | u64 sp; 42 | u64 pc; 43 | u64 pstate; 44 | }; 45 | }; 46 | u64 orig_x0; 47 | s32 syscallno; 48 | u32 unused2; 49 | u64 sdei_ttbr1; 50 | u64 pmr_save; 51 | u64 stackframe[2]; 52 | u64 lockdep_hardirqs; 53 | u64 exit_rcu; 54 | }; 55 | #endif /* bpf_target_arm64 */ 56 | -------------------------------------------------------------------------------- /ebpf/headers/tcp.h: -------------------------------------------------------------------------------- 1 | struct trace_entry { 2 | short unsigned int type; 3 | unsigned char flags; 4 | unsigned char preempt_count; 5 | int pid; 6 | }; 7 | struct trace_event_raw_inet_sock_set_state { 8 | struct trace_entry ent; 9 | const void *skaddr; 10 | int oldstate; 11 | int newstate; 12 | __u16 sport; 13 | __u16 dport; 14 | __u16 family; 15 | __u16 protocol; 16 | __u8 saddr[4]; 17 | __u8 daddr[4]; 18 | __u8 saddr_v6[16]; 19 | __u8 daddr_v6[16]; 20 | char __data[0]; 21 | }; 22 | 23 | typedef unsigned short int sa_family_t; 24 | 25 | struct sockaddrv 26 | { 27 | sa_family_t sa_family; 28 | char sa_data[14]; 29 | }; 30 | 31 | struct trace_event_sys_enter_connect 32 | { 33 | struct trace_entry ent; 34 | int __syscall_nr; 35 | long unsigned int fd; 36 | struct sockaddrv *uservaddr; 37 | long unsigned int addrlen; 38 | }; 39 | 40 | 41 | #define EVENT_TCP_ESTABLISHED 1 42 | #define EVENT_TCP_CONNECT_FAILED 2 43 | #define EVENT_TCP_LISTEN 3 44 | #define EVENT_TCP_LISTEN_CLOSED 4 45 | #define EVENT_TCP_CLOSED 5 46 | -------------------------------------------------------------------------------- /ebpf/proc/proc.go: -------------------------------------------------------------------------------- 1 | package proc 2 | 3 | import ( 4 | "context" 5 | "os" 6 | "unsafe" 7 | 8 | "github.com/ddosify/alaz/ebpf/c" 9 | "github.com/ddosify/alaz/log" 10 | 11 | "github.com/cilium/ebpf" 12 | "github.com/cilium/ebpf/link" 13 | "github.com/cilium/ebpf/perf" 14 | ) 15 | 16 | const ( 17 | BPF_EVENT_PROC_EXEC = iota + 1 18 | BPF_EVENT_PROC_EXIT 19 | ) 20 | 21 | const ( 22 | EVENT_PROC_EXEC = "EVENT_PROC_EXEC" 23 | EVENT_PROC_EXIT = "EVENT_PROC_EXIT" 24 | ) 25 | 26 | // Custom type for the enumeration 27 | type ProcEventConversion uint32 28 | 29 | // String representation of the enumeration values 30 | func (e ProcEventConversion) String() string { 31 | switch e { 32 | case BPF_EVENT_PROC_EXEC: 33 | return EVENT_PROC_EXEC 34 | case BPF_EVENT_PROC_EXIT: 35 | return EVENT_PROC_EXIT 36 | default: 37 | return "Unknown" 38 | } 39 | } 40 | 41 | type PEvent struct { 42 | Pid uint32 43 | Type_ uint8 44 | _ [3]byte 45 | } 46 | 47 | type ProcEvent struct { 48 | Pid uint32 49 | Type_ string 50 | } 51 | 52 | const PROC_EVENT = "proc_event" 53 | 54 | func (e ProcEvent) Type() string { 55 | return PROC_EVENT 56 | } 57 | 58 | type ProcProgConfig struct { 59 | ProcEventsMapSize uint32 // specified in terms of os page size 60 | } 61 | 62 | var defaultConfig *ProcProgConfig = &ProcProgConfig{ 63 | ProcEventsMapSize: 16, 64 | } 65 | 66 | type ProcProg struct { 67 | // links represent a program attached to a hook 68 | links map[string]link.Link // key : hook name 69 | ProcEvents *perf.Reader 70 | ProcEventsMapSize uint32 71 | ContainerPidMap *ebpf.Map // for filtering non-container pids on the node 72 | } 73 | 74 | func InitProcProg(conf *ProcProgConfig) *ProcProg { 75 | if conf == nil { 76 | conf = defaultConfig 77 | } 78 | 79 | return &ProcProg{ 80 | links: map[string]link.Link{}, 81 | ProcEventsMapSize: conf.ProcEventsMapSize, 82 | } 83 | } 84 | 85 | func (pp *ProcProg) Close() { 86 | for hookName, link := range pp.links { 87 | log.Logger.Info().Msgf("unattach %s", hookName) 88 | link.Close() 89 | } 90 | } 91 | 92 | func (pp *ProcProg) Attach() { 93 | l, err := link.Tracepoint("sched", "sched_process_exit", c.BpfObjs.SchedProcessExit, nil) 94 | if err != nil { 95 | log.Logger.Fatal().Err(err).Msg("link sched_process_exit tracepoint") 96 | } 97 | pp.links["sched/sched_process_exit"] = l 98 | 99 | l1, err := link.Tracepoint("sched", "sched_process_exec", c.BpfObjs.SchedProcessExec, nil) 100 | if err != nil { 101 | log.Logger.Fatal().Err(err).Msg("link sched_process_exec tracepoint") 102 | } 103 | pp.links["sched/sched_process_exec"] = l1 104 | 105 | l2, err := link.Tracepoint("sched", "sched_process_fork", c.BpfObjs.SchedProcessFork, nil) 106 | if err != nil { 107 | log.Logger.Fatal().Err(err).Msg("link sched_process_fork tracepoint") 108 | } 109 | pp.links["sched/sched_process_fork"] = l2 110 | } 111 | 112 | func (pp *ProcProg) InitMaps() { 113 | var err error 114 | pp.ProcEvents, err = perf.NewReader(c.BpfObjs.ProcEvents, 16*os.Getpagesize()) 115 | if err != nil { 116 | log.Logger.Fatal().Err(err).Msg("error creating perf reader") 117 | } 118 | 119 | // Initialize the pid filter map from user space and populate 120 | // the map with the pids of the container processes 121 | } 122 | 123 | func (pp *ProcProg) Consume(ctx context.Context, ch chan interface{}) { 124 | for { 125 | read := func() { 126 | record, err := pp.ProcEvents.Read() 127 | if err != nil { 128 | log.Logger.Warn().Err(err).Msg("error reading from proc events map") 129 | } 130 | 131 | if record.RawSample == nil || len(record.RawSample) == 0 { 132 | log.Logger.Debug().Msgf("read sample l7-event nil or empty") 133 | return 134 | } 135 | 136 | bpfEvent := (*PEvent)(unsafe.Pointer(&record.RawSample[0])) 137 | 138 | go func() { 139 | ch <- &ProcEvent{ 140 | Pid: bpfEvent.Pid, 141 | Type_: ProcEventConversion(bpfEvent.Type_).String(), 142 | } 143 | }() 144 | } 145 | 146 | select { 147 | case <-ctx.Done(): 148 | return 149 | default: 150 | read() 151 | } 152 | } 153 | } 154 | -------------------------------------------------------------------------------- /ebpf/ssllib.go: -------------------------------------------------------------------------------- 1 | package ebpf 2 | 3 | import ( 4 | "fmt" 5 | "regexp" 6 | "strings" 7 | ) 8 | 9 | var libSSLRegex string = `.*libssl(?P\d)*-*.*\.so\.*(?P[0-9\.]+)*.*` 10 | var re *regexp.Regexp 11 | 12 | func init() { 13 | re = regexp.MustCompile(libSSLRegex) 14 | } 15 | 16 | type sslLib struct { 17 | path string 18 | version string 19 | } 20 | 21 | func parseSSLlib(text string) (map[string]*sslLib, error) { 22 | res := make(map[string]*sslLib) 23 | matches := re.FindAllStringSubmatch(text, -1) 24 | 25 | if matches == nil { 26 | return nil, fmt.Errorf("no ssl lib found") 27 | } 28 | 29 | for _, groups := range matches { 30 | match := groups[0] 31 | 32 | paramsMap := make(map[string]string) 33 | for i, name := range re.SubexpNames() { 34 | if i > 0 && i <= len(groups) { 35 | paramsMap[name] = groups[i] 36 | } 37 | } 38 | 39 | // paramsMap 40 | // k : AdjacentVersion or SuffixVersion 41 | // v : 1.0.2 or 3 ... 42 | 43 | var version string 44 | if paramsMap["AdjacentVersion"] != "" { 45 | version = paramsMap["AdjacentVersion"] 46 | } else if paramsMap["SuffixVersion"] != "" { 47 | version = paramsMap["SuffixVersion"] 48 | } else { 49 | continue 50 | } 51 | 52 | // add "v." prefix 53 | if version != "" { 54 | version = "v" + version 55 | } 56 | 57 | path := getPath(match) 58 | res[path] = &sslLib{ 59 | path: path, 60 | version: version, 61 | } 62 | } 63 | 64 | return res, nil 65 | } 66 | 67 | func getPath(mappingLine string) string { 68 | mappingLine = strings.TrimSpace(mappingLine) 69 | elems := strings.Split(mappingLine, " ") 70 | 71 | // edge case 72 | // /usr/lib64/libssl.so.1.0.2k (deleted) 73 | 74 | path := elems[len(elems)-1] 75 | 76 | if strings.Contains(path, "(deleted)") { 77 | path = elems[len(elems)-2] 78 | } 79 | 80 | return path 81 | } 82 | -------------------------------------------------------------------------------- /ebpf/ssllib_test.go: -------------------------------------------------------------------------------- 1 | package ebpf 2 | 3 | import "testing" 4 | 5 | func TestParseSSLLibRemoveDuplicates(t *testing.T) { 6 | // expected one lib, libssl3.so (remoeved duplicates) 7 | text := `7f96bb1cf000-7f96bb22b000 r-xp 00000000 103:01 2202604 /usr/lib64/libssl3.so 8 | 7f96bb22b000-7f96bb42a000 ---p 0005c000 103:01 2202604 /usr/lib64/libssl3.so 9 | 7f96bb42a000-7f96bb42e000 r--p 0005b000 103:01 2202604 /usr/lib64/libssl3.so 10 | 7f96bb42e000-7f96bb42f000 rw-p 0005f000 103:01 2202604 /usr/lib64/libssl3.so` 11 | 12 | libs, err := parseSSLlib(text) 13 | if err != nil { 14 | t.Fatal(err) 15 | } 16 | 17 | if len(libs) != 1 { 18 | t.Fatalf("expected 1 lib, got %d", len(libs)) 19 | } 20 | 21 | lib := libs["/usr/lib64/libssl3.so"] 22 | if lib.version != "3" { 23 | t.Fatalf("expected version 3, got %s", lib.version) 24 | } 25 | } 26 | 27 | func TestParseSSLLib(t *testing.T) { 28 | text := `/usr/lib/x86_64-linux-gnu/libssl.so.1.1 29 | /usr/lib64/libssl3.so 30 | /usr/lib64/libssl.so.1.0.2k 31 | /lib/libssl.so.1.1 32 | /usr/lib/x86_64-linux-gnu/libssl.so.1.1 33 | /lib/libssl.so.1.1 34 | /lib/libssl.so.3 35 | /usr/local/lib/python3.9/site-packages/psycopg2_binary.libs/libssl-0331cfe8.so.1.1 36 | /usr/lib/x86_64-linux-gnu/libssl.so.1.1 37 | /usr/lib64/libssl3.so 38 | /usr/lib64/libssl.so.1.0.2k (deleted) 39 | /usr/lib/x86_64-linux-gnu/libssl.so.1.1 40 | /usr/lib64/libssl3.so 41 | /usr/lib64/libssl.so.1.0.2k (deleted) 42 | /usr/lib64/libssl3.so 43 | /usr/lib64/libssl.so.1.0.2k 44 | /usr/local/lib/python3.9/site-packages/psycopg2_binary.libs/libssl-0331cfe8.so.1.1 45 | /usr/lib/x86_64-linux-gnu/libssl.so.1.1 46 | /usr/lib64/libssl3.so 47 | /usr/lib64/libssl.so.1.0.2k 48 | /usr/lib64/libssl3.so 49 | /usr/lib64/libssl.so.1.0.2k (deleted) 50 | /usr/lib64/libssl3.so 51 | /usr/lib64/libssl.so.1.0.2k 52 | /usr/lib64/libssl3.so 53 | /usr/lib64/libssl.so.1.0.2k 54 | /usr/lib64/libssl3.so 55 | /usr/lib64/libssl.so.1.0.2k (deleted) 56 | /usr/lib64/libssl.so.1.0.2k 57 | /usr/lib/x86_64-linux-gnu/libssl.so.1.1 58 | /usr/lib64/libssl3.so 59 | /usr/lib64/libssl.so.1.0.2k 60 | /usr/lib/x86_64-linux-gnu/libssl.so.3 61 | /usr/lib64/libssl3.so 62 | /usr/lib64/libssl.so.1.0.2k 63 | ` 64 | 65 | libs, err := parseSSLlib(text) 66 | 67 | if err != nil { 68 | t.Fatal(err) 69 | } 70 | 71 | // /usr/lib/x86_64-linux-gnu/libssl.so.1.1 72 | // /usr/lib64/libssl3.so 73 | // /usr/lib64/libssl.so.1.0.2k 74 | // /lib/libssl.so.1.1 75 | // /lib/libssl.so.3 76 | // /usr/local/lib/python3.9/site-packages/psycopg2_binary.libs/libssl-0331cfe8.so.1.1 77 | // /usr/lib/x86_64-linux-gnu/libssl.so.3 78 | 79 | if len(libs) != 7 { 80 | t.Fatalf("expected 7 libs, got %d", len(libs)) 81 | } 82 | 83 | lib := libs["/usr/lib/x86_64-linux-gnu/libssl.so.1.1"] 84 | if lib.version != "1.1" { 85 | t.Fatalf("expected version 1.1, got %s", lib.version) 86 | } 87 | 88 | lib = libs["/usr/lib64/libssl3.so"] 89 | if lib.version != "3" { 90 | t.Fatalf("expected version 3, got %s", lib.version) 91 | } 92 | 93 | lib = libs["/usr/lib64/libssl.so.1.0.2k"] 94 | if lib.version != "1.0.2" { 95 | t.Fatalf("expected version 1.0.2, got %s", lib.version) 96 | } 97 | 98 | lib = libs["/lib/libssl.so.1.1"] 99 | if lib.version != "1.1" { 100 | t.Fatalf("expected version 1.1, got %s", lib.version) 101 | } 102 | 103 | lib = libs["/lib/libssl.so.3"] 104 | if lib.version != "3" { 105 | t.Fatalf("expected version 3, got %s", lib.version) 106 | } 107 | 108 | lib = libs["/usr/local/lib/python3.9/site-packages/psycopg2_binary.libs/libssl-0331cfe8.so.1.1"] 109 | if lib.version != "1.1" { 110 | t.Fatalf("expected version 1.1, got %s", lib.version) 111 | } 112 | 113 | lib = libs["/usr/lib/x86_64-linux-gnu/libssl.so.3"] 114 | if lib.version != "3" { 115 | t.Fatalf("expected version 3, got %s", lib.version) 116 | } 117 | 118 | } 119 | -------------------------------------------------------------------------------- /ebpf/tcp_state/tcp.go: -------------------------------------------------------------------------------- 1 | package tcp_state 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | "os" 7 | "time" 8 | "unsafe" 9 | 10 | "github.com/ddosify/alaz/ebpf/c" 11 | "github.com/ddosify/alaz/log" 12 | 13 | "github.com/cilium/ebpf" 14 | "github.com/cilium/ebpf/link" 15 | "github.com/cilium/ebpf/perf" 16 | ) 17 | 18 | // match with values in tcp_state.c 19 | const ( 20 | BPF_EVENT_TCP_ESTABLISHED = iota + 1 21 | BPF_EVENT_TCP_CONNECT_FAILED 22 | BPF_EVENT_TCP_LISTEN 23 | BPF_EVENT_TCP_LISTEN_CLOSED 24 | BPF_EVENT_TCP_CLOSED 25 | ) 26 | 27 | // for user space 28 | const ( 29 | EVENT_TCP_ESTABLISHED = "EVENT_TCP_ESTABLISHED" 30 | EVENT_TCP_CONNECT_FAILED = "EVENT_TCP_CONNECT_FAILED" 31 | EVENT_TCP_LISTEN = "EVENT_TCP_LISTEN" 32 | EVENT_TCP_LISTEN_CLOSED = "EVENT_TCP_LISTEN_CLOSED" 33 | EVENT_TCP_CLOSED = "EVENT_TCP_CLOSED" 34 | ) 35 | 36 | // Custom type for the enumeration 37 | type TcpStateConversion uint32 38 | 39 | // String representation of the enumeration values 40 | func (e TcpStateConversion) String() string { 41 | switch e { 42 | case BPF_EVENT_TCP_ESTABLISHED: 43 | return EVENT_TCP_ESTABLISHED 44 | case BPF_EVENT_TCP_CONNECT_FAILED: 45 | return EVENT_TCP_CONNECT_FAILED 46 | case BPF_EVENT_TCP_LISTEN: 47 | return EVENT_TCP_LISTEN 48 | case BPF_EVENT_TCP_LISTEN_CLOSED: 49 | return EVENT_TCP_LISTEN_CLOSED 50 | case BPF_EVENT_TCP_CLOSED: 51 | return EVENT_TCP_CLOSED 52 | default: 53 | return "Unknown" 54 | } 55 | } 56 | 57 | // $BPF_CLANG and $BPF_CFLAGS are set by the Makefile. 58 | // //go:generate go run github.com/cilium/ebpf/cmd/bpf2go -cc $BPF_CLANG -cflags $BPF_CFLAGS bpf tcp_sockets.c -- -I../headers 59 | 60 | const mapKey uint32 = 0 61 | 62 | // padding to match the kernel struct 63 | type BpfTcpEvent struct { 64 | Fd uint64 65 | Timestamp uint64 66 | Type uint32 67 | Pid uint32 68 | SPort uint16 69 | DPort uint16 70 | SAddr [16]byte 71 | DAddr [16]byte 72 | } 73 | 74 | // for user space 75 | type TcpConnectEvent struct { 76 | Fd uint64 77 | Timestamp uint64 78 | Type_ string 79 | Pid uint32 80 | SPort uint16 81 | DPort uint16 82 | SAddr string 83 | DAddr string 84 | } 85 | 86 | const TCP_CONNECT_EVENT = "tcp_connect_event" 87 | 88 | func (e TcpConnectEvent) Type() string { 89 | return TCP_CONNECT_EVENT 90 | } 91 | 92 | var TcpState *TcpStateProg 93 | 94 | type TcpStateConfig struct { 95 | BpfMapSize uint32 // specified in terms of os page size 96 | } 97 | 98 | var defaultConfig *TcpStateConfig = &TcpStateConfig{ 99 | BpfMapSize: 64, 100 | } 101 | 102 | func InitTcpStateProg(conf *TcpStateConfig) *TcpStateProg { 103 | if conf == nil { 104 | conf = defaultConfig 105 | } 106 | 107 | return &TcpStateProg{ 108 | links: map[string]link.Link{}, 109 | tcpConnectMapSize: conf.BpfMapSize, 110 | } 111 | } 112 | 113 | type TcpStateProg struct { 114 | // links represent a program attached to a hook 115 | links map[string]link.Link // key : hook name 116 | 117 | tcpConnectMapSize uint32 118 | tcpConnectEvents *perf.Reader 119 | 120 | ContainerPidMap *ebpf.Map // for filtering non-container pids on the node 121 | } 122 | 123 | func (tsp *TcpStateProg) Close() { 124 | for hookName, link := range tsp.links { 125 | log.Logger.Info().Msgf("unattach %s", hookName) 126 | link.Close() 127 | } 128 | } 129 | 130 | func (tsp *TcpStateProg) Attach() { 131 | l, err := link.Tracepoint("sock", "inet_sock_set_state", c.BpfObjs.InetSockSetState, nil) 132 | if err != nil { 133 | log.Logger.Fatal().Err(err).Msg("link inet_sock_set_state tracepoint") 134 | } 135 | tsp.links["sock/inet_sock_set_state"] = l 136 | 137 | l1, err := link.Tracepoint("syscalls", "sys_enter_connect", c.BpfObjs.SysEnterConnect, nil) 138 | if err != nil { 139 | log.Logger.Fatal().Err(err).Msg("link sys_enter_connect tracepoint") 140 | } 141 | tsp.links["syscalls/sys_enter_connect"] = l1 142 | 143 | l2, err := link.Tracepoint("syscalls", "sys_exit_connect", c.BpfObjs.SysEnterConnect, nil) 144 | if err != nil { 145 | log.Logger.Fatal().Err(err).Msg("link sys_exit_connect tracepoint") 146 | } 147 | tsp.links["syscalls/sys_exit_connect"] = l2 148 | } 149 | 150 | func (tsp *TcpStateProg) InitMaps() { 151 | var err error 152 | tsp.tcpConnectEvents, err = perf.NewReader(c.BpfObjs.TcpConnectEvents, int(tsp.tcpConnectMapSize)*os.Getpagesize()) 153 | if err != nil { 154 | log.Logger.Fatal().Err(err).Msg("error creating perf event array reader") 155 | } 156 | 157 | tsp.ContainerPidMap = c.BpfObjs.ContainerPids 158 | } 159 | 160 | func (tsp *TcpStateProg) PopulateContainerPidsMap(newKeys, deletedKeys []uint32) error { 161 | errors := []error{} 162 | if len(deletedKeys) > 0 { 163 | log.Logger.Debug().Msgf("deleting container pids map with %d new keys %v", len(deletedKeys), deletedKeys) 164 | count, err := tsp.ContainerPidMap.BatchDelete(deletedKeys, &ebpf.BatchOptions{}) 165 | if err != nil { 166 | log.Logger.Debug().Err(err).Msg("failed deleting entries from container pids map") 167 | // errors = append(errors, err) 168 | } else { 169 | log.Logger.Debug().Msgf("deleted %d entries from container pids map", count) 170 | } 171 | } 172 | 173 | if len(newKeys) > 0 { 174 | log.Logger.Debug().Msgf("adding container pids map with %d new keys %v", len(newKeys), newKeys) 175 | values := make([]uint8, len(newKeys)) 176 | for i := range values { 177 | values[i] = 1 178 | } 179 | 180 | count, err := tsp.ContainerPidMap.BatchUpdate(newKeys, values, &ebpf.BatchOptions{ 181 | ElemFlags: 0, 182 | Flags: 0, 183 | }) 184 | 185 | if err != nil { 186 | errors = append(errors, fmt.Errorf("failed adding ebpfcontainer pids map, %v", err)) 187 | } else { 188 | log.Logger.Debug().Msgf("updated %d entries in container pids map", count) 189 | } 190 | } 191 | 192 | if len(errors) > 0 { 193 | return fmt.Errorf("errors: %v", errors) 194 | } 195 | 196 | return nil 197 | } 198 | 199 | func findEndIndex(b [100]uint8) (endIndex int) { 200 | for i, v := range b { 201 | if v == 0 { 202 | return i 203 | } 204 | } 205 | return len(b) 206 | } 207 | 208 | // returns when program is detached 209 | func (tsp *TcpStateProg) Consume(ctx context.Context, ch chan interface{}) { 210 | printTs := true 211 | for { 212 | read := func() { 213 | record, err := tsp.tcpConnectEvents.Read() 214 | if err != nil { 215 | log.Logger.Warn().Err(err).Msg("error reading from tcp connect event map") 216 | } 217 | 218 | if record.LostSamples != 0 { 219 | log.Logger.Warn().Msgf("lost samples tcp-connect %d", record.LostSamples) 220 | } 221 | 222 | if record.RawSample == nil || len(record.RawSample) == 0 { 223 | return 224 | } 225 | 226 | bpfEvent := (*BpfTcpEvent)(unsafe.Pointer(&record.RawSample[0])) 227 | 228 | if printTs { 229 | log.Logger.Info().Uint64("now", uint64(time.Now().UnixNano())).Msgf("first-bpf-timestamp: %d", bpfEvent.Timestamp) 230 | printTs = false 231 | } 232 | 233 | go func() { 234 | ch <- &TcpConnectEvent{ 235 | Pid: bpfEvent.Pid, 236 | Fd: bpfEvent.Fd, 237 | Timestamp: bpfEvent.Timestamp, 238 | Type_: TcpStateConversion(bpfEvent.Type).String(), 239 | SPort: bpfEvent.SPort, 240 | DPort: bpfEvent.DPort, 241 | SAddr: fmt.Sprintf("%d.%d.%d.%d", bpfEvent.SAddr[0], bpfEvent.SAddr[1], bpfEvent.SAddr[2], bpfEvent.SAddr[3]), 242 | DAddr: fmt.Sprintf("%d.%d.%d.%d", bpfEvent.DAddr[0], bpfEvent.DAddr[1], bpfEvent.DAddr[2], bpfEvent.DAddr[3]), 243 | } 244 | }() 245 | } 246 | select { 247 | case <-ctx.Done(): 248 | log.Logger.Info().Msg("stop consuming tcp events...") 249 | return 250 | default: 251 | read() 252 | } 253 | } 254 | } 255 | -------------------------------------------------------------------------------- /k8s/daemonset.go: -------------------------------------------------------------------------------- 1 | package k8s 2 | 3 | func getOnAddDaemonSetFunc(ch chan interface{}) func(interface{}) { 4 | return func(obj interface{}) { 5 | ch <- K8sResourceMessage{ 6 | ResourceType: DAEMONSET, 7 | EventType: ADD, 8 | Object: obj, 9 | } 10 | } 11 | } 12 | 13 | func getOnUpdateDaemonSetFunc(ch chan interface{}) func(interface{}, interface{}) { 14 | return func(oldObj, newObj interface{}) { 15 | ch <- K8sResourceMessage{ 16 | ResourceType: DAEMONSET, 17 | EventType: UPDATE, 18 | Object: newObj, 19 | } 20 | } 21 | } 22 | 23 | func getOnDeleteDaemonSetFunc(ch chan interface{}) func(interface{}) { 24 | return func(obj interface{}) { 25 | ch <- K8sResourceMessage{ 26 | ResourceType: DAEMONSET, 27 | EventType: DELETE, 28 | Object: obj, 29 | } 30 | } 31 | } 32 | -------------------------------------------------------------------------------- /k8s/deployment.go: -------------------------------------------------------------------------------- 1 | package k8s 2 | 3 | func getOnAddDeploymentSetFunc(ch chan interface{}) func(interface{}) { 4 | return func(obj interface{}) { 5 | ch <- K8sResourceMessage{ 6 | ResourceType: DEPLOYMENT, 7 | EventType: ADD, 8 | Object: obj, 9 | } 10 | } 11 | } 12 | 13 | func getOnUpdateDeploymentSetFunc(ch chan interface{}) func(interface{}, interface{}) { 14 | return func(oldObj, newObj interface{}) { 15 | ch <- K8sResourceMessage{ 16 | ResourceType: DEPLOYMENT, 17 | EventType: UPDATE, 18 | Object: newObj, 19 | } 20 | } 21 | } 22 | 23 | func getOnDeleteDeploymentSetFunc(ch chan interface{}) func(interface{}) { 24 | return func(obj interface{}) { 25 | ch <- K8sResourceMessage{ 26 | ResourceType: DEPLOYMENT, 27 | EventType: DELETE, 28 | Object: obj, 29 | } 30 | } 31 | } 32 | -------------------------------------------------------------------------------- /k8s/endpoints.go: -------------------------------------------------------------------------------- 1 | package k8s 2 | 3 | func getOnAddEndpointsSetFunc(ch chan interface{}) func(interface{}) { 4 | return func(obj interface{}) { 5 | ch <- K8sResourceMessage{ 6 | ResourceType: ENDPOINTS, 7 | EventType: ADD, 8 | Object: obj, 9 | } 10 | } 11 | } 12 | 13 | func getOnUpdateEndpointsSetFunc(ch chan interface{}) func(interface{}, interface{}) { 14 | return func(oldObj, newObj interface{}) { 15 | ch <- K8sResourceMessage{ 16 | ResourceType: ENDPOINTS, 17 | EventType: UPDATE, 18 | Object: newObj, 19 | } 20 | } 21 | } 22 | 23 | func getOnDeleteEndpointsSetFunc(ch chan interface{}) func(interface{}) { 24 | return func(obj interface{}) { 25 | ch <- K8sResourceMessage{ 26 | ResourceType: ENDPOINTS, 27 | EventType: DELETE, 28 | Object: obj, 29 | } 30 | } 31 | } 32 | -------------------------------------------------------------------------------- /k8s/informer.go: -------------------------------------------------------------------------------- 1 | package k8s 2 | 3 | import ( 4 | "context" 5 | "flag" 6 | "fmt" 7 | "os" 8 | "path/filepath" 9 | "sync" 10 | "time" 11 | 12 | "github.com/ddosify/alaz/log" 13 | 14 | corev1 "k8s.io/api/core/v1" 15 | "k8s.io/apimachinery/pkg/util/runtime" 16 | "k8s.io/client-go/informers" 17 | appsv1 "k8s.io/client-go/informers/apps/v1" 18 | v1 "k8s.io/client-go/informers/core/v1" 19 | 20 | "k8s.io/client-go/kubernetes" 21 | "k8s.io/client-go/rest" 22 | "k8s.io/client-go/tools/cache" 23 | "k8s.io/client-go/tools/clientcmd" 24 | "k8s.io/client-go/util/homedir" 25 | ) 26 | 27 | type K8SResourceType string 28 | 29 | const ( 30 | SERVICE = "Service" 31 | POD = "Pod" 32 | REPLICASET = "ReplicaSet" 33 | DEPLOYMENT = "Deployment" 34 | ENDPOINTS = "Endpoints" 35 | CONTAINER = "Container" 36 | DAEMONSET = "DaemonSet" 37 | STATEFULSET = "StatefulSet" 38 | ) 39 | 40 | const ( 41 | ADD = "Add" 42 | UPDATE = "Update" 43 | DELETE = "Delete" 44 | ) 45 | 46 | var k8sVersion string 47 | var resyncPeriod time.Duration = 120 * time.Second 48 | 49 | type K8sCollector struct { 50 | ctx context.Context 51 | informersFactory informers.SharedInformerFactory 52 | watchers map[K8SResourceType]cache.SharedIndexInformer 53 | stopper chan struct{} // stop signal for the informers 54 | doneChan chan struct{} // done signal for k8sCollector 55 | // watchers 56 | podInformer v1.PodInformer 57 | serviceInformer v1.ServiceInformer 58 | replicasetInformer appsv1.ReplicaSetInformer 59 | deploymentInformer appsv1.DeploymentInformer 60 | endpointsInformer v1.EndpointsInformer 61 | daemonsetInformer appsv1.DaemonSetInformer 62 | statefulSetInformer appsv1.StatefulSetInformer 63 | 64 | Events chan interface{} 65 | } 66 | 67 | func (k *K8sCollector) Init(events chan interface{}) error { 68 | log.Logger.Info().Msg("k8sCollector initializing...") 69 | k.Events = events 70 | 71 | // Pod 72 | k.podInformer = k.informersFactory.Core().V1().Pods() 73 | k.watchers[POD] = k.podInformer.Informer() 74 | 75 | // Service 76 | k.serviceInformer = k.informersFactory.Core().V1().Services() 77 | k.watchers[SERVICE] = k.informersFactory.Core().V1().Services().Informer() 78 | 79 | // ReplicaSet 80 | k.replicasetInformer = k.informersFactory.Apps().V1().ReplicaSets() 81 | k.watchers[REPLICASET] = k.replicasetInformer.Informer() 82 | 83 | // Deployment 84 | k.deploymentInformer = k.informersFactory.Apps().V1().Deployments() 85 | k.watchers[DEPLOYMENT] = k.deploymentInformer.Informer() 86 | 87 | // Endpoints 88 | k.endpointsInformer = k.informersFactory.Core().V1().Endpoints() 89 | k.watchers[ENDPOINTS] = k.endpointsInformer.Informer() 90 | 91 | // DaemonSet 92 | k.daemonsetInformer = k.informersFactory.Apps().V1().DaemonSets() 93 | k.watchers[DAEMONSET] = k.daemonsetInformer.Informer() 94 | 95 | // StatefulSet 96 | k.statefulSetInformer = k.informersFactory.Apps().V1().StatefulSets() 97 | k.watchers[STATEFULSET] = k.statefulSetInformer.Informer() 98 | 99 | defer runtime.HandleCrash() 100 | 101 | // Add event handlers 102 | k.watchers[POD].AddEventHandler(cache.ResourceEventHandlerFuncs{ 103 | AddFunc: getOnAddPodFunc(k.Events), 104 | UpdateFunc: getOnUpdatePodFunc(k.Events), 105 | DeleteFunc: getOnDeletePodFunc(k.Events), 106 | }) 107 | 108 | k.watchers[SERVICE].AddEventHandler(cache.ResourceEventHandlerFuncs{ 109 | AddFunc: getOnAddServiceFunc(k.Events), 110 | UpdateFunc: getOnUpdateServiceFunc(k.Events), 111 | DeleteFunc: getOnDeleteServiceFunc(k.Events), 112 | }) 113 | 114 | k.watchers[REPLICASET].AddEventHandler(cache.ResourceEventHandlerFuncs{ 115 | AddFunc: getOnAddReplicaSetFunc(k.Events), 116 | UpdateFunc: getOnUpdateReplicaSetFunc(k.Events), 117 | DeleteFunc: getOnDeleteReplicaSetFunc(k.Events), 118 | }) 119 | 120 | k.watchers[DEPLOYMENT].AddEventHandler(cache.ResourceEventHandlerFuncs{ 121 | AddFunc: getOnAddDeploymentSetFunc(k.Events), 122 | UpdateFunc: getOnUpdateDeploymentSetFunc(k.Events), 123 | DeleteFunc: getOnDeleteDeploymentSetFunc(k.Events), 124 | }) 125 | 126 | k.watchers[ENDPOINTS].AddEventHandler(cache.ResourceEventHandlerFuncs{ 127 | AddFunc: getOnAddEndpointsSetFunc(k.Events), 128 | UpdateFunc: getOnUpdateEndpointsSetFunc(k.Events), 129 | DeleteFunc: getOnDeleteEndpointsSetFunc(k.Events), 130 | }) 131 | 132 | k.watchers[DAEMONSET].AddEventHandler(cache.ResourceEventHandlerFuncs{ 133 | AddFunc: getOnAddDaemonSetFunc(k.Events), 134 | UpdateFunc: getOnUpdateDaemonSetFunc(k.Events), 135 | DeleteFunc: getOnDeleteDaemonSetFunc(k.Events), 136 | }) 137 | 138 | k.watchers[STATEFULSET].AddEventHandler(cache.ResourceEventHandlerFuncs{ 139 | AddFunc: getOnAddStatefulSetFunc(k.Events), 140 | UpdateFunc: getOnUpdateStatefulSetFunc(k.Events), 141 | DeleteFunc: getOnDeleteStatefulSetFunc(k.Events), 142 | }) 143 | 144 | wg := sync.WaitGroup{} 145 | wg.Add(len(k.watchers)) 146 | for _, watcher := range k.watchers { 147 | go func(watcher cache.SharedIndexInformer) { 148 | watcher.Run(k.stopper) // it will return when stopper is closed 149 | wg.Done() 150 | }(watcher) 151 | } 152 | wg.Wait() 153 | log.Logger.Info().Msg("k8sCollector informers stopped") 154 | k.doneChan <- struct{}{} 155 | 156 | return nil 157 | } 158 | 159 | func (k *K8sCollector) Done() <-chan struct{} { 160 | return k.doneChan 161 | } 162 | 163 | func NewK8sCollector(parentCtx context.Context) (*K8sCollector, error) { 164 | ctx, _ := context.WithCancel(parentCtx) 165 | // get incluster kubeconfig 166 | var kubeconfig *string 167 | var kubeConfig *rest.Config 168 | 169 | if os.Getenv("IN_CLUSTER") == "false" { 170 | var err error 171 | if home := homedir.HomeDir(); home != "" { 172 | kubeconfig = flag.String("kubeconfig", filepath.Join(home, ".kube", "config"), "(optional) absolute path to the kubeconfig file") 173 | } else { 174 | kubeconfig = flag.String("kubeconfig", "", "absolute path to the kubeconfig file") 175 | } 176 | 177 | flag.Parse() 178 | 179 | kubeConfig, err = clientcmd.BuildConfigFromFlags("", *kubeconfig) 180 | if err != nil { 181 | panic(err) 182 | } 183 | } else { 184 | // in cluster config, default 185 | var err error 186 | kubeConfig, err = rest.InClusterConfig() 187 | if err != nil { 188 | return nil, fmt.Errorf("unable to get incluster kubeconfig: %w", err) 189 | } 190 | } 191 | 192 | clientset, err := kubernetes.NewForConfig(kubeConfig) 193 | if err != nil { 194 | return nil, fmt.Errorf("unable to create clientset: %w", err) 195 | } 196 | 197 | version, err := clientset.ServerVersion() 198 | if err != nil { 199 | return nil, fmt.Errorf("unable to get k8s server version: %w", err) 200 | } 201 | 202 | k8sVersion = version.String() 203 | 204 | factory := informers.NewSharedInformerFactory(clientset, resyncPeriod) 205 | 206 | collector := &K8sCollector{ 207 | ctx: ctx, 208 | stopper: make(chan struct{}), 209 | doneChan: make(chan struct{}), 210 | informersFactory: factory, 211 | watchers: map[K8SResourceType]cache.SharedIndexInformer{}, 212 | } 213 | 214 | go func(c *K8sCollector) { 215 | <-c.ctx.Done() // wait for context to be cancelled 216 | c.close() 217 | }(collector) 218 | 219 | return collector, nil 220 | } 221 | 222 | func (k *K8sCollector) GetK8sVersion() string { 223 | return k8sVersion 224 | } 225 | 226 | func (k *K8sCollector) close() { 227 | log.Logger.Info().Msg("k8sCollector closing...") 228 | close(k.stopper) // stop informers 229 | } 230 | 231 | type K8sNamespaceResources struct { 232 | Pods map[string]corev1.Pod `json:"pods"` // map[podName]Pod 233 | Services map[string]corev1.Service `json:"services"` // map[serviceName]Service 234 | } 235 | 236 | type K8sResourceMessage struct { 237 | ResourceType string `json:"type"` 238 | EventType string `json:"eventType"` 239 | Object interface{} `json:"object"` 240 | } 241 | -------------------------------------------------------------------------------- /k8s/pod.go: -------------------------------------------------------------------------------- 1 | package k8s 2 | 3 | import ( 4 | corev1 "k8s.io/api/core/v1" 5 | ) 6 | 7 | type Container struct { 8 | Name string `json:"name"` 9 | Namespace string `json:"namespace"` 10 | PodUID string `json:"pod"` // Pod UID 11 | Image string `json:"image"` 12 | Ports []struct { 13 | Port int32 `json:"port"` 14 | Protocol string `json:"protocol"` 15 | } `json:"ports"` 16 | } 17 | 18 | func getContainers(pod *corev1.Pod) []*Container { 19 | containers := make([]*Container, 0) 20 | 21 | for _, container := range pod.Spec.Containers { 22 | ports := make([]struct { 23 | Port int32 "json:\"port\"" 24 | Protocol string "json:\"protocol\"" 25 | }, 0) 26 | 27 | for _, port := range container.Ports { 28 | ports = append(ports, struct { 29 | Port int32 "json:\"port\"" 30 | Protocol string "json:\"protocol\"" 31 | }{ 32 | Port: port.ContainerPort, 33 | Protocol: string(port.Protocol), 34 | }) 35 | } 36 | 37 | containers = append(containers, &Container{ 38 | Name: container.Name, 39 | Namespace: pod.Namespace, 40 | PodUID: string(pod.UID), 41 | Image: container.Image, 42 | Ports: ports, 43 | }) 44 | } 45 | return containers 46 | } 47 | 48 | func getOnAddPodFunc(ch chan interface{}) func(interface{}) { 49 | return func(obj interface{}) { 50 | pod := obj.(*corev1.Pod) 51 | containers := getContainers(pod) 52 | 53 | ch <- K8sResourceMessage{ 54 | ResourceType: POD, 55 | EventType: ADD, 56 | Object: obj, 57 | } 58 | 59 | for _, container := range containers { 60 | ch <- K8sResourceMessage{ 61 | ResourceType: CONTAINER, 62 | EventType: ADD, 63 | Object: container, 64 | } 65 | } 66 | } 67 | } 68 | 69 | func getOnUpdatePodFunc(ch chan interface{}) func(interface{}, interface{}) { 70 | return func(oldObj, newObj interface{}) { 71 | pod := newObj.(*corev1.Pod) 72 | 73 | containers := getContainers(pod) 74 | ch <- K8sResourceMessage{ 75 | ResourceType: POD, 76 | EventType: UPDATE, 77 | Object: newObj, 78 | } 79 | for _, container := range containers { 80 | ch <- K8sResourceMessage{ 81 | ResourceType: CONTAINER, 82 | EventType: UPDATE, 83 | Object: container, 84 | } 85 | } 86 | } 87 | } 88 | 89 | func getOnDeletePodFunc(ch chan interface{}) func(interface{}) { 90 | return func(obj interface{}) { 91 | ch <- K8sResourceMessage{ 92 | ResourceType: POD, 93 | EventType: DELETE, 94 | Object: obj, 95 | } 96 | 97 | // no need to delete containers, they will be deleted automatically 98 | } 99 | } 100 | -------------------------------------------------------------------------------- /k8s/replicaset.go: -------------------------------------------------------------------------------- 1 | package k8s 2 | 3 | func getOnAddReplicaSetFunc(ch chan interface{}) func(interface{}) { 4 | return func(obj interface{}) { 5 | ch <- K8sResourceMessage{ 6 | ResourceType: REPLICASET, 7 | EventType: ADD, 8 | Object: obj, 9 | } 10 | } 11 | } 12 | 13 | func getOnUpdateReplicaSetFunc(ch chan interface{}) func(interface{}, interface{}) { 14 | return func(oldObj, newObj interface{}) { 15 | ch <- K8sResourceMessage{ 16 | ResourceType: REPLICASET, 17 | EventType: UPDATE, 18 | Object: newObj, 19 | } 20 | } 21 | } 22 | 23 | func getOnDeleteReplicaSetFunc(ch chan interface{}) func(interface{}) { 24 | return func(obj interface{}) { 25 | ch <- K8sResourceMessage{ 26 | ResourceType: REPLICASET, 27 | EventType: DELETE, 28 | Object: obj, 29 | } 30 | } 31 | } 32 | -------------------------------------------------------------------------------- /k8s/service.go: -------------------------------------------------------------------------------- 1 | package k8s 2 | 3 | func getOnAddServiceFunc(ch chan interface{}) func(interface{}) { 4 | return func(obj interface{}) { 5 | ch <- K8sResourceMessage{ 6 | ResourceType: SERVICE, 7 | EventType: ADD, 8 | Object: obj, 9 | } 10 | } 11 | } 12 | 13 | func getOnUpdateServiceFunc(ch chan interface{}) func(interface{}, interface{}) { 14 | return func(oldObj, newObj interface{}) { 15 | ch <- K8sResourceMessage{ 16 | ResourceType: SERVICE, 17 | EventType: UPDATE, 18 | Object: newObj, 19 | } 20 | } 21 | } 22 | 23 | func getOnDeleteServiceFunc(ch chan interface{}) func(interface{}) { 24 | return func(obj interface{}) { 25 | ch <- K8sResourceMessage{ 26 | ResourceType: SERVICE, 27 | EventType: DELETE, 28 | Object: obj, 29 | } 30 | } 31 | } 32 | -------------------------------------------------------------------------------- /k8s/statefulset.go: -------------------------------------------------------------------------------- 1 | package k8s 2 | 3 | func getOnAddStatefulSetFunc(ch chan interface{}) func(interface{}) { 4 | return func(obj interface{}) { 5 | ch <- K8sResourceMessage{ 6 | ResourceType: STATEFULSET, 7 | EventType: ADD, 8 | Object: obj, 9 | } 10 | } 11 | } 12 | 13 | func getOnUpdateStatefulSetFunc(ch chan interface{}) func(interface{}, interface{}) { 14 | return func(oldObj, newObj interface{}) { 15 | ch <- K8sResourceMessage{ 16 | ResourceType: STATEFULSET, 17 | EventType: UPDATE, 18 | Object: newObj, 19 | } 20 | } 21 | } 22 | 23 | func getOnDeleteStatefulSetFunc(ch chan interface{}) func(interface{}) { 24 | return func(obj interface{}) { 25 | ch <- K8sResourceMessage{ 26 | ResourceType: STATEFULSET, 27 | EventType: DELETE, 28 | Object: obj, 29 | } 30 | } 31 | } 32 | -------------------------------------------------------------------------------- /log/logger.go: -------------------------------------------------------------------------------- 1 | package log 2 | 3 | import ( 4 | "os" 5 | "strconv" 6 | 7 | "github.com/rs/zerolog" 8 | ) 9 | 10 | type NoopLogger struct{} 11 | 12 | func (NoopLogger) Write(p []byte) (n int, err error) { 13 | return 0, nil 14 | } 15 | 16 | var ( 17 | // Logger is the global logger. 18 | Logger zerolog.Logger 19 | ) 20 | 21 | const ( 22 | LOG_CONTEXT = "log-context" // for hook 23 | ) 24 | 25 | func init() { 26 | // Get the desired log level from environment variables 27 | levelStr := os.Getenv("LOG_LEVEL") 28 | 29 | // Set a default log level if the environment variable is not set 30 | defaultLevel := zerolog.InfoLevel 31 | 32 | // Parse the log level from the environment variable 33 | level, err := strconv.Atoi(levelStr) 34 | if err == nil { 35 | defaultLevel = zerolog.Level(level) 36 | } 37 | 38 | // Set the global log level for Zerolog 39 | zerolog.SetGlobalLevel(defaultLevel) 40 | 41 | zerolog.TimeFieldFormat = zerolog.TimeFormatUnix 42 | 43 | if os.Getenv("DISABLE_LOGS") == "true" { 44 | Logger = zerolog.New(NoopLogger{}) 45 | } else { 46 | hook := &ContextFilterHook{ 47 | ContextKey: LOG_CONTEXT, 48 | ContextValue: os.Getenv("LOG_CONTEXT_KEY"), 49 | } 50 | 51 | Logger = zerolog.New(os.Stdout).With().Timestamp().Logger().Hook(hook) 52 | } 53 | } 54 | 55 | type ContextFilterHook struct { 56 | ContextKey string 57 | ContextValue string 58 | } 59 | 60 | func (cfh *ContextFilterHook) Run(e *zerolog.Event, level zerolog.Level, message string) { 61 | if os.Getenv("LOG_CONTEXT_KEY") == "" { 62 | // if not specified, no filtering 63 | return 64 | } 65 | val := e.GetCtx().Value(cfh.ContextKey) 66 | if val != nil { 67 | if val.(string) == cfh.ContextValue { 68 | e.Str(cfh.ContextKey, cfh.ContextValue) 69 | } else { 70 | e.Discard() 71 | } 72 | } else { 73 | e.Discard() 74 | } 75 | } 76 | -------------------------------------------------------------------------------- /logstreamer/caCert.go: -------------------------------------------------------------------------------- 1 | package logstreamer 2 | 3 | const CaCert = `-----BEGIN CERTIFICATE----- 4 | MIIFPDCCAySgAwIBAgIURR38PLDaEjEeUOAsSiOU9dnl4PkwDQYJKoZIhvcNAQEL 5 | BQAwGDEWMBQGA1UEAwwNZ2V0YW50ZW9uLmNvbTAeFw0yNDA0MTAyMDA5MjVaFw0z 6 | NDA0MDgyMDA5MjVaMBgxFjAUBgNVBAMMDWdldGFudGVvbi5jb20wggIiMA0GCSqG 7 | SIb3DQEBAQUAA4ICDwAwggIKAoICAQCof8ZFaiw6sBm2mZhyZnjDvvlLjCIwzw1m 8 | ZfpMkyyLyhCaxcR3HU6ORx7IZexGejH61I9a2VTfiCDDtQcNFBmu360N2U0l7Y8V 9 | x50mOlnWROPHKhZAWMPNuvweTbrfQiwIP/4JFi4s3zz/7nWOByyOVj1NTDO7kgK4 10 | 7vtcIRHFA5P3S2CaBbEk8x7G7of3BNQup8r+aFicZBVpkeWdXSVsPI+4dIC/TJs2 11 | uhO73MO7OHMZBh4jNakxk6hLadLMkRfRua7761CwXAZC51eJfHHa8XUfhoFR38yb 12 | HfV+AuYWSycdxwvmhqZ9i8ezh4JDB6k2J91s5rN6Qwcb1jbEdY4ymodJyF93edpc 13 | pt20h8KD6ECV7e+Du8/Vwr9PsRokEoNrhC05Sjd/mA1IcBP0GIuGREhjnTFzTloz 14 | kGvaE7kHvhePjVkRB1L57Dmwt1n7V/0stBoa9hO/zfW5ElgE3n6+TJsuVicsB2v/ 15 | frvPFYkB9xCFGjULV/A1ETKgXtAIMxWzllNMhnaIVQKmM1kKGMXkto11OHRykHPE 16 | qrSu4Qw69hyJOcUp7953XQJ1QbcNQ9EbpR8YwvC3/UBq35Y4FqJVeJSx6NnVsto2 17 | ie/idEMDBl+0drXmTfFq3+2la+dkxZLz5wS18q5zgJIKLCwBQVOGoJmBWitQPgzb 18 | T4S6qCIiBwIDAQABo34wfDAdBgNVHQ4EFgQU/iHxC1glm5SG7M3ZilpcA5wi4D8w 19 | HwYDVR0jBBgwFoAU/iHxC1glm5SG7M3ZilpcA5wi4D8wDwYDVR0TAQH/BAUwAwEB 20 | /zApBgNVHREEIjAggg1nZXRhbnRlb24uY29tgg8qLmdldGFudGVvbi5jb20wDQYJ 21 | KoZIhvcNAQELBQADggIBAJNnaWg4ycBzHn+9THSgwxtYeTaKzv1GWR2g5pjQMVg8 22 | CpMZWBUq5L8pl4tt9L9iqIFgqaOx/g2jeKkWO1iHluzRvAJURYufxcixnqiq5lBl 23 | MS045il5Fhxgumy0DaVPblElfKn2TCaVJEhbrHWePTJjErhh7mm4PMsVQMx3S5Lq 24 | BYxLEs7dsVld/nR5pYyuIUXF6KZpoJKW60m12OxiWPS4rS9PF7GzUzrNoJAdFYki 25 | KgGqRF09NSdG4jNAR+aIUSGvAsN6T1nmVuCMVdRBqMLb19rPssw30WGRzkYixYHn 26 | nFdyKQMs5AaKgc8ovRubFIhGVQ6bVtUseMoZgoouH8ESbEb0kXvr7oKE7SQYXD/W 27 | 6sPKzVrfUbCEAa+oJtB0HiZQd8wfs0TDdNjvURCEHLXvsfVuCBQxF5lmbxGqfvmc 28 | 2kLBGf+3rAWIi+eCnImhMMUYsCWZG1peDsKQzXWhBRX8TNdA9XQ24EBuUYVEVb5H 29 | K2vWn+Py+zmj7cslAp6fuBbIJEEfWBzWzKDRkEciJFd8fFNVHVT9Luk4/tfcsEnM 30 | GGitGWQIPZDfpdL6ff2gurf6CE91or368ah1Hd0hU8g2t4xnEQOG/4CXPCkty/sS 31 | YkexgXeqWTZ04UP/BHsqwWeQQreocuxc/4Jz4YQ8b7P2kRIehjVdUjgdaMy9JDLN 32 | -----END CERTIFICATE----- 33 | ` 34 | -------------------------------------------------------------------------------- /logstreamer/pool.go: -------------------------------------------------------------------------------- 1 | package logstreamer 2 | 3 | import ( 4 | "crypto/tls" 5 | "errors" 6 | "fmt" 7 | "net" 8 | "sync" 9 | "time" 10 | 11 | "github.com/ddosify/alaz/log" 12 | ) 13 | 14 | var ErrClosed = errors.New("pool is closed") 15 | 16 | type PoolConn struct { 17 | net.Conn 18 | mu sync.RWMutex 19 | c *channelPool 20 | unusable bool 21 | tls bool 22 | } 23 | 24 | func (p *PoolConn) isAlive() bool { 25 | var buf [1]byte 26 | p.SetReadDeadline(time.Now().Add(1 * time.Millisecond)) // Set a very short deadline 27 | _, err := p.Read(buf[:]) 28 | 29 | if err != nil { 30 | if e, ok := err.(net.Error); ok && e.Timeout() { 31 | // Timeout occurred, but connection is still alive 32 | return true 33 | } 34 | // Real error or EOF encountered, connection likely dead 35 | return false 36 | } 37 | 38 | if buf[0] == 'X' { 39 | // close 40 | return false 41 | } 42 | 43 | // Data received (unexpected in send only), process or ignore 44 | return true 45 | } 46 | 47 | // Close() puts the given connects back to the pool instead of closing it. 48 | func (p *PoolConn) Close() error { 49 | p.mu.RLock() 50 | defer p.mu.RUnlock() 51 | 52 | if p.unusable { 53 | if p.Conn != nil { 54 | log.Logger.Info().Msg("connection is unusable, closing it") 55 | if p.tls { 56 | return p.Conn.(*tls.Conn).Close() 57 | } else { 58 | return p.Conn.Close() 59 | } 60 | } 61 | return nil 62 | } 63 | return p.c.put(p) 64 | } 65 | 66 | // MarkUnusable() marks the connection not usable any more, to let the pool close it instead of returning it to pool. 67 | func (p *PoolConn) MarkUnusable() { 68 | p.mu.Lock() 69 | p.unusable = true 70 | p.mu.Unlock() 71 | } 72 | 73 | // newConn wraps a standard net.Conn to a poolConn net.Conn. 74 | func (c *channelPool) wrapConn(conn net.Conn) *PoolConn { 75 | p := &PoolConn{c: c} 76 | p.Conn = conn 77 | p.tls = c.tls 78 | return p 79 | } 80 | 81 | type channelPool struct { 82 | // storage for our net.Conn connections 83 | mu sync.RWMutex 84 | conns chan *PoolConn 85 | 86 | // net.Conn generator 87 | factory Factory 88 | tls bool 89 | } 90 | 91 | func (c *channelPool) getConnsAndFactory() (chan *PoolConn, Factory) { 92 | c.mu.RLock() 93 | conns := c.conns 94 | factory := c.factory 95 | c.mu.RUnlock() 96 | return conns, factory 97 | } 98 | 99 | // Get implements the Pool interfaces Get() method. If there is no new 100 | // connection available in the pool, a new connection will be created via the 101 | // Factory() method. 102 | func (c *channelPool) Get() (*PoolConn, error) { 103 | conns, factory := c.getConnsAndFactory() 104 | if conns == nil { 105 | return nil, ErrClosed 106 | } 107 | 108 | // wrap our connections with out custom net.Conn implementation (wrapConn 109 | // method) that puts the connection back to the pool if it's closed. 110 | select { 111 | case conn := <-conns: 112 | if conn == nil { 113 | return nil, fmt.Errorf("connection is nil") 114 | } 115 | if conn.unusable { 116 | log.Logger.Info().Msg("connection is unusable on Get, closing it") 117 | conn.Close() 118 | return nil, fmt.Errorf("connection is unusable") 119 | } 120 | 121 | if conn.isAlive() { 122 | return conn, nil 123 | } else { 124 | conn.MarkUnusable() 125 | conn.Close() 126 | return nil, fmt.Errorf("connection is dead") 127 | } 128 | default: 129 | conn, err := factory() 130 | if err != nil { 131 | return nil, err 132 | } 133 | log.Logger.Info().Msg("no connection available, created a new one") 134 | return c.wrapConn(conn), nil 135 | } 136 | } 137 | 138 | // put puts the connection back to the pool. If the pool is full or closed, 139 | // conn is simply closed. A nil conn will be rejected. 140 | func (c *channelPool) put(conn *PoolConn) error { 141 | if conn == nil { 142 | return errors.New("connection is nil. rejecting") 143 | } 144 | 145 | c.mu.RLock() 146 | defer c.mu.RUnlock() 147 | 148 | if c.conns == nil { 149 | // pool is closed, close passed connection 150 | return conn.Close() 151 | } 152 | 153 | // put the resource back into the pool. If the pool is full, this will 154 | // block and the default case will be executed. 155 | select { 156 | case c.conns <- conn: 157 | // log.Logger.Info().Msg("putting connection back into the pool") 158 | return nil 159 | default: 160 | // pool is full, close passed connection 161 | log.Logger.Info().Msg("pool is full, close passed connection") 162 | return conn.Close() 163 | } 164 | } 165 | 166 | func (c *channelPool) Close() { 167 | c.mu.Lock() 168 | conns := c.conns 169 | c.conns = nil 170 | c.factory = nil 171 | c.mu.Unlock() 172 | 173 | if conns == nil { 174 | return 175 | } 176 | 177 | close(conns) 178 | for conn := range conns { 179 | conn.Close() 180 | } 181 | } 182 | 183 | func (c *channelPool) Len() int { 184 | conns, _ := c.getConnsAndFactory() 185 | return len(conns) 186 | } 187 | 188 | func NewChannelPool(initialCap, maxCap int, factory Factory, isTls bool) (*channelPool, error) { 189 | if initialCap < 0 || maxCap <= 0 || initialCap > maxCap { 190 | return nil, errors.New("invalid capacity settings") 191 | } 192 | 193 | c := &channelPool{ 194 | conns: make(chan *PoolConn, maxCap), 195 | factory: factory, 196 | tls: isTls, 197 | } 198 | 199 | // create initial connections, if something goes wrong, 200 | // just close the pool error out. 201 | for i := 0; i < initialCap; i++ { 202 | conn, err := factory() 203 | if err != nil { 204 | c.Close() 205 | return nil, fmt.Errorf("factory is not able to fill the pool: %s", err) 206 | } 207 | log.Logger.Debug().Msg("new connection created for log streaming") 208 | c.conns <- c.wrapConn(conn) 209 | } 210 | 211 | return c, nil 212 | } 213 | 214 | // Factory is a function to create new connections. 215 | type Factory func() (net.Conn, error) 216 | -------------------------------------------------------------------------------- /main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "os" 5 | "os/signal" 6 | "regexp" 7 | "runtime/debug" 8 | "strconv" 9 | "syscall" 10 | "time" 11 | 12 | "github.com/ddosify/alaz/aggregator" 13 | "github.com/ddosify/alaz/config" 14 | "github.com/ddosify/alaz/cri" 15 | "github.com/ddosify/alaz/datastore" 16 | "github.com/ddosify/alaz/ebpf" 17 | "github.com/ddosify/alaz/k8s" 18 | "github.com/ddosify/alaz/logstreamer" 19 | 20 | "context" 21 | 22 | "github.com/ddosify/alaz/log" 23 | 24 | "net/http" 25 | _ "net/http/pprof" 26 | ) 27 | 28 | func main() { 29 | debug.SetGCPercent(80) 30 | ctx, cancel := context.WithCancel(context.Background()) 31 | 32 | c := make(chan os.Signal, 1) 33 | signal.Notify(c, syscall.SIGINT, syscall.SIGTERM, syscall.SIGKILL) 34 | go func() { 35 | <-c 36 | signal.Stop(c) 37 | cancel() 38 | }() 39 | 40 | var nsFilterRx *regexp.Regexp 41 | if os.Getenv("EXCLUDE_NAMESPACES") != "" { 42 | nsFilterRx = regexp.MustCompile(os.Getenv("EXCLUDE_NAMESPACES")) 43 | } 44 | 45 | stopAndWait := false 46 | var nsFilterStr string 47 | if nsFilterRx != nil { 48 | nsFilterStr = nsFilterRx.String() 49 | } 50 | 51 | var k8sCollector *k8s.K8sCollector 52 | kubeEvents := make(chan interface{}, 1000) 53 | var k8sVersion string 54 | 55 | var k8sCollectorEnabled bool = true 56 | k8sEnabled, err := strconv.ParseBool(os.Getenv("K8S_COLLECTOR_ENABLED")) 57 | if err == nil && !k8sEnabled { 58 | k8sCollectorEnabled = false 59 | } 60 | 61 | if k8sCollectorEnabled { 62 | // k8s collector 63 | var err error 64 | k8sCollector, err = k8s.NewK8sCollector(ctx) 65 | if err != nil { 66 | panic(err) 67 | } 68 | k8sVersion = k8sCollector.GetK8sVersion() 69 | go k8sCollector.Init(kubeEvents) 70 | } 71 | 72 | tracingEnabled, err := strconv.ParseBool(os.Getenv("TRACING_ENABLED")) 73 | metricsEnabled, _ := strconv.ParseBool(os.Getenv("METRICS_ENABLED")) 74 | // logsEnabled, _ := strconv.ParseBool(os.Getenv("LOGS_ENABLED")) 75 | 76 | // Temporarily closed until otlp export's cpu performance problem is resolved 77 | // https://github.com/getanteon/alaz/tree/feat/logs-in-otlp 78 | // https://github.com/open-telemetry/opentelemetry-go/issues/5196 79 | logsEnabled := false 80 | 81 | // datastore backend 82 | dsBackend := datastore.NewBackendDS(ctx, config.BackendDSConfig{ 83 | Host: os.Getenv("BACKEND_HOST"), 84 | MetricsExport: metricsEnabled, 85 | GpuMetricsExport: metricsEnabled, 86 | MetricsExportInterval: 10, 87 | ReqBufferSize: 40000, // TODO: get from a conf file 88 | ConnBufferSize: 1000, // TODO: get from a conf file 89 | KafkaEventBufferSize: 2000, 90 | }) 91 | 92 | var ct *cri.CRITool 93 | ct, err = cri.NewCRITool(ctx) 94 | if err != nil { 95 | log.Logger.Error().Err(err).Msg("failed to create cri tool") 96 | } 97 | 98 | // deploy ebpf programs 99 | var ec *ebpf.EbpfCollector 100 | if tracingEnabled { 101 | ec = ebpf.NewEbpfCollector(ctx, ct) 102 | 103 | a := aggregator.NewAggregator(ctx, ct, kubeEvents, ec.EbpfEvents(), ec.EbpfProcEvents(), ec.EbpfTcpEvents(), ec.TlsAttachQueue(), dsBackend) 104 | a.Run() 105 | 106 | // a.AdvertiseDebugData() 107 | 108 | ec.Init() 109 | go ec.ListenEvents() 110 | } 111 | 112 | var ls *logstreamer.LogStreamer 113 | if logsEnabled { 114 | if ct != nil { 115 | go func() { 116 | backoff := 5 * time.Second 117 | for { 118 | // retry creating LogStreamer with backoff 119 | // it will throw an error if connection to backend is not established 120 | log.Logger.Info().Msg("creating logstreamer") 121 | ls, err = logstreamer.NewLogStreamer(ctx, ct) 122 | if err != nil { 123 | log.Logger.Error().Err(err).Msg("failed to create logstreamer") 124 | select { 125 | case <-time.After(backoff): 126 | case <-ctx.Done(): 127 | return 128 | } 129 | backoff *= 2 130 | } else { 131 | log.Logger.Info().Msg("logstreamer successfully created") 132 | break 133 | } 134 | } 135 | 136 | err := ls.StreamLogs() 137 | if err != nil { 138 | log.Logger.Error().Err(err).Msg("failed to stream logs") 139 | } 140 | }() 141 | 142 | } else { 143 | log.Logger.Error().Msg("logs enabled but cri tool not available") 144 | } 145 | } 146 | 147 | dsBackend.Start() 148 | 149 | healthCh := dsBackend.SendHealthCheck(tracingEnabled, metricsEnabled, logsEnabled, nsFilterStr, k8sVersion) 150 | go func() { 151 | for msg := range healthCh { 152 | if msg == datastore.HealthCheckActionStop { 153 | stopAndWait = true 154 | cancel() 155 | break 156 | } 157 | } 158 | }() 159 | 160 | go http.ListenAndServe(":8181", nil) 161 | 162 | if k8sCollectorEnabled { 163 | <-k8sCollector.Done() 164 | log.Logger.Info().Msg("k8sCollector done") 165 | } 166 | 167 | if tracingEnabled { 168 | <-ec.Done() 169 | log.Logger.Info().Msg("ebpfCollector done") 170 | } 171 | 172 | if logsEnabled && ls != nil { 173 | <-ls.Done() 174 | log.Logger.Info().Msg("cri done") 175 | } 176 | 177 | if stopAndWait { 178 | log.Logger.Warn().Msg("Payment required. Alaz will restart itself after payment's been made.") 179 | for msg := range healthCh { 180 | if msg == datastore.HealthCheckActionOK { 181 | log.Logger.Info().Msg("Restarting alaz...") 182 | break 183 | } 184 | } 185 | } else { 186 | log.Logger.Info().Msg("alaz exiting...") 187 | } 188 | } 189 | -------------------------------------------------------------------------------- /resources/alaz.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: ServiceAccount 3 | metadata: 4 | name: alaz-serviceaccount 5 | namespace: anteon 6 | --- 7 | # For alaz to keep track of changes in cluster 8 | apiVersion: rbac.authorization.k8s.io/v1 9 | kind: ClusterRole 10 | metadata: 11 | name: alaz-role 12 | namespace: anteon 13 | rules: 14 | - apiGroups: 15 | - "*" 16 | resources: 17 | - pods 18 | - services 19 | - endpoints 20 | - replicasets 21 | - deployments 22 | - daemonsets 23 | - statefulsets 24 | verbs: 25 | - "get" 26 | - "list" 27 | - "watch" 28 | --- 29 | apiVersion: rbac.authorization.k8s.io/v1 30 | kind: ClusterRoleBinding 31 | metadata: 32 | name: alaz-role-binding 33 | namespace: anteon 34 | roleRef: 35 | apiGroup: rbac.authorization.k8s.io 36 | kind: ClusterRole 37 | name: alaz-role 38 | subjects: 39 | - kind: ServiceAccount 40 | name: alaz-serviceaccount 41 | namespace: anteon 42 | --- 43 | apiVersion: apps/v1 44 | kind: DaemonSet 45 | metadata: 46 | name: alaz-daemonset 47 | namespace: anteon 48 | spec: 49 | selector: 50 | matchLabels: 51 | app: alaz 52 | template: 53 | metadata: 54 | labels: 55 | app: alaz 56 | spec: 57 | hostPID: true 58 | containers: 59 | - env: 60 | - name: TRACING_ENABLED 61 | value: "true" 62 | - name: METRICS_ENABLED 63 | value: "true" 64 | - name: LOGS_ENABLED 65 | value: "false" 66 | - name: BACKEND_HOST 67 | value: https://api-alaz.getanteon.com:443 68 | - name: LOG_LEVEL 69 | value: "1" 70 | # - name: EXCLUDE_NAMESPACES 71 | # value: "^anteon.*" 72 | - name: MONITORING_ID 73 | value: 74 | - name: SEND_ALIVE_TCP_CONNECTIONS # Send undetected protocol connections (unknown connections) 75 | value: "false" 76 | - name: NODE_NAME 77 | valueFrom: 78 | fieldRef: 79 | apiVersion: v1 80 | fieldPath: spec.nodeName 81 | args: 82 | - --no-collector.wifi 83 | - --no-collector.hwmon 84 | - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/) 85 | - --collector.netclass.ignored-devices=^(veth.*)$ 86 | image: ddosify/alaz:v0.11.4 87 | imagePullPolicy: IfNotPresent 88 | name: alaz-pod 89 | ports: 90 | - containerPort: 8181 91 | protocol: TCP 92 | resources: 93 | limits: 94 | memory: 1Gi 95 | requests: 96 | cpu: "1" 97 | memory: 400Mi 98 | securityContext: 99 | privileged: true 100 | terminationMessagePath: /dev/termination-log 101 | terminationMessagePolicy: File 102 | # needed for linking ebpf trace programs 103 | volumeMounts: 104 | - mountPath: /sys/kernel/debug 105 | name: debugfs 106 | readOnly: false 107 | dnsPolicy: ClusterFirst 108 | restartPolicy: Always 109 | schedulerName: default-scheduler 110 | securityContext: {} 111 | serviceAccount: alaz-serviceaccount 112 | serviceAccountName: alaz-serviceaccount 113 | terminationGracePeriodSeconds: 30 114 | # needed for linking ebpf trace programs 115 | volumes: 116 | - name: debugfs 117 | hostPath: 118 | path: /sys/kernel/debug 119 | -------------------------------------------------------------------------------- /testconfig/config1.json: -------------------------------------------------------------------------------- 1 | { 2 | "testDuration" : 15, 3 | "memProfInterval" : 5, 4 | "podCount": 100, 5 | "serviceCount" : 50, 6 | "edgeCount" : 20, 7 | "edgeRate" : 10000, 8 | "kubeEventsBufferSize" : 1000, 9 | "ebpfEventsBufferSize": 200000, 10 | "ebpfProcEventsBufferSize" : 100, 11 | "tlsAttachQueueBufferSize" : 10, 12 | "ebpfTcpEventsBufferSize" : 1000, 13 | "dsReqBufferSize" : 150000, 14 | "mockBackendMinLatency": 5, 15 | "mockBackendMaxLatency": 20 16 | } --------------------------------------------------------------------------------