├── .gitignore ├── README.md ├── auxiliary ├── DLL.go ├── Dockerfile ├── LRU.go ├── app.go ├── auxiliary_test.go ├── controller.go ├── go.mod ├── go.sum ├── main.go └── zookeeper.go ├── distributed-cache.png ├── docker-compose.yml ├── go.work.sum ├── load_test ├── loadTest.sh └── locust.py ├── master ├── Dockerfile ├── app.go ├── controller.go ├── go.mod ├── go.sum ├── hashRing.go ├── main.go ├── master_test.go ├── utils.go └── zookeeper.go ├── nginx └── conf.d │ └── nginx.conf ├── prometheus └── prometheus.yml └── zookeeper └── zoo.cfg /.gitignore: -------------------------------------------------------------------------------- 1 | # If you prefer the allow list template instead of the deny list, see community template: 2 | # https://github.com/github/gitignore/blob/main/community/Golang/Go.AllowList.gitignore 3 | # 4 | # Binaries for programs and plugins 5 | *.exe 6 | *.exe~ 7 | *.dll 8 | *.so 9 | *.dylib 10 | 11 | # Test binary, built with `go test -c` 12 | *.test 13 | 14 | # Output of the go coverage tool, specifically when used with LiteIDE 15 | *.out 16 | 17 | # Dependency directories (remove the comment below to include it) 18 | # vendor/ 19 | 20 | # Go workspace file 21 | go.work 22 | 23 | # cache file 24 | data 25 | 26 | #python cache 27 | __pycache__ 28 | 29 | #ignore metrics 30 | grafana -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | **Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* 4 | 5 | - [Distributed-Cache-System](#distributed-cache-system) 6 | - [Features](#features) 7 | - [Architecture](#architecture) 8 | - [Data flow](#data-flow) 9 | - [Recovery](#recovery) 10 | - [Usage](#usage) 11 | - [TODO](#todo) 12 | - [Configuration](#configuration) 13 | - [Contributing](#contributing) 14 | - [License](#license) 15 | 16 | 17 | 18 | # Distributed-Cache-System 19 | This implementation of the distributed cache system is an attempt to make a high-performant, scalable and fault-tolerant caching solution to improve performance and efficiency of distributed systems. It utilizes master-auxiliary architecture, where master servers select which auxiliary server to choose when getting or putting key-vals. 20 | 21 | ![Architecture of Distributed Cache System](distributed-cache.png) 22 | ## Features 23 | 24 | - Scalable distributed cache system 25 | - Consistent hashing for efficient key distribution 26 | - Master server scaled up to multiple containers 27 | - Load balancing using Nginx 28 | - Auxiliary (aux) servers for caching data 29 | - Rebalance cache data when a node is added/removed or on catastrophic failure 30 | - Backup for catastrophic failure of all aux servers 31 | - Docker containerization for easy deployment 32 | - Metrics monitoring with Prometheus 33 | - Visualization with Grafana 34 | - LRU (Least Recently Used) caching algorithm implemented using Doubly Linked List (DLL) and Hashmap 35 | 36 | ## Architecture 37 | 38 | The Distributed Cache System consists of the following components: 39 | 40 | - **Master Server**: The master node acts as the central coordinator and is responsible for handling client requests. It receives key-value pairs from clients and determines the appropriate auxiliary server to store the data using a consistent hashing algorithm. The master node also handles the retrieval of data from auxiliary servers and forwards read requests accordingly. 41 | 42 | - **Auxiliary (Aux) Servers**: The auxiliary servers store the cached data. They are replicated instances deployed in a consistent hash ring to ensure efficient distribution and load balancing. Each auxiliary server is responsible for maintaining a local LRU (Least Recently Used) cache, implemented using a combination of a hashtable and a doubly linked list (DLL). This cache allows for fast access and eviction of less frequently used data. 43 | 44 | - **Load Balancing**: The load balancer, typically implemented using nginx, acts as an intermediary between the clients and the master node. It distributes incoming client requests across multiple instances of the master node, enabling horizontal scaling and improved availability. 45 | 46 | - **Metrics Monitoring**: Prometheus is integrated into both master and auxliliary server to collect count and response time of GET and POST requests. 47 | 48 | - **Visualization**: Grafana is used to visualize the collected metrics from Prometheus, providing insightful dashboards and graphs for monitoring the cache system's performance. 49 | 50 | - **Docker**: The project utilizes Docker to containerize and deploy the master and auxiliary servers, making it easy to scale and manage the system. 51 | 52 | - **Load Test**: To ensure the system's reliability and performance, a Python script for load testing was developed. The script utilizes Locust, a popular load testing framework.The script defines a set of tasks for load testing the cache system. The put task sends a POST request to the /data endpoint, randomly selecting a payload from a predefined list. The get task sends a GET request to the /data/{key} endpoint, randomly selecting a key from a predefined list. By utilizing this load testing script, it can simulate multiple concurrent users and measure the performance and scalability of the distributed cache system. 53 | 54 | 55 | 56 | ## Data flow 57 | 58 | 1. Client Interaction: Clients interact to distributed cache system via Load Balancer. 59 | 60 | 2. Load Balancer Routing: The load balancer receives the client requests and distributes them among the available instances of the master node using Round-Robin policy. This ensures a balanced workload and improved performance. The configuration can be tweaked to increase the connection pool or increase the number of works processed by each of the worker. 61 | 62 | 3. Master Node Processing: The master node receives the client requests and performs the necessary operations. For write requests (storing key-value pairs), the master node applies a consistent hashing algorithm to determine the appropriate auxiliary server for data storage. It then forwards the data to the selected auxiliary server for caching. For read requests, the master node identifies the auxiliary server holding the requested data and retrieves it from there. 63 | 64 | 4. Auxiliary Server Caching: The auxiliary servers receive data from the master node and store it in their local LRU cache. This cache allows for efficient data access and eviction based on usage patterns. 65 | 66 | 5. Response to Clients: Once the master node receives a response from the auxiliary server (in the case of read requests) or completes the necessary operations (in the case of write requests), it sends the response back to the client through the load balancer. Clients can then utilize the retrieved data or receive confirmation of a successful operation. 67 | 68 | ## Recovery 69 | - Health of auxiliary servers are monitored by master servers in a regular interval. If any of the auliliary servers go down or respawned, the master knows about it and rebalance the key-val mappings using consistent hashing. 70 | 71 | - In case if one or more auxiliary nodes are shutdown, the key-vals mappings are sent to the master node which rebalance them using consistent hashing. 72 | 73 | - Each auxiliary server backs up data in their container volume every 10 sec, incase a catastrophic failure occurs. These backups are then used when the server is respawned. 74 | 75 | - In case if one or more auxiliary nodes are respawned, the key-vals mappings from the corresponding nodes in the hash ring are rebalanced using consistent hashing. 76 | 77 | - When redistributing/remapping the key-vals, a copy is backed up in the shared volume of master containers incase if the whole system goes down and has to be quickly respawned. The backups can be used to salvage as much data as possible. When respawning the backup is rebalanced to corresponding auxiliary server. 78 | 79 | 80 | ## Usage 81 | 82 | 1. Clone the repository: 83 | ``` 84 | git clone https://github.com/cruzelx/Distributed-Cache-System.git 85 | ``` 86 | 2. Build and run the Docker containers for the master and auxiliary servers: 87 | ``` 88 | docker-compose up --build 89 | ``` 90 | 3. Run load testing: 91 | ``` 92 | ./load_test/loadTest.sh 93 | ``` 94 | 95 | ## TODO 96 | - [ ] Support for expirable key-value 97 | - [ ] Local cache in Master server for better performance 98 | - [ ] Implement replicas for Auxiliary servers with leader selection (might need zookeeper) 99 | 100 | 101 | ## Configuration 102 | 103 | - To adjust the number of auxiliary servers, modify the `docker-compose.yml` file and add/remove auxiliary server instances as needed. 104 | 105 | - The cache system's behavior, such as cache size, eviction policies, and request timeouts, can be configured in the server codebase. 106 | 107 | ## Contributing 108 | 109 | Contributions to the Distributed Cache System are welcome! If you find any issues or have suggestions for improvements, please submit a GitHub issue or create a pull request. 110 | 111 | ## License 112 | 113 | This project is licensed under the [MIT License](LICENSE). -------------------------------------------------------------------------------- /auxiliary/DLL.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | type Node struct { 4 | Previous *Node 5 | Next *Node 6 | Key string 7 | Value string 8 | } 9 | 10 | type DLL struct { 11 | Head *Node 12 | Tail *Node 13 | } 14 | 15 | func NewDLL() *DLL { 16 | return &DLL{} 17 | } 18 | 19 | func (dll *DLL) Prepend(node *Node) { 20 | if dll.Head == nil { 21 | dll.Head = node 22 | dll.Tail = node 23 | return 24 | } 25 | node.Next = dll.Head 26 | dll.Head.Previous = node 27 | dll.Head = node 28 | node.Previous = nil 29 | } 30 | 31 | func (dll *DLL) Append(node *Node) { 32 | if dll.Head == nil { 33 | dll.Head = node 34 | dll.Tail = node 35 | return 36 | } 37 | node.Previous = dll.Tail 38 | node.Next = nil 39 | dll.Tail.Next = node 40 | dll.Tail = node 41 | } 42 | 43 | func (dll *DLL) Remove(node *Node) { 44 | if node == nil { 45 | return 46 | } 47 | 48 | if node == dll.Head { 49 | dll.Head = node.Next 50 | } 51 | 52 | if node == dll.Tail { 53 | dll.Tail = node.Previous 54 | } 55 | 56 | if node.Previous != nil { 57 | node.Previous.Next = node.Next 58 | } 59 | 60 | if node.Next != nil { 61 | node.Next.Previous = node.Previous 62 | } 63 | 64 | node.Previous = nil 65 | node.Next = nil 66 | } 67 | -------------------------------------------------------------------------------- /auxiliary/Dockerfile: -------------------------------------------------------------------------------- 1 | # Base image builder 2 | FROM golang:1.17-buster as builder 3 | 4 | RUN mkdir -p /data 5 | 6 | WORKDIR /app 7 | 8 | RUN useradd -c "Auxiliary Server" -u 1001 aux 9 | 10 | COPY go.mod go.sum ./ 11 | 12 | RUN go mod download 13 | 14 | COPY . . 15 | 16 | RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-linkmode external -extldflags -static" -tags netgo -o /auxiliary 17 | 18 | 19 | #Aux image 20 | FROM scratch 21 | 22 | COPY --from=builder /etc/passwd /etc/passwd 23 | COPY --from=builder /auxiliary /auxiliary 24 | 25 | # RUN chmod +x /auxiliary 26 | USER aux 27 | 28 | CMD [ "/auxiliary" ] -------------------------------------------------------------------------------- /auxiliary/LRU.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "encoding/gob" 5 | "fmt" 6 | "log" 7 | "os" 8 | ) 9 | 10 | type LRU struct { 11 | capacity int 12 | bucket map[string]*Node 13 | dll *DLL 14 | filepath string 15 | } 16 | 17 | func NewLRU(capacity int, filepath string) *LRU { 18 | return &LRU{ 19 | capacity: capacity, 20 | bucket: make(map[string]*Node, capacity), 21 | dll: NewDLL(), 22 | filepath: filepath, 23 | } 24 | } 25 | 26 | func (lru *LRU) Get(key string) (string, error) { 27 | if node, ok := lru.bucket[key]; ok { 28 | lru.dll.Remove(node) 29 | lru.dll.Prepend(node) 30 | return node.Value, nil 31 | } 32 | return "", fmt.Errorf("value for the key %s not found", key) 33 | } 34 | func (lru *LRU) Put(key string, value string) { 35 | if node, ok := lru.bucket[key]; ok { 36 | node.Value = value 37 | lru.dll.Remove(node) 38 | lru.dll.Prepend(node) 39 | } else { 40 | if len(lru.bucket) >= lru.capacity { 41 | delete(lru.bucket, lru.dll.Tail.Key) 42 | lru.dll.Remove(lru.dll.Tail) 43 | } 44 | newNode := &Node{Key: key, Value: value} 45 | lru.bucket[key] = newNode 46 | lru.dll.Prepend(newNode) 47 | } 48 | 49 | } 50 | 51 | func (lru *LRU) EraseCache() { 52 | lru.dll = NewDLL() 53 | lru.bucket = make(map[string]*Node, lru.capacity) 54 | } 55 | 56 | func (lru *LRU) saveToDisk() (bool, error) { 57 | 58 | file, err := os.OpenFile(lru.filepath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0777) 59 | if err != nil { 60 | return false, err 61 | } 62 | defer file.Close() 63 | 64 | if len(lru.bucket) == 0 { 65 | return true, nil 66 | } 67 | 68 | curr := lru.dll.Head 69 | 70 | temp := make(map[string]string) 71 | 72 | for curr != nil { 73 | temp[curr.Key] = curr.Value 74 | curr = curr.Next 75 | } 76 | 77 | encode := gob.NewEncoder(file) 78 | if err := encode.Encode(temp); err != nil { 79 | return false, err 80 | } 81 | log.Println("saved at: ", lru.filepath) 82 | return true, nil 83 | } 84 | 85 | func (lru *LRU) loadFromDisk() (bool, error) { 86 | file, err := os.Open(lru.filepath) 87 | if err != nil { 88 | return os.IsNotExist(err), err 89 | } 90 | defer file.Close() 91 | 92 | decode := gob.NewDecoder(file) 93 | 94 | temp := make(map[string]string) 95 | 96 | if err := decode.Decode(&temp); err != nil { 97 | return false, err 98 | } 99 | 100 | lru.dll = NewDLL() 101 | 102 | for k, v := range temp { 103 | node := &Node{Key: k, Value: v} 104 | lru.dll.Append(node) 105 | lru.bucket[k] = node 106 | } 107 | return true, err 108 | 109 | } 110 | -------------------------------------------------------------------------------- /auxiliary/app.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | "log" 7 | "net/http" 8 | "os" 9 | "os/signal" 10 | "syscall" 11 | "time" 12 | 13 | "github.com/gorilla/handlers" 14 | "github.com/gorilla/mux" 15 | "github.com/prometheus/client_golang/prometheus/promhttp" 16 | ) 17 | 18 | func Start() { 19 | 20 | // Load server configuration form env 21 | port := os.Getenv("PORT") 22 | serverId := os.Getenv("ID") 23 | 24 | // Path of the file that stores the cache for persistence 25 | filepath := "/data/" + serverId + "-" + "data.dat" 26 | 27 | aux := NewAuxiliary(3, filepath) 28 | 29 | // Check if the cache file already exists and load the data in LRU cache 30 | if ok, err := aux.LRU.loadFromDisk(); !ok { 31 | log.Println("error loading from disk: ", err) 32 | } else { 33 | log.Println("cache loaded from the disk") 34 | } 35 | 36 | r := mux.NewRouter() 37 | r.Use(mux.CORSMethodMiddleware(r)) 38 | 39 | // Handlers 40 | r.HandleFunc("/data", aux.Put).Methods("POST") 41 | r.HandleFunc("/data/{key}", aux.Get).Methods("GET") 42 | 43 | // Send all key-val mappings 44 | r.HandleFunc("/mappings", aux.Mappings).Methods("GET") 45 | 46 | // Empty the cache 47 | r.HandleFunc("/erase", aux.Erase).Methods("DELETE") 48 | 49 | // Monitor health to check alive status 50 | r.HandleFunc("/health", aux.Health).Methods("GET") 51 | 52 | // Instrumentation 53 | r.Handle("/metrics", promhttp.Handler()) 54 | 55 | loggedHandler := handlers.LoggingHandler(os.Stdout, r) 56 | srv := http.Server{ 57 | Addr: fmt.Sprintf(":%s", port), 58 | WriteTimeout: time.Second * 15, 59 | ReadTimeout: time.Second * 15, 60 | IdleTimeout: time.Second * 60, 61 | Handler: loggedHandler, 62 | } 63 | 64 | errChan := make(chan error) 65 | 66 | go func() { 67 | if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed { 68 | errChan <- err 69 | } 70 | }() 71 | 72 | log.Println("aux is listening on the port ", port) 73 | 74 | // Save cache to disk every 10 sec for persistent storage 75 | go func() { 76 | for { 77 | time.Sleep(time.Second * 10) 78 | log.Println("saving cache to disk...") 79 | 80 | if ok, err := aux.LRU.saveToDisk(); !ok { 81 | log.Printf("failed saving cache to disk: %s\n", err.Error()) 82 | } 83 | } 84 | }() 85 | 86 | // Listen to termination signals 87 | shutdown := make(chan os.Signal, 1) 88 | signal.Notify(shutdown, os.Interrupt, syscall.SIGTERM, syscall.SIGINT, syscall.SIGQUIT) 89 | 90 | defer func() { 91 | close(errChan) 92 | close(shutdown) 93 | }() 94 | 95 | select { 96 | case err := <-errChan: 97 | log.Printf("error: %s\n", err.Error()) 98 | case signal := <-shutdown: 99 | 100 | log.Printf("received signal: %v\n", signal) 101 | log.Println("shutting down...") 102 | log.Println("saving cache to disk...") 103 | 104 | if ok, err := aux.LRU.saveToDisk(); !ok { 105 | log.Printf("failed to save the cache to disk: %s", err.Error()) 106 | } 107 | 108 | // send mappings to master before shutting down 109 | aux.SendMappings() 110 | 111 | ctx, cancel := context.WithTimeout(context.Background(), time.Second*5) 112 | defer cancel() 113 | 114 | if err := srv.Shutdown(ctx); err != nil { 115 | log.Printf("error: %s\n", err.Error()) 116 | } 117 | } 118 | 119 | } 120 | -------------------------------------------------------------------------------- /auxiliary/auxiliary_test.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "os" 5 | "testing" 6 | ) 7 | 8 | func TestDLL_Prepend(t *testing.T) { 9 | dll := NewDLL() 10 | 11 | node1 := &Node{Key: "alex", Value: "bhattarai"} 12 | node2 := &Node{Key: "ramesh", Value: "pokharel"} 13 | 14 | dll.Prepend(node1) 15 | if dll.Head != node1 || dll.Tail != node1 { 16 | t.Error("Prepend: Unexpected head and tail node") 17 | } 18 | 19 | dll.Prepend(node2) 20 | if dll.Head != node2 || dll.Tail != node1 { 21 | t.Error("Prepend: Unexpected head and tail node") 22 | } 23 | 24 | } 25 | 26 | func TestDLL_Append(t *testing.T) { 27 | dll := NewDLL() 28 | 29 | node1 := &Node{Key: "alex", Value: "bhattarai"} 30 | node2 := &Node{Key: "ramesh", Value: "pokharel"} 31 | 32 | dll.Append(node1) 33 | if dll.Head != node1 || dll.Tail != node1 { 34 | t.Error("Append: Unexpected head and tail node") 35 | } 36 | 37 | dll.Append(node2) 38 | if dll.Head != node1 || dll.Tail != node2 { 39 | t.Error("Append: Unexpected head and tail node") 40 | } 41 | } 42 | 43 | func TestDLL_Remove(t *testing.T) { 44 | dll := NewDLL() 45 | 46 | node1 := &Node{Key: "alex", Value: "bhattarai"} 47 | node2 := &Node{Key: "ramesh", Value: "pokharel"} 48 | 49 | dll.Append(node1) 50 | dll.Append(node2) 51 | 52 | dll.Remove(node1) 53 | if dll.Head != node2 || dll.Tail != node2 { 54 | t.Error("Remove: Unexpected head and tail node") 55 | } 56 | 57 | dll.Remove(node2) 58 | if dll.Head != nil || dll.Tail != nil { 59 | t.Error("Remove: Unexpected head and tail node") 60 | } 61 | 62 | } 63 | 64 | func TestLRU_Get(t *testing.T) { 65 | lru := NewLRU(3, "") 66 | 67 | lru.Put("Name", "Alex") 68 | lru.Put("Age", "25") 69 | lru.Put("Country", "NP") 70 | 71 | val, err := lru.Get("Age") 72 | if err != nil { 73 | t.Errorf("Failed to get value for existing key %s: %v", "Age", err) 74 | } 75 | 76 | if val != "25" { 77 | t.Errorf("Unexpected value for key %s: got %s wanted %s", "Age", val, "25") 78 | } 79 | 80 | _, err = lru.Get("Town") 81 | if err == nil { 82 | t.Errorf("Expected error for non existent key %s", "Town") 83 | } 84 | 85 | expectedError := "value for the key Town not found" 86 | if err.Error() != expectedError { 87 | t.Errorf("Unexpected error message: wanted %s, got %s", expectedError, err.Error()) 88 | } 89 | 90 | expectedOrder := []string{"Age", "Country", "Name"} 91 | 92 | curr := lru.dll.Head 93 | for _, key := range expectedOrder { 94 | if key != curr.Key { 95 | t.Errorf("Unexpected key order in LRU Cache: wanted %s, got %s", key, curr.Key) 96 | } 97 | curr = curr.Next 98 | } 99 | 100 | } 101 | 102 | func TestLRU_Put(t *testing.T) { 103 | lru := NewLRU(3, "") 104 | 105 | lru.Put("Name", "Alex") 106 | lru.Put("Age", "25") 107 | lru.Put("Country", "NP") 108 | 109 | val1, err1 := lru.Get("Name") 110 | val2, err2 := lru.Get("Age") 111 | val3, err3 := lru.Get("Country") 112 | 113 | if err1 != nil || err2 != nil || err3 != nil { 114 | t.Error("Failed to get values for one or more keys") 115 | } 116 | if val1 != "Alex" || val2 != "25" || val3 != "NP" { 117 | t.Errorf("Unexpected values for the keys: wanted %s,%s,%s; got %s,%s,%s", "Alex", "25", "NP", val1, val2, val3) 118 | } 119 | 120 | lru.Put("Wallet", "Bitcoin") 121 | 122 | _, err := lru.Get("Name") 123 | if err == nil { 124 | t.Errorf("Expected error while getting evicted key %s ", "Name") 125 | } 126 | 127 | val, err := lru.Get("Wallet") 128 | 129 | if err != nil { 130 | t.Errorf("Failed to get value for key %s: err %v", "Wallet", err) 131 | } 132 | 133 | if val != "Bitcoin" { 134 | t.Errorf("Unexpected value for the key %s: wanted %s, got %s", "Wallet", "Bitcoin", val) 135 | } 136 | 137 | } 138 | 139 | func TestLRU_SaveAndLoadFromDisk(t *testing.T) { 140 | filepath := "test.dat" 141 | 142 | lru := NewLRU(3, filepath) 143 | 144 | lru.Put("Name", "Alex") 145 | lru.Put("Age", "25") 146 | lru.Put("Country", "NP") 147 | 148 | if ok, err := lru.saveToDisk(); !ok { 149 | t.Errorf("Failed to save to disk: err %v", err) 150 | } 151 | 152 | newlru := NewLRU(3, filepath) 153 | if ok, err := newlru.loadFromDisk(); !ok { 154 | t.Errorf("Failed to read from disk: err %v", err) 155 | } 156 | 157 | value, err := newlru.Get("Name") 158 | if err != nil { 159 | t.Errorf("Failed to get value for key %s: err %v", "Name", err) 160 | } 161 | 162 | if value != "Alex" { 163 | t.Errorf("Unexpected value for the key %s : wanted %s, got %s", "Name", "Alex", value) 164 | } 165 | 166 | err = os.Remove(filepath) 167 | if err != nil { 168 | t.Errorf("Failed to remove test file: err %v", err) 169 | } 170 | 171 | } 172 | 173 | func Benchmark_LRUPut(b *testing.B) { 174 | lru := NewLRU(3, "") 175 | 176 | for i := 0; i < b.N; i++ { 177 | lru.Put("Alex", "Name") 178 | } 179 | } 180 | 181 | func Benchmark_LRUGet(b *testing.B) { 182 | lru := NewLRU(3, "") 183 | lru.Put("Alex", "Name") 184 | 185 | for i := 0; i < b.N; i++ { 186 | lru.Get("Alex") 187 | } 188 | } 189 | 190 | func TestMain(m *testing.M) { 191 | m.Run() 192 | } 193 | -------------------------------------------------------------------------------- /auxiliary/controller.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "bytes" 5 | "encoding/json" 6 | "fmt" 7 | "log" 8 | "net/http" 9 | "os" 10 | "time" 11 | 12 | "github.com/gorilla/mux" 13 | "github.com/prometheus/client_golang/prometheus" 14 | ) 15 | 16 | type Auxiliary struct { 17 | LRU *LRU 18 | requests *prometheus.CounterVec 19 | responseTime *prometheus.HistogramVec 20 | } 21 | 22 | type KeyVal struct { 23 | Key string `json:key` 24 | Value string `json:value` 25 | } 26 | 27 | func NewAuxiliary(bucketSize int, filepath string) *Auxiliary { 28 | requests := prometheus.NewCounterVec( 29 | prometheus.CounterOpts{ 30 | Name: "auxiliary_request_total", 31 | Help: "Total number of requests to the auxiliary node", 32 | }, []string{"method"}, 33 | ) 34 | 35 | responseTime := prometheus.NewHistogramVec( 36 | prometheus.HistogramOpts{ 37 | Name: "auxiliary_response_time_seconds", 38 | Help: "Distribution of the response time processed by the auxiliary server", 39 | Buckets: prometheus.ExponentialBuckets(0.001, 2, 16), 40 | }, 41 | []string{"method"}, 42 | ) 43 | 44 | prometheus.MustRegister(requests, responseTime) 45 | 46 | return &Auxiliary{ 47 | LRU: NewLRU(bucketSize, filepath), 48 | requests: requests, 49 | responseTime: responseTime, 50 | } 51 | 52 | } 53 | 54 | func (aux *Auxiliary) Put(w http.ResponseWriter, r *http.Request) { 55 | startTime := time.Now() 56 | 57 | var kv KeyVal 58 | 59 | if err := json.NewDecoder(r.Body).Decode(&kv); err != nil { 60 | w.WriteHeader(http.StatusBadRequest) 61 | return 62 | } 63 | 64 | aux.LRU.Put(kv.Key, kv.Value) 65 | 66 | elapsedTime := time.Since(startTime).Seconds() 67 | aux.requests.WithLabelValues(r.Method).Inc() 68 | aux.responseTime.WithLabelValues(r.Method).Observe(elapsedTime) 69 | w.WriteHeader(http.StatusOK) 70 | 71 | } 72 | 73 | func (aux *Auxiliary) Get(w http.ResponseWriter, r *http.Request) { 74 | startTime := time.Now() 75 | 76 | vars := mux.Vars(r) 77 | key := vars["key"] 78 | 79 | val, err := aux.LRU.Get(key) 80 | 81 | if err != nil { 82 | http.Error(w, err.Error(), http.StatusNotFound) 83 | return 84 | } 85 | 86 | elapsedTime := time.Since(startTime).Seconds() 87 | aux.requests.WithLabelValues(r.Method).Inc() 88 | aux.responseTime.WithLabelValues(r.Method).Observe(elapsedTime) 89 | w.Header().Set("Content-Type", "application/json") 90 | json.NewEncoder(w).Encode(KeyVal{Key: key, Value: val}) 91 | } 92 | 93 | func (aux *Auxiliary) Mappings(w http.ResponseWriter, r *http.Request) { 94 | 95 | keyvals := map[string]string{} 96 | curr := aux.LRU.dll.Head 97 | 98 | for curr != nil { 99 | keyvals[curr.Key] = curr.Value 100 | curr = curr.Next 101 | } 102 | 103 | w.Header().Set("Content-Type", "application/json") 104 | json.NewEncoder(w).Encode(keyvals) 105 | } 106 | 107 | func (aux *Auxiliary) Erase(w http.ResponseWriter, r *http.Request) { 108 | aux.LRU.EraseCache() 109 | w.WriteHeader(http.StatusOK) 110 | } 111 | 112 | func (aux *Auxiliary) Health(w http.ResponseWriter, r *http.Request) { 113 | w.WriteHeader(http.StatusOK) 114 | } 115 | 116 | func (aux *Auxiliary) SendMappings() { 117 | 118 | keyvals := map[string]string{} 119 | curr := aux.LRU.dll.Head 120 | 121 | for curr != nil { 122 | keyvals[curr.Key] = curr.Value 123 | curr = curr.Next 124 | } 125 | 126 | postBody, err := json.Marshal(keyvals) 127 | if err != nil { 128 | log.Printf("failed to parse key-val pairs: %v\n", err) 129 | return 130 | } 131 | 132 | log.Println("sending mappings to master server...") 133 | 134 | client := &http.Client{} 135 | masterServer := os.Getenv("MASTER_SERVER") 136 | req, err := http.NewRequest("POST", fmt.Sprintf("http://%s/rebalance-dead-aux", masterServer), bytes.NewBuffer(postBody)) 137 | if err != nil { 138 | log.Println("failed to create the request") 139 | return 140 | } 141 | 142 | port := os.Getenv("PORT") 143 | serverId := os.Getenv("ID") 144 | 145 | auxServer := fmt.Sprintf("%s:%s", serverId, port) 146 | 147 | req.Header.Set("Content-Type", "application/json") 148 | req.Header.Set("aux-server", auxServer) 149 | 150 | resp, err := client.Do(req) 151 | if err != nil { 152 | log.Printf("failed to send mappings to master server: %s", err.Error()) 153 | return 154 | } 155 | defer resp.Body.Close() 156 | 157 | log.Println("mappings sent!!!") 158 | 159 | } 160 | -------------------------------------------------------------------------------- /auxiliary/go.mod: -------------------------------------------------------------------------------- 1 | module auxiliary 2 | 3 | go 1.19 4 | 5 | require ( 6 | github.com/JoinVerse/xid v0.0.0-20171120095953-afa60c3e0e47 7 | github.com/gorilla/handlers v1.5.1 8 | github.com/gorilla/mux v1.8.0 9 | github.com/prometheus/client_golang v1.16.0 10 | github.com/samuel/go-zookeeper v0.0.0-20201211165307-7117e9ea2414 11 | ) 12 | 13 | require ( 14 | github.com/beorn7/perks v1.0.1 // indirect 15 | github.com/cespare/xxhash/v2 v2.2.0 // indirect 16 | github.com/felixge/httpsnoop v1.0.1 // indirect 17 | github.com/golang/protobuf v1.5.3 // indirect 18 | github.com/matttproud/golang_protobuf_extensions v1.0.4 // indirect 19 | github.com/prometheus/client_model v0.3.0 // indirect 20 | github.com/prometheus/common v0.42.0 // indirect 21 | github.com/prometheus/procfs v0.10.1 // indirect 22 | github.com/stretchr/testify v1.8.4 // indirect 23 | golang.org/x/sys v0.8.0 // indirect 24 | google.golang.org/protobuf v1.30.0 // indirect 25 | ) 26 | -------------------------------------------------------------------------------- /auxiliary/go.sum: -------------------------------------------------------------------------------- 1 | github.com/JoinVerse/xid v0.0.0-20171120095953-afa60c3e0e47 h1:E8wN7I+yDDNUEXKhwXvd/QElNj54F0dSDpKQA00G5us= 2 | github.com/JoinVerse/xid v0.0.0-20171120095953-afa60c3e0e47/go.mod h1:wqRuu28xfulaAXrAsIEVz0R8f08LHBdVGnWSyU2UJQM= 3 | github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM= 4 | github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw= 5 | github.com/cespare/xxhash/v2 v2.2.0 h1:DC2CZ1Ep5Y4k3ZQ899DldepgrayRUGE6BBZ/cd9Cj44= 6 | github.com/cespare/xxhash/v2 v2.2.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs= 7 | github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= 8 | github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 9 | github.com/felixge/httpsnoop v1.0.1 h1:lvB5Jl89CsZtGIWuTcDM1E/vkVs49/Ml7JJe07l8SPQ= 10 | github.com/felixge/httpsnoop v1.0.1/go.mod h1:m8KPJKqk1gH5J9DgRY2ASl2lWCfGKXixSwevea8zH2U= 11 | github.com/go-logfmt/logfmt v0.5.1/go.mod h1:WYhtIu8zTZfxdn5+rREduYbwxfcBr/Vr6KEVveWlfTs= 12 | github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= 13 | github.com/golang/protobuf v1.3.5/go.mod h1:6O5/vntMXwX2lRkT1hjjk0nAC1IDOTvTlVgjlRvqsdk= 14 | github.com/golang/protobuf v1.5.0/go.mod h1:FsONVRAS9T7sI+LIUmWTfcYkHO4aIWwzhcaSAoJOfIk= 15 | github.com/golang/protobuf v1.5.3 h1:KhyjKVUg7Usr/dYsdSqoFveMYd5ko72D+zANwlG1mmg= 16 | github.com/golang/protobuf v1.5.3/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY= 17 | github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= 18 | github.com/gorilla/handlers v1.5.1 h1:9lRY6j8DEeeBT10CvO9hGW0gmky0BprnvDI5vfhUHH4= 19 | github.com/gorilla/handlers v1.5.1/go.mod h1:t8XrUpc4KVXb7HGyJ4/cEnwQiaxrX/hz1Zv/4g96P1Q= 20 | github.com/gorilla/mux v1.8.0 h1:i40aqfkR1h2SlN9hojwV5ZA91wcXFOvkdNIeFDP5koI= 21 | github.com/gorilla/mux v1.8.0/go.mod h1:DVbg23sWSpFRCP0SfiEN6jmj59UnW/n46BH5rLB71So= 22 | github.com/jpillora/backoff v1.0.0/go.mod h1:J/6gKK9jxlEcS3zixgDgUAsiuZ7yrSoa/FX5e0EB2j4= 23 | github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo= 24 | github.com/julienschmidt/httprouter v1.3.0/go.mod h1:JR6WtHb+2LUe8TCKY3cZOxFyyO8IZAc4RVcycCCAKdM= 25 | github.com/matttproud/golang_protobuf_extensions v1.0.4 h1:mmDVorXM7PCGKw94cs5zkfA9PSy5pEvNWRP0ET0TIVo= 26 | github.com/matttproud/golang_protobuf_extensions v1.0.4/go.mod h1:BSXmuO+STAnVfrANrmjBb36TMTDstsz7MSK+HVaYKv4= 27 | github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= 28 | github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk= 29 | github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U= 30 | github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= 31 | github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= 32 | github.com/prometheus/client_golang v1.16.0 h1:yk/hx9hDbrGHovbci4BY+pRMfSuuat626eFsHb7tmT8= 33 | github.com/prometheus/client_golang v1.16.0/go.mod h1:Zsulrv/L9oM40tJ7T815tM89lFEugiJ9HzIqaAx4LKc= 34 | github.com/prometheus/client_model v0.3.0 h1:UBgGFHqYdG/TPFD1B1ogZywDqEkwp3fBMvqdiQ7Xew4= 35 | github.com/prometheus/client_model v0.3.0/go.mod h1:LDGWKZIo7rky3hgvBe+caln+Dr3dPggB5dvjtD7w9+w= 36 | github.com/prometheus/common v0.42.0 h1:EKsfXEYo4JpWMHH5cg+KOUWeuJSov1Id8zGR8eeI1YM= 37 | github.com/prometheus/common v0.42.0/go.mod h1:xBwqVerjNdUDjgODMpudtOMwlOwf2SaTr1yjz4b7Zbc= 38 | github.com/prometheus/procfs v0.10.1 h1:kYK1Va/YMlutzCGazswoHKo//tZVlFpKYh+PymziUAg= 39 | github.com/prometheus/procfs v0.10.1/go.mod h1:nwNm2aOCAYw8uTR/9bWRREkZFxAUcWzPHWJq+XBB/FM= 40 | github.com/samuel/go-zookeeper v0.0.0-20201211165307-7117e9ea2414 h1:AJNDS0kP60X8wwWFvbLPwDuojxubj9pbfK7pjHw0vKg= 41 | github.com/samuel/go-zookeeper v0.0.0-20201211165307-7117e9ea2414/go.mod h1:gi+0XIa01GRL2eRQVjQkKGqKF3SF9vZR/HnPullcV2E= 42 | github.com/stretchr/testify v1.8.4 h1:CcVxjf3Q8PM0mHUKJCdn+eZZtm5yQwehR5yeSVQQcUk= 43 | github.com/stretchr/testify v1.8.4/go.mod h1:sz/lmYIOXD/1dqDmKjjqLyZ2RngseejIcXlSw2iwfAo= 44 | golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 45 | golang.org/x/sys v0.8.0 h1:EBmGv8NaZBZTWvrbjNoL6HVt+IVy3QDQpJs7VRIw3tU= 46 | golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= 47 | golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= 48 | google.golang.org/protobuf v1.26.0-rc.1/go.mod h1:jlhhOSvTdKEhbULTjvd4ARK9grFBp09yW+WbY/TyQbw= 49 | google.golang.org/protobuf v1.26.0/go.mod h1:9q0QmTI4eRPtz6boOQmLYwt+qCgq0jsYwAQnmE0givc= 50 | google.golang.org/protobuf v1.30.0 h1:kPPoIgf3TsEvrm0PFe15JQ+570QVxYzEvvHqChK+cng= 51 | google.golang.org/protobuf v1.30.0/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I= 52 | gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ= 53 | gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA= 54 | gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= 55 | -------------------------------------------------------------------------------- /auxiliary/main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | func main() { 4 | Start() 5 | } 6 | -------------------------------------------------------------------------------- /auxiliary/zookeeper.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "fmt" 5 | "log" 6 | "os" 7 | "strings" 8 | "time" 9 | 10 | "github.com/samuel/go-zookeeper/zk" 11 | ) 12 | 13 | type Zookeeper struct { 14 | conn *zk.Conn 15 | } 16 | 17 | func NewManager() *Zookeeper { 18 | servers := os.Getenv("ZOO_SERVERS") 19 | zooServers := strings.Split(servers, ",") 20 | 21 | time.Sleep(time.Second * 10) 22 | 23 | conn, err := connectToZookeeper(zooServers, time.Second*5) 24 | if err != nil { 25 | log.Fatalln("Failed to connect to zookeeper server(s): ", err) 26 | } 27 | return &Zookeeper{ 28 | conn: conn, 29 | } 30 | } 31 | 32 | func connectToZookeeper(servers []string, time time.Duration) (*zk.Conn, error) { 33 | conn, _, err := zk.Connect(servers, time) 34 | if err != nil { 35 | return nil, err 36 | } 37 | 38 | return conn, err 39 | } 40 | 41 | func (z *Zookeeper) CreateAuxiliaryNode() error { 42 | 43 | auxRoot := "/auxiliaries" 44 | 45 | host := os.Getenv("ID") + ":" + os.Getenv("PORT") 46 | auxPath := auxRoot + "/" + host 47 | 48 | exists, _, err := z.conn.Exists(auxRoot) 49 | if err != nil { 50 | return fmt.Errorf("failed to check if root aux znode exists: %v", err) 51 | } 52 | 53 | if !exists { 54 | if _, err := z.conn.Create(auxRoot, []byte{}, int32(0), zk.WorldACL(zk.PermAll)); err != nil { 55 | return fmt.Errorf("failed to create root aux znode: %v", err) 56 | } 57 | } 58 | 59 | exists, _, err = z.conn.Exists(auxPath) 60 | if err != nil { 61 | return fmt.Errorf("failed to check if aux node exists: %v", err) 62 | } 63 | 64 | if !exists { 65 | if _, err := z.conn.Create(auxPath, []byte{}, int32(0), zk.WorldACL(zk.PermAll)); err != nil { 66 | return fmt.Errorf("failed to create the aux node: %v", err) 67 | } 68 | 69 | } else { 70 | log.Println("Aux node already exists") 71 | } 72 | 73 | return nil 74 | } 75 | 76 | func (z *Zookeeper) GetData(path string) ([]byte, error) { 77 | data, _, err := z.conn.Get(path) 78 | return data, err 79 | } 80 | 81 | func (z *Zookeeper) SetData(path string, data []byte) error { 82 | _, err := z.conn.Set(path, data, -1) 83 | return err 84 | } 85 | 86 | func (z *Zookeeper) DeleteNode(path string) error { 87 | return z.conn.Delete(path, -1) 88 | } 89 | 90 | // TODO: Need to work on this 91 | func (z *Zookeeper) WatchChildren(path string, changeHandler func(children []string)) error { 92 | children, _, childCh, err := z.conn.ChildrenW(path) 93 | if err != nil { 94 | return err 95 | } 96 | 97 | go func() { 98 | for { 99 | select { 100 | case <-childCh: 101 | changeHandler(children) 102 | log.Printf("Children updated for path %s: %v", path, children) 103 | } 104 | } 105 | }() 106 | 107 | log.Printf("Children for path %s: %v", path, children) 108 | 109 | changeHandler(children) 110 | return nil 111 | } 112 | 113 | func (z *Zookeeper) Close() { 114 | z.conn.Close() 115 | } 116 | -------------------------------------------------------------------------------- /distributed-cache.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cruzelx/Distributed-Cache-System/HEAD/distributed-cache.png -------------------------------------------------------------------------------- /docker-compose.yml: -------------------------------------------------------------------------------- 1 | version: "3" 2 | 3 | services: 4 | master: 5 | build: 6 | context: "./master" 7 | scale: 3 8 | # ports: 9 | # - 8000:8000 10 | depends_on: 11 | - zoo 12 | - aux1 13 | - aux2 14 | - aux3 15 | environment: 16 | - AUX_SERVERS=aux1:3001,aux2:3002,aux3:3003 17 | - ZOO_SERVERS=zoo:2181 18 | - PORT=8000 19 | volumes: 20 | - ./master:/app 21 | - ./data:/data 22 | networks: 23 | - cache-network 24 | 25 | nginx: 26 | image: "nginx" 27 | depends_on: 28 | - master 29 | ports: 30 | - 8080:3000 31 | volumes: 32 | - ./nginx/conf.d:/etc/nginx/ 33 | networks: 34 | - cache-network 35 | 36 | 37 | aux1: 38 | build: 39 | context: "./auxiliary" 40 | restart: unless-stopped 41 | environment: 42 | - PORT=3001 43 | - MASTER_SERVER=master:8000 44 | - ID=aux1 45 | - ZOO_SERVERS=zoo:2181 46 | ports: 47 | - 9001:3001 48 | depends_on: 49 | - zoo 50 | volumes: 51 | - ./auxiliary:/app 52 | - ./data:/data 53 | networks: 54 | - cache-network 55 | 56 | 57 | aux2: 58 | build: 59 | context: "./auxiliary" 60 | restart: unless-stopped 61 | environment: 62 | - PORT=3002 63 | - MASTER_SERVER=master:8000 64 | - ID=aux2 65 | - ZOO_SERVERS=zoo:2181 66 | ports: 67 | - 9002:3002 68 | depends_on: 69 | - zoo 70 | volumes: 71 | - ./auxiliary:/app 72 | - ./data:/data 73 | networks: 74 | - cache-network 75 | 76 | 77 | aux3: 78 | build: 79 | context: "./auxiliary" 80 | restart: unless-stopped 81 | environment: 82 | - PORT=3003 83 | - MASTER_SERVER=master:8000 84 | - ID=aux3 85 | - ZOO_SERVERS=zoo:2181 86 | ports: 87 | - 9003:3003 88 | depends_on: 89 | - zoo 90 | volumes: 91 | - ./auxiliary:/app 92 | - ./data:/data 93 | networks: 94 | - cache-network 95 | 96 | # Cluster Management 97 | 98 | zoo: 99 | image: zookeeper 100 | restart: always 101 | ports: 102 | - 2181:2181 103 | - 2888:2888 104 | - 3888:3888 105 | networks: 106 | - cache-network 107 | volumes: 108 | - ./zookeeper/zoo.cfg:/conf/zoo.cfg 109 | - ./data/zookeeper:/data 110 | environment: 111 | - ZOO_MY_ID=1 112 | - ZOO_SERVERS=server.1=zoo:2888:3888;2181 113 | 114 | # Metrics 115 | prometheus: 116 | image: prom/prometheus 117 | ports: 118 | - 9090:9090 119 | volumes: 120 | - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml 121 | networks: 122 | - cache-network 123 | 124 | grafana: 125 | image: grafana/grafana 126 | ports: 127 | - 4000:3000 128 | environment: 129 | - GF_INSTALL_PLUGINS=grafana-piechart-panel 130 | volumes: 131 | - ./grafana:/var/lib/grafana 132 | networks: 133 | - cache-network 134 | 135 | networks: 136 | cache-network: -------------------------------------------------------------------------------- /go.work.sum: -------------------------------------------------------------------------------- 1 | github.com/benbjohnson/clock v1.1.0 h1:Q92kusRqC1XV2MjkWETPvjJVqKetz1OzxZB7mHJLju8= 2 | github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY= 3 | github.com/pkg/errors v0.8.1 h1:iURUrRGxPUNPdy5/HRSm+Yj6okJ6UtLINN0Q9M4+h3I= 4 | github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= 5 | github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo= 6 | go.uber.org/goleak v1.1.11 h1:wy28qYRKZgnJTxGxvye5/wgWr1EKjmUDGYox5mGlRlI= 7 | golang.org/x/mod v0.11.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= 8 | golang.org/x/net v0.7.0 h1:rJrUqqhjsgNp7KqAIc25s9pZnjU7TUcSY7HcVZjdn1g= 9 | golang.org/x/net v0.7.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs= 10 | golang.org/x/sync v0.2.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 11 | golang.org/x/text v0.7.0 h1:4BRB4x83lYWy72KwLD/qYDuTu7q9PjSagHvijDw7cLo= 12 | golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8= 13 | google.golang.org/appengine v1.6.7 h1:FZR1q0exgwxzPzp/aF+VccGrSfxfPpkBqjIIEq3ru6c= 14 | google.golang.org/appengine v1.6.7/go.mod h1:8WjMMxjGQR8xUklV/ARdw2HLXBOI7O7uCIDZVag1xfc= 15 | -------------------------------------------------------------------------------- /load_test/loadTest.sh: -------------------------------------------------------------------------------- 1 | locust -f locust.py --host http://localhost:8080 -------------------------------------------------------------------------------- /load_test/locust.py: -------------------------------------------------------------------------------- 1 | from locust import HttpUser, TaskSet, task, between 2 | import random 3 | import json 4 | 5 | 6 | class CacheBehaviour(TaskSet): 7 | @task 8 | def put(self): 9 | headers = {"Content-Type": "application/json"} 10 | payloads = [{"key": "water", "value": "Nile"}, 11 | {"key": "Interest", "value": "Dance"}, 12 | {"key": "Escape", "value": "World"}] 13 | 14 | self.client.post("/data", json.dumps(random.choice(payloads)), headers) 15 | 16 | @task 17 | def get(self): 18 | self.client.get( 19 | "/data/"+random.choice(["water", "Interest", "Escape"])) 20 | 21 | 22 | class CacheLoadTest(HttpUser): 23 | tasks = [CacheBehaviour] 24 | wait_time = between(1, 2) 25 | -------------------------------------------------------------------------------- /master/Dockerfile: -------------------------------------------------------------------------------- 1 | # Base image builder 2 | FROM golang:1.17-buster as builder 3 | 4 | RUN mkdir -p /data 5 | 6 | WORKDIR /app 7 | 8 | RUN useradd -c "Master Server" -u 1001 master 9 | 10 | COPY go.mod go.sum ./ 11 | 12 | RUN go mod download 13 | 14 | COPY . . 15 | 16 | RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-linkmode external -extldflags -static" -tags netgo -o /master 17 | 18 | 19 | # Master image 20 | FROM scratch 21 | 22 | COPY --from=builder /etc/passwd /etc/passwd 23 | COPY --from=builder /master /master 24 | 25 | # RUN chmod +x /master 26 | USER master 27 | 28 | EXPOSE 8000 29 | 30 | CMD ["/master"] -------------------------------------------------------------------------------- /master/app.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | "log" 7 | "net/http" 8 | "os" 9 | "os/signal" 10 | "strings" 11 | "syscall" 12 | "time" 13 | 14 | "github.com/gorilla/handlers" 15 | "github.com/gorilla/mux" 16 | "github.com/prometheus/client_golang/prometheus/promhttp" 17 | ) 18 | 19 | func Start() { 20 | // get aux server host and port from env 21 | servers := os.Getenv("AUX_SERVERS") 22 | auxServers := strings.Split(servers, ",") 23 | 24 | m := NewMaster() 25 | // add env aux host:port to hashring 26 | for _, auxServer := range auxServers { 27 | m.hashring.AddNode(auxServer) 28 | } 29 | 30 | // Restore and rebalance if backup file exists 31 | go m.RestoreCacheFromDisk() 32 | 33 | r := mux.NewRouter() 34 | r.Use(mux.CORSMethodMiddleware(r)) 35 | 36 | // Handlers 37 | r.HandleFunc("/data", m.Put).Methods("POST") 38 | r.HandleFunc("/data/{key}", m.Get).Methods("GET") 39 | 40 | // Rebalance when a aux server is shutting down 41 | r.HandleFunc("/rebalance-dead-aux", m.RebalanceDeadAuxServer).Methods("POST") 42 | 43 | // Instrumentation 44 | r.Handle("/metrics", promhttp.Handler()) 45 | 46 | loggedHandler := handlers.LoggingHandler(os.Stdout, r) 47 | 48 | port := os.Getenv("PORT") 49 | srv := http.Server{ 50 | Addr: fmt.Sprintf(":%s", port), 51 | WriteTimeout: time.Second * 15, 52 | ReadTimeout: time.Second * 15, 53 | IdleTimeout: time.Second * 60, 54 | Handler: loggedHandler, 55 | } 56 | 57 | errChan := make(chan error) 58 | 59 | go func() { 60 | if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed { 61 | errChan <- err 62 | } 63 | }() 64 | 65 | // Listen to termination signals 66 | sigChan := make(chan os.Signal, 1) 67 | signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM) 68 | 69 | log.Printf("Server is listening on the port %s\n", port) 70 | 71 | healthChan := make(chan interface{}) 72 | go m.HealthCheck(time.Second*5, healthChan) 73 | 74 | defer func() { 75 | close(errChan) 76 | close(sigChan) 77 | close(healthChan) 78 | }() 79 | 80 | select { 81 | case err := <-errChan: 82 | log.Printf("error: %s\n", err.Error()) 83 | healthChan <- struct{}{} 84 | 85 | case <-sigChan: 86 | log.Println("shutting down...") 87 | 88 | healthChan <- struct{}{} 89 | 90 | ctx, cancel := context.WithTimeout(context.Background(), time.Second*5) 91 | defer cancel() 92 | 93 | if err := srv.Shutdown(ctx); err != nil { 94 | log.Printf("error: %s\n", err) 95 | } 96 | } 97 | 98 | } 99 | -------------------------------------------------------------------------------- /master/controller.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "bytes" 5 | "encoding/gob" 6 | "encoding/json" 7 | "fmt" 8 | "io" 9 | "log" 10 | "net/http" 11 | "os" 12 | "sync" 13 | "time" 14 | 15 | "github.com/gorilla/mux" 16 | "github.com/prometheus/client_golang/prometheus" 17 | ) 18 | 19 | const ( 20 | backupFilePath = "/data/backupCache.dat" 21 | ) 22 | 23 | type Master struct { 24 | hashring *HashRing 25 | client *http.Client 26 | requests *prometheus.CounterVec 27 | responseTime *prometheus.HistogramVec 28 | filepath string 29 | auxServers []string 30 | activeAuxServers map[string]bool 31 | } 32 | 33 | func NewMaster() *Master { 34 | transport := &http.Transport{ 35 | MaxIdleConns: 100, 36 | MaxIdleConnsPerHost: 100, 37 | } 38 | client := &http.Client{Transport: transport} 39 | 40 | requests := prometheus.NewCounterVec( 41 | prometheus.CounterOpts{ 42 | Name: "master_request_total", 43 | Help: "Total number of requests to the master node", 44 | }, []string{"method"}, 45 | ) 46 | 47 | responseTime := prometheus.NewHistogramVec( 48 | prometheus.HistogramOpts{ 49 | Name: "master_response_time_seconds", 50 | Help: "Distribution of the response time processed by the master server", 51 | Buckets: prometheus.ExponentialBuckets(0.001, 2, 16), 52 | }, 53 | []string{"method"}, 54 | ) 55 | 56 | prometheus.MustRegister(requests, responseTime) 57 | 58 | return &Master{ 59 | client: client, 60 | hashring: NewHashRing(3), 61 | requests: requests, 62 | responseTime: responseTime, 63 | filepath: backupFilePath, 64 | auxServers: getAuxServers(), 65 | activeAuxServers: make(map[string]bool), 66 | } 67 | } 68 | 69 | type KeyVal struct { 70 | Key string `json:key` 71 | Value string `json:value` 72 | } 73 | 74 | func (m *Master) Put(w http.ResponseWriter, r *http.Request) { 75 | startTime := time.Now() 76 | 77 | var kv KeyVal 78 | 79 | if err := json.NewDecoder(r.Body).Decode(&kv); err != nil { 80 | http.Error(w, "Bad Request", http.StatusBadRequest) 81 | return 82 | } 83 | 84 | node, err := m.hashring.GetNode(kv.Key) 85 | fmt.Println("Node: ", node) 86 | 87 | if err != nil { 88 | http.Error(w, "Internal Server Error", http.StatusInternalServerError) 89 | return 90 | } 91 | 92 | postBody, err := json.Marshal(kv) 93 | if err != nil { 94 | http.Error(w, "Internal Server Error", http.StatusInternalServerError) 95 | return 96 | } 97 | 98 | resp, err := m.client.Post(fmt.Sprintf("http://%s/data", node), "application/json", bytes.NewBuffer(postBody)) 99 | if err != nil { 100 | http.Error(w, "Internal Server Error", http.StatusInternalServerError) 101 | return 102 | } 103 | defer resp.Body.Close() 104 | 105 | w.WriteHeader(resp.StatusCode) 106 | elapsedTime := time.Since(startTime).Seconds() 107 | m.requests.WithLabelValues(r.Method).Inc() 108 | m.responseTime.WithLabelValues(r.Method).Observe(elapsedTime) 109 | } 110 | 111 | func (m *Master) Get(w http.ResponseWriter, r *http.Request) { 112 | startTime := time.Now() 113 | 114 | vars := mux.Vars(r) 115 | key, ok := vars["key"] 116 | 117 | if !ok { 118 | w.WriteHeader(http.StatusBadRequest) 119 | return 120 | } 121 | 122 | node, err := m.hashring.GetNode(key) 123 | if err != nil { 124 | http.Error(w, "Internal Server Error", http.StatusInternalServerError) 125 | return 126 | } 127 | 128 | resp, err := m.client.Get(fmt.Sprintf("http://%s/data/%s", node, key)) 129 | if err != nil { 130 | http.Error(w, "Internal Server Error", http.StatusInternalServerError) 131 | return 132 | } 133 | 134 | defer resp.Body.Close() 135 | 136 | w.Header().Set("Content-Type", "application/json") 137 | w.WriteHeader(resp.StatusCode) 138 | 139 | _, err = io.Copy(w, resp.Body) 140 | if err != nil { 141 | http.Error(w, "Internal Server Error", http.StatusInternalServerError) 142 | return 143 | } 144 | 145 | elapsedTime := time.Since(startTime).Seconds() 146 | m.requests.WithLabelValues(r.Method).Inc() 147 | m.responseTime.WithLabelValues(r.Method).Observe(elapsedTime) 148 | } 149 | 150 | func (m *Master) rebalance(keyvals map[string]string) { 151 | startTime := time.Now() 152 | 153 | keyvalChan := make(chan KeyVal) 154 | 155 | var wg sync.WaitGroup 156 | numOfWorkers := 16 157 | 158 | if len(keyvals) < 16 { 159 | numOfWorkers = len(keyvals) 160 | } 161 | 162 | log.Printf("number of workers used for remapping: %d", numOfWorkers) 163 | 164 | for i := 0; i < numOfWorkers; i++ { 165 | wg.Add(1) 166 | 167 | go func(workID int) { 168 | defer wg.Done() 169 | 170 | for keyval := range keyvalChan { 171 | node, err := m.hashring.GetNode(keyval.Key) 172 | if err != nil { 173 | log.Printf("failed to remap key %s to server %s", keyval.Key, node) 174 | continue 175 | } 176 | 177 | postBody, err := json.Marshal(keyval) 178 | if err != nil { 179 | log.Printf("failed to parse key-value pair: %v \n", err) 180 | continue 181 | } 182 | 183 | resp, err := m.client.Post(fmt.Sprintf("http://%s/data", node), "application/json", bytes.NewBuffer(postBody)) 184 | if err != nil { 185 | log.Printf("failed to send key-value pair to aux server %s", node) 186 | continue 187 | } 188 | defer resp.Body.Close() 189 | } 190 | }(i) 191 | } 192 | 193 | for k, v := range keyvals { 194 | keyvalChan <- KeyVal{Key: k, Value: v} 195 | } 196 | 197 | defer close(keyvalChan) 198 | elapsedTime := time.Since(startTime).Seconds() 199 | 200 | log.Printf("mapped keys in %v sec(s)\n", elapsedTime) 201 | 202 | } 203 | 204 | func (m *Master) backupCacheToDisk(keyvals map[string]string) error { 205 | _, err := os.Stat(m.filepath) 206 | if err != nil { 207 | if os.IsNotExist(err) { 208 | file, err := os.Create(m.filepath) 209 | if err != nil { 210 | return fmt.Errorf("failed to create backup file: %v", err) 211 | } 212 | 213 | encode := gob.NewEncoder(file) 214 | if err := encode.Encode(keyvals); err != nil { 215 | return fmt.Errorf("failed to encode key-val pairs to file %s: %v", m.filepath, err) 216 | 217 | } 218 | log.Printf("saved mappings to backup file %s", m.filepath) 219 | return nil 220 | } 221 | return fmt.Errorf("error reading stats of file %s: %v", m.filepath, err) 222 | } 223 | 224 | file, err := os.OpenFile(m.filepath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0600) 225 | if err != nil { 226 | return fmt.Errorf("failed to open file %s: %v", m.filepath, err) 227 | 228 | } 229 | defer file.Close() 230 | 231 | encode := gob.NewEncoder(file) 232 | if err := encode.Encode(&keyvals); err != nil { 233 | return fmt.Errorf("failed to decode key-val pairs to file %s: %v", m.filepath, err) 234 | 235 | } 236 | log.Printf("saved mappings to backup file %s", m.filepath) 237 | return nil 238 | } 239 | 240 | func (m *Master) RestoreCacheFromDisk() error { 241 | file, err := os.Open(m.filepath) 242 | if err != nil { 243 | return fmt.Errorf("failed to open file %s: %v", m.filepath, err) 244 | } 245 | 246 | var mappings map[string]string 247 | 248 | decode := gob.NewDecoder(file) 249 | if err := decode.Decode(&mappings); err != nil { 250 | return fmt.Errorf("failed to decode mappings from file %s: %v", m.filepath, err) 251 | } 252 | 253 | m.rebalance(mappings) 254 | 255 | return nil 256 | } 257 | 258 | // Aux sends mappings to this route before dying 259 | func (m *Master) RebalanceDeadAuxServer(w http.ResponseWriter, r *http.Request) { 260 | var auxMappings map[string]string 261 | 262 | auxServer := r.Header.Get("aux-server") 263 | 264 | body, err := io.ReadAll(r.Body) 265 | if err != nil { 266 | http.Error(w, "Bad request", http.StatusBadRequest) 267 | return 268 | } 269 | 270 | err = json.Unmarshal(body, &auxMappings) 271 | if err != nil { 272 | http.Error(w, "Bad request", http.StatusBadRequest) 273 | return 274 | } 275 | 276 | log.Printf("Remapping %d keys from server %s", len(auxMappings), auxServer) 277 | // Broadcast aux node to be removed 278 | m.hashring.RemoveNode(auxServer) 279 | // rebalance all key-vals to the live aux servers 280 | m.rebalance(auxMappings) 281 | 282 | // Save to backup cache file while remapping 283 | // This helps in spinning up 284 | go func(filepath string, mappings map[string]string) { 285 | if err := m.backupCacheToDisk(auxMappings); err != nil { 286 | log.Println(err) 287 | return 288 | } 289 | }(m.filepath, auxMappings) 290 | 291 | } 292 | 293 | func (m *Master) checkAuxServerHealth(auxServer string) bool { 294 | resp, err := m.client.Get(fmt.Sprintf("http://%s/health", auxServer)) 295 | if err != nil { 296 | log.Printf("failed to connect to aux server %s: %v", auxServer, err) 297 | return false 298 | } 299 | defer resp.Body.Close() 300 | 301 | return true 302 | } 303 | 304 | func (m *Master) handleDeadAuxServer(deadAux string) { 305 | if val, ok := m.activeAuxServers[deadAux]; ok && val { 306 | m.hashring.RemoveNode(deadAux) 307 | } 308 | m.activeAuxServers[deadAux] = false 309 | log.Printf("heart of %s has stopped beating... ", deadAux) 310 | } 311 | 312 | func (m *Master) getDistinctNodesToRebalance(node string) []string { 313 | distinctNodes := make(map[string]bool) 314 | 315 | for i := 0; i < m.hashring.replica; i++ { 316 | replicaNode := fmt.Sprintf("%s:%d", node, i) 317 | mappedNode, err := m.hashring.GetNode(replicaNode) 318 | 319 | if err != nil { 320 | log.Println(err) 321 | continue 322 | } 323 | 324 | distinctNodes[mappedNode] = true 325 | } 326 | 327 | result := make([]string, 0, len(distinctNodes)) 328 | for k := range distinctNodes { 329 | result = append(result, k) 330 | } 331 | return result 332 | 333 | } 334 | 335 | func (m *Master) handleAliveAuxServer(aliveAux string) { 336 | if val, ok := m.activeAuxServers[aliveAux]; ok && !val { 337 | 338 | distinctNodesToRebalance := m.getDistinctNodesToRebalance(aliveAux) 339 | 340 | for _, node := range distinctNodesToRebalance { 341 | go func(node string) { 342 | 343 | resp, err := m.client.Get(fmt.Sprintf("http://%s/mappings", node)) 344 | if err != nil { 345 | log.Printf("failed to get mappings from aux server %s: %v", node, err) 346 | return 347 | } 348 | 349 | data, err := io.ReadAll(resp.Body) 350 | if err != nil { 351 | log.Printf("failed to read mappings from the response: %v", err) 352 | return 353 | } 354 | 355 | var mappings map[string]string 356 | if err := json.Unmarshal(data, &mappings); err != nil { 357 | log.Printf("failed to parse the response body: %v", err) 358 | return 359 | } 360 | 361 | m.hashring.AddNode(aliveAux) 362 | 363 | m.rebalance(mappings) 364 | 365 | defer resp.Body.Close() 366 | 367 | }(node) 368 | } 369 | } 370 | m.activeAuxServers[aliveAux] = true 371 | log.Printf("heart of %s is beating... ", aliveAux) 372 | } 373 | 374 | // Checks the heartbeat of aux server every {duration} seconds 375 | func (m *Master) HealthCheck(duration time.Duration, stop <-chan interface{}) { 376 | log.Printf("checking health of aux servers... %v", m.auxServers) 377 | 378 | deadAuxChan := make(chan string) 379 | aliveAuxChan := make(chan string) 380 | 381 | for _, aux := range m.auxServers { 382 | 383 | go func(aux string) { 384 | 385 | for { 386 | select { 387 | case <-stop: 388 | return 389 | 390 | default: 391 | if !m.checkAuxServerHealth(aux) { 392 | deadAuxChan <- aux 393 | } else { 394 | aliveAuxChan <- aux 395 | } 396 | time.Sleep(duration) 397 | } 398 | } 399 | }(aux) 400 | } 401 | 402 | defer func() { 403 | log.Printf("Exiting from health check...") 404 | close(deadAuxChan) 405 | close(aliveAuxChan) 406 | }() 407 | 408 | for { 409 | 410 | select { 411 | case deadAux := <-deadAuxChan: 412 | m.handleDeadAuxServer(deadAux) 413 | 414 | case aliveAux := <-aliveAuxChan: 415 | m.handleAliveAuxServer(aliveAux) 416 | 417 | case <-stop: 418 | log.Println("exiting health check...") 419 | return 420 | } 421 | } 422 | 423 | } 424 | -------------------------------------------------------------------------------- /master/go.mod: -------------------------------------------------------------------------------- 1 | module master 2 | 3 | go 1.19 4 | 5 | require ( 6 | github.com/gorilla/handlers v1.5.1 7 | github.com/gorilla/mux v1.8.0 8 | github.com/prometheus/client_golang v1.16.0 9 | golang.org/x/exp v0.0.0-20230713183714-613f0c0eb8a1 10 | ) 11 | 12 | require ( 13 | github.com/davecgh/go-spew v1.1.1 // indirect 14 | github.com/pmezard/go-difflib v1.0.0 // indirect 15 | gopkg.in/yaml.v3 v3.0.1 // indirect 16 | ) 17 | 18 | require ( 19 | github.com/beorn7/perks v1.0.1 // indirect 20 | github.com/cespare/xxhash/v2 v2.2.0 // indirect 21 | github.com/felixge/httpsnoop v1.0.1 // indirect 22 | github.com/golang/protobuf v1.5.3 // indirect 23 | github.com/matttproud/golang_protobuf_extensions v1.0.4 // indirect 24 | github.com/prometheus/client_model v0.3.0 // indirect 25 | github.com/prometheus/common v0.42.0 // indirect 26 | github.com/prometheus/procfs v0.10.1 // indirect 27 | github.com/samuel/go-zookeeper v0.0.0-20201211165307-7117e9ea2414 28 | github.com/stretchr/testify v1.8.4 29 | golang.org/x/sys v0.8.0 // indirect 30 | google.golang.org/protobuf v1.30.0 // indirect 31 | ) 32 | -------------------------------------------------------------------------------- /master/go.sum: -------------------------------------------------------------------------------- 1 | github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM= 2 | github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw= 3 | github.com/cespare/xxhash/v2 v2.2.0 h1:DC2CZ1Ep5Y4k3ZQ899DldepgrayRUGE6BBZ/cd9Cj44= 4 | github.com/cespare/xxhash/v2 v2.2.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs= 5 | github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= 6 | github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 7 | github.com/felixge/httpsnoop v1.0.1 h1:lvB5Jl89CsZtGIWuTcDM1E/vkVs49/Ml7JJe07l8SPQ= 8 | github.com/felixge/httpsnoop v1.0.1/go.mod h1:m8KPJKqk1gH5J9DgRY2ASl2lWCfGKXixSwevea8zH2U= 9 | github.com/go-logfmt/logfmt v0.5.1/go.mod h1:WYhtIu8zTZfxdn5+rREduYbwxfcBr/Vr6KEVveWlfTs= 10 | github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= 11 | github.com/golang/protobuf v1.3.5/go.mod h1:6O5/vntMXwX2lRkT1hjjk0nAC1IDOTvTlVgjlRvqsdk= 12 | github.com/golang/protobuf v1.5.0/go.mod h1:FsONVRAS9T7sI+LIUmWTfcYkHO4aIWwzhcaSAoJOfIk= 13 | github.com/golang/protobuf v1.5.3 h1:KhyjKVUg7Usr/dYsdSqoFveMYd5ko72D+zANwlG1mmg= 14 | github.com/golang/protobuf v1.5.3/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY= 15 | github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= 16 | github.com/gorilla/handlers v1.5.1 h1:9lRY6j8DEeeBT10CvO9hGW0gmky0BprnvDI5vfhUHH4= 17 | github.com/gorilla/handlers v1.5.1/go.mod h1:t8XrUpc4KVXb7HGyJ4/cEnwQiaxrX/hz1Zv/4g96P1Q= 18 | github.com/gorilla/mux v1.8.0 h1:i40aqfkR1h2SlN9hojwV5ZA91wcXFOvkdNIeFDP5koI= 19 | github.com/gorilla/mux v1.8.0/go.mod h1:DVbg23sWSpFRCP0SfiEN6jmj59UnW/n46BH5rLB71So= 20 | github.com/jpillora/backoff v1.0.0/go.mod h1:J/6gKK9jxlEcS3zixgDgUAsiuZ7yrSoa/FX5e0EB2j4= 21 | github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo= 22 | github.com/julienschmidt/httprouter v1.3.0/go.mod h1:JR6WtHb+2LUe8TCKY3cZOxFyyO8IZAc4RVcycCCAKdM= 23 | github.com/matttproud/golang_protobuf_extensions v1.0.4 h1:mmDVorXM7PCGKw94cs5zkfA9PSy5pEvNWRP0ET0TIVo= 24 | github.com/matttproud/golang_protobuf_extensions v1.0.4/go.mod h1:BSXmuO+STAnVfrANrmjBb36TMTDstsz7MSK+HVaYKv4= 25 | github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= 26 | github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk= 27 | github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U= 28 | github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= 29 | github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= 30 | github.com/prometheus/client_golang v1.16.0 h1:yk/hx9hDbrGHovbci4BY+pRMfSuuat626eFsHb7tmT8= 31 | github.com/prometheus/client_golang v1.16.0/go.mod h1:Zsulrv/L9oM40tJ7T815tM89lFEugiJ9HzIqaAx4LKc= 32 | github.com/prometheus/client_model v0.3.0 h1:UBgGFHqYdG/TPFD1B1ogZywDqEkwp3fBMvqdiQ7Xew4= 33 | github.com/prometheus/client_model v0.3.0/go.mod h1:LDGWKZIo7rky3hgvBe+caln+Dr3dPggB5dvjtD7w9+w= 34 | github.com/prometheus/common v0.42.0 h1:EKsfXEYo4JpWMHH5cg+KOUWeuJSov1Id8zGR8eeI1YM= 35 | github.com/prometheus/common v0.42.0/go.mod h1:xBwqVerjNdUDjgODMpudtOMwlOwf2SaTr1yjz4b7Zbc= 36 | github.com/prometheus/procfs v0.10.1 h1:kYK1Va/YMlutzCGazswoHKo//tZVlFpKYh+PymziUAg= 37 | github.com/prometheus/procfs v0.10.1/go.mod h1:nwNm2aOCAYw8uTR/9bWRREkZFxAUcWzPHWJq+XBB/FM= 38 | github.com/samuel/go-zookeeper v0.0.0-20201211165307-7117e9ea2414 h1:AJNDS0kP60X8wwWFvbLPwDuojxubj9pbfK7pjHw0vKg= 39 | github.com/samuel/go-zookeeper v0.0.0-20201211165307-7117e9ea2414/go.mod h1:gi+0XIa01GRL2eRQVjQkKGqKF3SF9vZR/HnPullcV2E= 40 | github.com/stretchr/testify v1.8.4 h1:CcVxjf3Q8PM0mHUKJCdn+eZZtm5yQwehR5yeSVQQcUk= 41 | github.com/stretchr/testify v1.8.4/go.mod h1:sz/lmYIOXD/1dqDmKjjqLyZ2RngseejIcXlSw2iwfAo= 42 | golang.org/x/exp v0.0.0-20230713183714-613f0c0eb8a1 h1:MGwJjxBy0HJshjDNfLsYO8xppfqWlA5ZT9OhtUUhTNw= 43 | golang.org/x/exp v0.0.0-20230713183714-613f0c0eb8a1/go.mod h1:FXUEEKJgO7OQYeo8N01OfiKP8RXMtf6e8aTskBGqWdc= 44 | golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 45 | golang.org/x/sys v0.8.0 h1:EBmGv8NaZBZTWvrbjNoL6HVt+IVy3QDQpJs7VRIw3tU= 46 | golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= 47 | golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= 48 | google.golang.org/protobuf v1.26.0-rc.1/go.mod h1:jlhhOSvTdKEhbULTjvd4ARK9grFBp09yW+WbY/TyQbw= 49 | google.golang.org/protobuf v1.26.0/go.mod h1:9q0QmTI4eRPtz6boOQmLYwt+qCgq0jsYwAQnmE0givc= 50 | google.golang.org/protobuf v1.30.0 h1:kPPoIgf3TsEvrm0PFe15JQ+570QVxYzEvvHqChK+cng= 51 | google.golang.org/protobuf v1.30.0/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I= 52 | gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 53 | gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ= 54 | gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA= 55 | gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= 56 | -------------------------------------------------------------------------------- /master/hashRing.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "fmt" 5 | "hash/crc32" 6 | "sort" 7 | "sync" 8 | ) 9 | 10 | type HashRing struct { 11 | mutex sync.Mutex 12 | sortedHash []uint32 13 | hashmap map[uint32]string 14 | replica int 15 | } 16 | 17 | func NewHashRing(replica int) *HashRing { 18 | return &HashRing{ 19 | sortedHash: []uint32{}, 20 | hashmap: make(map[uint32]string), 21 | replica: replica, 22 | mutex: sync.Mutex{}, 23 | } 24 | } 25 | 26 | func (hr *HashRing) AddNode(node string) { 27 | hr.mutex.Lock() 28 | defer hr.mutex.Unlock() 29 | 30 | for i := 0; i < hr.replica; i++ { 31 | replicaKey := fmt.Sprintf("%s:%d", node, i) 32 | hash := crc32.ChecksumIEEE([]byte(replicaKey)) 33 | hr.hashmap[hash] = node 34 | hr.sortedHash = append(hr.sortedHash, hash) 35 | } 36 | 37 | sort.Slice(hr.sortedHash, func(i, j int) bool { 38 | return hr.sortedHash[i] < hr.sortedHash[j] 39 | }) 40 | } 41 | 42 | func (hr *HashRing) RemoveNode(node string) { 43 | hr.mutex.Lock() 44 | defer hr.mutex.Unlock() 45 | 46 | var modifiedSortedHash []uint32 47 | 48 | for _, hash := range hr.sortedHash { 49 | if hr.hashmap[hash] != node { 50 | modifiedSortedHash = append(modifiedSortedHash, hash) 51 | } else { 52 | delete(hr.hashmap, hash) 53 | } 54 | } 55 | hr.sortedHash = modifiedSortedHash 56 | } 57 | 58 | func (hr *HashRing) GetNode(key string) (string, error) { 59 | 60 | hr.mutex.Lock() 61 | defer hr.mutex.Unlock() 62 | 63 | hash := crc32.ChecksumIEEE([]byte(key)) 64 | index := sort.Search(len(hr.sortedHash), func(i int) bool { 65 | return hr.sortedHash[i] >= hash 66 | }) 67 | if index == len(hr.sortedHash) { 68 | index = 0 69 | } 70 | 71 | if _, ok := hr.hashmap[hr.sortedHash[index]]; ok { 72 | return hr.hashmap[hr.sortedHash[index]], nil 73 | } else { 74 | return "", fmt.Errorf("node not found for the key %s", key) 75 | } 76 | } 77 | -------------------------------------------------------------------------------- /master/main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | func main() { 4 | Start() 5 | } 6 | -------------------------------------------------------------------------------- /master/master_test.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "fmt" 5 | "hash/crc32" 6 | "testing" 7 | "time" 8 | 9 | "github.com/stretchr/testify/assert" 10 | ) 11 | 12 | func TestHashRing_AddNode(t *testing.T) { 13 | hr := NewHashRing(3) 14 | 15 | hr.AddNode("aux1") 16 | hr.AddNode("aux2") 17 | hr.AddNode("aux3") 18 | 19 | expectedHashLength := hr.replica * 3 20 | 21 | if expectedHashLength != len(hr.sortedHash) { 22 | t.Errorf("Unexpected number of hashes: got %v wanted %v", len(hr.sortedHash), expectedHashLength) 23 | } 24 | 25 | keys := []string{"key1", "key2", "key3"} 26 | 27 | for _, k := range keys { 28 | node, err := hr.GetNode(k) 29 | if err != nil { 30 | t.Errorf("Failed to get node for key %s", k) 31 | } 32 | if node != "aux1" && node != "aux2" && node != "aux3" { 33 | t.Errorf("Unexpected node mapping for key %s : got %s", k, node) 34 | } 35 | } 36 | } 37 | 38 | func TestHashRing_RemoveNode(t *testing.T) { 39 | hr := NewHashRing(3) 40 | 41 | hr.AddNode("aux1") 42 | hr.AddNode("aux2") 43 | hr.AddNode("aux3") 44 | 45 | nodeToRemove := "aux3" 46 | hashes := []uint32{} 47 | 48 | for i := 0; i < hr.replica; i++ { 49 | hash := crc32.ChecksumIEEE([]byte(fmt.Sprintf("%s:%d", nodeToRemove, i))) 50 | hashes = append(hashes, hash) 51 | } 52 | 53 | hr.RemoveNode("aux3") 54 | 55 | for _, h := range hashes { 56 | assert.NotContains(t, hr.sortedHash, h, "Removed node hash still exist in sorted hash") 57 | } 58 | 59 | } 60 | 61 | func TestHashRing_GetNode(t *testing.T) { 62 | hr := NewHashRing(3) 63 | 64 | hr.AddNode("aux1:3001") 65 | hr.AddNode("aux2:3002") 66 | hr.AddNode("aux3:3003") 67 | 68 | testKey := "some-key" 69 | _, err := hr.GetNode(testKey) 70 | if err != nil { 71 | t.Errorf("Failed to get map for the key %s : err %v", testKey, err) 72 | } 73 | 74 | testKeyNode := map[string]string{ 75 | "water": "aux1:3001", 76 | "Interest": "aux2:3002", 77 | "Escape": "aux3:3003", 78 | } 79 | 80 | for k, n := range testKeyNode { 81 | node, err := hr.GetNode(k) 82 | if err != nil { 83 | t.Errorf("Failed to get map for the key %s : err %v", k, err) 84 | } 85 | if node != n { 86 | t.Errorf("Unexpected node mapping for key %s : got %s wanted %s", k, node, n) 87 | } 88 | } 89 | 90 | } 91 | 92 | func TestHashRing_Performance(t *testing.T) { 93 | hr := NewHashRing(10) 94 | 95 | hr.AddNode("aux1:3001") 96 | hr.AddNode("aux2:3002") 97 | hr.AddNode("aux3:3003") 98 | 99 | count := map[string]int{ 100 | "aux1:3001": 0, 101 | "aux2:3002": 0, 102 | "aux3:3003": 0, 103 | } 104 | 105 | numOfKeys := 10_00_00_00 106 | keys := make([]string, numOfKeys) 107 | 108 | for i := 0; i < numOfKeys; i++ { 109 | keys[i] = fmt.Sprintf("key%d", i) 110 | } 111 | 112 | for _, key := range keys { 113 | node, err := hr.GetNode(key) 114 | if err != nil { 115 | t.Errorf("Failed to get map for the key %s : err %v", key, err) 116 | } 117 | count[node] += 1 118 | } 119 | 120 | startTime := time.Now() 121 | for _, key := range keys { 122 | _, err := hr.GetNode(key) 123 | if err != nil { 124 | t.Errorf("Failed to get map for the key %s : err %v", key, err) 125 | } 126 | 127 | } 128 | elapsedTime := time.Since(startTime) 129 | 130 | t.Logf("Performance test: GetNode for %d keys", numOfKeys) 131 | t.Logf("Elapsed time: %s", elapsedTime) 132 | t.Logf("Average time per key: %s", elapsedTime/time.Duration(numOfKeys)) 133 | t.Logf("Distribution: aux1:3001 => %.2f%% aux2:3002 => %.2f%% aux3:3003 => %.2f%% ", 134 | float64(count["aux1:3001"])*100.0/float64(numOfKeys), 135 | float64(count["aux2:3002"])*100.0/float64(numOfKeys), 136 | float64(count["aux3:3003"])*100.0/float64(numOfKeys)) 137 | } 138 | 139 | func Benchmark_HashRing_GetNode(b *testing.B) { 140 | hr := NewHashRing(3) 141 | 142 | hr.AddNode("aux1:3001") 143 | hr.AddNode("aux2:3002") 144 | hr.AddNode("aux3:3003") 145 | 146 | for i := 0; i < b.N; i++ { 147 | hr.GetNode("test-key") 148 | } 149 | } 150 | 151 | func Benchmark_HashRing_AddNode(b *testing.B) { 152 | hr := NewHashRing(3) 153 | 154 | for i := 0; i < b.N; i++ { 155 | hr.AddNode("aux1:3001") 156 | } 157 | } 158 | 159 | func TestMain(m *testing.M) { 160 | m.Run() 161 | } 162 | -------------------------------------------------------------------------------- /master/utils.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "os" 5 | "strings" 6 | ) 7 | 8 | func getAuxServers() []string { 9 | servers := os.Getenv("AUX_SERVERS") 10 | auxServers := strings.Split(servers, ",") 11 | 12 | return auxServers 13 | } 14 | -------------------------------------------------------------------------------- /master/zookeeper.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "fmt" 5 | "log" 6 | "os" 7 | "sort" 8 | "strings" 9 | "time" 10 | 11 | "github.com/samuel/go-zookeeper/zk" 12 | ) 13 | 14 | type Zookeeper struct { 15 | conn *zk.Conn 16 | removeAuxParentZnode string 17 | removeAuxEphemeralZnode string 18 | } 19 | 20 | func NewManager() *Zookeeper { 21 | servers := os.Getenv("ZOO_SERVERS") 22 | zooServers := strings.Split(servers, ",") 23 | 24 | time.Sleep(time.Second * 10) 25 | 26 | conn, err := connectToZookeeper(zooServers, time.Second*5) 27 | if err != nil { 28 | log.Fatalln("Failed to connect to zookeeper server(s): ", err) 29 | } 30 | return &Zookeeper{ 31 | conn: conn, 32 | removeAuxParentZnode: "/remove-auxes", 33 | } 34 | } 35 | 36 | func connectToZookeeper(servers []string, time time.Duration) (*zk.Conn, error) { 37 | log.Printf("zookeeper servers: %s\n", servers) 38 | 39 | conn, _, err := zk.Connect(servers, time) 40 | if err != nil { 41 | return nil, err 42 | } 43 | 44 | return conn, err 45 | } 46 | 47 | func (z *Zookeeper) CreateMasterNode() error { 48 | 49 | // get hostname 50 | host, err := os.Hostname() 51 | if err != nil { 52 | log.Fatalln(err) 53 | } 54 | 55 | masterPath := "/masters/master-" + host 56 | 57 | exists, _, err := z.conn.Exists("/masters") 58 | if err != nil { 59 | return fmt.Errorf("failed to check if parent master znode exists: %v", err) 60 | } 61 | 62 | if !exists { 63 | if _, err := z.conn.Create("/masters", []byte{}, 0, zk.WorldACL(zk.PermAll)); err != nil { 64 | return fmt.Errorf("failed to create parent master znode: %v", err) 65 | } 66 | } 67 | 68 | exists, _, err = z.conn.Exists(masterPath) 69 | if err != nil { 70 | return fmt.Errorf("failed to check if master znode exists: %v", err) 71 | } 72 | 73 | if !exists { 74 | flag := int32(0) 75 | acl := zk.WorldACL(zk.PermAll) 76 | 77 | if _, err := z.conn.Create(masterPath, []byte{}, flag, acl); err != nil { 78 | return fmt.Errorf("failed to create the master znode: %v", err) 79 | } 80 | 81 | } else { 82 | log.Printf("Master znode already exists: %s\n", masterPath) 83 | } 84 | 85 | return nil 86 | } 87 | 88 | func (z *Zookeeper) GetData(path string) ([]byte, error) { 89 | data, _, err := z.conn.Get(path) 90 | return data, err 91 | } 92 | 93 | func (z *Zookeeper) SetData(path string, data []byte) error { 94 | _, err := z.conn.Set(path, data, -1) 95 | return err 96 | } 97 | 98 | func (z *Zookeeper) DeleteNode(path string) error { 99 | return z.conn.Delete(path, -1) 100 | } 101 | 102 | // TODO: Need to work on this 103 | func (z *Zookeeper) WatchChildren(path string, changeHandler func(children []string)) error { 104 | children, _, childCh, err := z.conn.ChildrenW(path) 105 | if err != nil { 106 | return err 107 | } 108 | 109 | go func() { 110 | for { 111 | select { 112 | case <-childCh: 113 | changeHandler(children) 114 | log.Printf("Children updated for path %s: %v", path, children) 115 | } 116 | } 117 | }() 118 | 119 | log.Printf("Children for path %s: %v", path, children) 120 | 121 | changeHandler(children) 122 | return nil 123 | } 124 | 125 | func (z *Zookeeper) Close() { 126 | z.conn.Close() 127 | } 128 | 129 | func (z *Zookeeper) InitRemoveAuxEphemeralZnode() error { 130 | lockPath := z.removeAuxParentZnode 131 | 132 | exists, _, err := z.conn.Exists(lockPath) 133 | if err != nil { 134 | return fmt.Errorf("failed to check if master lock path exists: %v", err) 135 | } 136 | 137 | if !exists { 138 | _, err := z.conn.Create(lockPath, []byte{}, int32(0), zk.WorldACL(zk.PermAll)) 139 | if err != nil { 140 | return fmt.Errorf("failed to create lock path %s: %v", lockPath, err) 141 | } 142 | } 143 | 144 | ephimeralLockPath, err := z.conn.Create(lockPath+"/lock-", []byte{}, zk.FlagEphemeral|zk.FlagSequence, zk.WorldACL(zk.PermAll)) 145 | if err != nil { 146 | return fmt.Errorf("failed to create sequential lock node: %v", err) 147 | } 148 | 149 | z.removeAuxEphemeralZnode = ephimeralLockPath 150 | return nil 151 | 152 | } 153 | 154 | func (z *Zookeeper) WatchOverAuxServers(handler func(auxes []string), stop <-chan zk.Event) { 155 | auxRoot := "/auxiliaries" 156 | for { 157 | _, _, childCh, err := z.conn.ChildrenW(auxRoot) 158 | if err != nil { 159 | log.Printf("failed to watch over children of aux servers root znode %s: %v", auxRoot, err) 160 | return 161 | } 162 | 163 | select { 164 | case event := <-childCh: 165 | if event.Type == zk.EventNodeChildrenChanged { 166 | auxes, _, err := z.conn.Children(auxRoot) 167 | if err != nil { 168 | log.Printf("failed to get children of aux servers root znode %s: %v", auxRoot, err) 169 | } 170 | 171 | if len(auxes) != 0 { 172 | handler(auxes) 173 | } 174 | } 175 | case <-stop: 176 | log.Printf("Stopping watch over data of aux servers root znode %s", auxRoot) 177 | return 178 | } 179 | 180 | } 181 | } 182 | 183 | func (z *Zookeeper) WatchRemoveAuxEphemeralData(handler func(node string), stop <-chan zk.Event) { 184 | 185 | for { 186 | _, _, childCh, err := z.conn.GetW(z.removeAuxEphemeralZnode) 187 | if err != nil { 188 | log.Printf("failed to get data from znode %s: %v", z.removeAuxEphemeralZnode, err) 189 | return 190 | } 191 | 192 | select { 193 | 194 | case event := <-childCh: 195 | if event.Type == zk.EventNodeDataChanged { 196 | 197 | data, _, err := z.conn.Get(z.removeAuxEphemeralZnode) 198 | log.Printf("Updated znode data: %s", string(data)) 199 | 200 | if err != nil { 201 | log.Printf("failed to get data from znode %s: %v", z.removeAuxEphemeralZnode, err) 202 | } 203 | 204 | if string(data) != "" { 205 | handler(string(data)) 206 | } 207 | 208 | log.Printf("data changed on ephemeral znode %s", z.removeAuxEphemeralZnode) 209 | } 210 | 211 | // node := string(data) 212 | // if node == "" { 213 | // log.Printf("Empty znode data: %s", string(data)) 214 | // continue 215 | // } else { 216 | // handler(node) 217 | // _, err := z.conn.Set(z.removeAuxEphemeralZnode, []byte{}, -1) 218 | // if err != nil { 219 | // log.Printf("failed to set data to znode %s: %v", z.removeAuxEphemeralZnode, err) 220 | // } 221 | // } 222 | case <-stop: 223 | log.Printf("Stopping watch over data of znode %s", z.removeAuxEphemeralZnode) 224 | return 225 | } 226 | 227 | } 228 | 229 | } 230 | 231 | func (z *Zookeeper) BroadcastRemoveAuxData(auxServer string) error { 232 | children, _, err := z.conn.Children(z.removeAuxParentZnode) 233 | if err != nil { 234 | return fmt.Errorf("failed to get children of the path %s: %v", z.removeAuxEphemeralZnode, err) 235 | } 236 | 237 | if len(children) == 0 { 238 | return nil 239 | } 240 | 241 | log.Printf("Children of parent znode %s: %v", z.removeAuxParentZnode, children) 242 | 243 | splitted := strings.Split(z.removeAuxEphemeralZnode, "/") 244 | thisEphemeralNode := splitted[len(splitted)-1] 245 | 246 | for i, ephemeralZnode := range children { 247 | if thisEphemeralNode == ephemeralZnode { 248 | continue 249 | } 250 | path := z.removeAuxParentZnode + "/" + ephemeralZnode 251 | log.Printf("ephemeral znode %d: %s", i, path) 252 | 253 | if _, err := z.conn.Set(path, []byte(auxServer), -1); err != nil { 254 | log.Printf("failed to set data to znode %s: %v", path, err) 255 | } 256 | } 257 | return nil 258 | } 259 | func (z *Zookeeper) LockAndRelease(callback func(value string), param string) error { 260 | lockPath := "/remove-auxes" 261 | 262 | exists, _, err := z.conn.Exists(lockPath) 263 | if err != nil { 264 | return fmt.Errorf("failed to check if master lock path exists: %v", err) 265 | } 266 | 267 | if !exists { 268 | _, err := z.conn.Create(lockPath, []byte{}, 0, zk.WorldACL(zk.PermAll)) 269 | if err != nil { 270 | return fmt.Errorf("failed to create lock path %s: %v", lockPath, err) 271 | } 272 | } 273 | 274 | // Acquire lock 275 | lockNode, err := z.conn.Create(lockPath+"/lock-", []byte{}, zk.FlagEphemeral|zk.FlagSequence, zk.WorldACL(zk.PermAll)) 276 | if err != nil { 277 | return fmt.Errorf("failed to create sequential lock node: %v", err) 278 | } 279 | 280 | for { 281 | children, _, err := z.conn.Children(lockPath) 282 | if err != nil { 283 | return fmt.Errorf("failed to get children of the path %s: %v", lockPath, err) 284 | } 285 | 286 | log.Printf("master lock children: %v", children) 287 | 288 | sort.Strings(children) 289 | if lockNode == lockPath+"/"+children[0] { 290 | // call remove aux node 291 | callback(param) 292 | log.Printf("Removed aux %s from master", param) 293 | break 294 | } 295 | 296 | lockNodeIndex := sort.SearchStrings(children, lockNode[len(lockPath)+1:]) 297 | waitNode := children[lockNodeIndex-1] 298 | _, _, waitChan, err := z.conn.GetW(lockPath + "/" + waitNode) 299 | if err != nil { 300 | return fmt.Errorf("failed to watch over %s: %v", lockPath+"/"+waitNode, err) 301 | } 302 | <-waitChan 303 | 304 | } 305 | 306 | err = z.conn.Delete(lockNode, -1) 307 | if err != nil { 308 | return fmt.Errorf("failed to delete lock node %s: %v", lockNode, err) 309 | } 310 | 311 | return nil 312 | 313 | } 314 | -------------------------------------------------------------------------------- /nginx/conf.d/nginx.conf: -------------------------------------------------------------------------------- 1 | worker_processes auto; 2 | 3 | events { 4 | worker_connections 2048; 5 | } 6 | 7 | http { 8 | keepalive_timeout 75s; 9 | keepalive_requests 1000; 10 | 11 | proxy_http_version 1.1; 12 | proxy_set_header Connection ""; 13 | 14 | server { 15 | listen 3000; 16 | location / { 17 | proxy_pass http://master:8000/; 18 | } 19 | } 20 | } -------------------------------------------------------------------------------- /prometheus/prometheus.yml: -------------------------------------------------------------------------------- 1 | global: 2 | scrape_interval: "10s" 3 | scrape_timeout: "10s" 4 | evaluation_interval: "10s" 5 | 6 | scrape_configs: 7 | - job_name: "master" 8 | static_configs: 9 | - targets: 10 | - "master:8000" 11 | - job_name: "aux1" 12 | static_configs: 13 | - targets: 14 | - "aux1:3001" 15 | - job_name: "aux2" 16 | static_configs: 17 | - targets: 18 | - "aux2:3002" 19 | - job_name: "aux3" 20 | static_configs: 21 | - targets: 22 | - "aux3:3003" -------------------------------------------------------------------------------- /zookeeper/zoo.cfg: -------------------------------------------------------------------------------- 1 | tickTime=2000 2 | dataDir=/data 3 | clientPort=2181 --------------------------------------------------------------------------------