├── README.md ├── docker ├── Dockerfile ├── custom-entrypoint ├── elasticsearch.yml └── log4j2.properties └── kubernetes ├── dev-namespace.yml ├── elasticsearch.yml ├── es-discovery-svc.yml ├── es-ia-svc.yml ├── es-lb-svc.yml └── gce-standard-sc.yml /README.md: -------------------------------------------------------------------------------- 1 | # ElasticSearch Deployment on Kubernetes 2 | 3 | 4 | This is a step by step guide for the deployment of Elasticsearch on top of Kubernetes using the content provided in this repository. The provisioned setup has been tested on Google Container Engine, using Google’s standard persistent disk as the persistence storage for the Elasticsearch nodes, with the custom Elasticsearch Docker image being hosted on the docker hub. 5 | 6 | The docker image has been uploaded to https://hub.docker.com/r/anchormen/elasticsearch-kubernetes/ and is referenced from the kubernetes statefulset object definition as ```anchormen/elasticsearch-kubernetes:5.6.0```. 7 | 8 | ### Deployment Instructions 9 | 1. You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster 10 | 2. clone this repository 11 | 3. cd ```kubernetes``` 12 | 4. create a development namespace ```kubectl create -f dev-namespace.yml``` 13 | 5. permanently save the namespace for all subsequent kubectl commands in that context ```kubectl config set-context $(kubectl config current-context) --namespace=development``` otherwise include ```--namespace=development``` in all subsequent ```kubectl``` commands 14 | 6. create the dynamic allocation storage class ```kubectl create -f gce-standard-sc.yml``` 15 | **note:** storage classes spans multiple namespaces (i.e. the same storage class can be used across all the namespaces within a single project) 16 | 7. create the statefulset headless service ```kubectl create -f es-discovery-svc.yml``` 17 | 8. create the cluster-ip service ```kubectl create -f es-ia-svc.yml``` 18 | 9. create the load-balancer service ```kubectl create -f es-lb-svc.yml``` 19 | 10. create the stateful set ```kubectl create -f elasticsearch.yml``` 20 | 21 | ### Verification 22 | 1. verify the statefulset has been created and all the pods are alive ```kubectl get statefulset -w``` 23 | - use ```-w``` to watch changes that occur on the statefulset; 24 | - keep watching until the current number of pods is ```3```; 25 | - once the current number of pods is 3, we have 3 elasticsearch pods that are up and running with the names ```elasticsearch-$i``` where $i={0,1,2} 26 | 27 | 2. to check the logs for a given pod use ```kubectl logs elasticsearch-$i``` 28 | - use ```-f``` to keep tracking the logs for the given pod 29 | 30 | 3. given the statefulset is created, is absolute verification that the headless-service is created properly and is properfly referenced by the elasticsearch nodes 31 | 32 | 4. verify the cluster-ip and load-balancer services are created; ```kubectl get svc``` 33 | - the cluester-ip service is used for internal access 34 | - the load-balancer service is used for external access, through the provided EXTERNAL-IP -------------------------------------------------------------------------------- /docker/Dockerfile: -------------------------------------------------------------------------------- 1 | # Dockerfile building custom elasticsearch image 2 | # The target image is used for deployment on Kubernetes 3 | # Solves the issue presneted in: https://github.com/kubernetes/kubernetes/issues/3595 4 | # Applies the solution presented in: 5 | # https://github.com/kubernetes/kubernetes/issues/3595#issuecomment-287692878 [Kubernetes Side] 6 | # https://github.com/kubernetes/kubernetes/issues/3595#issuecomment-288451522 [This image] 7 | # This image is being extended: 8 | # ------------------------------ 9 | # https://github.com/elastic/elasticsearch-docker/blob/master/templates/Dockerfile.j2 10 | 11 | 12 | FROM docker.elastic.co/elasticsearch/elasticsearch:5.6.0 13 | 14 | WORKDIR /usr/share/elasticsearch 15 | 16 | USER root 17 | 18 | # copying custom-entrypoint and configuration (elasticsearch.yml, log4j2.properties) 19 | # to their respective directories in /usr/share/elasticsearch (already the WORKDIR) 20 | COPY custom-entrypoint bin/ 21 | COPY elasticsearch.yml config/ 22 | COPY log4j2.properties config/ 23 | 24 | # assuring "elasticsearch" user have appropriate access to configuration and custom-entrypoint 25 | # make sure custom-entrypoint is executable 26 | RUN chown elasticsearch:elasticsearch config/elasticsearch.yml config/log4j2.properties bin/custom-entrypoint && \ 27 | chmod 0750 bin/custom-entrypoint 28 | 29 | # start by running the custom entrypoint (as root) 30 | CMD ["/bin/bash", "bin/custom-entrypoint"] 31 | -------------------------------------------------------------------------------- /docker/custom-entrypoint: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # This is expected to run as root for setting the ulimits 3 | 4 | set -e 5 | ################################################################################## 6 | # ensure increased ulimits - for nofile - for the Elasticsearch containers 7 | # the limit on the number of files that a single process can have open at a time (default is 1024) 8 | ulimit -n 65536 9 | 10 | # ensure increased ulimits - for nproc - for the Elasticsearch containers 11 | # the limit on the number of processes that elasticsearch can create 12 | # 2048 is min to pass the linux checks (default is 50) 13 | # https://www.elastic.co/guide/en/elasticsearch/reference/current/max-number-threads-check.html 14 | ulimit -u 2048 15 | 16 | # swapping needs to be disabled for performance and node stability 17 | # in ElasticSearch config we are using: [bootstrap.memory_lock=true] 18 | # this additionally requires the "memlock: true" ulimit; specifically set for each container 19 | # -l: max locked memory 20 | ulimit -l unlimited 21 | 22 | # running command to start elasticsearch 23 | # passing all inputs of this entry point script to the es-docker startup script 24 | # NOTE: this entry point script is run as root; but executes the es-docker 25 | # startup script as the elasticsearch user, passing all the root environment-variables 26 | # to the elasticsearch user 27 | su elasticsearch bin/es-docker "$@" 28 | -------------------------------------------------------------------------------- /docker/elasticsearch.yml: -------------------------------------------------------------------------------- 1 | # attaching the namespace to the cluster.name to differentiate different clusters 2 | # ex. elasticsearh-acceptance, elasticsearh-production, elasticsearh-monitoring 3 | cluster.name: "elasticsearch-${NAMESPACE}" 4 | 5 | # we provide a node.name that is the POD_NAME-NAMESPACE 6 | # ex. elasticsearh-0-acceptance, elasticsearh-1-acceptance, elasticsearh-2-acceptance 7 | node.name: "${POD_NAME}-${NAMESPACE}" 8 | 9 | network.host: ${POD_IP} 10 | 11 | # A hostname that resolves to multiple IP addresses will try all resolved addresses 12 | # we provide the name for the headless service 13 | # which resolves to the ip addresses of all the live attached pods 14 | # alternatively we can directly reference the hostnames of the pods 15 | discovery.zen.ping.unicast.hosts: es-discovery-svc 16 | 17 | # minimum_master_nodes need to be explicitly set when bound on a public IP 18 | # set to 1 to allow single node clusters 19 | # more info: https://github.com/elastic/elasticsearch/pull/17288 20 | discovery.zen.minimum_master_nodes: 2 21 | 22 | bootstrap.memory_lock: true 23 | 24 | #------------------------------------------------------------------------------------- 25 | # RECOVERY: https://www.elastic.co/guide/en/elasticsearch/guide/current/important-configuration-changes.html 26 | # SETTINGS TO avoid the excessive shard swapping that can occur on cluster restarts 27 | #------------------------------------------------------------------------------------- 28 | # how many nodes shall be present to consider the cluster functional; 29 | # prevents Elasticsearch from starting recovery until these nodes are available 30 | gateway.recover_after_nodes: 2 31 | 32 | # how many nodes are expected in the cluster 33 | gateway.expected_nodes: 3 34 | 35 | # how long we want to wait after [gateway.recover_after_nodes] is reached in order to start recovery process (if applicable). 36 | gateway.recover_after_time: 5m 37 | #------------------------------------------------------------------------------------- 38 | 39 | # The following settings control the fault detection process using the discovery.zen.fd prefix: 40 | # How often a node gets pinged. Defaults to 1s. 41 | discovery.zen.fd.ping_interval: 1s 42 | 43 | # How long to wait for a ping response, defaults to 30s. 44 | discovery.zen.fd.ping_timeout: 10s 45 | 46 | # How many ping failures / timeouts cause a node to be considered failed. Defaults to 3. 47 | discovery.zen.fd.ping_retries: 2 48 | -------------------------------------------------------------------------------- /docker/log4j2.properties: -------------------------------------------------------------------------------- 1 | status = error 2 | 3 | appender.console.type = Console 4 | appender.console.name = console 5 | appender.console.layout.type = PatternLayout 6 | appender.console.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] %marker%m%n 7 | 8 | rootLogger.level = info 9 | rootLogger.appenderRef.console.ref = console -------------------------------------------------------------------------------- /kubernetes/dev-namespace.yml: -------------------------------------------------------------------------------- 1 | kind: Namespace 2 | apiVersion: v1 3 | metadata: 4 | name: development 5 | labels: 6 | name: development -------------------------------------------------------------------------------- /kubernetes/elasticsearch.yml: -------------------------------------------------------------------------------- 1 | apiVersion: apps/v1beta1 2 | kind: StatefulSet 3 | metadata: 4 | name: elasticsearch 5 | labels: 6 | app: elasticsearch 7 | spec: 8 | # the headless-service that governs this StatefulSet 9 | # responsible for the network identity of the set. 10 | serviceName: es-discovery-svc 11 | replicas: 3 12 | # Template is the object that describes the pod that will be created 13 | template: 14 | metadata: 15 | labels: 16 | app: elasticsearch 17 | spec: 18 | securityContext: 19 | # allows read/write access for mounted volumes 20 | # by users that belong to a group with gid: 1000 21 | fsGroup: 1000 22 | initContainers: 23 | # init-container for setting the mmap count limit 24 | - name: sysctl 25 | image: busybox 26 | imagePullPolicy: IfNotPresent 27 | command: ["sysctl", "-w", "vm.max_map_count=262144"] 28 | securityContext: 29 | privileged: true 30 | containers: 31 | - name: elasticsearch 32 | securityContext: 33 | # applying fix in: https://github.com/kubernetes/kubernetes/issues/3595#issuecomment-287692878 34 | # https://docs.docker.com/engine/reference/run/#operator-exclusive-options 35 | capabilities: 36 | add: 37 | # Lock memory (mlock(2), mlockall(2), mmap(2), shmctl(2)) 38 | - IPC_LOCK 39 | # Override resource Limits 40 | - SYS_RESOURCE 41 | image: anchormen/elasticsearch-kubernetes:5.6.0 42 | imagePullPolicy: Always 43 | ports: 44 | - containerPort: 9300 45 | name: transport 46 | protocol: TCP 47 | - containerPort: 9200 48 | name: http 49 | protocol: TCP 50 | env: 51 | # environment variables to be directly refrenced from the configuration 52 | - name: NAMESPACE 53 | valueFrom: 54 | fieldRef: 55 | fieldPath: metadata.namespace 56 | - name: POD_NAME 57 | valueFrom: 58 | fieldRef: 59 | fieldPath: metadata.name 60 | - name: POD_IP 61 | valueFrom: 62 | fieldRef: 63 | fieldPath: status.podIP 64 | # elasticsearch heapsize (to be adjusted based on need) 65 | - name: "ES_JAVA_OPTS" 66 | value: "-Xms2g -Xmx2g" 67 | # mounting storage persistent volume completely on the data dir 68 | volumeMounts: 69 | - name: es-data-vc 70 | mountPath: /usr/share/elasticsearch/data 71 | # The StatefulSet guarantees that a given [POD] network identity will 72 | # always map to the same storage identity 73 | volumeClaimTemplates: 74 | - metadata: 75 | name: es-data-vc 76 | spec: 77 | accessModes: [ "ReadWriteOnce" ] 78 | resources: 79 | requests: 80 | # elasticsearch mounted data directory size (to be adjusted based on need) 81 | storage: 20Gi 82 | storageClassName: gce-standard-sc 83 | # no LabelSelector defined 84 | # claims can specify a label selector to further filter the set of volumes 85 | # currently, a PVC with a non-empty selector can not have a PV dynamically provisioned for it 86 | # no volumeName is provided -------------------------------------------------------------------------------- /kubernetes/es-discovery-svc.yml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: Service 3 | metadata: 4 | name: es-discovery-svc 5 | labels: 6 | app: es-discovery-svc 7 | spec: 8 | # the set of Pods targeted by this Service are determined by the Label Selector 9 | selector: 10 | app: elasticsearch 11 | # exposing elasticsearch transport port (only) 12 | # this service will be used by es-nodes for discovery; 13 | # communication between es-nodes happens through 14 | # the transport port (9300) 15 | ports: 16 | - protocol: TCP 17 | # port exposed by the service (service reacheable at) 18 | port: 9300 19 | # port exposed by the Pod(s) the service abstracts (pod reacheable at) 20 | # can be a string representing the name of the port @the pod (ex. transport) 21 | targetPort: 9300 22 | name: transport 23 | # specifying this is a headless service by providing ClusterIp "None" 24 | clusterIP: None 25 | -------------------------------------------------------------------------------- /kubernetes/es-ia-svc.yml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: Service 3 | metadata: 4 | name: es-ia-svc 5 | labels: 6 | app: es-ia-svc 7 | spec: 8 | selector: 9 | app: elasticsearch 10 | ports: 11 | - name: http 12 | port: 9200 13 | protocol: TCP 14 | - name: transport 15 | port: 9300 16 | protocol: TCP 17 | -------------------------------------------------------------------------------- /kubernetes/es-lb-svc.yml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: Service 3 | metadata: 4 | name: es-lb-svc 5 | labels: 6 | app: es-lb-svc 7 | spec: 8 | type: LoadBalancer 9 | selector: 10 | app: elasticsearch 11 | ports: 12 | - name: http 13 | port: 9200 14 | protocol: TCP 15 | - name: transport 16 | port: 9300 17 | protocol: TCP -------------------------------------------------------------------------------- /kubernetes/gce-standard-sc.yml: -------------------------------------------------------------------------------- 1 | kind: StorageClass 2 | apiVersion: storage.k8s.io/v1beta1 3 | metadata: 4 | name: gce-standard-sc 5 | provisioner: kubernetes.io/gce-pd 6 | parameters: 7 | type: pd-standard 8 | zone: europe-west1-b 9 | --------------------------------------------------------------------------------