├── single-account-single-cluster-multi-env ├── .gitignore ├── 20.iam-roles-for-eks │ ├── README.md │ ├── backend.tf │ ├── providers.tf │ ├── variables.tf │ ├── outputs.tf │ ├── main.tf │ └── locals.tf ├── 40.observability │ ├── 40.aws-native-observability │ │ ├── README.md │ │ ├── outputs.tf │ │ ├── backend.tf │ │ ├── remote_state.tf │ │ ├── locals.tf │ │ ├── variables.tf │ │ ├── main.tf │ │ └── providers.tf │ └── 45.aws-oss-observability │ │ ├── outputs.tf │ │ ├── prometheus.tf │ │ ├── backend.tf │ │ ├── data.tf │ │ ├── grafana-operator-manifests │ │ ├── infrastructure │ │ │ ├── amg-grafana.yaml │ │ │ └── amp-datasource.yaml │ │ └── eks │ │ │ └── apiserver-dashboards.yaml │ │ ├── remote_state.tf │ │ ├── adot_collector.tf │ │ ├── locals.tf │ │ ├── providers.tf │ │ ├── observability_addons.tf │ │ ├── managed_scraper.tf │ │ ├── grafana.tf │ │ ├── variables.tf │ │ ├── aws_prometheus_scraper_configuration │ │ ├── grafana_operator.tf │ │ └── README.md ├── 30.eks │ ├── 30.cluster │ │ ├── backend.tf │ │ ├── remote_state.tf │ │ ├── variables.tf │ │ ├── outputs.tf │ │ ├── locals.tf │ │ ├── karpenter │ │ │ └── default.yaml │ │ ├── providers.tf │ │ ├── auto-mode │ │ │ └── default.yaml │ │ ├── karpenter.tf │ │ ├── README.md │ │ └── main.tf │ └── 35.addons │ │ ├── backend.tf │ │ ├── remote_state.tf │ │ ├── variables.tf │ │ ├── locals.tf │ │ ├── app_example │ │ └── app_manifest.yaml │ │ ├── values.yaml │ │ ├── outputs.tf │ │ ├── providers.tf │ │ ├── main.tf │ │ └── README.md ├── 10.networking │ ├── backend.tf │ ├── providers.tf │ ├── outputs.tf │ ├── variables.tf │ ├── README.md │ └── main.tf ├── img │ ├── amg-dashboard.png │ ├── amg-dashboards-panel.png │ ├── amg-cluster-dashboard.png │ └── grafana-admin-identity-center.png ├── eks_workload_ready_cluster_reference_architecture_NEW.jpg ├── automated-provisioning-of-application-ready-amazon-eks-clusters.png ├── 00.global │ └── vars │ │ ├── dev.tfvars │ │ ├── prod.tfvars │ │ ├── automode.tfvars │ │ └── example.tfvars ├── Makefile └── README.md ├── .pre-commit-config.yaml ├── CODE_OF_CONDUCT.md ├── .github └── ISSUE_TEMPLATE │ └── feature_request.md ├── LICENSE ├── .gitignore ├── CONTRIBUTING.md └── README.md /single-account-single-cluster-multi-env/.gitignore: -------------------------------------------------------------------------------- 1 | tf-logs -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/20.iam-roles-for-eks/README.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/40.aws-native-observability/README.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/outputs.tf: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/40.aws-native-observability/outputs.tf: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/30.cluster/backend.tf: -------------------------------------------------------------------------------- 1 | # Configure remote state to use S3 and DynamoDB 2 | terraform { 3 | backend "s3" { 4 | key = "eks/terraform.tfstate" 5 | } 6 | } 7 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | repos: 2 | - repo: https://github.com/antonbabenko/pre-commit-terraform 3 | rev: v1.88.0 4 | hooks: 5 | - id: terraform_docs 6 | args: 7 | - --args="--lockfile=false" 8 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/35.addons/backend.tf: -------------------------------------------------------------------------------- 1 | # Configure remote state to use S3 and DynamoDB 2 | terraform { 3 | backend "s3" { 4 | key = "eks-addons/terraform.tfstate" 5 | } 6 | } 7 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/10.networking/backend.tf: -------------------------------------------------------------------------------- 1 | # Configure remote state to use S3 and DynamoDB 2 | terraform { 3 | backend "s3" { 4 | key = "networking/vpc/terraform.tfstate" 5 | } 6 | } 7 | 8 | 9 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/20.iam-roles-for-eks/backend.tf: -------------------------------------------------------------------------------- 1 | # Configure remote state to use S3 and DynamoDB 2 | terraform { 3 | backend "s3" { 4 | key = "iam/roles/terraform.tfstate" 5 | } 6 | } 7 | 8 | 9 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/prometheus.tf: -------------------------------------------------------------------------------- 1 | resource "aws_prometheus_workspace" "this" { 2 | count = var.observability_configuration.aws_oss_tooling ? 1 : 0 3 | alias = local.name 4 | } 5 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/40.aws-native-observability/backend.tf: -------------------------------------------------------------------------------- 1 | # Configure remote state to use S3 and DynamoDB 2 | terraform { 3 | backend "s3" { 4 | key = "observability/aws-native/infra/terraform.tfstate" 5 | } 6 | } 7 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/backend.tf: -------------------------------------------------------------------------------- 1 | # Configure remote state to use S3 and DynamoDB 2 | terraform { 3 | backend "s3" { 4 | key = "observability/aws-oss/infra/terraform.tfstate" 5 | } 6 | } 7 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/img/amg-dashboard.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters/HEAD/single-account-single-cluster-multi-env/img/amg-dashboard.png -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/img/amg-dashboards-panel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters/HEAD/single-account-single-cluster-multi-env/img/amg-dashboards-panel.png -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/img/amg-cluster-dashboard.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters/HEAD/single-account-single-cluster-multi-env/img/amg-cluster-dashboard.png -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/img/grafana-admin-identity-center.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters/HEAD/single-account-single-cluster-multi-env/img/grafana-admin-identity-center.png -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/20.iam-roles-for-eks/providers.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_providers { 3 | aws = { 4 | source = "hashicorp/aws" 5 | version = "~> 5.40" 6 | } 7 | } 8 | } 9 | provider "aws" { 10 | default_tags { 11 | tags = local.tags 12 | } 13 | } 14 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/eks_workload_ready_cluster_reference_architecture_NEW.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters/HEAD/single-account-single-cluster-multi-env/eks_workload_ready_cluster_reference_architecture_NEW.jpg -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/automated-provisioning-of-application-ready-amazon-eks-clusters.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters/HEAD/single-account-single-cluster-multi-env/automated-provisioning-of-application-ready-amazon-eks-clusters.png -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/10.networking/providers.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.0.0" 3 | 4 | required_providers { 5 | aws = { 6 | source = "hashicorp/aws" 7 | version = "~> 5.40" 8 | } 9 | } 10 | } 11 | 12 | provider "aws" { 13 | default_tags { 14 | tags = local.tags 15 | } 16 | } 17 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/35.addons/remote_state.tf: -------------------------------------------------------------------------------- 1 | data "terraform_remote_state" "eks" { 2 | backend = "s3" 3 | workspace = terraform.workspace 4 | config = { 5 | bucket = "tfstate-${data.aws_caller_identity.current.account_id}" 6 | key = "eks/terraform.tfstate" 7 | region = local.tfstate_region 8 | } 9 | } 10 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/20.iam-roles-for-eks/variables.tf: -------------------------------------------------------------------------------- 1 | variable "tags" { 2 | description = "Tags to apply to resources" 3 | type = map(string) 4 | default = { 5 | 6 | } 7 | } 8 | 9 | variable "shared_config" { 10 | description = "Shared configuration across all modules/folders" 11 | type = map(any) 12 | default = {} 13 | } 14 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/40.aws-native-observability/remote_state.tf: -------------------------------------------------------------------------------- 1 | data "terraform_remote_state" "eks" { 2 | backend = "s3" 3 | workspace = terraform.workspace 4 | config = { 5 | bucket = "tfstate-${data.aws_caller_identity.current.account_id}" 6 | key = "eks/terraform.tfstate" 7 | region = local.tfstate_region 8 | } 9 | } 10 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/data.tf: -------------------------------------------------------------------------------- 1 | data "aws_region" "current" {} 2 | data "aws_caller_identity" "current" {} 3 | data "aws_partition" "current" {} 4 | 5 | data "aws_eks_cluster_auth" "this" { 6 | name = local.eks_cluster_name 7 | } 8 | 9 | data "aws_eks_cluster" "this" { 10 | name = local.eks_cluster_name 11 | 12 | } 13 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/40.aws-native-observability/locals.tf: -------------------------------------------------------------------------------- 1 | locals { 2 | region = data.aws_region.current.id 3 | tfstate_region = try(var.tfstate_region, local.region) 4 | 5 | tags = merge( 6 | var.tags, 7 | { 8 | "Environment" : terraform.workspace 9 | "provisioned-by" : "aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters" 10 | } 11 | ) 12 | } 13 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/grafana-operator-manifests/infrastructure/amg-grafana.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: grafana.integreatly.org/v1beta1 2 | kind: Grafana 3 | metadata: 4 | name: external-grafana 5 | namespace: grafana-operator 6 | labels: 7 | dashboards: "external-grafana" 8 | amg-id: ${AMG_ID} 9 | spec: 10 | external: 11 | url: ${AMG_ENDPOINT_URL} 12 | apiKey: 13 | name: grafana-admin-credentials 14 | key: GF_SECURITY_ADMIN_APIKEY -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/10.networking/outputs.tf: -------------------------------------------------------------------------------- 1 | output "vpc_id" { 2 | description = "The ID of the VPC" 3 | value = module.vpc.vpc_id 4 | } 5 | 6 | output "private_subnet_ids" { 7 | description = "List of IDs of private subnets" 8 | value = module.vpc.private_subnets 9 | } 10 | 11 | output "public_subnet_ids" { 12 | description = "List of IDs of public subnets" 13 | value = module.vpc.public_subnets 14 | } 15 | 16 | output "intra_subnet_ids" { 17 | description = "List of IDs of intra subnets" 18 | value = module.vpc.intra_subnets 19 | } 20 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/20.iam-roles-for-eks/outputs.tf: -------------------------------------------------------------------------------- 1 | 2 | # Output IAM roles map with names and ARNs 3 | output "iam_roles_map" { 4 | value = { 5 | for role_name, role_config in aws_iam_role.iam_roles : role_name => role_config.arn 6 | } 7 | } 8 | 9 | output "iam_roles_aws_auth_list" { 10 | value = [ 11 | for r in { 12 | for role_name, role_obj in local.iam_roles : role_name => { 13 | "rolearn" = aws_iam_role.iam_roles[role_name].arn 14 | "username" = "system:node:{{${aws_iam_role.iam_roles[role_name].name}}}" 15 | "groups" = ["system:masters"] 16 | } 17 | } 18 | : r] 19 | } 20 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/30.cluster/remote_state.tf: -------------------------------------------------------------------------------- 1 | 2 | data "terraform_remote_state" "vpc" { 3 | backend = "s3" 4 | workspace = terraform.workspace 5 | config = { 6 | bucket = "tfstate-${data.aws_caller_identity.current.account_id}" 7 | key = "networking/vpc/terraform.tfstate" 8 | region = local.tfstate_region 9 | } 10 | } 11 | 12 | data "terraform_remote_state" "iam" { 13 | backend = "s3" 14 | workspace = terraform.workspace 15 | config = { 16 | bucket = "tfstate-${data.aws_caller_identity.current.account_id}" 17 | key = "iam/roles/terraform.tfstate" 18 | region = local.tfstate_region 19 | } 20 | } 21 | 22 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Feature request 3 | about: Suggest an idea for this project 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Is your feature request related to a problem? Please describe.** 11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 12 | 13 | **Describe the solution you'd like** 14 | A clear and concise description of what you want to happen. 15 | 16 | **Describe alternatives you've considered** 17 | A clear and concise description of any alternative solutions or features you've considered. 18 | 19 | **Additional context** 20 | Add any other context or screenshots about the feature request here. 21 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/40.aws-native-observability/variables.tf: -------------------------------------------------------------------------------- 1 | variable "tfstate_region" { 2 | description = "region where the terraform state is stored" 3 | type = string 4 | default = null 5 | } 6 | 7 | variable "tags" { 8 | description = "Tags to apply to resources" 9 | type = map(string) 10 | default = {} 11 | } 12 | 13 | variable "observability_configuration" { 14 | description = "observability configuration variable" 15 | type = object({ 16 | aws_oss_tooling = optional(bool, true) // AMP & AMG 17 | aws_native_tooling = optional(bool, false) // CW 18 | aws_oss_tooling_config = optional(map(any), {}) 19 | }) 20 | } 21 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/20.iam-roles-for-eks/main.tf: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | data "aws_caller_identity" "current" {} 6 | 7 | 8 | 9 | # Create IAM roles and attach policies 10 | resource "aws_iam_role" "iam_roles" { 11 | for_each = local.iam_roles 12 | 13 | name = "${local.name}-${each.value.role_name}" 14 | assume_role_policy = jsonencode({ 15 | Version = "2012-10-17", 16 | Statement = [ 17 | { 18 | Action = "sts:AssumeRole", 19 | Effect = "Allow", 20 | Principal = { 21 | AWS : "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root", // TODO: consider specific trust policy for those users 22 | }, 23 | }, 24 | ], 25 | }) 26 | } 27 | 28 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/00.global/vars/dev.tfvars: -------------------------------------------------------------------------------- 1 | # Dev environment variables 2 | vpc_cidr = "10.1.0.0/16" 3 | 4 | # custom tags to apply to all resources 5 | tags = { 6 | } 7 | 8 | shared_config = { 9 | resources_prefix = "wre" // WRE = Workload Ready EKS 10 | } 11 | 12 | cluster_config = { 13 | kubernetes_version = "1.34" 14 | private_eks_cluster = false 15 | } 16 | 17 | # Observability variables 18 | observability_configuration = { 19 | aws_oss_tooling = false 20 | aws_native_tooling = false 21 | aws_oss_tooling_config = { 22 | enable_managed_collector = false 23 | enable_adot_collector = false 24 | prometheus_name = "prom" 25 | enable_grafana_operator = true 26 | } 27 | } 28 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/grafana-operator-manifests/infrastructure/amp-datasource.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: grafana.integreatly.org/v1beta1 2 | kind: GrafanaDatasource 3 | metadata: 4 | name: grafanadatasource-amp 5 | namespace: grafana-operator 6 | spec: 7 | instanceSelector: 8 | matchLabels: 9 | dashboards: "external-grafana" 10 | datasource: 11 | name: amp-${ENVIRONMENT} 12 | type: prometheus 13 | access: proxy 14 | url: ${AMP_ENDPOINT_URL} 15 | isDefault: true 16 | jsonData: 17 | 'tlsSkipVerify': false 18 | 'timeInterval': "5s" 19 | 'sigV4Auth': true 20 | 'sigV4AuthType': "ec2_iam_role" 21 | 'sigV4Region': ${AMG_AWS_REGION} 22 | editable: true -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/00.global/vars/prod.tfvars: -------------------------------------------------------------------------------- 1 | # Prod environment variables 2 | vpc_cidr = "10.0.0.0/16" 3 | 4 | tags = { 5 | provisioned-by = "aws-samples/terraform-workloads-ready-eks-accelerator" 6 | } 7 | 8 | shared_config = { 9 | resources_prefix = "wre" // WRE = Workload Ready EKS 10 | } 11 | 12 | cluster_config = { 13 | kubernetes_version = 1.33 14 | private_eks_cluster = false 15 | } 16 | 17 | # Observability variables 18 | observability_configuration = { 19 | aws_oss_tooling = true 20 | aws_native_tooling = false 21 | aws_oss_tooling_config = { 22 | enable_managed_collector = true 23 | enable_adot_collector = false 24 | prometheus_name = "prom" 25 | enable_grafana_operator = true 26 | 27 | } 28 | } 29 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/35.addons/variables.tf: -------------------------------------------------------------------------------- 1 | variable "tfstate_region" { 2 | description = "region where the terraform state is stored" 3 | type = string 4 | default = null 5 | } 6 | 7 | variable "tags" { 8 | description = "Tags to apply to resources" 9 | type = map(string) 10 | default = {} 11 | } 12 | 13 | variable "cluster_config" { 14 | description = "cluster configurations such as version, public/private API endpoint, and more" 15 | type = any 16 | default = {} 17 | } 18 | 19 | variable "observability_configuration" { 20 | description = "observability configuration variable" 21 | type = object({ 22 | aws_oss_tooling = optional(bool, true) // AMP & AMG 23 | aws_native_tooling = optional(bool, false) // CW 24 | aws_oss_tooling_config = optional(map(any), {}) 25 | }) 26 | } 27 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/30.cluster/variables.tf: -------------------------------------------------------------------------------- 1 | variable "tfstate_region" { 2 | description = "region where the terraform state is stored" 3 | type = string 4 | default = null 5 | } 6 | 7 | variable "kms_key_admin_roles" { 8 | description = "list of role ARNs to add to the KMS policy" 9 | type = list(string) 10 | default = [] 11 | 12 | } 13 | 14 | variable "tags" { 15 | description = "Tags to apply to resources" 16 | type = map(string) 17 | default = {} 18 | } 19 | 20 | variable "cluster_config" { 21 | description = "cluster configurations such as version, public/private API endpoint, and more" 22 | type = any 23 | default = {} 24 | } 25 | 26 | variable "shared_config" { 27 | description = "Shared configuration across all modules/folders" 28 | type = map(any) 29 | default = {} 30 | } 31 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/35.addons/locals.tf: -------------------------------------------------------------------------------- 1 | locals { 2 | region = data.aws_region.current.id 3 | tfstate_region = try(var.tfstate_region, local.region) 4 | eks_auto_mode = try(var.cluster_config.eks_auto_mode, false) 5 | 6 | capabilities = { 7 | loadbalancing = try(var.cluster_config.capabilities.loadbalancing, !local.eks_auto_mode, true) 8 | gitops = try(var.cluster_config.capabilities.gitops, true) 9 | } 10 | 11 | critical_addons_tolerations = { 12 | tolerations = [ 13 | { 14 | key = "CriticalAddonsOnly", 15 | operator = "Exists", 16 | effect = "NoSchedule" 17 | } 18 | ] 19 | } 20 | 21 | tags = merge( 22 | var.tags, 23 | { 24 | "Environment" : terraform.workspace 25 | "provisioned-by" : "aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters" 26 | } 27 | ) 28 | } 29 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/remote_state.tf: -------------------------------------------------------------------------------- 1 | data "terraform_remote_state" "vpc" { 2 | backend = "s3" 3 | workspace = terraform.workspace 4 | config = { 5 | bucket = "tfstate-${data.aws_caller_identity.current.account_id}" 6 | key = "networking/vpc/terraform.tfstate" 7 | region = local.tfstate_region 8 | } 9 | } 10 | 11 | data "terraform_remote_state" "eks" { 12 | backend = "s3" 13 | workspace = terraform.workspace 14 | config = { 15 | bucket = "tfstate-${data.aws_caller_identity.current.account_id}" 16 | key = "eks/terraform.tfstate" 17 | region = local.tfstate_region 18 | } 19 | } 20 | 21 | data "terraform_remote_state" "eks_addons" { 22 | backend = "s3" 23 | workspace = terraform.workspace 24 | config = { 25 | bucket = "tfstate-${data.aws_caller_identity.current.account_id}" 26 | key = "eks-addons/terraform.tfstate" 27 | region = local.tfstate_region 28 | } 29 | } 30 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/00.global/vars/automode.tfvars: -------------------------------------------------------------------------------- 1 | # Example tfvars for deploying a Workload Ready EKS Cluster with EKS Auto Mode enabled 2 | 3 | #IPv4 CIDR for the cluster VPC 4 | vpc_cidr = "10.2.0.0/16" 5 | 6 | # custom tags to apply to all resources 7 | tags = { 8 | } 9 | 10 | shared_config = { 11 | resources_prefix = "wre-eks" // WRE = Workload Ready EKS 12 | } 13 | 14 | cluster_config = { 15 | kubernetes_version = "1.34" 16 | eks_auto_mode = true // When set to true, all other self-managed add-ons are set to false 17 | private_eks_cluster = false 18 | capabilities = { 19 | gitops = false 20 | } 21 | } 22 | 23 | # Observability variables 24 | observability_configuration = { 25 | aws_oss_tooling = false 26 | aws_native_tooling = true 27 | aws_oss_tooling_config = { 28 | enable_managed_collector = false 29 | enable_adot_collector = false 30 | prometheus_name = "prom" 31 | enable_grafana_operator = false 32 | } 33 | } 34 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT No Attribution 2 | 3 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of 6 | this software and associated documentation files (the "Software"), to deal in 7 | the Software without restriction, including without limitation the rights to 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 9 | the Software, and to permit persons to whom the Software is furnished to do so. 10 | 11 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 12 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 13 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 14 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 15 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 16 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 17 | 18 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/35.addons/app_example/app_manifest.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: argoproj.io/v1alpha1 2 | kind: Application 3 | metadata: 4 | name: guestbook 5 | namespace: argocd 6 | spec: 7 | project: default 8 | source: 9 | repoURL: https://github.com/argoproj/argocd-example-apps.git 10 | path: helm-guestbook 11 | targetRevision: HEAD 12 | helm: 13 | values: | 14 | affinity: 15 | nodeAffinity: 16 | requiredDuringSchedulingIgnoredDuringExecution: 17 | nodeSelectorTerms: 18 | - matchExpressions: 19 | - key: "karpenter.sh/nodepool" 20 | operator: In 21 | values: 22 | - "default" 23 | destination: 24 | server: "https://kubernetes.default.svc" 25 | namespace: ${ENV} # Make sure this is correctly replaced with the actual namespace you intend to use 26 | syncPolicy: 27 | automated: 28 | prune: false 29 | selfHeal: true 30 | syncOptions: 31 | - CreateNamespace=false 32 | - Validate=true 33 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/20.iam-roles-for-eks/locals.tf: -------------------------------------------------------------------------------- 1 | locals { 2 | tags = merge( 3 | var.tags, 4 | { 5 | "Environment" : terraform.workspace 6 | "provisioned-by" : "aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters" 7 | } 8 | 9 | ) 10 | name = "${var.shared_config.resources_prefix}-${terraform.workspace}" 11 | # The below IAM roles represent the default Kubernetes user-facing roles as documented in https://kubernetes.io/docs/reference/access-authn-authz/rbac/#user-facing-roles 12 | # and as supported by Amazon EKS Cluster Access Management 13 | iam_roles = { 14 | # cluster admin resources with wildcard permissions to any cluster resources 15 | EKSClusterAdmin = { 16 | role_name = "EKSClusterAdmin" 17 | attached_policies = [] 18 | }, 19 | EKSAdmin = { 20 | role_name = "EKSAdmin" 21 | attached_policies = [] 22 | }, 23 | EKSEdit = { 24 | role_name = "EKSEdit" 25 | attached_policies = [] 26 | }, 27 | EKSView = { 28 | role_name = "EKSView" 29 | attached_policies = [] 30 | }, 31 | } 32 | } 33 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/30.cluster/outputs.tf: -------------------------------------------------------------------------------- 1 | output "configure_kubectl" { 2 | description = "Configure kubectl: make sure you're logged in with the correct AWS profile and run the following command to update your kubeconfig" 3 | value = <<-EOT 4 | export KUBECONFIG="/tmp/${module.eks.cluster_name}" 5 | aws eks --region ${local.region} update-kubeconfig --name ${module.eks.cluster_name} 6 | EOT 7 | } 8 | 9 | output "cluster_name" { 10 | description = "The EKS Cluster version" 11 | value = module.eks.cluster_name 12 | 13 | } 14 | 15 | output "cluster_certificate_authority_data" { 16 | value = module.eks.cluster_certificate_authority_data 17 | } 18 | 19 | output "cluster_endpoint" { 20 | value = module.eks.cluster_endpoint 21 | } 22 | 23 | output "kubernetes_version" { 24 | description = "The EKS Cluster version" 25 | value = module.eks.cluster_version 26 | } 27 | 28 | output "oidc_provider_arn" { 29 | description = "The OIDC Provider ARN" 30 | value = module.eks.oidc_provider_arn 31 | } 32 | 33 | output "control_plane_subnet_ids" { 34 | description = "The Control Plane Subnet IDs" 35 | value = local.control_plane_subnet_ids 36 | } -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/adot_collector.tf: -------------------------------------------------------------------------------- 1 | # ADOT is a specific addon for observability, and therefore is managed in this part of the repo structure (and not in the addons folder) 2 | # Its dependencies however (such as cert-manager addon), are managed in the addons folder, as other capabilities might need it 3 | 4 | 5 | data "aws_eks_addon_version" "adot" { 6 | count = ( 7 | var.observability_configuration.aws_oss_tooling 8 | && var.observability_configuration.aws_oss_tooling_config.enable_adot_collector 9 | ) ? 1 : 0 10 | 11 | addon_name = "adot" 12 | kubernetes_version = data.aws_eks_cluster.this.version 13 | most_recent = true 14 | } 15 | 16 | 17 | resource "aws_eks_addon" "adot" { 18 | count = ( 19 | var.observability_configuration.aws_oss_tooling 20 | && var.observability_configuration.aws_oss_tooling_config.enable_adot_collector 21 | ) ? 1 : 0 22 | 23 | cluster_name = data.aws_eks_cluster.this.name 24 | addon_name = "adot" 25 | addon_version = data.aws_eks_addon_version.adot[0].version 26 | resolve_conflicts_on_update = "OVERWRITE" 27 | resolve_conflicts_on_create = "OVERWRITE" 28 | preserve = true 29 | 30 | configuration_values = "{\"collector\": {}}" 31 | } 32 | 33 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Local .terraform directories 2 | **/.terraform/* 3 | 4 | # .tfstate files 5 | *.tfstate 6 | *.tfstate.* 7 | 8 | # Crash log files 9 | crash.log 10 | crash.*.log 11 | 12 | # Exclude all .tfvars files, which are likely to contain sensitive data, such as 13 | # password, private keys, and other secrets. These should not be part of version 14 | # control as they are data points which are potentially sensitive and subject 15 | # to change depending on the environment. 16 | *.tfvars 17 | *.tfvars.json 18 | !dev.tfvars 19 | !prod.tfvars 20 | !am.tfvars 21 | !example.tfvars 22 | !base-env.tfvars 23 | 24 | # Ignore override files as they are usually used to override resources locally and so 25 | # are not checked in 26 | override.tf 27 | override.tf.json 28 | *_override.tf 29 | *_override.tf.json 30 | 31 | # Include override files you do wish to add to version control using negated pattern 32 | # !example_override.tf 33 | 34 | # Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan 35 | # example: *tfplan* 36 | 37 | # Ignore CLI configuration files 38 | .terraformrc 39 | terraform.rc 40 | 41 | ### Additional custom files 42 | .terraform.lock.hcl 43 | global-backend-config 44 | !global-backend-config.tmpl 45 | *-init.log 46 | *-plan.log 47 | *-apply.log 48 | *-destroy.log 49 | 50 | # General 51 | .DS_Store 52 | .idea/ 53 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/10.networking/variables.tf: -------------------------------------------------------------------------------- 1 | 2 | variable "vpc_cidr" { 3 | description = "CIDR block for the VPC" 4 | type = string 5 | default = "10.0.0.0/16" 6 | 7 | } 8 | 9 | variable "num_azs" { 10 | description = "Number of Availability Zones" 11 | type = number 12 | default = 3 13 | } 14 | 15 | variable "private_subnets_cidr_prefix" { 16 | description = "CIDR prefix for the private subnets" 17 | type = number 18 | default = 20 19 | } 20 | 21 | variable "public_subnets_cidr_prefix" { 22 | description = "CIDR prefix for the public subnets" 23 | type = number 24 | default = 24 25 | } 26 | 27 | variable "control_plane_subnets_cidr_prefix" { 28 | description = "CIDR prefix for the control plane subnets" 29 | type = number 30 | default = 28 31 | } 32 | 33 | variable "tags" { 34 | description = "Additional tags" 35 | type = map(string) 36 | default = {} 37 | } 38 | 39 | 40 | variable "cluster_config" { 41 | description = "cluster configurations such as version, public/private API endpoint, and more" 42 | type = any 43 | default = {} 44 | } 45 | 46 | variable "shared_config" { 47 | description = "Shared configuration across all modules/folders" 48 | type = map(any) 49 | default = {} 50 | } 51 | 52 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/locals.tf: -------------------------------------------------------------------------------- 1 | locals { 2 | 3 | name = "${var.shared_config.resources_prefix}-${terraform.workspace}" 4 | 5 | region = data.aws_region.current.name 6 | sso_region = try(var.observability_configuration.aws_oss_tooling_config.sso_region, local.region) 7 | tfstate_region = try(var.tfstate_region, local.region) 8 | 9 | eks_cluster_endpoint = data.aws_eks_cluster.this.endpoint 10 | eks_cluster_name = data.terraform_remote_state.eks.outputs.cluster_name 11 | 12 | grafana_workspace_name = local.name 13 | grafana_workspace_description = join("", ["Amazon Managed Grafana workspace for ${local.grafana_workspace_name}"]) 14 | grafana_workspace_api_expiration_days = 30 15 | grafana_workspace_api_expiration_seconds = 60 * 60 * 24 * local.grafana_workspace_api_expiration_days 16 | 17 | critical_addons_tolerations = { 18 | tolerations = [ 19 | { 20 | key = "CriticalAddonsOnly", 21 | operator = "Exists", 22 | effect = "NoSchedule" 23 | } 24 | ] 25 | } 26 | 27 | tags = merge( 28 | var.tags, 29 | { 30 | "Environment" : terraform.workspace 31 | "provisioned-by" : "aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters" 32 | } 33 | ) 34 | } 35 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/providers.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.1.0" 3 | 4 | required_providers { 5 | aws = { 6 | source = "hashicorp/aws" 7 | version = "~> 5.40" 8 | } 9 | kubernetes = { 10 | source = "hashicorp/kubernetes" 11 | version = "~> 2.22" 12 | } 13 | kubectl = { 14 | source = "alekc/kubectl" 15 | version = "~> 2.0" 16 | } 17 | helm = { 18 | source = "hashicorp/helm" 19 | version = "~> 2.4" 20 | } 21 | time = { 22 | source = "hashicorp/time" 23 | version = "0.10.0" 24 | } 25 | null = { 26 | source = "hashicorp/null" 27 | version = "~> 3.0" 28 | } 29 | } 30 | } 31 | 32 | provider "aws" { 33 | default_tags { 34 | tags = local.tags 35 | } 36 | } 37 | 38 | provider "kubernetes" { 39 | host = local.eks_cluster_endpoint 40 | cluster_ca_certificate = base64decode(data.aws_eks_cluster.this.certificate_authority[0].data) 41 | token = data.aws_eks_cluster_auth.this.token 42 | } 43 | 44 | provider "helm" { 45 | kubernetes { 46 | host = local.eks_cluster_endpoint 47 | cluster_ca_certificate = base64decode(data.aws_eks_cluster.this.certificate_authority[0].data) 48 | token = data.aws_eks_cluster_auth.this.token 49 | } 50 | } 51 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/00.global/vars/example.tfvars: -------------------------------------------------------------------------------- 1 | vpc_cidr = "10.5.0.0/16" 2 | 3 | # custom tags to apply to all resources 4 | tags = { 5 | } 6 | 7 | shared_config = { 8 | resources_prefix = "wre" // WRE = Workload Ready EKS 9 | } 10 | 11 | cluster_config = { 12 | kubernetes_version = "1.33" 13 | eks_auto_mode = true // When enabled, all capabilities will by false by default, unless specified otherwise 14 | private_eks_cluster = false 15 | create_mng_system = false // CriticalAddons MNG NodeGroup 16 | // When eks_auto_mode = true, those are false by default 17 | // Except: gitops 18 | capabilities = { 19 | kube_proxy = false // kube proxy 20 | networking = false // VPC CNI 21 | coredns = false // CoreDNS 22 | identity = false // Pod Identity 23 | autoscaling = false // Karpenter 24 | blockstorage = false // EBS CSI Driver 25 | loadbalancing = false // LB Controller 26 | gitops = false // ArgocD 27 | } 28 | } 29 | 30 | # Observability variables 31 | observability_configuration = { 32 | aws_oss_tooling = false 33 | aws_native_tooling = true 34 | aws_oss_tooling_config = { 35 | sso_region = "us-east-1" // IAM Identity Center fka SSO region 36 | enable_managed_collector = true 37 | enable_adot_collector = false 38 | prometheus_name = "prom" 39 | enable_grafana_operator = true 40 | } 41 | } 42 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/observability_addons.tf: -------------------------------------------------------------------------------- 1 | resource "helm_release" "prometheus_node_exporter" { 2 | count = var.observability_configuration.aws_oss_tooling ? 1 : 0 3 | chart = var.ne_config.helm_chart_name 4 | create_namespace = var.ne_config.create_namespace 5 | namespace = var.ne_config.k8s_namespace 6 | name = var.ne_config.helm_release_name 7 | version = var.ne_config.helm_chart_version 8 | repository = var.ne_config.helm_repo_url 9 | 10 | dynamic "set" { 11 | for_each = var.ne_config.helm_settings 12 | content { 13 | name = set.key 14 | value = set.value 15 | } 16 | } 17 | } 18 | 19 | resource "helm_release" "kube_state_metrics" { 20 | count = var.observability_configuration.aws_oss_tooling ? 1 : 0 21 | chart = var.ksm_config.helm_chart_name 22 | create_namespace = var.ksm_config.create_namespace 23 | namespace = var.ksm_config.k8s_namespace 24 | name = var.ksm_config.helm_release_name 25 | version = var.ksm_config.helm_chart_version 26 | repository = var.ksm_config.helm_repo_url 27 | values = [ 28 | yamlencode(local.critical_addons_tolerations) 29 | ] 30 | 31 | dynamic "set" { 32 | for_each = var.ksm_config.helm_settings 33 | content { 34 | name = set.key 35 | value = set.value 36 | } 37 | } 38 | } 39 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/35.addons/values.yaml: -------------------------------------------------------------------------------- 1 | redis-ha: 2 | enabled: false 3 | controller: 4 | replicas: 1 5 | server: 6 | replicas: 1 7 | ingress: 8 | enabled: true 9 | ingressClassName: alb 10 | hosts: 11 | - ${FQDN} 12 | annotations: 13 | kubernetes.io/ingress.class: alb 14 | alb.ingress.kubernetes.io/backend-protocol: HTTPS 15 | alb.ingress.kubernetes.io/group.name: ${LB_NAME} 16 | alb.ingress.kubernetes.io/healthcheck-interval-seconds: "30" 17 | alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]' 18 | alb.ingress.kubernetes.io/load-balancer-attributes: routing.http2.enabled=true 19 | alb.ingress.kubernetes.io/load-balancer-name: ${LB_NAME} 20 | alb.ingress.kubernetes.io/scheme: internet-facing 21 | alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-FS-1-2-2019-08 22 | alb.ingress.kubernetes.io/tags: "env=${ENV},terraform=true" 23 | alb.ingress.kubernetes.io/target-type: ip 24 | ingressGrpc: 25 | enabled: true 26 | isAWSALB: true 27 | awsALB: 28 | serviceType: ClusterIP 29 | backendProtocolVersion: GRPC 30 | hosts: 31 | - ${FQDN} 32 | repoServer: 33 | replicas: 1 34 | applicationSet: 35 | replicas: 1 36 | global: 37 | affinity: 38 | nodeAffinity: 39 | requiredDuringSchedulingIgnoredDuringExecution: 40 | nodeSelectorTerms: 41 | - matchExpressions: 42 | - key: role 43 | operator: In 44 | values: 45 | - core -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/grafana-operator-manifests/eks/apiserver-dashboards.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: grafana.integreatly.org/v1beta1 2 | kind: GrafanaDashboard 3 | metadata: 4 | name: apiserver-basic-grafanadashboard 5 | namespace: grafana-operator 6 | spec: 7 | folder: "Observability Accelerator Dashboards" 8 | instanceSelector: 9 | matchLabels: 10 | dashboards: "external-grafana" 11 | url: https://raw.githubusercontent.com/aws-observability/aws-observability-accelerator/v0.2.0/artifacts/grafana-dashboards/eks/apiserver/apiserver-basic.json 12 | --- 13 | apiVersion: grafana.integreatly.org/v1beta1 14 | kind: GrafanaDashboard 15 | metadata: 16 | name: apiserver-advanced-grafanadashboard 17 | namespace: grafana-operator 18 | spec: 19 | folder: "Observability Accelerator Dashboards" 20 | instanceSelector: 21 | matchLabels: 22 | dashboards: "external-grafana" 23 | url: https://raw.githubusercontent.com/aws-observability/aws-observability-accelerator/v0.2.0/artifacts/grafana-dashboards/eks/apiserver/apiserver-advanced.json 24 | --- 25 | apiVersion: grafana.integreatly.org/v1beta1 26 | kind: GrafanaDashboard 27 | metadata: 28 | name: apiserver-troubleshooting-grafanadashboard 29 | namespace: grafana-operator 30 | spec: 31 | folder: "Observability Accelerator Dashboards" 32 | instanceSelector: 33 | matchLabels: 34 | dashboards: "external-grafana" 35 | url: https://raw.githubusercontent.com/aws-observability/aws-observability-accelerator/v0.2.0/artifacts/grafana-dashboards/eks/apiserver/apiserver-troubleshooting.json -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/30.cluster/locals.tf: -------------------------------------------------------------------------------- 1 | locals { 2 | cluster_name = "${var.shared_config.resources_prefix}-${terraform.workspace}" 3 | region = data.aws_region.current.id 4 | tfstate_region = try(var.tfstate_region, local.region) 5 | cluster_version = var.cluster_config.kubernetes_version 6 | eks_auto_mode = try(var.cluster_config.eks_auto_mode, false) 7 | 8 | private_subnet_ids = data.terraform_remote_state.vpc.outputs.private_subnet_ids 9 | control_plane_subnet_ids = try(var.cluster_config.use_intra_subnets, true) ? data.terraform_remote_state.vpc.outputs.intra_subnet_ids : local.private_subnet_ids 10 | 11 | capabilities = { 12 | kube_proxy = try(var.cluster_config.capabilities.kube_proxy, !local.eks_auto_mode, true) 13 | networking = try(var.cluster_config.capabilities.networking, !local.eks_auto_mode, true) 14 | coredns = try(var.cluster_config.capabilities.coredns, !local.eks_auto_mode, true) 15 | identity = try(var.cluster_config.capabilities.identity, !local.eks_auto_mode, true) 16 | autoscaling = try(var.cluster_config.capabilities.autoscaling, !local.eks_auto_mode, true) 17 | blockstorage = try(var.cluster_config.capabilities.blockstorage, !local.eks_auto_mode, true) 18 | } 19 | 20 | create_mng_system = try(var.cluster_config.create_mng_system, !local.eks_auto_mode, true) 21 | 22 | critical_addons_tolerations = { 23 | tolerations = [ 24 | { 25 | key = "CriticalAddonsOnly", 26 | operator = "Exists", 27 | effect = "NoSchedule" 28 | } 29 | ] 30 | } 31 | 32 | tags = merge( 33 | var.tags, 34 | { 35 | "Environment" : terraform.workspace 36 | "provisioned-by" : "aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters" 37 | } 38 | ) 39 | } 40 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/40.aws-native-observability/main.tf: -------------------------------------------------------------------------------- 1 | data "aws_region" "current" {} 2 | 3 | data "aws_caller_identity" "current" {} 4 | 5 | ################################################################################ 6 | # CW EKS Addon 7 | ################################################################################ 8 | module "aws_cloudwatch_observability_irsa" { 9 | 10 | source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks" 11 | version = "~> 5.44" 12 | count = var.observability_configuration.aws_native_tooling ? 1 : 0 13 | 14 | role_name = "${data.terraform_remote_state.eks.outputs.cluster_name}-cw-ci" 15 | 16 | role_policy_arns = { 17 | policy = "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy" 18 | } 19 | 20 | oidc_providers = { 21 | cluster = { 22 | provider_arn = data.terraform_remote_state.eks.outputs.oidc_provider_arn 23 | namespace_service_accounts = ["amazon-cloudwatch:cloudwatch-agent"] 24 | } 25 | } 26 | } 27 | 28 | module "aws_cloudwatch_observability" { 29 | source = "aws-ia/eks-blueprints-addons/aws" 30 | version = "~> 1.21.0" 31 | count = var.observability_configuration.aws_native_tooling ? 1 : 0 32 | 33 | cluster_name = data.terraform_remote_state.eks.outputs.cluster_name 34 | cluster_endpoint = data.terraform_remote_state.eks.outputs.cluster_endpoint 35 | cluster_version = data.terraform_remote_state.eks.outputs.kubernetes_version 36 | oidc_provider_arn = data.terraform_remote_state.eks.outputs.oidc_provider_arn 37 | 38 | create_kubernetes_resources = true 39 | eks_addons = { 40 | amazon-cloudwatch-observability = { 41 | most_recent = true 42 | service_account_role_arn = module.aws_cloudwatch_observability_irsa[0].iam_role_arn 43 | } 44 | } 45 | } 46 | 47 | 48 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/30.cluster/karpenter/default.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | apiVersion: karpenter.sh/v1 3 | kind: NodePool 4 | metadata: 5 | name: default 6 | spec: 7 | disruption: 8 | budgets: 9 | - nodes: 10% 10 | consolidateAfter: 30s 11 | consolidationPolicy: WhenEmptyOrUnderutilized 12 | template: 13 | spec: 14 | nodeClassRef: 15 | group: karpenter.k8s.aws 16 | kind: EC2NodeClass 17 | name: default 18 | requirements: 19 | - key: kubernetes.io/arch 20 | operator: In 21 | values: ["amd64"] 22 | - key: kubernetes.io/os 23 | operator: In 24 | values: ["linux"] 25 | - key: karpenter.sh/capacity-type 26 | operator: In 27 | values: ["on-demand"] 28 | - key: karpenter.k8s.aws/instance-category 29 | operator: In 30 | values: ["c", "m", "r"] 31 | - key: karpenter.k8s.aws/instance-generation 32 | operator: Gt 33 | values: ["4"] 34 | - key: karpenter.k8s.aws/instance-size 35 | operator: NotIn 36 | values: [nano, micro, small, medium] 37 | --- 38 | apiVersion: karpenter.k8s.aws/v1 39 | kind: EC2NodeClass 40 | metadata: 41 | name: default 42 | spec: 43 | amiSelectorTerms: 44 | - alias: al2023@latest 45 | instanceStorePolicy: RAID0 46 | role: ${role} 47 | subnetSelectorTerms: 48 | - tags: 49 | karpenter.sh/discovery: ${cluster_name} 50 | Name: ${cluster_name}-private-* 51 | Environment: ${environment} 52 | securityGroupSelectorTerms: 53 | - tags: 54 | karpenter.sh/discovery: ${cluster_name} 55 | Environment: ${environment} 56 | tags: 57 | karpenter.sh/discovery: ${cluster_name} 58 | Environment: ${environment} 59 | provisioned-by: "aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters" -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/35.addons/outputs.tf: -------------------------------------------------------------------------------- 1 | locals { 2 | argocd_terminal_setup = <<-EOT 3 | export KUBECONFIG="/tmp/${data.terraform_remote_state.eks.outputs.cluster_name}" 4 | aws eks --region ${local.region} update-kubeconfig --name ${data.terraform_remote_state.eks.outputs.cluster_name} 5 | export ARGOCD_OPTS="--port-forward --port-forward-namespace argocd --grpc-web" 6 | kubectl config set-context --current --namespace argocd 7 | argocd login --port-forward --username admin --password $(argocd admin initial-password | head -1) 8 | echo "ArgoCD Username: admin" 9 | echo "ArgoCD Password: $(kubectl get secrets argocd-initial-admin-secret -n argocd --template="{{index .data.password | base64decode}}")" 10 | echo Port Forward: http://localhost:8080 11 | kubectl port-forward -n argocd svc/argo-cd-argocd-server 8080:80 12 | EOT 13 | argocd_access = <<-EOT 14 | export KUBECONFIG="/tmp/${data.terraform_remote_state.eks.outputs.cluster_name}" 15 | aws eks --region ${local.region} update-kubeconfig --name ${data.terraform_remote_state.eks.outputs.cluster_name} 16 | echo "ArgoCD Username: admin" 17 | echo "ArgoCD Password: $(kubectl get secrets argocd-initial-admin-secret -n argocd --template="{{index .data.password | base64decode}}")" 18 | echo "ArgoCD URL: https://$(kubectl get svc -n argocd argo-cd-argocd-server -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')" 19 | EOT 20 | } 21 | 22 | output "configure_argocd" { 23 | description = "ArgoCD Terminal Setup" 24 | value = try(var.cluster_config.capabilities.gitops, true) ? local.argocd_terminal_setup : null 25 | } 26 | 27 | output "access_argocd" { 28 | description = "ArgoCD Access" 29 | value = try(var.cluster_config.capabilities.gitops, true) ? local.argocd_access : null 30 | } 31 | 32 | output "external_secrets_addon_output" { 33 | description = "external-secrets addon output values" 34 | value = try(var.observability_configuration.aws_oss_tooling, false) ? module.eks_blueprints_addons.external_secrets : null 35 | } 36 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/35.addons/providers.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.0" 3 | 4 | required_providers { 5 | aws = { 6 | source = "hashicorp/aws" 7 | version = "~> 5.40" 8 | } 9 | kubernetes = { 10 | source = "hashicorp/kubernetes" 11 | version = "~> 2.22" 12 | } 13 | 14 | helm = { 15 | source = "hashicorp/helm" 16 | version = "~> 2.7" 17 | } 18 | kubectl = { 19 | source = "alekc/kubectl" 20 | version = "~> 2.0" 21 | } 22 | null = { 23 | source = "hashicorp/null" 24 | version = "~> 3.0" 25 | } 26 | } 27 | 28 | } 29 | 30 | provider "helm" { 31 | kubernetes { 32 | host = data.terraform_remote_state.eks.outputs.cluster_endpoint 33 | cluster_ca_certificate = base64decode(data.terraform_remote_state.eks.outputs.cluster_certificate_authority_data) 34 | 35 | exec { 36 | api_version = "client.authentication.k8s.io/v1beta1" 37 | command = "aws" 38 | # This requires the awscli to be installed locally where Terraform is executed 39 | args = ["eks", "get-token", "--cluster-name", data.terraform_remote_state.eks.outputs.cluster_name, "--region", local.region] 40 | 41 | } 42 | } 43 | } 44 | 45 | provider "kubernetes" { 46 | host = data.terraform_remote_state.eks.outputs.cluster_endpoint 47 | cluster_ca_certificate = base64decode(data.terraform_remote_state.eks.outputs.cluster_certificate_authority_data) 48 | 49 | exec { 50 | api_version = "client.authentication.k8s.io/v1beta1" 51 | command = "aws" 52 | # This requires the awscli to be installed locally where Terraform is executed 53 | args = ["eks", "get-token", "--cluster-name", data.terraform_remote_state.eks.outputs.cluster_name, "--region", local.region] 54 | } 55 | } 56 | 57 | provider "aws" { 58 | default_tags { 59 | tags = local.tags 60 | } 61 | } 62 | 63 | provider "kubectl" { 64 | apply_retry_count = 5 65 | host = data.terraform_remote_state.eks.outputs.cluster_endpoint 66 | cluster_ca_certificate = base64decode(data.terraform_remote_state.eks.outputs.cluster_certificate_authority_data) 67 | load_config_file = false 68 | 69 | exec { 70 | api_version = "client.authentication.k8s.io/v1beta1" 71 | command = "aws" 72 | # This requires the awscli to be installed locally where Terraform is executed 73 | args = ["eks", "get-token", "--cluster-name", data.terraform_remote_state.eks.outputs.cluster_name, "--region", local.region] 74 | 75 | } 76 | } 77 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/40.aws-native-observability/providers.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.0" 3 | 4 | required_providers { 5 | aws = { 6 | source = "hashicorp/aws" 7 | version = "~> 5.40" 8 | } 9 | kubernetes = { 10 | source = "hashicorp/kubernetes" 11 | version = "~> 2.22" 12 | } 13 | kubectl = { 14 | source = "alekc/kubectl" 15 | version = "~> 2.0" 16 | } 17 | helm = { 18 | source = "hashicorp/helm" 19 | version = "~> 2.7" 20 | } 21 | null = { 22 | source = "hashicorp/null" 23 | version = "~> 3.0" 24 | } 25 | } 26 | 27 | } 28 | 29 | provider "helm" { 30 | kubernetes { 31 | host = data.terraform_remote_state.eks.outputs.cluster_endpoint 32 | cluster_ca_certificate = base64decode(data.terraform_remote_state.eks.outputs.cluster_certificate_authority_data) 33 | 34 | exec { 35 | api_version = "client.authentication.k8s.io/v1beta1" 36 | command = "aws" 37 | # This requires the awscli to be installed locally where Terraform is executed 38 | args = ["eks", "get-token", "--cluster-name", data.terraform_remote_state.eks.outputs.cluster_name, "--region", local.region] 39 | 40 | } 41 | } 42 | } 43 | 44 | provider "kubernetes" { 45 | host = data.terraform_remote_state.eks.outputs.cluster_endpoint 46 | cluster_ca_certificate = base64decode(data.terraform_remote_state.eks.outputs.cluster_certificate_authority_data) 47 | 48 | exec { 49 | api_version = "client.authentication.k8s.io/v1beta1" 50 | command = "aws" 51 | # This requires the awscli to be installed locally where Terraform is executed 52 | args = ["eks", "get-token", "--cluster-name", data.terraform_remote_state.eks.outputs.cluster_name, "--region", local.region] 53 | } 54 | } 55 | 56 | provider "aws" { 57 | default_tags { 58 | tags = local.tags 59 | } 60 | } 61 | 62 | provider "kubectl" { 63 | apply_retry_count = 5 64 | host = data.terraform_remote_state.eks.outputs.cluster_endpoint 65 | cluster_ca_certificate = base64decode(data.terraform_remote_state.eks.outputs.cluster_certificate_authority_data) 66 | load_config_file = false 67 | 68 | exec { 69 | api_version = "client.authentication.k8s.io/v1beta1" 70 | command = "aws" 71 | # This requires the awscli to be installed locally where Terraform is executed 72 | args = ["eks", "get-token", "--cluster-name", data.terraform_remote_state.eks.outputs.cluster_name, "--region", local.region] 73 | 74 | } 75 | } 76 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/managed_scraper.tf: -------------------------------------------------------------------------------- 1 | # The configurations in this file are for AMP Managed Collector (https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-collector-how-to.html) 2 | # and will be enabled only if var.observability_configuration.aws_oss_tooling_config.enable_managed_collector is set to true 3 | 4 | ### managed collector 5 | 6 | # per docs on https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-collector-how-to.html#AMP-collector-create 7 | resource "kubectl_manifest" "amp_scraper_clusterrole" { 8 | count = var.observability_configuration.aws_oss_tooling && var.observability_configuration.aws_oss_tooling_config.enable_managed_collector ? 1 : 0 9 | yaml_body = < 4 | ## Requirements 5 | 6 | | Name | Version | 7 | |------|---------| 8 | | [terraform](#requirement\_terraform) | >= 1.0.0 | 9 | | [aws](#requirement\_aws) | >= 5.40.0 | 10 | 11 | ## Providers 12 | 13 | | Name | Version | 14 | |------|---------| 15 | | [aws](#provider\_aws) | >= 5.40.0 | 16 | 17 | ## Modules 18 | 19 | | Name | Source | Version | 20 | |------|--------|---------| 21 | | [endpoints](#module\_endpoints) | terraform-aws-modules/vpc/aws//modules/vpc-endpoints | 5.5.3 | 22 | | [subnets](#module\_subnets) | hashicorp/subnets/cidr | 1.0.0 | 23 | | [vpc](#module\_vpc) | terraform-aws-modules/vpc/aws | 5.5.2 | 24 | 25 | ## Resources 26 | 27 | | Name | Type | 28 | |------|------| 29 | | [aws_availability_zones.available](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/availability_zones) | data source | 30 | 31 | ## Inputs 32 | 33 | | Name | Description | Type | Default | Required | 34 | |------|-------------|------|---------|:--------:| 35 | | [cluster\_config](#input\_cluster\_config) | cluster configurations such as version, public/private API endpoint, and more | `map(string)` | `{}` | no | 36 | | [control\_plane\_subnets\_cidr\_prefix](#input\_control\_plane\_subnets\_cidr\_prefix) | CIDR prefix for the control plane subnets | `number` | `28` | no | 37 | | [num\_azs](#input\_num\_azs) | Number of Availability Zones | `number` | `3` | no | 38 | | [private\_subnets\_cidr\_prefix](#input\_private\_subnets\_cidr\_prefix) | CIDR prefix for the private subnets | `number` | `20` | no | 39 | | [public\_subnets\_cidr\_prefix](#input\_public\_subnets\_cidr\_prefix) | CIDR prefix for the public subnets | `number` | `24` | no | 40 | | [shared\_config](#input\_shared\_config) | Shared configuration across all modules/folders | `map(any)` | `{}` | no | 41 | | [tags](#input\_tags) | Additional tags | `map(string)` | `{}` | no | 42 | | [vpc\_cidr](#input\_vpc\_cidr) | CIDR block for the VPC | `string` | `"10.0.0.0/16"` | no | 43 | 44 | ## Outputs 45 | 46 | | Name | Description | 47 | |------|-------------| 48 | | [intra\_subnet\_ids](#output\_intra\_subnet\_ids) | List of IDs of intra subnets | 49 | | [private\_subnet\_ids](#output\_private\_subnet\_ids) | List of IDs of private subnets | 50 | | [public\_subnet\_ids](#output\_public\_subnet\_ids) | List of IDs of public subnets | 51 | | [vpc\_id](#output\_vpc\_id) | The ID of the VPC | 52 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/30.cluster/auto-mode/default.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | apiVersion: karpenter.sh/v1 3 | kind: NodePool 4 | metadata: 5 | name: general-purpose 6 | spec: 7 | disruption: 8 | budgets: 9 | - nodes: 10% 10 | consolidateAfter: 30s 11 | consolidationPolicy: WhenEmptyOrUnderutilized 12 | template: 13 | metadata: {} 14 | spec: 15 | expireAfter: 336h 16 | nodeClassRef: 17 | group: eks.amazonaws.com 18 | kind: NodeClass 19 | name: general 20 | requirements: 21 | - key: karpenter.sh/capacity-type 22 | operator: In 23 | values: 24 | - on-demand 25 | - key: eks.amazonaws.com/instance-category 26 | operator: In 27 | values: 28 | - c 29 | - m 30 | - r 31 | - key: eks.amazonaws.com/instance-generation 32 | operator: Gt 33 | values: 34 | - "4" 35 | - key: kubernetes.io/arch 36 | operator: In 37 | values: 38 | - amd64 39 | - key: kubernetes.io/os 40 | operator: In 41 | values: 42 | - linux 43 | terminationGracePeriod: 24h0m0s 44 | --- 45 | apiVersion: karpenter.sh/v1 46 | kind: NodePool 47 | metadata: 48 | name: system 49 | spec: 50 | disruption: 51 | budgets: 52 | - nodes: 10% 53 | consolidateAfter: 30s 54 | consolidationPolicy: WhenEmptyOrUnderutilized 55 | template: 56 | metadata: {} 57 | spec: 58 | expireAfter: 336h 59 | nodeClassRef: 60 | group: eks.amazonaws.com 61 | kind: NodeClass 62 | name: general 63 | requirements: 64 | - key: karpenter.sh/capacity-type 65 | operator: In 66 | values: 67 | - on-demand 68 | - key: eks.amazonaws.com/instance-category 69 | operator: In 70 | values: 71 | - c 72 | - m 73 | - r 74 | - key: eks.amazonaws.com/instance-generation 75 | operator: Gt 76 | values: 77 | - "4" 78 | - key: kubernetes.io/arch 79 | operator: In 80 | values: 81 | - amd64 82 | - arm64 83 | - key: kubernetes.io/os 84 | operator: In 85 | values: 86 | - linux 87 | taints: 88 | - effect: NoSchedule 89 | key: CriticalAddonsOnly 90 | terminationGracePeriod: 24h0m0s 91 | --- 92 | apiVersion: eks.amazonaws.com/v1 93 | kind: NodeClass 94 | metadata: 95 | name: general 96 | spec: 97 | ephemeralStorage: 98 | iops: 3000 99 | size: 80Gi 100 | throughput: 125 101 | networkPolicy: DefaultAllow 102 | networkPolicyEventLogs: Disabled 103 | role: ${role} 104 | securityGroupSelectorTerms: 105 | - id: ${cluster_security_group_id} 106 | snatPolicy: Random 107 | subnetSelectorTerms: 108 | - tags: 109 | Name: ${cluster_name}-private-* 110 | Environment: ${environment} 111 | tags: 112 | Environment: ${environment} 113 | provisioned-by: "aws-solutions-library-samples/guidance-for-automated-provisioning-of-application-ready-amazon-eks-clusters" -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *main* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/grafana.tf: -------------------------------------------------------------------------------- 1 | # Data block to fetch the SSO admin instance. Once you enabled SSO admin from console, you need data block to fetch this in your code. 2 | provider "aws" { 3 | region = local.sso_region 4 | alias = "sso" 5 | default_tags { 6 | tags = local.tags 7 | } 8 | } 9 | 10 | data "aws_ssoadmin_instances" "current" { 11 | provider = aws.sso 12 | } 13 | 14 | module "managed_grafana" { 15 | count = var.observability_configuration.aws_oss_tooling ? 1 : 0 16 | source = "terraform-aws-modules/managed-service-grafana/aws" 17 | version = "2.1.1" 18 | 19 | name = local.grafana_workspace_name 20 | associate_license = false 21 | description = local.grafana_workspace_description 22 | account_access_type = "CURRENT_ACCOUNT" 23 | authentication_providers = ["AWS_SSO"] 24 | permission_type = "SERVICE_MANAGED" 25 | data_sources = ["CLOUDWATCH", "PROMETHEUS", "XRAY"] 26 | notification_destinations = ["SNS"] 27 | stack_set_name = local.grafana_workspace_name 28 | 29 | configuration = jsonencode({ 30 | unifiedAlerting = { 31 | enabled = true 32 | }, 33 | plugins = { 34 | pluginAdminEnabled = false 35 | } 36 | }) 37 | 38 | grafana_version = "9.4" 39 | 40 | # Workspace IAM role 41 | create_iam_role = true 42 | iam_role_name = local.grafana_workspace_name 43 | use_iam_role_name_prefix = true 44 | iam_role_description = local.grafana_workspace_description 45 | iam_role_path = "/grafana/" 46 | iam_role_force_detach_policies = true 47 | iam_role_max_session_duration = 7200 48 | iam_role_tags = local.tags 49 | 50 | # Role associations 51 | # Ref: https://github.com/aws/aws-sdk/issues/25 52 | # Ref: https://github.com/hashicorp/terraform-provider-aws/issues/18812 53 | # WARNING: https://github.com/hashicorp/terraform-provider-aws/issues/24166 54 | role_associations = { 55 | "ADMIN" = { 56 | "group_ids" = [aws_identitystore_group.group[count.index].group_id] 57 | } 58 | } 59 | 60 | tags = local.tags 61 | } 62 | 63 | # ############################## Users,Group,Group's Membership ######################################### 64 | 65 | resource "aws_identitystore_user" "user" { 66 | provider = aws.sso 67 | count = var.observability_configuration.aws_oss_tooling ? 1 : 0 68 | identity_store_id = tolist(data.aws_ssoadmin_instances.current.identity_store_ids)[0] 69 | 70 | display_name = "Grafana Admin for ${terraform.workspace} env" 71 | user_name = "grafana-admin-${terraform.workspace}" 72 | 73 | 74 | name { 75 | family_name = "Admin" 76 | given_name = "Grafana" 77 | } 78 | 79 | emails { 80 | value = "${terraform.workspace}-${var.grafana_admin_email}" 81 | } 82 | } 83 | 84 | resource "aws_identitystore_group" "group" { 85 | provider = aws.sso 86 | count = var.observability_configuration.aws_oss_tooling ? 1 : 0 87 | identity_store_id = tolist(data.aws_ssoadmin_instances.current.identity_store_ids)[0] 88 | display_name = "grafana-admins-${terraform.workspace}" 89 | description = "Grafana Administrators for ${terraform.workspace} env" 90 | } 91 | 92 | resource "aws_identitystore_group_membership" "group_membership" { 93 | provider = aws.sso 94 | count = var.observability_configuration.aws_oss_tooling ? 1 : 0 95 | identity_store_id = tolist(data.aws_ssoadmin_instances.current.identity_store_ids)[0] 96 | group_id = aws_identitystore_group.group[count.index].group_id 97 | member_id = aws_identitystore_user.user[count.index].user_id 98 | } 99 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/variables.tf: -------------------------------------------------------------------------------- 1 | variable "tfstate_region" { 2 | description = "region where the terraform state is stored" 3 | type = string 4 | default = null 5 | } 6 | 7 | variable "observability_configuration" { 8 | description = "observability configuration variable" 9 | type = object({ 10 | aws_oss_tooling = optional(bool, true) // AMP & AMG 11 | aws_native_tooling = optional(bool, false) // CW 12 | aws_oss_tooling_config = optional(map(any), {}) 13 | }) 14 | default = { 15 | aws_oss_tooling = true 16 | aws_native_tooling = false 17 | aws_oss_tooling_config = { 18 | enable_managed_collector = true 19 | enable_self_managed_collectors = false 20 | prometheus_name = "prom" 21 | enable_grafana_operator = true 22 | } 23 | } 24 | } 25 | 26 | variable "go_config" { 27 | description = "Grafana Operator configuration" 28 | type = object({ 29 | create_namespace = optional(bool, true) 30 | helm_chart = optional(string, "oci://ghcr.io/grafana-operator/helm-charts/grafana-operator") 31 | helm_name = optional(string, "grafana-operator") 32 | k8s_namespace = optional(string, "grafana-operator") 33 | helm_release_name = optional(string, "grafana-operator") 34 | helm_chart_version = optional(string, "v5.5.2") 35 | }) 36 | 37 | default = { 38 | create_namespace = true 39 | helm_chart = "oci://ghcr.io/grafana-operator/helm-charts/grafana-operator" 40 | helm_name = "grafana-operator" 41 | k8s_namespace = "grafana-operator" 42 | helm_release_name = "grafana-operator" 43 | helm_chart_version = "v5.5.2" 44 | } 45 | } 46 | 47 | variable "prometheus_name" { 48 | description = "Amazon Managed Service for Prometheus Name" 49 | type = string 50 | default = "prom" 51 | } 52 | 53 | 54 | variable "tags" { 55 | description = "Additional tags" 56 | type = map(string) 57 | default = {} 58 | } 59 | 60 | variable "grafana_admin_email" { 61 | description = "default email for the grafana-admin user" 62 | type = string 63 | default = "email@example.com" 64 | } 65 | 66 | variable "ne_config" { 67 | description = "Node exporter configuration" 68 | type = object({ 69 | create_namespace = optional(bool, true) 70 | k8s_namespace = optional(string, "prometheus-node-exporter") 71 | helm_chart_name = optional(string, "prometheus-node-exporter") 72 | helm_chart_version = optional(string, "4.24.0") 73 | helm_release_name = optional(string, "prometheus-node-exporter") 74 | helm_repo_url = optional(string, "https://prometheus-community.github.io/helm-charts") 75 | helm_settings = optional(map(string), {}) 76 | helm_values = optional(map(any), {}) 77 | 78 | scrape_interval = optional(string, "60s") 79 | scrape_timeout = optional(string, "60s") 80 | }) 81 | 82 | default = {} 83 | nullable = false 84 | } 85 | 86 | 87 | variable "ksm_config" { 88 | description = "Kube State metrics configuration" 89 | type = object({ 90 | create_namespace = optional(bool, true) 91 | k8s_namespace = optional(string, "kube-system") 92 | helm_chart_name = optional(string, "kube-state-metrics") 93 | helm_chart_version = optional(string, "5.16.1") 94 | helm_release_name = optional(string, "kube-state-metrics") 95 | helm_repo_url = optional(string, "https://prometheus-community.github.io/helm-charts") 96 | helm_settings = optional(map(string), {}) 97 | helm_values = optional(map(any), {}) 98 | 99 | scrape_interval = optional(string, "60s") 100 | scrape_timeout = optional(string, "15s") 101 | }) 102 | 103 | default = {} 104 | nullable = false 105 | } 106 | 107 | variable "shared_config" { 108 | description = "Shared configuration across all modules/folders" 109 | type = map(any) 110 | default = {} 111 | } -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/35.addons/main.tf: -------------------------------------------------------------------------------- 1 | data "aws_region" "current" {} 2 | 3 | data "aws_caller_identity" "current" {} 4 | 5 | module "eks_blueprints_addons" { 6 | source = "aws-ia/eks-blueprints-addons/aws" 7 | version = "~> 1.21.0" 8 | 9 | cluster_name = data.terraform_remote_state.eks.outputs.cluster_name 10 | cluster_endpoint = data.terraform_remote_state.eks.outputs.cluster_endpoint 11 | cluster_version = data.terraform_remote_state.eks.outputs.kubernetes_version 12 | oidc_provider_arn = data.terraform_remote_state.eks.outputs.oidc_provider_arn 13 | 14 | create_kubernetes_resources = true 15 | 16 | # common addons deployed with EKS Blueprints Addons 17 | enable_aws_load_balancer_controller = local.capabilities.loadbalancing 18 | aws_load_balancer_controller = { 19 | values = [yamlencode(local.critical_addons_tolerations)] 20 | } 21 | 22 | # external-secrets is being used AMG for grafana auth 23 | enable_external_secrets = try(var.observability_configuration.aws_oss_tooling, false) 24 | external_secrets = { 25 | values = [ 26 | yamlencode({ 27 | tolerations = [local.critical_addons_tolerations.tolerations[0]] 28 | webhook = { 29 | tolerations = [local.critical_addons_tolerations.tolerations[0]] 30 | } 31 | certController = { 32 | tolerations = [local.critical_addons_tolerations.tolerations[0]] 33 | } 34 | }) 35 | ] 36 | } 37 | 38 | # cert-manager as a dependency for ADOT addon 39 | enable_cert_manager = try( 40 | var.observability_configuration.aws_oss_tooling 41 | && var.observability_configuration.aws_oss_tooling_config.enable_adot_collector, 42 | false) 43 | cert_manager = { 44 | values = [ 45 | yamlencode({ 46 | tolerations = [local.critical_addons_tolerations.tolerations[0]] 47 | webhook = { 48 | tolerations = [local.critical_addons_tolerations.tolerations[0]] 49 | } 50 | cainjector = { 51 | tolerations = [local.critical_addons_tolerations.tolerations[0]] 52 | } 53 | }) 54 | ] 55 | } 56 | 57 | # FluentBit 58 | enable_aws_for_fluentbit = try( 59 | var.observability_configuration.aws_oss_tooling 60 | && !var.observability_configuration.aws_oss_tooling_config.enable_adot_collector 61 | , false) 62 | aws_for_fluentbit = { 63 | values = [ 64 | yamlencode({ "tolerations" : [{ "operator" : "Exists" }] }) 65 | ] 66 | } 67 | aws_for_fluentbit_cw_log_group = { 68 | name = "/aws/eks/${data.terraform_remote_state.eks.outputs.cluster_name}/aws-fluentbit-logs" 69 | use_name_prefix = false 70 | create = true 71 | } 72 | 73 | # GitOps 74 | enable_argocd = local.capabilities.gitops 75 | argocd = { 76 | enabled = true 77 | # The following settings are required to be set to true to ensure the 78 | # argocd application is deployed 79 | create_argocd_application = true 80 | create_kubernetes_resources = true 81 | enable_argocd = true 82 | argocd_namespace = "argocd" 83 | } 84 | } 85 | 86 | 87 | resource "null_resource" "clean_up_argocd_resources" { 88 | count = try(var.cluster_config.capabilities.gitops, true) ? 1 : 0 89 | triggers = { 90 | argocd = module.eks_blueprints_addons.argocd.name 91 | eks_cluster_name = data.terraform_remote_state.eks.outputs.cluster_name 92 | } 93 | provisioner "local-exec" { 94 | command = <<-EOT 95 | kubeconfig=/tmp/tf.clean_up_argocd.kubeconfig.yaml 96 | aws eks update-kubeconfig --name ${self.triggers.eks_cluster_name} --kubeconfig $kubeconfig 97 | rm -f /tmp/tf.clean_up_argocd_resources.err.log 98 | kubectl --kubeconfig $kubeconfig get Application -A -o name | xargs -I {} kubectl --kubeconfig $kubeconfig -n argocd patch -p '{"metadata":{"finalizers":null}}' --type=merge {} 2> /tmp/tf.clean_up_argocd_resources.err.log || true 99 | kubectl --kubeconfig $kubeconfig get appprojects -A -o name | xargs -I {} kubectl --kubeconfig $kubeconfig -n argocd patch -p '{"metadata":{"finalizers":null}}' --type=merge {} 2> /tmp/tf.clean_up_argocd_resources.err.log || true 100 | rm -f $kubeconfig 101 | EOT 102 | interpreter = ["bash", "-c"] 103 | when = destroy 104 | } 105 | } 106 | 107 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/30.cluster/karpenter.tf: -------------------------------------------------------------------------------- 1 | data "aws_ecrpublic_authorization_token" "token" { 2 | provider = aws.virginia 3 | } 4 | 5 | # Add the Karpenter discovery tag only to the cluster primary security group 6 | # by default if using the eks module tags, it will tag all resources with this tag, which is not needed. 7 | resource "aws_ec2_tag" "cluster_primary_security_group" { 8 | count = local.capabilities.autoscaling ? 1 : 0 9 | resource_id = module.eks.cluster_primary_security_group_id 10 | key = "karpenter.sh/discovery" 11 | value = local.cluster_name 12 | } 13 | 14 | ################################################################################ 15 | # Karpenter 16 | ################################################################################ 17 | module "karpenter" { 18 | source = "terraform-aws-modules/eks/aws//modules/karpenter" 19 | version = "~> 20.36.0" 20 | 21 | create = local.capabilities.autoscaling 22 | 23 | cluster_name = module.eks.cluster_name 24 | 25 | enable_v1_permissions = true 26 | 27 | enable_pod_identity = true 28 | create_pod_identity_association = true 29 | 30 | # Used to attach additional IAM policies to the Karpenter node IAM role 31 | node_iam_role_additional_policies = { 32 | AmazonSSMManagedInstanceCore = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore" 33 | } 34 | 35 | iam_role_name = "KarpenterController-${module.eks.cluster_name}" 36 | iam_role_use_name_prefix = false 37 | 38 | node_iam_role_name = "KarpenterNode-${module.eks.cluster_name}" 39 | node_iam_role_use_name_prefix = false 40 | 41 | tags = local.tags 42 | 43 | depends_on = [ 44 | module.eks 45 | ] 46 | } 47 | 48 | ################################################################################ 49 | # Karpenter Helm chart deployment 50 | ################################################################################ 51 | resource "helm_release" "karpenter" { 52 | count = local.capabilities.autoscaling ? 1 : 0 53 | 54 | namespace = "kube-system" 55 | name = "karpenter" 56 | repository = "oci://public.ecr.aws/karpenter" 57 | repository_username = data.aws_ecrpublic_authorization_token.token.user_name 58 | repository_password = data.aws_ecrpublic_authorization_token.token.password 59 | chart = "karpenter" 60 | version = "1.5.0" 61 | wait = false 62 | 63 | values = [ 64 | yamlencode({ 65 | tolerations = local.critical_addons_tolerations.tolerations, 66 | dnsPolicy : "Default", 67 | settings = { 68 | clusterName : module.eks.cluster_name 69 | clusterEndpoint : module.eks.cluster_endpoint 70 | interruptionQueue : module.karpenter.queue_name 71 | }, 72 | controller = { 73 | resources = { 74 | requests = { 75 | cpu = "1" 76 | memory = "1Gi" 77 | }, 78 | limits = { 79 | cpu = "1" 80 | memory = "1Gi" 81 | } 82 | } 83 | } 84 | }) 85 | ] 86 | 87 | depends_on = [ 88 | module.karpenter 89 | ] 90 | } 91 | ################################################################################ 92 | # Karpenter default NodePool & NodeClass 93 | ################################################################################ 94 | data "kubectl_path_documents" "karpenter_manifests" { 95 | count = local.capabilities.autoscaling ? 1 : 0 96 | pattern = "${path.module}/karpenter/*.yaml" 97 | vars = { 98 | role = module.karpenter.node_iam_role_name 99 | cluster_name = local.cluster_name 100 | environment = terraform.workspace 101 | } 102 | depends_on = [ 103 | helm_release.karpenter[0] 104 | ] 105 | } 106 | 107 | # workaround terraform issue with attributes that cannot be determined ahead because of module dependencies 108 | # https://github.com/gavinbunney/terraform-provider-kubectl/issues/58 109 | data "kubectl_path_documents" "karpenter_manifests_dummy" { 110 | count = local.capabilities.autoscaling ? 1 : 0 111 | pattern = "${path.module}/karpenter/*.yaml" 112 | vars = { 113 | role = "" 114 | cluster_name = "" 115 | environment = terraform.workspace 116 | } 117 | } 118 | 119 | resource "kubectl_manifest" "karpenter_manifests" { 120 | count = local.capabilities.autoscaling ? length(data.kubectl_path_documents.karpenter_manifests_dummy[0].documents) : 0 121 | yaml_body = element(data.kubectl_path_documents.karpenter_manifests[0].documents, count.index) 122 | } 123 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/aws_prometheus_scraper_configuration: -------------------------------------------------------------------------------- 1 | # https://github.com/prometheus/prometheus/blob/release-2.50/documentation/examples/prometheus-kubernetes.yml 2 | global: 3 | scrape_interval: 30s 4 | external_labels: 5 | clusterArn: ${CLUSTER_ARN} 6 | cluster: ${CLUSTER_NAME} 7 | scrape_configs: 8 | # pod metrics 9 | - job_name: pod_exporter 10 | kubernetes_sd_configs: 11 | - role: pod 12 | # container metrics 13 | - job_name: cadvisor 14 | scheme: https 15 | authorization: 16 | credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token 17 | kubernetes_sd_configs: 18 | - role: node 19 | relabel_configs: 20 | - action: labelmap 21 | regex: __meta_kubernetes_node_label_(.+) 22 | - replacement: kubernetes.default.svc:443 23 | target_label: __address__ 24 | - source_labels: [__meta_kubernetes_node_name] 25 | regex: (.+) 26 | target_label: __metrics_path__ 27 | replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor 28 | 29 | # kube proxy metrics 30 | - job_name: kube-proxy 31 | honor_labels: true 32 | kubernetes_sd_configs: 33 | - role: pod 34 | relabel_configs: 35 | - action: keep 36 | source_labels: 37 | - __meta_kubernetes_namespace 38 | - __meta_kubernetes_pod_name 39 | separator: '/' 40 | regex: 'kube-system/kube-proxy.+' 41 | - source_labels: 42 | - __address__ 43 | action: replace 44 | target_label: __address__ 45 | regex: (.+?)(\\:\\d+)? 46 | replacement: $1:10249 47 | 48 | # kubernetes kubelet 49 | - job_name: 'kubelet' 50 | scheme: https 51 | authorization: 52 | credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token 53 | kubernetes_sd_configs: 54 | - role: node 55 | relabel_configs: 56 | - action: labelmap 57 | regex: __meta_kubernetes_node_label_(.+) 58 | - target_label: __address__ 59 | replacement: kubernetes.default.svc:443 60 | - source_labels: [__meta_kubernetes_node_name] 61 | regex: (.+) 62 | target_label: __metrics_path__ 63 | replacement: /api/v1/nodes/$1/proxy/metrics 64 | 65 | # kubernetes apiservers metrics 66 | - job_name: 'kubernetes-apiservers' 67 | scheme: https 68 | tls_config: 69 | ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt 70 | insecure_skip_verify: true 71 | bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token 72 | kubernetes_sd_configs: 73 | - role: endpoints 74 | relabel_configs: 75 | - source_labels: 76 | [ 77 | __meta_kubernetes_namespace, 78 | __meta_kubernetes_service_name, 79 | __meta_kubernetes_endpoint_port_name, 80 | ] 81 | action: keep 82 | regex: default;kubernetes;https 83 | metric_relabel_configs: 84 | - action: keep 85 | source_labels: [__name__] 86 | - source_labels: [__name__] 87 | regex: etcd_request_duration_seconds_bucket # 20K 88 | action: drop 89 | - source_labels: [__name__] 90 | regex: apiserver_request_duration_seconds_bucket # 15K 91 | action: drop 92 | - source_labels: [__name__] 93 | regex: apiserver_request_sli_duration_seconds_bucket # 15K 94 | action: drop 95 | - source_labels: [__name__] 96 | regex: apiserver_response_sizes_bucket # 5K 97 | action: drop 98 | - source_labels: [__name__] 99 | regex: apiserver_storage_list_duration_seconds_bucket # 3K 100 | action: drop 101 | 102 | - job_name: 'kube-state-metrics' 103 | kubernetes_sd_configs: 104 | - role: pod 105 | relabel_configs: 106 | # Select kube-state-metrics only from port 8080 107 | - source_labels: [__meta_kubernetes_pod_container_name,__meta_kubernetes_pod_container_port_number] 108 | separator: ; 109 | regex: (kube-state-metrics);8080 110 | replacement: $1 111 | action: keep 112 | - action: labelmap 113 | regex: __meta_kubernetes_pod_label_(.+) 114 | - source_labels: [__meta_kubernetes_namespace] 115 | action: replace 116 | target_label: kubernetes_namespace 117 | - source_labels: [__meta_kubernetes_pod_name] 118 | action: replace 119 | target_label: kubernetes_pod_name 120 | 121 | - job_name: 'node-exporter' 122 | kubernetes_sd_configs: 123 | - role: endpoints 124 | # ec2_sd_configs: 125 | relabel_configs: 126 | - source_labels: [ __address__ ] 127 | action: keep 128 | regex: '.*:9100$' 129 | - action: replace 130 | source_labels: [__meta_kubernetes_endpoint_node_name] 131 | target_label: nodename 132 | 133 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/Makefile: -------------------------------------------------------------------------------- 1 | .ONESHELL: 2 | SHELL = bash 3 | SHELLFLAGS = -o pipefail 4 | 5 | ENVIRONMENT ?= dev 6 | AWS_REGION ?= $(shell aws configure get region) 7 | AWS_ACCOUNT_ID ?= $(shell aws sts get-caller-identity --output json | jq -r '.Account') 8 | TFSTATE_S3_BUCKET ?= "tfstate-$(AWS_ACCOUNT_ID)" 9 | TFSTATE_REGION ?= $(AWS_REGION) 10 | TFSTATE_DDB_TABLE ?= "tfstate-lock" 11 | VAR_FILE := $(CURDIR)/00.global/vars/$(ENVIRONMENT).tfvars 12 | 13 | TF_VAR_tfstate_region := $(TFSTATE_REGION) 14 | export TF_VAR_tfstate_region 15 | 16 | MODULES = $(shell find . -type f -name "backend.tf" -exec dirname {} \; | sort -u ) 17 | 18 | ifeq ($(AUTO_APPROVE), true) 19 | TF_AUTO_APPROVE := "-auto-approve" 20 | else 21 | TF_AUTO_APPROVE := "" 22 | endif 23 | 24 | define execute_terraform 25 | set -o pipefail; \ 26 | terraform -chdir=$(1) $(2) $(3) \ 27 | -input=false \ 28 | -lock=true \ 29 | -var-file=$(VAR_FILE) \ 30 | 2>&1 | tee -a tf-logs/$(notdir $(1))-$(2).log; \ 31 | exit_code=$$?; \ 32 | if [ $$exit_code -ne 0 ]; then \ 33 | echo "Terraform $(2) failed for module $(1)"; \ 34 | exit $$exit_code; \ 35 | fi 36 | endef 37 | 38 | .PHONY: print-modules clean check-env bootstrap init-all plan-all apply-all destroy-all init refresh plan apply destroy 39 | 40 | print-modules: 41 | @for m in $(MODULES); do echo $$m; done 42 | 43 | clean: 44 | @find . -type d -name ".terraform" -prune -exec rm -rf {} \; 45 | @find . -type f -name ".terraform.lock.hcl" -prune -exec rm -f {} \; 46 | 47 | check-env: 48 | @if [ -z $(AWS_REGION) ]; then \ 49 | echo "AWS_REGION was not set."; \ 50 | exit 1; \ 51 | fi 52 | @if [ -z $(TFSTATE_REGION) ]; then \ 53 | echo "TFSTATE_REGION was not set."; \ 54 | exit 1; \ 55 | fi 56 | @if [ ! -f $(VAR_FILE) ]; then \ 57 | echo "VAR_FILE: $(VAR_FILE) does not exist."; \ 58 | fi 59 | 60 | @mkdir -p tf-logs 61 | 62 | bootstrap: check-env 63 | @echo "Bootstrapping Terraform: S3 Bucket: $(TFSTATE_S3_BUCKET), DynamoDB Table: $(TFSTATE_DDB_TABLE)" 64 | @echo "Checking if S3 Bucket $(TFSTATE_S3_BUCKET) exists" 65 | @if ! aws s3api head-bucket --region $(TFSTATE_REGION) --bucket $(TFSTATE_S3_BUCKET) > /dev/null 2>&1; then \ 66 | echo "S3 Bucket $(TFSTATE_S3_BUCKET) does not exists, creating..."; \ 67 | aws s3 mb --region $(TFSTATE_REGION) s3://$(TFSTATE_S3_BUCKET) > /dev/null 2>&1; \ 68 | aws s3api put-bucket-ownership-controls --region $(TFSTATE_REGION) --bucket $(TFSTATE_S3_BUCKET) --ownership-controls Rules="[{ObjectOwnership=BucketOwnerPreferred}]" > /dev/null 2>&1; \ 69 | aws s3api put-bucket-acl --region $(TFSTATE_REGION) --bucket $(TFSTATE_S3_BUCKET) --acl private > /dev/null 2>&1; \ 70 | aws s3api put-public-access-block --bucket $(TFSTATE_S3_BUCKET) --public-access-block-configuration "BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true" > /dev/null 2>&1; \ 71 | aws s3api put-bucket-versioning --region $(TFSTATE_REGION) --bucket $(TFSTATE_S3_BUCKET) --versioning-configuration Status=Enabled > /dev/null 2>&1; \ 72 | echo "Created S3 Bucket $(TFSTATE_S3_BUCKET)."; \ 73 | else \ 74 | echo "S3 Bucket $(TFSTATE_S3_BUCKET) exists."; \ 75 | fi 76 | @echo "Checking if DynamoDB table $(TFSTATE_DDB_TABLE) exists" 77 | @if ! aws dynamodb describe-table --region $(TFSTATE_REGION) --table-name $(TFSTATE_DDB_TABLE) > /dev/null 2>&1 ; then \ 78 | echo "DynamoDB table $(TFSTATE_DDB_TABLE) does not exists, creating..."; \ 79 | aws dynamodb create-table \ 80 | --region $(TFSTATE_REGION) \ 81 | --table-name $(TFSTATE_DDB_TABLE) \ 82 | --attribute-definitions AttributeName=LockID,AttributeType=S \ 83 | --key-schema AttributeName=LockID,KeyType=HASH \ 84 | --billing-mode PAY_PER_REQUEST > /dev/null 2>&1 ; \ 85 | echo "Created DynamoDB table $(TFSTATE_DDB_TABLE)."; \ 86 | else \ 87 | echo "DynamoDB table $(TFSTATE_DDB_TABLE) exists."; \ 88 | fi 89 | 90 | init-all: 91 | @for m in $(MODULES); do \ 92 | $(MAKE) init MODULE=$$m || exit 1; \ 93 | done 94 | 95 | plan-all: 96 | @for m in $(MODULES); do \ 97 | $(MAKE) plan MODULE=$$m || exit 1; \ 98 | done 99 | 100 | apply-all: 101 | @for m in $(MODULES); do \ 102 | $(MAKE) apply MODULE=$$m || exit 1; \ 103 | done 104 | 105 | destroy-all: MODULES := $(shell find . -type f -name "backend.tf" -exec dirname {} \; | sort -r ) 106 | destroy-all: 107 | @for m in $(MODULES); do \ 108 | $(MAKE) destroy MODULE=$$m || exit 1; \ 109 | done 110 | 111 | 112 | init: check-env 113 | @if [ -z $(MODULE) ]; then \ 114 | echo "MODULE was not set."; \ 115 | exit 1; \ 116 | fi 117 | @rm -rf $(MODULE)/.terraform/*.tfstate 118 | @echo ENVIRONMENT=$(ENVIRONMENT) MODULE=$(MODULE) terraform::init 119 | @terraform -chdir=$(MODULE) init \ 120 | -input=false \ 121 | -upgrade \ 122 | -backend=true \ 123 | -backend-config="region=$(TFSTATE_REGION)" \ 124 | -backend-config="bucket=$(TFSTATE_S3_BUCKET)" \ 125 | -backend-config="dynamodb_table=$(TFSTATE_DDB_TABLE)" \ 126 | 2>&1 | tee -a tf-logs/$(notdir $(MODULE))-init.log 127 | 128 | tf-select-ws: 129 | @if [ -z $(MODULE) ]; then \ 130 | echo "MODULE was not set."; \ 131 | exit 1; \ 132 | fi 133 | @echo ENVIRONMENT=$(ENVIRONMENT) MODULE=$(MODULE) Switching to Terraform workspace: $(ENVIRONMENT) 134 | @terraform -chdir=$(MODULE) workspace select -or-create=true $(ENVIRONMENT) 135 | 136 | refresh: init tf-select-ws 137 | @echo ENVIRONMENT=$(ENVIRONMENT) MODULE=$(MODULE) terraform::refresh 138 | @$(call execute_terraform,$(MODULE),refresh) 139 | 140 | plan: init tf-select-ws 141 | @echo ENVIRONMENT=$(ENVIRONMENT) MODULE=$(MODULE) terraform::plan 142 | @$(call execute_terraform,$(MODULE),plan) 143 | 144 | apply: init tf-select-ws 145 | @echo ENVIRONMENT=$(ENVIRONMENT) MODULE=$(MODULE) terraform::apply 146 | @$(call execute_terraform,$(MODULE),apply,"-auto-approve") 147 | 148 | destroy: init tf-select-ws 149 | @echo ENVIRONMENT=$(ENVIRONMENT) MODULE=$(MODULE) terraform::destroy 150 | @$(call execute_terraform,$(MODULE),destroy,"-auto-approve") 151 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/35.addons/README.md: -------------------------------------------------------------------------------- 1 | # Observing Amazon EKS Cluster with AWS services for OSS tooling 2 | 3 | This part of this pattern deploys and configure the relevant Kubernetes addons. It uses the [Amazon EKS Blueprints addons](https://aws-ia.github.io/terraform-aws-eks-blueprints-addons/main/) to deploy relevant addons. A subset of the addons deployed in this folder are : 4 | 5 | * [Cert-Manager](https://aws-ia.github.io/terraform-aws-eks-blueprints-addons/main/addons/cert-manager/) 6 | * [External Secret](https://aws-ia.github.io/terraform-aws-eks-blueprints-addons/main/addons/external-secrets/) 7 | * [AWS for FluentBit](https://aws-ia.github.io/terraform-aws-eks-blueprints-addons/main/addons/aws-for-fluentbit/) 8 | * [ArgoCD](https://aws-ia.github.io/terraform-aws-eks-blueprints-addons/main/addons/argocd/) for GitOps tooling 9 | * and more... 10 | 11 | ## Prerequisites 12 | helm 13 | 14 | To install Helm, the package manager for Kubernetes, first, download the Helm binary for your operating system from the Helm Releases page. For Linux and macOS, use curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3; chmod 700 get_helm.sh; ./get_helm.sh in the terminal. Windows users should download the .zip file, extract it, and move helm.exe to a directory on their PATH. Verify the installation with helm version. Next, add the stable repository with helm repo add stable https://charts.helm.sh/stable and update your repositories using helm repo update. This process installs Helm and prepares it for managing Kubernetes applications by updating the local list of charts to match the latest versions in your repositories. For additional details and configurations, refer to the Helm documentation. 15 | None 16 | 17 | ## Using the services deployed in this part 18 | 19 | 20 | ## Architecture Decisions 21 | 22 | ### All relevant addons should be configured based on the use-case and not based on flags 23 | 24 | #### Context 25 | 26 | One of the purpose of this project is to provide a cluster ready to deploy applications into it. Therefore, instead of enabling/disabling addons based on flags, we will enable the relevant addons based on a use-case. 27 | 28 | #### Decision 29 | 30 | For every use-case/configuration that this project will collect as it grows, we will enable a group of addons for a specific use-case or purpose. For example, instead of allowing users to enable observability addons, one by one, we will group them together under a variable called `observability_configuration.aws_oss_tooling` and this will automatically configure all the addons and the relevant AWS Services to enable observability in the cluster. 31 | 32 | #### Consequences 33 | 34 | Grouping addons deployment based on requirements, simplify the deployment process. We'll might have to enable deploying addons one-by-one as an "escape hatch", but we will do it based on feedback collected along the way. 35 | 36 | 37 | ## Deploy it 38 | 39 | To deploy this folder resources to a specific resources, use the following commands 40 | 41 | ``` 42 | terraform init --backend-config=../../00.global/global-backend-config 43 | terraform workspace new 44 | terraform workspace select 45 | terraform apply -var-file="../../00.global/vars/dev.tfvars" 46 | ``` 47 | 48 | 49 | ## Troubleshooting 50 | 51 | 52 | ## Terraform docs 53 | 54 | ## Requirements 55 | 56 | | Name | Version | 57 | |------|---------| 58 | | [terraform](#requirement\_terraform) | >= 1.0 | 59 | | [aws](#requirement\_aws) | >= 5.40.0 | 60 | | [helm](#requirement\_helm) | >= 2.7 | 61 | | [kubectl](#requirement\_kubectl) | >= 2.0.3 | 62 | | [kubernetes](#requirement\_kubernetes) | 2.22.0 | 63 | | [null](#requirement\_null) | >= 3.0 | 64 | 65 | ## Providers 66 | 67 | | Name | Version | 68 | |------|---------| 69 | | [aws](#provider\_aws) | >= 5.40.0 | 70 | | [null](#provider\_null) | >= 3.0 | 71 | | [terraform](#provider\_terraform) | n/a | 72 | 73 | ## Modules 74 | 75 | | Name | Source | Version | 76 | |------|--------|---------| 77 | | [eks\_blueprints\_addons](#module\_eks\_blueprints\_addons) | aws-ia/eks-blueprints-addons/aws | ~> 1.16.2 | 78 | 79 | ## Resources 80 | 81 | | Name | Type | 82 | |------|------| 83 | | [null_resource.clean_up_argocd_resources](https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource) | resource | 84 | | [aws_caller_identity.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/caller_identity) | data source | 85 | | [aws_region.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region) | data source | 86 | | [terraform_remote_state.eks](https://registry.terraform.io/providers/hashicorp/terraform/latest/docs/data-sources/remote_state) | data source | 87 | 88 | ## Inputs 89 | 90 | | Name | Description | Type | Default | Required | 91 | |------|-------------|------|---------|:--------:| 92 | | [observability\_configuration](#input\_observability\_configuration) | observability configuration variable |
object({
aws_oss_tooling = optional(bool, true) // AMP & AMG
aws_native_tooling = optional(bool, false) // CW
aws_oss_tooling_config = optional(map(any), {})
})
| n/a | yes | 93 | | [tags](#input\_tags) | Tags to apply to resources | `map(string)` | `{}` | no | 94 | 95 | ## Outputs 96 | 97 | | Name | Description | 98 | |------|-------------| 99 | | [access\_argocd](#output\_access\_argocd) | ArgoCD Access | 100 | | [configure\_argocd](#output\_configure\_argocd) | Terminal Setup | 101 | | [external\_secrets\_addon\_output](#output\_external\_secrets\_addon\_output) | external-secrets addon output values | 102 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/40.observability/45.aws-oss-observability/grafana_operator.tf: -------------------------------------------------------------------------------- 1 | resource "helm_release" "grafana_operator" { 2 | count = var.observability_configuration.aws_oss_tooling && var.observability_configuration.aws_oss_tooling_config.enable_grafana_operator ? 1 : 0 3 | chart = var.go_config.helm_chart 4 | name = var.go_config.helm_name 5 | namespace = var.go_config.k8s_namespace 6 | version = var.go_config.helm_chart_version 7 | create_namespace = var.go_config.create_namespace 8 | values = [ 9 | yamlencode(local.critical_addons_tolerations) 10 | ] 11 | max_history = 3 12 | } 13 | 14 | locals { 15 | cluster_secretstore_name = "aws-parameter-store" 16 | cluster_secretstore_sa = "external-secrets-sa" // this is currently const - need to dynamically get it from the cluster 17 | esop_secret_name = "external-secrets" 18 | target_secret_name = "grafana-admin-credentials" 19 | } 20 | 21 | #--------------------------------------------------------------- 22 | # External Secrets Operator - Secret 23 | #--------------------------------------------------------------- 24 | 25 | resource "aws_kms_key" "secrets" { 26 | count = var.observability_configuration.aws_oss_tooling && var.observability_configuration.aws_oss_tooling_config.enable_grafana_operator ? 1 : 0 27 | enable_key_rotation = true 28 | } 29 | 30 | # handle grafana api key expiration 31 | # https://github.com/hashicorp/terraform-provider-aws/issues/27043#issuecomment-1614947274 32 | resource "time_rotating" "this" { 33 | count = var.observability_configuration.aws_oss_tooling && var.observability_configuration.aws_oss_tooling_config.enable_grafana_operator ? 1 : 0 34 | rotation_days = local.grafana_workspace_api_expiration_days 35 | } 36 | 37 | resource "time_static" "this" { 38 | count = var.observability_configuration.aws_oss_tooling && var.observability_configuration.aws_oss_tooling_config.enable_grafana_operator ? 1 : 0 39 | rfc3339 = time_rotating.this[count.index].rfc3339 40 | } 41 | 42 | resource "aws_grafana_workspace_api_key" "this" { 43 | count = var.observability_configuration.aws_oss_tooling && var.observability_configuration.aws_oss_tooling_config.enable_grafana_operator ? 1 : 0 44 | key_name = "eks-monitoring-grafana-admin-key" 45 | key_role = "ADMIN" 46 | seconds_to_live = local.grafana_workspace_api_expiration_seconds // TODO: mechanism to rotate expired key 47 | workspace_id = module.managed_grafana[count.index].workspace_id 48 | 49 | lifecycle { 50 | replace_triggered_by = [ 51 | time_static.this 52 | ] 53 | } 54 | } 55 | 56 | resource "aws_ssm_parameter" "secret" { 57 | count = var.observability_configuration.aws_oss_tooling && var.observability_configuration.aws_oss_tooling_config.enable_grafana_operator ? 1 : 0 58 | name = "/eks-accelerator/${terraform.workspace}/grafana-api-key" 59 | description = "SSM Secret to store grafana API Key" 60 | type = "SecureString" 61 | value = jsonencode({ 62 | GF_SECURITY_ADMIN_APIKEY = tostring(aws_grafana_workspace_api_key.this[count.index].key) 63 | key_id = aws_kms_key.secrets[count.index].id 64 | }) 65 | } 66 | 67 | 68 | #--------------------------------------------------------------- 69 | # External Secrets Operator - Cluster Secret Store 70 | #--------------------------------------------------------------- 71 | resource "kubectl_manifest" "cluster_secretstore" { 72 | count = var.observability_configuration.aws_oss_tooling && var.observability_configuration.aws_oss_tooling_config.enable_grafana_operator ? 1 : 0 73 | yaml_body = < 95 | ## Requirements 96 | 97 | | Name | Version | 98 | |------|---------| 99 | | [terraform](#requirement\_terraform) | >= 1.0 | 100 | | [aws](#requirement\_aws) | >= 5.40.0 | 101 | | [helm](#requirement\_helm) | >= 2.7 | 102 | | [kubectl](#requirement\_kubectl) | >= 2.0.3 | 103 | | [kubernetes](#requirement\_kubernetes) | 2.22.0 | 104 | | [null](#requirement\_null) | >= 3.0 | 105 | 106 | ## Providers 107 | 108 | | Name | Version | 109 | |------|---------| 110 | | [aws](#provider\_aws) | >= 5.40.0 | 111 | | [aws.virginia](#provider\_aws.virginia) | >= 5.40.0 | 112 | | [kubectl](#provider\_kubectl) | >= 2.0.3 | 113 | | [kubernetes](#provider\_kubernetes) | 2.22.0 | 114 | | [null](#provider\_null) | >= 3.0 | 115 | | [terraform](#provider\_terraform) | n/a | 116 | 117 | ## Modules 118 | 119 | | Name | Source | Version | 120 | |------|--------|---------| 121 | | [ebs\_csi\_driver\_irsa](#module\_ebs\_csi\_driver\_irsa) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | ~> 5.43 | 122 | | [eks](#module\_eks) | terraform-aws-modules/eks/aws | 20.24.0 | 123 | | [eks\_blueprints\_addons](#module\_eks\_blueprints\_addons) | aws-ia/eks-blueprints-addons/aws | ~> 1.16.2 | 124 | 125 | ## Resources 126 | 127 | | Name | Type | 128 | |------|------| 129 | | [aws_ec2_tag.cluster_primary_security_group](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/ec2_tag) | resource | 130 | | [aws_eks_access_entry.karpenter_node](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_access_entry) | resource | 131 | | [kubectl_manifest.karpenter_manifests](https://registry.terraform.io/providers/alekc/kubectl/latest/docs/resources/manifest) | resource | 132 | | [kubernetes_annotations.gp2](https://registry.terraform.io/providers/hashicorp/kubernetes/2.22.0/docs/resources/annotations) | resource | 133 | | [kubernetes_storage_class_v1.gp3](https://registry.terraform.io/providers/hashicorp/kubernetes/2.22.0/docs/resources/storage_class_v1) | resource | 134 | | [null_resource.update-kubeconfig](https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource) | resource | 135 | | [aws_caller_identity.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/caller_identity) | data source | 136 | | [aws_ecrpublic_authorization_token.token](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ecrpublic_authorization_token) | data source | 137 | | [aws_iam_session_context.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_session_context) | data source | 138 | | [aws_region.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region) | data source | 139 | | [kubectl_path_documents.karpenter_manifests](https://registry.terraform.io/providers/alekc/kubectl/latest/docs/data-sources/path_documents) | data source | 140 | | [terraform_remote_state.iam](https://registry.terraform.io/providers/hashicorp/terraform/latest/docs/data-sources/remote_state) | data source | 141 | | [terraform_remote_state.vpc](https://registry.terraform.io/providers/hashicorp/terraform/latest/docs/data-sources/remote_state) | data source | 142 | 143 | ## Inputs 144 | 145 | | Name | Description | Type | Default | Required | 146 | |------|-------------|------|---------|:--------:| 147 | | [cluster\_config](#input\_cluster\_config) | cluster configurations such as version, public/private API endpoint, and more | `map(string)` | `{}` | no | 148 | | [kms\_key\_admin\_roles](#input\_kms\_key\_admin\_roles) | list of role ARNs to add to the KMS policy | `list(string)` | `[]` | no | 149 | | [shared\_config](#input\_shared\_config) | Shared configuration across all modules/folders | `map(any)` | `{}` | no | 150 | | [tags](#input\_tags) | Tags to apply to resources | `map(string)` | `{}` | no | 151 | 152 | ## Outputs 153 | 154 | | Name | Description | 155 | |------|-------------| 156 | | [cluster\_certificate\_authority\_data](#output\_cluster\_certificate\_authority\_data) | n/a | 157 | | [cluster\_endpoint](#output\_cluster\_endpoint) | n/a | 158 | | [cluster\_name](#output\_cluster\_name) | The EKS Cluster version | 159 | | [configure\_kubectl](#output\_configure\_kubectl) | Configure kubectl: make sure you're logged in with the correct AWS profile and run the following command to update your kubeconfig | 160 | | [kubernetes\_version](#output\_kubernetes\_version) | The EKS Cluster version | 161 | | [oidc\_provider\_arn](#output\_oidc\_provider\_arn) | The OIDC Provider ARN | 162 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/README.md: -------------------------------------------------------------------------------- 1 | # Single Cluster / Single Environment / Single Account 2 | 3 | ## Table of content 4 | * [Design Considerations](#design-considerations) 5 | * [Architecture diagram](#architecture-diagram) 6 | * [Capabilities deployed](#capabilities-deployed-in-this-reference-implementation) 7 | * [Deploying the pattern](#deploying-the-pattern) 8 | * [Architecture Decisions](#architecture-decisions) 9 | 10 | ## Design considerations 11 | 12 | This reference implementation is designed to deploy a single Amazon EKS cluster per environment in a single account. It designed for customers that require a simple ready-to-use cluster, configured with set of opinionated (yet configurable to some extent) tooling deployed alongside the cluster itself. The ideal customers profile includes: 13 | * Customers who are early on their containerization/Kubernetes journey, and are looking for a simplified deployment to run their applications 14 | * Limited resources to manage the cluster and its configurations 15 | * Application/s that can be deployed in a single cluster 16 | * A business unit within the organization that needs to deploy a multi-environment cluster for its specific workloads 17 | 18 | ## Architecture Diagram 19 | 20 | ![architecture diagram](https://lucid.app/publicSegments/view/cca79846-a08c-4f72-84e3-df524efc409f/image.png) 21 | 22 | ## Capabilities deployed in this reference implementation 23 | 24 | This pattern deploy the following resources per environment in a single account: 25 | 26 | * Terraform remote state and locking mechanism for collaboration - this deploys Amazon S3 and Amazon DynamoDB requires to manage Terraform remote state backend configuration. 27 | 28 | * Network configuration - the base Amazon VPC configuration needed for the Amazon EKS cluster. As an example, this includes provisioning Amazon VPC Endpoints to [reduce cost and increase security](https://aws.amazon.com/blogs/architecture/reduce-cost-and-increase-security-with-amazon-vpc-endpoints/). 29 | See networking [README](./10.networking/README.md) for detailed configuration 30 | 31 | * Access Management capabilities - provisioning set of default [user-facing roles](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#user-facing-roles) that are used to access the Amazon EKS cluster 32 | See "IAM Roles for EKS" [README](./20.iam-roles-for-eks/README.md) for detailed configuration 33 | * Amazon EKS Cluster - configured with set of defaults (described in the EKS Cluster README) alongside with a baseline set of Amazon EKS add-ons that are needed for a minimal functionality (including Karpenter for node provisioning). 34 | See EKS Cluster [README](./30.eks/30.cluster/README.md) for detailed configuration 35 | * EKS Auto Mode - Enabling EKS Auto Mode will provide a fully automated cluster management for compute, storage, and networking. 36 | 37 | * EKS Blueprints addons - The intent of this folder is to provision the relevant addons, based on enabled capabilities configured for this reference implementation. 38 | See EKS Cluster [README](./30.eks/35.addons/README.md) for detailed configuration 39 | 40 | * Observability capabilities - based on the observability configuration, this part deploys the relevant AWS Services and addons into your cluster with a ready-to-use base to observe applications deployed to the cluster. Currently it supports the following configurations: 41 | * AWS OSS observability services stack (see AWS OSS [README](./40.observability/45.aws-oss-observability/README.md) folder for detailed configuration) which includes: 42 | * Amazon Managed Service for Prometheus - a serverless, Prometheus-compatible monitoring service for container metrics. 43 | * Amazon Managed Grafana - a data visualization for your operational metrics, logs, and traces 44 | * AWS Managed scraper for AMP that scrape metrics from your Amazon EKS cluster directly to an AMP workspace 45 | 46 | 47 | ## Configurable variables 48 | 49 | This pattern use a global configurable variable per environment to allow you to customize some of the environment specific objects for the different environments. The variables are documented on a configuration file under the `00.global/vars/example.tfvars` file. 50 | 51 | Instead of configuring flags for provisioning resources, this pattern uses use-case driven flags that result in a complete configuration for a collection of deployments, services, and configuration. For example, setting `observability_configuration.aws_oss_tooling = true` will result in provisioning the relevant AWS resources (such as AMP and AMG) as well the configurations that connects them together. 52 | 53 | ## Deploying the pattern 54 | 55 | This pattern rely on multiple Terraform configuration which resides in multiple folders (such as: networking, iam-roles-for-eks, eks, addons, and observability). 56 | Each folder that holds a Terraform configuration, also has a `backend.tf` terraform configuration file used to indicate the backend S3 prefix key. 57 | 58 | Before deploying the whole cluster configuration, this pattern use Amazon S3 and Amazon DynamoDB to store the Terraform state of all resources across environments, and provide a locking mechanism. 59 | The deployment of the S3 Bucket and the DynamoDB table is configured in the [`Makefile`](./Makefile) under the `bootstrap` stage. 60 | 61 | Default S3 bucket name: `tfstate-` 62 | Default DynamoDB table name: `tfstate-lock` 63 | 64 | Several environment variables must be used before running any target 65 | - `AWS_REGION` - Specify the AWS region to run on 66 | - `AWS_PROFILE` - The aws configuration profile to be used. 67 | 68 | ### Makefile 69 | We are using a [`Makefile`](./Makefile) that is designed to automate varius tasks related to managing the Terraform infrastructure. 70 | It provides several targets (or commands) to handle different aspects of the Terraform workflow. 71 | 72 | **Targets**: 73 | - `bootstrap`: Responsible for setting up the backend (S3 bucket and DynamoDB table) for state management. 74 | - `init-all`: Initializes all Terraform modules by calling the `init` target for each module. 75 | - `plan-all`: Runs the `plan` target for all Terraform modules. 76 | - `apply-all`: Runs the `apply` target for all Terraform modules. 77 | - `plan`: Runs the `terraform plan` command for a specific module. 78 | - `apply`: Runs the `terraform apply` command for a specific module. 79 | - `destroy`: Runs the `terraform destroy` command for a specific module. 80 | 81 | The `-all` Makefile target is iterating over all of the folders that holds any terraform configuration with a `backend.tf` terraform configuration file and deploy all of them one by one by specifying a target environment for the deployment. 82 | 83 | **Variables**: 84 | - `ENVIRONMENT`: Specifies the environment (e.g., dev, prod) for which the Terraform configuration will be applied. It defaults to `dev` if not set. 85 | - `MODULE`: Specifies a specific MODULE to to run. 86 | 87 | ### Step 1 - deploy the Terraform resources for state management (once for all environments) 88 | ```bash 89 | export AWS_REGION=us-east-1 90 | export AWS_PROFILE=default 91 | make bootstrap 92 | ``` 93 | 94 | ### Step 2 - configure environments for deployment 95 | 96 | In this step you should define the environments you want to deploy to (can be dev, staging, prod, etc...), as well as the overall configuration variables for every environment. 97 | 98 | To define the environments you want to deploy ensure that under the folder [`00.global/vars/`](00.global/vars) you have files for every environment with the exact name of the environments defined equivalent to the `ENVIRONMENT` we will use with our Makefile. 99 | As a starting point, this reference implementation includes a general baseline file names [`example.tfvars`](./00.global/vars/example.tfvars) as well as files for `dev` and `prod` environments. 100 | 101 | ### Step 3 - Deploy environment 102 | In this step you will deploy the environment based on the configuration you defined per environment in previous step. 103 | You can then trigger an `apply-all` Makefile target and it'll provision an environment based on the configuration per environment. 104 | 105 | ```bash 106 | export AWS_REGION=us-east-1 107 | export AWS_PROFILE=default 108 | make ENVIRONMENT=dev apply-all 109 | ``` 110 | 111 | #### Deploy a specific module 112 | To deploy a specific module, specify a MODULE variable before the the Makefile target. 113 | 114 | ```bash 115 | export AWS_REGION=us-east-1 116 | export AWS_PROFILE=default 117 | make ENVIRONMENT=dev MODULE=10.networking apply 118 | ``` 119 | 120 | ### Step 4 - Cleanup - destroy 121 | Be caution when using destroy to cleanup resources. 122 | By default, the destroy `-auto-approve` is disabled, to enable it, use the `AUTO_APPROVE=true` variable. 123 | 124 | ``` 125 | export AWS_REGION=us-east-1 126 | export AWS_PROFILE=default 127 | make ENVIRONMENT=dev MODULE=10.networking destroy AUTO_APPROVE=true 128 | ``` 129 | 130 | #### Destroy ALL 131 | This will run Terraform destroy on all modules by reverse order. 132 | 133 | ``` 134 | export AWS_REGION=us-east-1 135 | export AWS_PROFILE=default 136 | make ENVIRONMENT=dev destroy-all AUTO_APPROVE=true 137 | ``` 138 | 139 | ## Architecture Decisions 140 | 141 | ### Global variable file per environment 142 | 143 | #### Context 144 | 145 | This pattern can deploy the same configuration to multiple environments. There's a need to customize environment specific configurations to support gradual updates, or per-environment specific configuration. 146 | 147 | #### Decision 148 | 149 | This pattern standardize on a shared Terraform variable file per environment which is used in the CLI (see Makefile as it wraps the CLI commands) to use this file throughout multiple folder configurations. 150 | 151 | #### Consequences 152 | 153 | This decision help us share the variables across the different folders, and standardize on variable naming and values 154 | 155 | ### Storing Environment specific state using Terraform Workspaces 156 | 157 | https://developer.hashicorp.com/terraform/tutorials/automation/automate-terraform#multi-environment-deployment 158 | #### Context 159 | 160 | To keep the IaC code DRY, the state of the different resources needs to be kept in the context of an environment, so other configurations in different folders will be able to access the outputs of the right environment. 161 | 162 | #### Decision 163 | 164 | This pattern uses Terraform Workspaces to store and retrieve environment specific state from the S3 Backend being used as the remote state backend. Per Hashicorp recommendations on ["Multi-Environment Deployment"](https://developer.hashicorp.com/terraform/tutorials/automation/automate-terraform#multi-environment-deployment), it's encouraged to use Terraform workspaces to manage multiple environments: "Where possible, it's recommended to use a single backend configuration for all environments and use the terraform workspace command to switch between workspaces" 165 | 166 | #### Consequences 167 | 168 | The uses of Terraform workspaces allows us to use the same IaC code and backend configuration, without changing it per environment. As this project Makefile wraps the relevant workspaces commands, if users choose to rewrite their own CLI automation, they'll need to handle workspace switching before applying per-environment configuration. 169 | 170 | ### Configuring Instance Store Disks (NVMe) with RAID0 171 | 172 | #### Context 173 | 174 | We want to leverage instance store volumes for improved disks I/O performance, especially for the containerd root volume which holds container images and ephemeral storage used by containers. 175 | 176 | #### Decision 177 | 178 | Configuring instance store volumes in a RAID0 configuration using Karpenter's [instanceStorePolicy](https://karpenter.sh/docs/concepts/nodeclasses/#specinstancestorepolicy) in the default EC2NodeClass. This configuration will be applied to appropriate EC2 instance types that offer NVMe instance store volumes. 179 | 180 | #### Consequences 181 | 182 | Implementing RAID0 with instance store volumes will significantly boost I/O performance for containerd and ephemeral storage, potentially improving container start-up times and overall application responsiveness. 183 | This configuration maximizes the use of included instance store volumes, reducing EBS dependency and reducing costs. 184 | -------------------------------------------------------------------------------- /single-account-single-cluster-multi-env/30.eks/30.cluster/main.tf: -------------------------------------------------------------------------------- 1 | data "aws_region" "current" {} 2 | data "aws_caller_identity" "current" {} 3 | # data "aws_availability_zones" "available" {} 4 | data "aws_iam_session_context" "current" { 5 | # This data source provides information on the IAM source role of an STS assumed role 6 | # For non-role ARNs, this data source simply passes the ARN through issuer ARN 7 | # Ref https://github.com/terraform-aws-modules/terraform-aws-eks/issues/2327#issuecomment-1355581682 8 | # Ref https://github.com/hashicorp/terraform-provider-aws/issues/28381 9 | arn = data.aws_caller_identity.current.arn 10 | } 11 | 12 | 13 | ################################################################################ 14 | # EKS Cluster 15 | ################################################################################ 16 | #tfsec:ignore:aws-eks-enable-control-plane-logging 17 | module "eks" { 18 | source = "terraform-aws-modules/eks/aws" 19 | version = "~> 20.36.0" 20 | 21 | cluster_name = local.cluster_name 22 | cluster_version = local.cluster_version 23 | cluster_endpoint_public_access = try(!var.cluster_config.private_eks_cluster, false) 24 | 25 | create_iam_role = try(var.cluster_config.create_iam_role, true) 26 | iam_role_arn = try(var.cluster_config.cluster_iam_role_arn, null) 27 | iam_role_use_name_prefix = false 28 | 29 | node_iam_role_use_name_prefix = false 30 | 31 | cluster_enabled_log_types = ["audit", "api", "authenticator", "controllerManager", "scheduler"] 32 | 33 | vpc_id = data.terraform_remote_state.vpc.outputs.vpc_id 34 | subnet_ids = local.private_subnet_ids 35 | control_plane_subnet_ids = local.control_plane_subnet_ids 36 | 37 | # Combine root account, current user/role and additinoal roles to be able to access the cluster KMS key - required for terraform updates 38 | kms_key_administrators = distinct(concat([ 39 | "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root"], 40 | var.kms_key_admin_roles, 41 | [data.aws_iam_session_context.current.issuer_arn] 42 | )) 43 | 44 | enable_cluster_creator_admin_permissions = "true" 45 | authentication_mode = try(var.cluster_config.authentication_mode, "API") 46 | 47 | bootstrap_self_managed_addons = "false" 48 | 49 | cluster_compute_config = local.eks_auto_mode ? { enabled = local.eks_auto_mode } : {} 50 | 51 | # We're using EKS module to only provision EKS managed addons 52 | # The reason for that is to use the `before_compute` parameter which allows for the addon to be deployed before a compute is available for it to run 53 | cluster_addons = merge( 54 | local.capabilities.networking ? { 55 | vpc-cni = { 56 | # Specify the VPC CNI addon should be deployed before compute to ensure 57 | # the addon is configured before data plane compute resources are created 58 | before_compute = true 59 | most_recent = true # To ensure access to the latest settings provided 60 | preserve = false 61 | configuration_values = jsonencode({ 62 | env = { 63 | ENABLE_PREFIX_DELEGATION = "false" 64 | WARM_ENI_TARGET = "0" 65 | MINIMUM_IP_TARGET = "10" 66 | WARM_IP_TARGET = "5" 67 | } 68 | }) 69 | } 70 | } : {}, 71 | local.capabilities.kube_proxy ? { 72 | kube-proxy = { 73 | before_compute = true 74 | most_recent = true 75 | preserve = false 76 | } 77 | } : {}, 78 | local.capabilities.coredns ? { 79 | coredns = { 80 | resolve_conflicts_on_create = "OVERWRITE" 81 | resolve_conflicts_on_update = "PRESERVE" 82 | preserve = false 83 | most_recent = true 84 | configuration_values = jsonencode( 85 | { 86 | replicaCount : 2, 87 | tolerations : [local.critical_addons_tolerations.tolerations[0]] 88 | } 89 | ) 90 | } 91 | } : {}, 92 | local.capabilities.identity ? { 93 | eks-pod-identity-agent = { 94 | most_recent = true 95 | preserve = false 96 | } 97 | } : {}, 98 | local.capabilities.blockstorage ? { 99 | aws-ebs-csi-driver = { 100 | service_account_role_arn = module.ebs_csi_driver_irsa[0].iam_role_arn 101 | preserve = false 102 | } 103 | } : {} 104 | ) 105 | 106 | access_entries = { 107 | EKSClusterAdmin = { 108 | kubernetes_groups = [] 109 | principal_arn = data.terraform_remote_state.iam.outputs.iam_roles_map["EKSClusterAdmin"] 110 | 111 | policy_associations = { 112 | single = { 113 | policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy" 114 | access_scope = { 115 | type = "cluster" 116 | } 117 | } 118 | } 119 | } 120 | EKSAdmin = { 121 | kubernetes_groups = [] 122 | principal_arn = data.terraform_remote_state.iam.outputs.iam_roles_map["EKSAdmin"] 123 | 124 | policy_associations = { 125 | single = { 126 | policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSAdminPolicy" 127 | access_scope = { 128 | type = "cluster" 129 | } 130 | } 131 | } 132 | } 133 | EKSEdit = { 134 | kubernetes_groups = [] 135 | principal_arn = data.terraform_remote_state.iam.outputs.iam_roles_map["EKSEdit"] 136 | policy_associations = { 137 | single = { 138 | policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSEditPolicy" 139 | access_scope = { 140 | namespaces = ["default"] 141 | type = "namespace" 142 | } 143 | } 144 | } 145 | } 146 | EKSView = { 147 | kubernetes_groups = [] 148 | principal_arn = data.terraform_remote_state.iam.outputs.iam_roles_map["EKSView"] 149 | 150 | policy_associations = { 151 | single = { 152 | policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSAdminPolicy" 153 | access_scope = { 154 | namespaces = ["default"] 155 | type = "namespace" 156 | } 157 | } 158 | } 159 | } 160 | 161 | } 162 | 163 | # Fargate profiles use the cluster primary security group so these are not utilized 164 | create_cluster_security_group = false 165 | create_node_security_group = false 166 | 167 | # managed node group for base EKS addons such as Karpenter 168 | eks_managed_node_group_defaults = { 169 | instance_types = ["m6g.large"] 170 | ami_type = "AL2023_ARM_64_STANDARD" 171 | iam_role_additional_policies = { 172 | SSM = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore" 173 | } 174 | } 175 | eks_managed_node_groups = { 176 | "${local.cluster_name}-criticaladdons" = { 177 | create = local.create_mng_system 178 | iam_role_use_name_prefix = false 179 | subnet_ids = data.terraform_remote_state.vpc.outputs.private_subnet_ids 180 | max_size = 8 181 | desired_size = 2 182 | min_size = 2 183 | 184 | labels = { 185 | "role" : "system" 186 | } 187 | 188 | taints = { 189 | critical_addons = { 190 | key = "CriticalAddonsOnly" 191 | effect = "NO_SCHEDULE" 192 | } 193 | } 194 | } 195 | } 196 | tags = local.tags 197 | } 198 | 199 | resource "null_resource" "update-kubeconfig" { 200 | depends_on = [module.eks] 201 | triggers = { 202 | always_run = timestamp() 203 | region = local.region 204 | cluster_name = module.eks.cluster_name 205 | } 206 | provisioner "local-exec" { 207 | 208 | command = "aws eks --region ${self.triggers.region} update-kubeconfig --name ${self.triggers.cluster_name}" 209 | 210 | interpreter = ["bash", "-c"] 211 | # when = destroy 212 | } 213 | lifecycle { 214 | ignore_changes = [ 215 | # Ignore changes so it won't be applied every run 216 | # This is simply to simplify the access for whoever test this solution 217 | id, 218 | triggers 219 | ] 220 | } 221 | } 222 | 223 | ################################################################################ 224 | # EBS CSI Driver 225 | ################################################################################ 226 | module "ebs_csi_driver_irsa" { 227 | count = local.capabilities.blockstorage ? 1 : 0 228 | source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks" 229 | version = "~> 5.43" 230 | 231 | role_name = "${local.cluster_name}-ebs-csi" 232 | 233 | role_policy_arns = { 234 | policy = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy" 235 | } 236 | 237 | oidc_providers = { 238 | cluster = { 239 | provider_arn = module.eks.oidc_provider_arn 240 | namespace_service_accounts = ["kube-system:ebs-csi-controller-sa"] 241 | } 242 | } 243 | } 244 | 245 | ################################################################################ 246 | # EKS Auto Mode Node role access entry 247 | ################################################################################ 248 | resource "aws_eks_access_entry" "automode_node" { 249 | count = local.eks_auto_mode ? 1 : 0 250 | cluster_name = module.eks.cluster_name 251 | principal_arn = module.eks.node_iam_role_arn 252 | type = "EC2" 253 | } 254 | 255 | resource "aws_eks_access_policy_association" "automode_node" { 256 | count = local.eks_auto_mode ? 1 : 0 257 | cluster_name = module.eks.cluster_name 258 | access_scope { 259 | type = "cluster" 260 | } 261 | policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSAutoNodePolicy" 262 | principal_arn = module.eks.node_iam_role_arn 263 | } 264 | 265 | ################################################################################ 266 | # EKS Auto Mode default NodePools & NodeClass 267 | ################################################################################ 268 | data "kubectl_path_documents" "automode_manifests" { 269 | count = local.eks_auto_mode ? 1 : 0 270 | pattern = "${path.module}/auto-mode/*.yaml" 271 | vars = { 272 | role = module.eks.node_iam_role_name 273 | cluster_name = local.cluster_name 274 | cluster_security_group_id = module.eks.cluster_primary_security_group_id 275 | environment = terraform.workspace 276 | } 277 | depends_on = [ 278 | module.eks 279 | ] 280 | } 281 | 282 | # workaround terraform issue with attributes that cannot be determined ahead because of module dependencies 283 | # https://github.com/gavinbunney/terraform-provider-kubectl/issues/58 284 | data "kubectl_path_documents" "automode_manifests_dummy" { 285 | count = local.eks_auto_mode ? 1 : 0 286 | pattern = "${path.module}/auto-mode/*.yaml" 287 | vars = { 288 | role = "" 289 | cluster_name = "" 290 | cluster_security_group_id = "" 291 | environment = terraform.workspace 292 | } 293 | } 294 | 295 | resource "kubectl_manifest" "automode_manifests" { 296 | count = local.eks_auto_mode ? length(data.kubectl_path_documents.automode_manifests_dummy[0].documents) : 0 297 | yaml_body = element(data.kubectl_path_documents.automode_manifests[0].documents, count.index) 298 | } 299 | 300 | ################################################################################ 301 | # Storage Classes 302 | ################################################################################ 303 | resource "kubernetes_annotations" "gp2" { 304 | count = local.capabilities.blockstorage || local.eks_auto_mode ? 1 : 0 305 | api_version = "storage.k8s.io/v1" 306 | kind = "StorageClass" 307 | force = "true" 308 | metadata { 309 | name = "gp2" 310 | } 311 | annotations = { 312 | "storageclass.kubernetes.io/is-default-class" = "false" 313 | } 314 | depends_on = [ 315 | module.eks 316 | ] 317 | } 318 | 319 | resource "kubernetes_storage_class_v1" "gp3" { 320 | count = local.capabilities.blockstorage ? 1 : 0 321 | metadata { 322 | name = "gp3" 323 | annotations = { 324 | "storageclass.kubernetes.io/is-default-class" = "true" # make gp3 the default storage class 325 | } 326 | } 327 | storage_provisioner = "ebs.csi.aws.com" 328 | allow_volume_expansion = true 329 | reclaim_policy = "Delete" 330 | volume_binding_mode = "WaitForFirstConsumer" 331 | parameters = { 332 | encrypted = true 333 | fsType = "ext4" 334 | type = "gp3" 335 | } 336 | depends_on = [ 337 | module.eks 338 | ] 339 | } 340 | 341 | ################################################################################ 342 | # EKS Auto Mode Storage Class 343 | ################################################################################ 344 | resource "kubernetes_storage_class_v1" "automode" { 345 | count = local.eks_auto_mode ? 1 : 0 346 | metadata { 347 | name = "auto-ebs-sc" 348 | annotations = { 349 | "storageclass.kubernetes.io/is-default-class" = "true" 350 | } 351 | } 352 | storage_provisioner = "ebs.csi.eks.amazonaws.com" 353 | volume_binding_mode = "WaitForFirstConsumer" 354 | parameters = { 355 | encrypted = true 356 | type = "gp3" 357 | } 358 | depends_on = [ 359 | module.eks 360 | ] 361 | } 362 | 363 | ################################################################################ 364 | # EKS Auto Mode Ingress 365 | ################################################################################ 366 | resource "kubectl_manifest" "automode_ingressclass_params" { 367 | count = local.eks_auto_mode ? 1 : 0 368 | yaml_body = </aws-fluentbit-logs` 37 | 38 | ## Architecture Decisions 39 | 40 | ### Using AWS Managed Services for OSS Observability tooling 41 | 42 | #### Context 43 | 44 | This pattern run a single Amazon EKS cluster per environment in a single account. One of the reasons for that is to simplify the operational processes for managing the cluster and its infrastructure (including connected services, addons, etc...). 45 | 46 | #### Decision 47 | 48 | To reduce the need to manage observability tooling and infrastructure, this pattern use the Amazon Managed Service for Prometheus (AMP), and the Amazon Managed Grafana (AMG) as the observability metric database and visualization service. Amazon Cloudwatch is used as the logging visualization tool alongside with a deployed addon of aws-fluent-bit. It also use an in-cluster Prometheus [kube-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack), or manage and [ADOT Collector](https://aws-otel.github.io/docs/getting-started/adot-eks-add-on), this pattern uses the recently launched [AWS Managed Collector](https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-collector-how-to.html) for AMP. 49 | 50 | #### Consequences 51 | 52 | Consequences of this decisions, true to the date of writing this README, is that there is a 1:1 mapping between an EKS cluster and an AMP workspace, meaning that each cluster will have its own AMP workspace. This shouldn't be a concern as the design consideration of this pattern is to create isolated per-environment deployments of the cluster, its supporting services, and addons. 53 | 54 | A challenging consequent of this decision, is that currently the AWS Managed Collector is using the previous way of authenticating to an EKS cluster, using the `aws-auth` [ConfigMap entry](https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html), and not the new [EKS Cluster Access Management](https://docs.aws.amazon.com/eks/latest/userguide/access-entries.html) feature of managing access to the EKS Cluster using Amazon EKS APIs. Because of that, the AWS Managed Collector role needs to be added with `eksctl` which is the best available tool to this date to manage `aws-auth` configMap entries. 55 | > **_NOTE:_** keep an eye on the output of this terraform folder which includes the `eksctl` command to create the entry for the AWS Managed Scraper IAM Role 56 | 57 | ### Grafana-Operator to configure AMG workspace 58 | 59 | #### Context 60 | 61 | To provide a complete solution in this pattern, the observability resources needs to be configured as part of the deployment of this pattern. This includes connecting the AMP workspace as a data-source for the AMG workspace, deploying set of predefined dashboards as a starting point, and a process to deploy new dashboards in the same way the "basic"/predefined dashboards were deployed. 62 | 63 | #### Decision 64 | 65 | Since the AMG API doesn't support managing the workspace APIs itself (meaning the grafana APIs), and only the AMG service specific configurations, this pattern use the [grafana-operator](https://grafana.github.io/grafana-operator/docs/) to connect to the AMG workspace, and deploy dashboards into it. 66 | 67 | #### Consequences 68 | 69 | 1. The grafana operator requires the cluster to have a API key stored as a Kubernetes secret in the cluster so that the grafana-operator can connect to the workspace. That means that the API key is generated and stored in the cluster itself 70 | 2. The dashboards are managed as assets/files stored in a GitRepo and synced to the AMG workspace continuously 71 | 3. TBD - use of an API key require a process for rotating a key and redeploying it to system manager as a secret object, and sync it to the cluster afterwards (using ESO - External Secrets Operator) 72 | 73 | ## Deploy it 74 | 75 | To deploy this folder resources to a specific resources, use the following commands 76 | 77 | ``` 78 | terraform init --backend-config=../../00.global/global-backend-config 79 | terraform workspace new 80 | terraform workspace select 81 | terraform apply -var-file="../../00.global/vars/dev.tfvars" 82 | ``` 83 | 84 | 85 | ## Troubleshooting 86 | 87 | ### How can I check that metrics are ingested to the AMP workspace 88 | 89 | Using `awscurl` allows you to interact with 90 | 91 | Use the following command samples to query collecting jobs running, labels collected, and metrics values: 92 | 93 | ```bash 94 | REGION= 95 | WSID= 96 | 97 | # List jobs collecting metrics 98 | awscurl --service="aps" --region=$REGION "https://aps-workspaces.$REGION.amazonaws.com/workspaces/$WSID/api/v1/label/job/values" | jq 99 | 100 | # List labels collected 101 | awscurl --service="aps" --region=$REGION "https://aps-workspaces.$REGION.amazonaws.com/workspaces/$WSID/api/v1/label/__name__/values" | jq 102 | 103 | # List values for a specific label 104 | awscurl --service="aps" --region=$REGION "https://aps-workspaces.$REGION.amazonaws.com/workspaces/$WSID/api/v1/query?query=apiserver_request_duration_seconds_bucket" | jq 105 | 106 | ``` 107 | 108 | 109 | 110 | ## Terraform docs 111 | 112 | ## Requirements 113 | 114 | | Name | Version | 115 | |------|---------| 116 | | [terraform](#requirement\_terraform) | >= 1.1.0 | 117 | | [aws](#requirement\_aws) | >= 5.40.0 | 118 | | [helm](#requirement\_helm) | >= 2.4.1 | 119 | | [kubectl](#requirement\_kubectl) | >= 2.0.3 | 120 | | [kubernetes](#requirement\_kubernetes) | >= 2.10 | 121 | | [null](#requirement\_null) | >= 3.0 | 122 | | [time](#requirement\_time) | 0.10.0 | 123 | 124 | ## Providers 125 | 126 | | Name | Version | 127 | |------|---------| 128 | | [aws](#provider\_aws) | >= 5.40.0 | 129 | | [helm](#provider\_helm) | >= 2.4.1 | 130 | | [kubectl](#provider\_kubectl) | >= 2.0.3 | 131 | | [terraform](#provider\_terraform) | n/a | 132 | | [time](#provider\_time) | 0.10.0 | 133 | 134 | ## Modules 135 | 136 | | Name | Source | Version | 137 | |------|--------|---------| 138 | | [managed\_grafana](#module\_managed\_grafana) | terraform-aws-modules/managed-service-grafana/aws | 2.1.1 | 139 | 140 | ## Resources 141 | 142 | | Name | Type | 143 | |------|------| 144 | | [aws_grafana_workspace_api_key.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/grafana_workspace_api_key) | resource | 145 | | [aws_identitystore_group.group](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/identitystore_group) | resource | 146 | | [aws_identitystore_group_membership.group_membership](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/identitystore_group_membership) | resource | 147 | | [aws_identitystore_user.user](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/identitystore_user) | resource | 148 | | [aws_kms_key.secrets](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/kms_key) | resource | 149 | | [aws_prometheus_rule_group_namespace.recording_rules](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_rule_group_namespace) | resource | 150 | | [aws_prometheus_scraper.amp_scraper](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_scraper) | resource | 151 | | [aws_prometheus_workspace.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_workspace) | resource | 152 | | [aws_ssm_parameter.secret](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/ssm_parameter) | resource | 153 | | [helm_release.grafana_operator](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource | 154 | | [helm_release.kube_state_metrics](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource | 155 | | [helm_release.prometheus_node_exporter](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource | 156 | | [kubectl_manifest.amg_remote_identity](https://registry.terraform.io/providers/alekc/kubectl/latest/docs/resources/manifest) | resource | 157 | | [kubectl_manifest.amp_data_source](https://registry.terraform.io/providers/alekc/kubectl/latest/docs/resources/manifest) | resource | 158 | | [kubectl_manifest.amp_scraper_clusterrole](https://registry.terraform.io/providers/alekc/kubectl/latest/docs/resources/manifest) | resource | 159 | | [kubectl_manifest.amp_scraper_clusterrolebinding](https://registry.terraform.io/providers/alekc/kubectl/latest/docs/resources/manifest) | resource | 160 | | [kubectl_manifest.cluster_secretstore](https://registry.terraform.io/providers/alekc/kubectl/latest/docs/resources/manifest) | resource | 161 | | [kubectl_manifest.default_dashboards](https://registry.terraform.io/providers/alekc/kubectl/latest/docs/resources/manifest) | resource | 162 | | [kubectl_manifest.secret](https://registry.terraform.io/providers/alekc/kubectl/latest/docs/resources/manifest) | resource | 163 | | [time_rotating.this](https://registry.terraform.io/providers/hashicorp/time/0.10.0/docs/resources/rotating) | resource | 164 | | [time_static.this](https://registry.terraform.io/providers/hashicorp/time/0.10.0/docs/resources/static) | resource | 165 | | [aws_caller_identity.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/caller_identity) | data source | 166 | | [aws_eks_cluster.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/eks_cluster) | data source | 167 | | [aws_eks_cluster_auth.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/eks_cluster_auth) | data source | 168 | | [aws_region.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region) | data source | 169 | | [aws_ssoadmin_instances.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ssoadmin_instances) | data source | 170 | | [kubectl_path_documents.default_dashboards_manifest](https://registry.terraform.io/providers/alekc/kubectl/latest/docs/data-sources/path_documents) | data source | 171 | | [terraform_remote_state.eks](https://registry.terraform.io/providers/hashicorp/terraform/latest/docs/data-sources/remote_state) | data source | 172 | | [terraform_remote_state.eks_addons](https://registry.terraform.io/providers/hashicorp/terraform/latest/docs/data-sources/remote_state) | data source | 173 | | [terraform_remote_state.vpc](https://registry.terraform.io/providers/hashicorp/terraform/latest/docs/data-sources/remote_state) | data source | 174 | 175 | ## Inputs 176 | 177 | | Name | Description | Type | Default | Required | 178 | |------|-------------|------|---------|:--------:| 179 | | [go\_config](#input\_go\_config) | Grafana Operator configuration |
object({
create_namespace = optional(bool, true)
helm_chart = optional(string, "oci://ghcr.io/grafana-operator/helm-charts/grafana-operator")
helm_name = optional(string, "grafana-operator")
k8s_namespace = optional(string, "grafana-operator")
helm_release_name = optional(string, "grafana-operator")
helm_chart_version = optional(string, "v5.5.2")
})
|
{
"create_namespace": true,
"helm_chart": "oci://ghcr.io/grafana-operator/helm-charts/grafana-operator",
"helm_chart_version": "v5.5.2",
"helm_name": "grafana-operator",
"helm_release_name": "grafana-operator",
"k8s_namespace": "grafana-operator"
}
| no | 180 | | [grafana\_admin\_email](#input\_grafana\_admin\_email) | default email for the grafana-admin user | `string` | `"email@example.com"` | no | 181 | | [ksm\_config](#input\_ksm\_config) | Kube State metrics configuration |
object({
create_namespace = optional(bool, true)
k8s_namespace = optional(string, "kube-system")
helm_chart_name = optional(string, "kube-state-metrics")
helm_chart_version = optional(string, "5.16.1")
helm_release_name = optional(string, "kube-state-metrics")
helm_repo_url = optional(string, "https://prometheus-community.github.io/helm-charts")
helm_settings = optional(map(string), {})
helm_values = optional(map(any), {})

scrape_interval = optional(string, "60s")
scrape_timeout = optional(string, "15s")
})
| `{}` | no | 182 | | [ne\_config](#input\_ne\_config) | Node exporter configuration |
object({
create_namespace = optional(bool, true)
k8s_namespace = optional(string, "prometheus-node-exporter")
helm_chart_name = optional(string, "prometheus-node-exporter")
helm_chart_version = optional(string, "4.24.0")
helm_release_name = optional(string, "prometheus-node-exporter")
helm_repo_url = optional(string, "https://prometheus-community.github.io/helm-charts")
helm_settings = optional(map(string), {})
helm_values = optional(map(any), {})

scrape_interval = optional(string, "60s")
scrape_timeout = optional(string, "60s")
})
| `{}` | no | 183 | | [observability\_configuration](#input\_observability\_configuration) | observability configuration variable |
object({
aws_oss_tooling = optional(bool, true) // AMP & AMG
aws_native_tooling = optional(bool, false) // CW
aws_oss_tooling_config = optional(map(any), {})
})
|
{
"aws_native_tooling": false,
"aws_oss_tooling": true,
"aws_oss_tooling_config": {
"enable_grafana_operator": true,
"enable_managed_collector": true,
"enable_self_managed_collectors": false,
"prometheus_name": "prom"
}
}
| no | 184 | | [prometheus\_name](#input\_prometheus\_name) | Amazon Managed Service for Prometheus Name | `string` | `"prom"` | no | 185 | | [shared\_config](#input\_shared\_config) | Shared configuration across all modules/folders | `map(any)` | `{}` | no | 186 | | [tags](#input\_tags) | Additional tags | `map(string)` | `{}` | no | 187 | 188 | ## Outputs 189 | 190 | No outputs. 191 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Guidance for Automated Provisioning of Application-Ready Amazon EKS Clusters 2 | 3 | ## Table of Contents 4 | 22 | 23 | - [Overview](#overview) 24 | - [Features and Benefits](#features-and-benefits) 25 | - [Use cases](#use-cases) 26 | - [Architecture](#architecture-overview) 27 | - [AWS Services in this Guidance](#aws-services-in-this-guidance) 28 | - [Cost](#cost) 29 | - [Security](#security) 30 | - [Supported AWS regions](#supported-aws-regions) 31 | - [Prerequisites](#prerequisites) 32 | - [Operating System](#operating-system) 33 | - [Third-Party Tools](#third-party-tools) 34 | - [Deployment Steps](#deploy-the-guidance) 35 | - [License](#license) 36 | - [Notices](#notices) 37 | 38 | ## Overview 39 | 40 | The Guidance for Automated Provisioning of Application-Ready Amazon EKS Clusters is a collection of reference implementations for Amazon Elastic Kubernetes Service (EKS) designed to accelerate time it takes to launch a workload-ready EKS cluster. It includes an "opinionated" set of pre-configured and integrated tools/add-ons and follows best practices to support core capabilities including Autoscaling, Observability, Networking and Security. 41 | 42 | This guidance addresses the following key points: 43 | 44 | - Provides a simplified process for setting up an application-ready EKS cluster. 45 | - Includes pre-configured tools and add-ons to support essential capabilities, including an option for automated EKS management with [EKS AutoMode](https://aws.amazon.com/eks/auto-mode/) 46 | - Aims to reduce the learning curve associated with deploying a production-ready EKS cluster. 47 | - Allows users to focus on deploying and testing applications rather than EKS cluster setup. 48 | 49 | The motivation behind this project is to accelerate and simplify the process of setting up a cluster that is ready to support applications and workloads. We've heard from customers that there can be a learning curve associated with deploying your first application ready EKS cluster. This project aims to simplify the undifferentiated lifting, allowing you to focus on deploying and testing your applications. 50 | 51 | ## Features and Benefits 52 | 53 | 1. **Single Cluster Deployment**: Deploy one Amazon EKS cluster per environment in a single account, simplifying management and reducing complexity for users new to Kubernetes. 54 | 55 | 2. **Ready-to-Use Configuration**: Receive a pre-configured cluster with opinionated yet customizable tooling, accelerating the setup process and reducing the learning curve for containerization beginners. 56 | 57 | 3. **Environment-Specific Customization**: Easily tailor cluster configurations for different environments (e.g. `dev`, `staging`, `prod`, `automode`), enabling flexible and consistent multi-environment setups. 58 | 59 | 4. **Integrated Tooling**: Benefit from pre-integrated tools and add-ons that support core capabilities like Autoscaling, Observability, Networking, and Security, reducing the time and expertise needed to set up a production-ready cluster. 60 | 61 | 5. **Best Practices Implementation**: Automatically apply AWS and Kubernetes best practices, enhancing security, performance, and cost-efficiency without requiring deep expertise. 62 | 63 | 6. **Terraform-Based Deployment**: Utilize Terraform for infrastructure-as-code deployment, ensuring reproducibility and easier management of cluster configurations across environments. 64 | 65 | This reference implementation is designed for customers that require a simple ready-to-use cluster, configured with a set of opinionated (yet configurable to some extent) tooling deployed alongside the cluster itself. The ideal customer profile includes: 66 | 67 | - Organizations in the early stages of containerization/Kubernetes adoption looking for a simplified deployment to run their applications 68 | - Teams with limited resources to manage cluster configurations 69 | - Projects with applications that can be deployed in a single cluster 70 | - Business units within an organization that need to deploy a multi-environment cluster for their specific workloads 71 | 72 | ## Use Cases 73 | 74 | 1. **Accelerating Initial Kubernetes Adoption** 75 | Streamlining the transition to containerization for organizations new to Kubernetes/EKS. 76 | This use case addresses organizations in the early stages of adopting containerization and Kubernetes. The reference implementation provides a pre-configured, production-ready EKS cluster, reducing the complexity of initial setup. It allows teams to focus on application containerization rather than cluster configuration, accelerating the path to leveraging containerized workloads. 77 | 78 | 2. **Optimizing DevOps Resources in Early Kubernetes Adoption Stages** 79 | Enabling efficient cluster management for teams new to Kubernetes operations. For organizations in the early phases of Kubernetes adoption, DevOps teams often need to balance learning new technologies with maintaining existing systems. This implementation offers a pre-configured EKS cluster with best practices in security and scalability. It reduces the initial operational burden, allowing teams to gradually build Kubernetes expertise while maintaining productivity. 80 | 81 | 3. **Simplified Single-Cluster Deployment for Initial Projects** 82 | Providing a streamlined infrastructure for initial Kubernetes applications. This use case caters to teams deploying their first applications on Kubernetes. The reference implementation offers a robust, single-cluster environment per development stage. It's ideal for initial projects that need containerization benefits without multi-cluster complexity, allowing teams to start small and scale as they gain experience. 83 | 84 | 4. **Consistent Multi-Environment Setup for Kubernetes Newcomers** 85 | Facilitating uniform cluster deployments across dev, staging, and production for teams new to Kubernetes. For organizations setting up their first Kubernetes environments, maintaining consistency across development stages is crucial. This implementation uses Terraform and environment-specific configurations to ensure identical setups across dev, staging, and production. It helps teams new to Kubernetes establish a solid foundation for their development pipeline from the start. 86 | 87 | 5. **Adopting Terraform and AWS Best Practices for Amazon EKS** 88 | Implementing infrastructure-as-code and AWS-recommended configurations for EKS deployments. This use case is tailored for organizations aiming to adopt or improve their DevOps practices using Terraform while leveraging AWS best practices for EKS. The reference implementation provides a Terraform-based deployment that incorporates AWS-recommended configurations for EKS. It allows teams to quickly implement infrastructure-as-code methodologies, ensuring reproducibility and version control of their EKS environments. Simultaneously, it applies AWS best practices, optimizing security, performance, and cost-efficiency. This approach enables companies to accelerate their DevOps transformation while ensuring their EKS deployments align with industry standards and AWS expertise. 89 | 90 | ## Architecture Overview 91 | 92 | This section provides a reference implementation architecture diagram for the components deployed with this Guidance. 93 | 94 | At its core, the solution deploys an Amazon EKS cluster within a custom Amazon VPC. This VPC is configured with public and private subnets across multiple Availability Zones for high availability and resilience. 95 | 96 | Key elements of the architecture include: 97 | 98 | 1. [**Amazon VPC**](https://aws.amazon.com/vpc/): A custom VPC with public and private subnets, providing network isolation and security. 99 | 2. [**Amazon EKS Cluster**](https://aws.amazon.com/eks/): The central component, managing the Kubernetes control plane and compute nodes 100 | 3. [**EKS Managed Node Groups**](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html): Auto-scaling groups of EC2 instances that serve as EKS compute nodes. 101 | 4. [**Amazon Identity & Access Management**](https://aws.amazon.com/iam/): Integrated with EKS for fine-grained access control. 102 | 5. [**Amazon Elastic Container Registry (ECR)**](http://aws.amazon.com/ecr/): For storing and managing container images. 103 | 6. [**AWS Load Balancer Controller**](https://aws.amazon.com/blogs/networking-and-content-delivery/deploying-aws-load-balancer-controller-on-amazon-eks/): Automatically provisions Application Load Balancers or Network Load Balancers when a Kubernetes service of type LoadBalancer is created. 104 | 7. **Observability Tools**: [Amazon CloudWatch](https://aws.amazon.com/cloudwatch/), [Amazon Managed Grafana](https://aws.amazon.com/grafana/), and [Amazon Managed Service for Prometheus](https://aws.amazon.com/prometheus/) for comprehensive monitoring and logging. 105 | 8. [**Terraform Resources**](https://www.hashicorp.com/products/terraform): Representing infrastructure-as-code components that define and provision the guidance architecture. 106 | 107 | This architecture is designed to provide a secure, scalable, and easily manageable EKS environment, incorporating AWS best practices and ready for production workloads. 108 | 109 | ### Architecture Diagram 110 | 111 |
112 | 113 |
114 | Figure 1: Application-Ready Amazon EKS Cluster Reference architecture 115 |
116 | 117 | ### Architecture Steps 118 | 119 | 1. Admin/DevOps engineer defines a per-environment [Terraform variable file](https://developer.hashicorp.com/terraform/language/values/variables#variable-definitions-tfvars-files) that controls all of the environment-specific configuration. This variable file is used in every step of the process by all IaC configurations. 120 | 121 | 2. DevOps engineer applies the environment configuration using Terraform following the deployment process defined in the guidance. 122 | 123 | 3. An [Amazon Virtual Private Cloud (VPC)](https://aws.amazon.com/vpc/) is provisioned and configured based on specified configuration. According to best practices for Reliability, 3 Availability Zones (AZs) are configured with VPC Endpoints for access to resources deployed in private VPC. Provisioned resources for private clusters, including [Amazon Elastic Container Registry (Amazon ECR)](http://aws.amazon.com/ecr/), [Amazon EKS](https://aws.amazon.com/eks/), [Amazon Elastic Compute Cloud (Amazon EC2)](http://aws.amazon.com/ec2/), and [Amazon Elastic Block Store (Amazon EBS)](http://aws.amazon.com/ebs/) are available via corresponding VPC endpoints. 124 | 125 | 4. User-facing [AWS Identity and Access Management (IAM)](https://aws.amazon.com/iam/) roles (Cluster Admin, Admin, Editor and Reader) are created for various access levels to EKS cluster resources, as recommended in Kubernetes security best practices. 126 | 127 | 5. [Amazon Elastic Kubernetes Service (Amazon EKS)](https://aws.amazon.com/eks/) cluster is provisioned with Managed Node Groups that host critical cluster add-ons (CoreDNS, AWS Load Balancer Controller and [Karpenter](https://karpenter.sh/)) on its compute node instances. Karpenter manages compute capacity for other add-ons, as well as business applications that will be deployed by users while prioritizing provisioning [AWS Graviton](https://aws.amazon.com/ec2/graviton/) based compute node instances for the best price-performance. 128 | 129 | 6. Other important EKS add-ons are provisioned based on the configurations defined in the per-environment Terraform configuration file. 130 | 131 | 7. AWS OSS Observability stack is deployed if configured, including [Amazon Managed Service for Prometheus (AMP)](https://aws.amazon.com/prometheus/), [AWS Managed Collector for Amazon EKS](https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-collector.html), and [Amazon Managed Grafana (AMG)](https://aws.amazon.com/grafana/). In addition, a Grafana-operator addon is deployed alongside a set of predefined Grafana dashboards to get started. 132 | 133 | 8. Amazon EKS cluster(s) with critical add-ons, configured managed Observability stack and RBAC based security mapped to IAM roles is available for workload deployment and its Kubernetes API is exposed via an [AWS Network Load Balancer](https://aws.amazon.com/elasticloadbalancing/network-load-balancer/). 134 | 135 | ### AWS Services in this Guidance 136 | 137 | | **AWS Service** | **Role** | **Description** | 138 | |-----------------|----------|-----------------| 139 | | [Amazon Elastic Kubernetes Service](https://aws.amazon.com/eks/) ( EKS) | Core service | Manages the Kubernetes control plane and worker nodes for container orchestration. | 140 | | [Amazon Elastic Compute Cloud](https://aws.amazon.com/ec2/) (EC2) | Core service | Provides the compute instances for EKS worker nodes and runs containerized applications. | 141 | | [Amazon Virtual Private Cloud](https://aws.amazon.com/vpc/) (VPC) | Core Service | Creates an isolated network environment with public and private subnets across multiple Availability Zones. | 142 | | [Amazon Elastic Container Registry](http://aws.amazon.com/ecr/) (ECR) | Supporting service | Stores and manages Docker container images for EKS deployments. | 143 | | [Elastic Load Balancing](https://aws.amazon.com/elasticloadbalancing/) (NLB) | Supporting service | Distributes incoming traffic across multiple targets in the EKS cluster. | 144 | | [Amazon Elastic Block Store](https://aws.amazon.com/ebs) (EBS) | Supporting service | Provides persistent block storage volumes for EC2 instances in the EKS cluster. | 145 | | [AWS Identity and Access Management](https://aws.amazon.com/iam/) (IAM) | Supporting service | Manages access to AWS services and resources securely, including EKS cluster access. | 146 | | [Amazon Managed Grafana](https://aws.amazon.com/grafana/) (AMG) | Observability service | Provides fully managed service for metrics visualization and monitoring. | 147 | | [Amazon Managed Service for Prometheus](https://aws.amazon.com/prometheus/) (AMP) | Observability service | Offers managed Prometheus-compatible monitoring for container metrics. | 148 | | [AWS Certificate Manager](https://aws.amazon.com/certificate-manager/) (ACM) | Security service | Manages SSL/TLS certificates for secure communication within the cluster. | 149 | | [Amazon CloudWatch](https://aws.amazon.com/cloudwatch/) | Monitoring service | Collects and tracks metrics, logs, and events from EKS and other AWS resources provisoned in the guidance | 150 | | [AWS Systems Manager](https://aws.amazon.com/systems-manager/) | Management service | Provides operational insights and takes action on AWS resources. | 151 | | [AWS Key Management Service](https://aws.amazon.com/kms/) (KMS) | Security service | Manages encryption keys for securing data in EKS and other AWS services. | 152 | 153 | ## Plan your deployment 154 | 155 | ### Cost 156 | 157 | You are responsible for the cost of the AWS services used while running this guidance. 158 | As of August 2024, the cost for running this guidance with the default settings in the US East (N. Virginia) Region is approximately **$447.47/month**. 159 | 160 | We recommend creating a [budget](https://alpha-docs-aws.amazon.com/awsaccountbilling/latest/aboutv2/budgets-create.html) through [AWS Cost Explorer](http://aws.amazon.com/aws-cost-management/aws-cost-explorer/) to help manage costs. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this guidance. 161 | 162 | ### Sample cost table 163 | 164 | The following table provides a sample cost breakdown for deploying this guidance with the default parameters in the `us-east-1` (N. Virginia) Region for one month. This estimate is based on the AWS Pricing Calculator output for the full deployment as per the guidance. 165 | 166 | | **AWS service** | Dimensions | Cost, month [USD] | 167 | |-----------------|------------|-------------------| 168 | | Amazon EKS | 1 cluster | $73.00 | 169 | | Amazon VPC | 3 NAT Gateways | $98.67 | 170 | | Amazon EC2 | 2 m6g.large instances | $112.42 | 171 | | Amazon Managed Service for Prometheus (AMP) | Metric samples, storage, and queries | $100.60 | 172 | | Amazon Managed Grafana (AMG) | Metrics visualization - Editor and Viewer users | $14.00 | 173 | | Amazon EBS | gp2 storage volumes and snapshots | $17.97 | 174 | | Application Load Balancer | 1 ALB for workloads | $16.66 | 175 | | Amazon VPC | Public IP addresses | $3.65 | 176 | | AWS Key Management Service (KMS) | Keys and requests | $7.00 | 177 | | Amazon CloudWatch | Metrics | $3.00 | 178 | | Amazon ECR | Data storage | $0.50 | 179 | | **TOTAL** | | **$447.47/month** | 180 | 181 | For a more accurate estimate based on your specific configuration and usage patterns, we recommend using the [AWS Pricing Calculator](https://calculator.aws). 182 | 183 | ## Security 184 | 185 | When you build systems on AWS infrastructure, security responsibilities are shared between you and AWS. This [shared responsibility model](https://aws.amazon.com/compliance/shared-responsibility-model/) reduces your operational burden because AWS operates, manages, and controls the components including the host operating system, the virtualization layer, and the physical security of the facilities in which the services operate. For more information about AWS security, visit [AWS Cloud Security](http://aws.amazon.com/security/). 186 | 187 | This guidance implements several security best practices and AWS services to enhance the security posture of your EKS Workload Ready Cluster. Here are the key security components and considerations: 188 | 189 | ### Identity and Access Management (IAM) 190 | 191 | - **IAM Roles**: The architecture uses predefined IAM roles (Cluster Admin, Admin, Edit, Read) to manage access to the EKS cluster resources. This follows the principle of least privilege, ensuring users and services have only the permissions necessary to perform their tasks. 192 | - **EKS Managed Node Groups**: These use IAM roles with specific permissions required for nodes to join the cluster and for pods to access AWS services. 193 | 194 | ### Network Security 195 | 196 | - **Amazon VPC**: The EKS cluster is deployed within a custom VPC with public and private subnets across multiple Availability Zones, providing network isolation. 197 | - **Security Groups**: Although not explicitly shown in the diagram, security groups are typically used to control inbound and outbound traffic to EC2 instances and other resources within the VPC. 198 | - **NAT Gateways**: Deployed in public subnets to allow outbound internet access for resources in private subnets while preventing inbound access from the internet. 199 | 200 | ### Data Protection 201 | 202 | - **Amazon EBS Encryption**: EBS volumes used by EC2 instances are typically encrypted to protect data at rest. 203 | - **AWS Key Management Service (KMS)**: Used for managing encryption keys for various services, including EBS volume encryption. 204 | 205 | ### Kubernetes-specific Security 206 | 207 | - **Kubernetes RBAC**: Role-Based Access Control is implemented within the EKS cluster to manage fine-grained access to Kubernetes resources. 208 | - **AWS Certificate Manager**: Integrated to manage SSL/TLS certificates for secure communication within the cluster. 209 | 210 | ### Monitoring and Logging 211 | 212 | - **Amazon CloudWatch**: Used for monitoring and logging of AWS resources and applications running on the EKS cluster. 213 | - **Amazon Managed Grafana and Prometheus**: Provide additional monitoring and observability capabilities, helping to detect and respond to security events. 214 | 215 | ### Container Security 216 | 217 | - **Amazon ECR**: Stores container images in a secure, encrypted repository. It includes vulnerability scanning to identify security issues in your container images. 218 | 219 | ### Secrets Management 220 | 221 | - **AWS Secrets Manager**: While not explicitly shown in the diagram, it's commonly used to securely store and manage sensitive information such as database credentials, API keys, and other secrets used by applications running on EKS. 222 | 223 | ### Additional Security Considerations 224 | 225 | - Regularly update and patch EKS clusters, worker nodes, and container images. 226 | - Implement network policies to control pod-to-pod communication within the cluster. 227 | - Use Pod Security Policies or Pod Security Standards to enforce security best practices for pods. 228 | - Implement proper logging and auditing mechanisms for both AWS and Kubernetes resources. 229 | - Regularly review and rotate IAM and Kubernetes RBAC permissions. 230 | 231 | ## Supported AWS Regions 232 | 233 | The core components of the Guidance for Automated Provisioning of Application-Ready Amazon EKS Clusters are available in all AWS Regions where Amazon EKS is supported. 234 | The observability components of this guidance use Amazon Managed Service for Prometheus (AMP) and Amazon Managed Grafana (AMG). These services are available in the following regions: 235 | 253 | 254 | | Region Name | Region Code | 255 | |-------------|-------------| 256 | | US East (N. Virginia) | us-east-1 | 257 | | US East (Ohio) | us-east-2 | 258 | | US West (Oregon) | us-west-2 | 259 | | Asia Pacific (Mumbai) | ap-south-1 | 260 | | Asia Pacific (Seoul) | ap-northeast-2 | 261 | | Asia Pacific (Singapore) | ap-southeast-1 | 262 | | Asia Pacific (Sydney) | ap-southeast-2 | 263 | | Asia Pacific (Tokyo) | ap-northeast-1 | 264 | | Europe (Frankfurt) | eu-central-1 | 265 | | Europe (Ireland) | eu-west-1 | 266 | | Europe (London) | eu-west-2 | 267 | | Europe (Paris) | eu-west-3 | 268 | | Europe (Stockholm) | eu-north-1 | 269 | | South America (São Paulo) | sa-east-1 | 270 | | Greater China (Beijing) | cn-north-1 | 271 | | Greater China (Ningxia) | cn-northwest-1 | 272 | | GovCloud region (US-west) | us-gov-west-1 | 273 | | GovCloud region (US-east) | us-gov-east-1 | 274 | 275 | 276 | ### Regions Supporting Core Components Only 277 | 278 | The core components of this Guidance can be deployed in any AWS Region where Amazon EKS is available. This includes all commercial AWS Regions except for the China Regions and the AWS GovCloud (US) Regions. 279 | 280 | For the most current availability of AWS services by Region, refer to the [AWS Regional Services List](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/). 281 | 282 | Note: If you deploy this guidance into a region where AMP and/or AMG are not available, you can disable the OSS observability tooling during deployment. This allows you to use the core components of the guidance without built-in observability features. 283 | 284 | ### Service Quotas 285 | 286 | **NOTICE** 287 | Service quotas, also referred to as limits, are the maximum number of service resources or operations for your AWS account. 288 | 289 | ### Quotas for AWS services in this Guidance 290 | 291 | Ensure you have sufficient quota for each of the AWS services utilized in this guidance. For more details, refer to [AWS service quotas](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html). 292 | 293 | If you need to view service quotas across all AWS services within the documentation, you can conveniently access this information in the [Service endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/aws-general.pdf#aws-service-information) page in the PDF. 294 | 295 | For specific implementation quotas, consider the following key components and services used in this guidance: 296 | 297 | - **Amazon EKS**: Ensure that your account has sufficient quotas for Amazon EKS clusters, node groups, and related resources. 298 | - **Amazon EC2**: Verify your EC2 instance quotas, as EKS node groups rely on these. 299 | - **Amazon VPC**: Check your VPC quotas, including subnets and Elastic IPs, to support the networking setup. 300 | - **Amazon EBS**: Ensure your account has sufficient EBS volume quotas for persistent storage. 301 | - **IAM Roles**: Verify that you have the necessary quota for IAM roles, as these are critical for securing your EKS clusters. 302 | - **AWS Systems Manager**: Review the quota for Systems Manager resources, which are used for operational insights and management. 303 | - **AWS Secrets Manager**: If you're using Secrets Manager for storing sensitive information, ensure your quota is adequate. 304 | 305 | 306 | ## Deploy the Guidance 307 | 308 | ### Prerequisites 309 | 310 | Before deploying this guidance, please ensure you have met the following prerequisites: 311 | 312 | 1. **AWS Account and Permissions**: Ensure you have an active AWS account with appropriate permissions to create and manage AWS resources like Amazon EKS, EC2, IAM, and VPC. 313 | 314 | 2. **AWS CLI and Terraform Installation**: Install and configure the [AWS CLI](https://aws.amazon.com/cli/) and [Terraform](https://www.terraform.io/downloads.html) tools on your local machine. 315 | 316 | 3. **Makefile Setup**: This guidance uses a `Makefile` to automate various tasks related to managing the Terraform infrastructure. Ensure `make` utility is installed on your system. 317 | 318 | 4. **S3 Bucket and DynamoDB Table**: Set up an S3 bucket and DynamoDB table for storing Terraform state and providing a locking mechanism for managing deployments across environments. 319 | 320 | 5. **Environment Variables**: Set the required environment variables, including `AWS_REGION` and `AWS_PROFILE`, to specify the AWS region and profile for deployment. 321 | 322 | 6. **VPC and Network Configuration**: Ensure that you have a VPC configuration ready or are prepared to deploy a new VPC as part of this guidance, with public and private subnets across multiple Availability Zones. 323 | 324 | 7. **IAM Roles**: Define and configure the necessary IAM roles and policies required for accessing the Amazon EKS cluster and related resources. 325 | 326 | 8. **Observability Tools (Optional)**: If using the AWS OSS observability stack, ensure the necessary services like Amazon Managed Service for Prometheus and Amazon Managed Grafana are configured. Since AMG uses [IAM Identity Center](https://docs.aws.amazon.com/grafana/latest/userguide/authentication-in-AMG-SSO.html) to provide authentication to its Workspace, you'll have to enable IAM Identity center before deploying this pattern. 327 | 328 | 331 | ### Deployment Instructions 332 | 333 | Please refer to [Full Implementation Guide](https://aws-solutions-library-samples.github.io/compute/automated-provisioning-of-application-ready-amazon-eks-clusters.html) for detailed instructions for all deployment, confogursation as well as uninstallation options. 334 | 335 | ## License 336 | This library is licensed under the MIT-0 License. See the [LICENSE](./LICENSE) file. 337 | 338 | ## Notices 339 | 340 | *Customers are responsible for making their own independent assessment of the information in this Guidance. This Guidance: (a) is for informational purposes only, (b) represents AWS current product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided “as is” without warranties, representations, or conditions of any kind, whether express or implied. AWS responsibilities and liabilities to its customers are controlled by AWS agreements, and this Guidance is not part of, nor does it modify, any agreement between AWS and its customers.* 341 | --------------------------------------------------------------------------------