├── .release-please-manifest.json ├── .env ├── CODEOWNERS ├── renovate.json5 ├── .tflint.hcl ├── examples ├── complete │ ├── fixtures.insecure.tfvars │ ├── fixtures.secure.tfvars │ ├── outputs.tf │ ├── providers.tf │ ├── fixtures.common.tfvars │ ├── access-logging.tf │ ├── variables.tf │ ├── main.tf │ └── README.md └── README.md ├── .editorconfig ├── .checkov.yml ├── .github ├── workflows │ ├── pre-commit.yml │ ├── release-please.yml │ ├── update-command.yml │ ├── scheduled-e2e-secure-test.yml │ ├── pr-merge-group.yml │ ├── test-command.yml │ ├── slash-command-dispatch.yml │ └── pull-request-opened-by-renovate.yml └── ISSUE_TEMPLATE │ └── general_issue.md ├── doc └── adr │ ├── 0001-record-architecture-decisions.md │ ├── 0002-use-terratest-for-automated-testing-of-the-terraform-modules.md │ ├── 0007-branch-protection-settings.md │ ├── 0003-move-tfstate-backend-module-to-separate-repo.md │ ├── 0012-automated-renovate-and-release-please-workflows.md │ ├── 0006-project-decisions.md │ ├── 0009-automated-release-process.md │ ├── 0008-how-to-trigger-automated-tests.md │ ├── 0005-terraform-selection.md │ ├── 0004-module-selection-criteria.md │ ├── 0010-e2e-testing-improvement.md │ ├── 0011-e2e-ci-test-scaling.md │ ├── 0014-secrets-management.md │ └── 0013-opinionated-tofu-aws-wrapper-modules.md ├── test └── e2e │ ├── main_test.go │ ├── examples_complete_plan_only_test.go │ ├── examples_complete_insecure_test.go │ └── examples_complete_secure_test.go ├── .golangci.yml ├── release-please-config.json ├── README.md ├── .gitignore ├── .pre-commit-config.yaml ├── CONTRIBUTING.md ├── go.mod ├── Makefile ├── LICENSE └── CHANGELOG.md /.release-please-manifest.json: -------------------------------------------------------------------------------- 1 | {".":"0.0.12"} 2 | -------------------------------------------------------------------------------- /.env: -------------------------------------------------------------------------------- 1 | BUILD_HARNESS_REPO=ghcr.io/defenseunicorns/build-harness/build-harness 2 | # renovate: datasource=github-tags depName=defenseunicorns/build-harness 3 | BUILD_HARNESS_VERSION=2.0.9 4 | -------------------------------------------------------------------------------- /CODEOWNERS: -------------------------------------------------------------------------------- 1 | * @defenseunicorns/delivery-aws-iac 2 | 3 | # Privileged Files 4 | /CODEOWNERS @defenseunicorns/delivery-aws-iac-admin 5 | /LICENSE @defenseunicorns/delivery-aws-iac-admin 6 | -------------------------------------------------------------------------------- /renovate.json5: -------------------------------------------------------------------------------- 1 | { 2 | "$schema": "https://docs.renovatebot.com/renovate-schema.json", 3 | "extends": [ 4 | "github>defenseunicorns/narwhal-delivery-renovate-config:terraformReposAuto.json5" 5 | ] 6 | } 7 | -------------------------------------------------------------------------------- /.tflint.hcl: -------------------------------------------------------------------------------- 1 | plugin "terraform" { 2 | enabled = true 3 | preset = "recommended" 4 | } 5 | 6 | #plugin "aws" { 7 | # enabled = true 8 | # version = "0.23.0" 9 | # source = "github.com/terraform-linters/tflint-ruleset-aws" 10 | #} 11 | -------------------------------------------------------------------------------- /examples/complete/fixtures.insecure.tfvars: -------------------------------------------------------------------------------- 1 | enable_eks_managed_nodegroups = true 2 | enable_self_managed_nodegroups = true 3 | eks_worker_tenancy = "default" 4 | cluster_endpoint_public_access = true 5 | 6 | enable_bastion = true # you can turn this off if not using the go testing utilities 7 | -------------------------------------------------------------------------------- /examples/complete/fixtures.secure.tfvars: -------------------------------------------------------------------------------- 1 | enable_eks_managed_nodegroups = false 2 | enable_self_managed_nodegroups = true 3 | bastion_tenancy = "dedicated" 4 | eks_worker_tenancy = "dedicated" 5 | cluster_endpoint_public_access = false 6 | create_kubernetes_resources = false # terraform won't have access to the eks cluster due to public endpoint access being disabled 7 | -------------------------------------------------------------------------------- /.editorconfig: -------------------------------------------------------------------------------- 1 | root = true 2 | 3 | [*] 4 | charset = utf-8 5 | end_of_line = lf 6 | indent_size = 2 7 | indent_style = space 8 | insert_final_newline = true 9 | trim_trailing_whitespace = true 10 | max_line_length = 120 11 | tab_width = 4 12 | 13 | [{Makefile,go.mod,go.sum,*.go,.gitmodules}] 14 | indent_style = tab 15 | indent_size = 4 16 | 17 | [*.md] 18 | trim_trailing_whitespace = false 19 | -------------------------------------------------------------------------------- /.checkov.yml: -------------------------------------------------------------------------------- 1 | directory: 2 | - modules/ 3 | - examples/ 4 | download-external-modules: false # This should ideally be true but there's a lot of findings in the upstream open source modules. 5 | framework: terraform 6 | compact: true 7 | quiet: false 8 | summary-position: bottom 9 | 10 | skip-check: 11 | - CKV_TF_1 # Ensure Terraform module sources use a commit hash // pending https://github.com/hashicorp/terraform/issues/29867 12 | -------------------------------------------------------------------------------- /.github/workflows/pre-commit.yml: -------------------------------------------------------------------------------- 1 | # If the workflow trigger is "pull_request", run pre-commit checks. 2 | name: pre-commit 3 | 4 | on: 5 | pull_request: 6 | merge_group: 7 | workflow_dispatch: 8 | 9 | 10 | permissions: 11 | pull-requests: write 12 | id-token: write 13 | contents: read 14 | 15 | jobs: 16 | pre-commit: 17 | uses: defenseunicorns/delivery-github-actions-workflows/.github/workflows/pre-commit.yml@main 18 | secrets: 19 | APPLICATION_ID: ${{ secrets.NARWHAL_BOT_APP_ID }} 20 | APPLICATION_PRIVATE_KEY: ${{ secrets.NARWHAL_BOT_SECRET }} 21 | -------------------------------------------------------------------------------- /.github/workflows/release-please.yml: -------------------------------------------------------------------------------- 1 | # On every push to main, run release-please to automatically handle the release process. 2 | 3 | name: release-please 4 | 5 | on: 6 | push: 7 | branches: 8 | - main 9 | 10 | permissions: 11 | contents: write 12 | pull-requests: write 13 | 14 | jobs: 15 | release-please: 16 | uses: defenseunicorns/delivery-github-actions-workflows/.github/workflows/release-please.yml@main 17 | secrets: 18 | APPLICATION_ID: ${{ secrets.NARWHAL_BOT_APP_ID }} 19 | APPLICATION_PRIVATE_KEY: ${{ secrets.NARWHAL_BOT_SECRET }} 20 | -------------------------------------------------------------------------------- /doc/adr/0001-record-architecture-decisions.md: -------------------------------------------------------------------------------- 1 | # 1. Record architecture decisions 2 | 3 | Date: 2023-01-23 4 | 5 | ## Status 6 | 7 | Accepted 8 | 9 | ## Context 10 | 11 | We need to record the architectural decisions made on this project. 12 | 13 | ## Decision 14 | 15 | We will use Architecture Decision Records, as [described by Michael Nygard](http://thinkrelevance.com/blog/2011/11/15/documenting-architecture-decisions). 16 | 17 | ## Consequences 18 | 19 | See Michael Nygard's article, linked above. For a lightweight ADR toolset, see Nat Pryce's [adr-tools](https://github.com/npryce/adr-tools). 20 | -------------------------------------------------------------------------------- /.github/workflows/update-command.yml: -------------------------------------------------------------------------------- 1 | # This workflow is triggered by a comment on a pull request. The comment must contain "/update " to trigger the workflow. 2 | 3 | name: update 4 | on: 5 | repository_dispatch: 6 | types: [update-command] 7 | 8 | permissions: 9 | id-token: write 10 | contents: write 11 | 12 | defaults: 13 | run: 14 | # We need -e -o pipefail for consistency with GitHub Actions' default behavior 15 | shell: bash -e -o pipefail {0} 16 | 17 | jobs: 18 | update: 19 | uses: defenseunicorns/delivery-github-actions-workflows/.github/workflows/update.yml@main 20 | secrets: 21 | APPLICATION_ID: ${{ secrets.NARWHAL_BOT_APP_ID }} 22 | APPLICATION_PRIVATE_KEY: ${{ secrets.NARWHAL_BOT_SECRET }} 23 | -------------------------------------------------------------------------------- /.github/workflows/scheduled-e2e-secure-test.yml: -------------------------------------------------------------------------------- 1 | name: scheduled-e2e-secure-test 2 | 3 | on: 4 | schedule: 5 | # weekly on Mondays at 12:00 UTC 6 | - cron: '0 12 * * 1' 7 | 8 | defaults: 9 | run: 10 | shell: bash -eo pipefail {0} 11 | 12 | permissions: 13 | id-token: write # needed for oidc auth for AWS creds 14 | contents: read 15 | 16 | jobs: 17 | scheduled-e2e-secure-test: 18 | uses: defenseunicorns/delivery-github-actions-workflows/.github/workflows/secure-test-with-chatops.yml@main 19 | secrets: 20 | APPLICATION_ID: ${{ secrets.NARWHAL_BOT_APP_ID }} 21 | APPLICATION_PRIVATE_KEY: ${{ secrets.NARWHAL_BOT_SECRET }} 22 | AWS_GOVCLOUD_ROLE_TO_ASSUME: ${{ secrets.NARWHAL_AWS_GOVCLOUD_ROLE_TO_ASSUME }} 23 | SLACK_WEBHOOK_URL: ${{ secrets.NARWHAL_SLACK_URL }} 24 | -------------------------------------------------------------------------------- /test/e2e/main_test.go: -------------------------------------------------------------------------------- 1 | package e2e_test 2 | 3 | import ( 4 | "context" 5 | "os" 6 | "testing" 7 | "time" 8 | 9 | "github.com/defenseunicorns/delivery_aws_iac_utils/pkg/utils" 10 | ) 11 | 12 | // TestMain is the entry point for all tests. We are using a custom one so that we can log a message to the console every few seconds. Without this there is a risk of GitHub Actions killing the test run if it believes it is hung. 13 | func TestMain(m *testing.M) { 14 | ctx, cancel := context.WithCancel(context.Background()) 15 | go func() { 16 | for { 17 | select { 18 | case <-ctx.Done(): 19 | return 20 | default: 21 | utils.DoLog("The test is still running! Don't kill me!") 22 | } 23 | time.Sleep(180 * time.Second) 24 | } 25 | }() 26 | exitVal := m.Run() 27 | cancel() 28 | os.Exit(exitVal) 29 | } 30 | -------------------------------------------------------------------------------- /.github/workflows/pr-merge-group.yml: -------------------------------------------------------------------------------- 1 | # triggers on merge_group and pull_request events 2 | # only use this if merge queue is enabled, otherwise stick to test-command for e2e testing 3 | 4 | name: pr-merge-group 5 | on: 6 | merge_group: 7 | types: [checks_requested] 8 | pull_request: 9 | 10 | defaults: 11 | run: 12 | shell: bash -eo pipefail {0} 13 | 14 | permissions: 15 | id-token: write # needed for oidc auth for AWS creds 16 | contents: read 17 | 18 | jobs: 19 | pr-merge-group-test: 20 | uses: defenseunicorns/delivery-github-actions-workflows/.github/workflows/pr-merge-group-test.yml@main 21 | secrets: 22 | APPLICATION_ID: ${{ secrets.NARWHAL_BOT_APP_ID }} 23 | APPLICATION_PRIVATE_KEY: ${{ secrets.NARWHAL_BOT_SECRET }} 24 | AWS_COMMERCIAL_ROLE_TO_ASSUME: ${{ secrets.NARWHAL_AWS_COMMERCIAL_ROLE_TO_ASSUME }} 25 | AWS_GOVCLOUD_ROLE_TO_ASSUME: ${{ secrets.NARWHAL_AWS_GOVCLOUD_ROLE_TO_ASSUME }} 26 | -------------------------------------------------------------------------------- /.golangci.yml: -------------------------------------------------------------------------------- 1 | run: 2 | timeout: 5m 3 | linters: 4 | enable-all: true 5 | disable: 6 | - depguard 7 | - exhaustivestruct 8 | - exhaustruct 9 | - gci 10 | - goerr113 11 | - gofumpt 12 | - goimports 13 | - gomnd 14 | - lll 15 | - nlreturn 16 | - stylecheck 17 | # - testpackage 18 | - varnamelen 19 | # - wrapcheck 20 | - wsl 21 | linters-settings: 22 | funlen: 23 | lines: 120 24 | testifylint: 25 | enable-all: false 26 | enable: 27 | - bool-compare 28 | - compares 29 | - empty 30 | - error-is-as 31 | - error-nil 32 | - expected-actual 33 | - float-compare 34 | - len 35 | - suite-dont-use-pkg 36 | - suite-extra-assert-call 37 | - suite-thelper 38 | # -require-error causes errors in our e2e test patterns 39 | issues: 40 | exclude: 41 | - "G304" # Potential file inclusion via variable 42 | exclude-use-default: false 43 | -------------------------------------------------------------------------------- /test/e2e/examples_complete_plan_only_test.go: -------------------------------------------------------------------------------- 1 | package e2e_test 2 | 3 | import ( 4 | "testing" 5 | 6 | "github.com/gruntwork-io/terratest/modules/terraform" 7 | teststructure "github.com/gruntwork-io/terratest/modules/test-structure" 8 | ) 9 | 10 | func TestExamplesCompletePlanOnly(t *testing.T) { 11 | t.Parallel() 12 | tempFolder := teststructure.CopyTerraformFolderToTemp(t, "../..", "examples/complete") 13 | terraformOptionsPlan := &terraform.Options{ 14 | TerraformDir: tempFolder, 15 | Upgrade: false, 16 | VarFiles: []string{ 17 | "fixtures.common.tfvars", 18 | "fixtures.insecure.tfvars", 19 | }, 20 | // Set any overrides for variables you would like to validate 21 | Vars: map[string]interface{}{ 22 | "keycloak_enabled": false, 23 | }, 24 | SetVarsAfterVarFiles: true, 25 | } 26 | teststructure.RunTestStage(t, "SETUP", func() { 27 | terraform.Init(t, terraformOptionsPlan) 28 | terraform.Plan(t, terraformOptionsPlan) 29 | }) 30 | } 31 | -------------------------------------------------------------------------------- /doc/adr/0002-use-terratest-for-automated-testing-of-the-terraform-modules.md: -------------------------------------------------------------------------------- 1 | # 2. Use Terratest for automated testing of the Terraform modules 2 | 3 | Date: 2023-01-23 4 | 5 | ## Status 6 | 7 | Accepted 8 | 9 | ## Context 10 | 11 | We need a way to automatically test the Terraform modules that we create. 2 options were suggested: 12 | 13 | * [Terratest](https://github.com/gruntwork-io/terratest) -- A golang library from Gruntwork 14 | * [terraform-testing](https://github.com/antonbabenko/terraform-testing) -- A project by Anton Babenko, that now looks to have been either abandoned or moved 15 | 16 | ## Decision 17 | 18 | For the time being we will use Terratest for automated testing of the Terraform modules until such time that a different option is selected at a company-wide level. 19 | 20 | ## Consequences 21 | 22 | * Terratest is already used in other areas in the company (namely [DI2-ME](https://github.com/defenseunicorns/zarf-package-software-factory)) so it should be easier to adopt as we can copy/paste existing work. 23 | -------------------------------------------------------------------------------- /test/e2e/examples_complete_insecure_test.go: -------------------------------------------------------------------------------- 1 | package e2e_test 2 | 3 | import ( 4 | "testing" 5 | "time" 6 | 7 | "github.com/gruntwork-io/terratest/modules/terraform" 8 | teststructure "github.com/gruntwork-io/terratest/modules/test-structure" 9 | ) 10 | 11 | func TestExamplesCompleteInsecure(t *testing.T) { 12 | t.Parallel() 13 | tempFolder := teststructure.CopyTerraformFolderToTemp(t, "../..", "examples/complete") 14 | terraformOptions := &terraform.Options{ 15 | TerraformBinary: "tofu", 16 | TerraformDir: tempFolder, 17 | Upgrade: false, 18 | VarFiles: []string{ 19 | "fixtures.common.tfvars", 20 | "fixtures.insecure.tfvars", 21 | }, 22 | RetryableTerraformErrors: map[string]string{ 23 | ".*": "Failed to apply Terraform configuration due to an error.", 24 | }, 25 | MaxRetries: 5, 26 | TimeBetweenRetries: 5 * time.Second, 27 | } 28 | 29 | // Defer the teardown 30 | defer func() { 31 | t.Helper() 32 | teststructure.RunTestStage(t, "TEARDOWN", func() { 33 | terraform.Destroy(t, terraformOptions) 34 | }) 35 | }() 36 | 37 | // Set up the infra 38 | teststructure.RunTestStage(t, "SETUP", func() { 39 | terraform.InitAndApply(t, terraformOptions) 40 | }) 41 | } 42 | -------------------------------------------------------------------------------- /doc/adr/0007-branch-protection-settings.md: -------------------------------------------------------------------------------- 1 | # 7. Branch Protection Settings 2 | 3 | Date: 2023-07-27 4 | 5 | ## Status 6 | 7 | Accepted 8 | 9 | ## Context 10 | 11 | We need to decide as a team what the branch protection setting will be on our repo(s). 12 | 13 | ## Decision 14 | 15 | - We will have a Branch protection rule with the branch name pattern of `main` that contains the following settings: 16 | - Require pull request reviews before merging 17 | - Require at least 1 approving review 18 | - Dismiss stale pull request approvals when new commits are pushed 19 | - Require review from Code Owners 20 | - Restrict who can dismiss pull request reviews to organization and repository administrators 21 | - Do not allow specified actors to bypass required pull requests 22 | - Do not require approval of the most recent reviewable push 23 | - Require status checks to pass before merging 24 | - pre-commit status checks 25 | - Integration/E2E tests 26 | - Require conversation resolution before merging 27 | - Require signed commits 28 | - Require linear history 29 | - Do not require merge queue 30 | - Do not require deployments to succeed before merging 31 | - Do not lock the branch 32 | - Do not allow bypassing the above settings 33 | - Restrict who can push to matching branches to organization administrators, repository administrators, and users with the Maintain role only 34 | - Do not allow force pushes 35 | - Do not allow deletions 36 | -------------------------------------------------------------------------------- /examples/complete/outputs.tf: -------------------------------------------------------------------------------- 1 | # Root module outputs 2 | # Setting all of them sensitive = true to avoid having their details logged to the console in our public CI pipelines 3 | 4 | output "bastion_instance_id" { 5 | description = "The ID of the bastion host" 6 | value = try(module.bastion[0].instance_id, null) 7 | sensitive = true 8 | } 9 | 10 | output "bastion_region" { 11 | description = "The region that the bastion host was deployed to" 12 | value = try(module.bastion[0].region, null) 13 | sensitive = true 14 | } 15 | 16 | output "bastion_private_dns" { 17 | description = "The private DNS address of the bastion host" 18 | value = try(module.bastion[0].private_dns, null) 19 | sensitive = true 20 | } 21 | 22 | output "vpc_cidr" { 23 | description = "The CIDR block of the VPC" 24 | value = module.vpc.vpc_cidr_block 25 | sensitive = true 26 | } 27 | 28 | output "eks_cluster_name" { 29 | description = "The name of the EKS cluster" 30 | value = module.eks.cluster_name 31 | sensitive = true 32 | } 33 | 34 | output "efs_storageclass_name" { 35 | description = "The name of the EFS storageclass that was created (if var.enable_amazon_eks_aws_efs_csi_driver was set to true)" 36 | value = try(module.eks.efs_storageclass_name, null) 37 | } 38 | 39 | output "lambda_password_function_arn" { 40 | description = "Arn for lambda password function" 41 | value = try(module.password_lambda[0].lambda_password_function_arn, null) 42 | } 43 | -------------------------------------------------------------------------------- /release-please-config.json: -------------------------------------------------------------------------------- 1 | { 2 | "packages": { 3 | ".": { 4 | "bump-minor-pre-major": true, 5 | "bump-patch-for-minor-pre-major": true, 6 | "changelog-host": "https://github.com", 7 | "changelog-path": "CHANGELOG.md", 8 | "changelog-sections": [ 9 | { "type": "feat", "section": "Features" }, 10 | { "type": "feature", "section": "Features" }, 11 | { "type": "fix", "section": "Bug Fixes" }, 12 | { "type": "perf", "section": "Performance Improvements" }, 13 | { "type": "revert", "section": "Reverts" }, 14 | { "type": "docs", "section": "Documentation" }, 15 | { "type": "style", "section": "Styles" }, 16 | { "type": "chore", "section": "Miscellaneous Chores" }, 17 | { "type": "refactor", "section": "Code Refactoring" }, 18 | { "type": "test", "section": "Tests" }, 19 | { "type": "build", "section": "Build System" }, 20 | { "type": "ci", "section": "Continuous Integration" } 21 | ], 22 | "changelog-type": "default", 23 | "draft": false, 24 | "draft-pull-request": false, 25 | "include-component-in-tag": false, 26 | "include-v-in-tag": true, 27 | "prerelease": false, 28 | "pull-request-header": ":robot: I have created a release *beep* *boop*", 29 | "pull-request-title-pattern": "chore${scope}: release${component} ${version}", 30 | "release-type": "simple", 31 | "separate-pull-requests": false, 32 | "skip-github-release": false, 33 | "versioning": "default" 34 | } 35 | } 36 | } 37 | -------------------------------------------------------------------------------- /.github/workflows/test-command.yml: -------------------------------------------------------------------------------- 1 | # usage: 2 | # A user with write status to the repo can from a PR comment: 3 | 4 | # run a single test 5 | # /test make= region= 6 | 7 | # run ping test 8 | # /test ping 9 | 10 | # run all tests in the makefile 11 | # /test 12 | 13 | name: test 14 | on: 15 | repository_dispatch: 16 | types: [test-command] 17 | 18 | 19 | permissions: 20 | id-token: write 21 | contents: read 22 | 23 | defaults: 24 | run: 25 | # We need -e -o pipefail for consistency with GitHub Actions' default behavior 26 | shell: bash -e -o pipefail {0} 27 | 28 | jobs: 29 | e2e-test: 30 | uses: defenseunicorns/delivery-github-actions-workflows/.github/workflows/e2e-test.yml@main 31 | secrets: 32 | APPLICATION_ID: ${{ secrets.NARWHAL_BOT_APP_ID }} 33 | APPLICATION_PRIVATE_KEY: ${{ secrets.NARWHAL_BOT_SECRET }} 34 | AWS_COMMERCIAL_ROLE_TO_ASSUME: ${{ secrets.NARWHAL_AWS_COMMERCIAL_ROLE_TO_ASSUME }} 35 | AWS_GOVCLOUD_ROLE_TO_ASSUME: ${{ secrets.NARWHAL_AWS_GOVCLOUD_ROLE_TO_ASSUME }} 36 | with: 37 | # check if the required slash command args are present, if so populate the json matrix, else pass in null and relevant e2e tests that would require a make target and region will be skipped 38 | e2e-test-matrix: ${{ (contains(github.event.client_payload.slash_command_args.named, 'make') && contains(github.event.client_payload.slash_command_args.named, 'region')) && format('[{{"make-target":"{0}", "region":"{1}"}}]', github.event.client_payload.slash_command_args.named.make, github.event.client_payload.slash_command_args.named.region) || null }} 39 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Infrastructure-as-Code 2 | 3 | This repository is used as an example project for the upstream UDS-IaC projects, as well as for integration testing those upstream modules when they have updates. When any of the upstream IaC modules receive a new release Renovate will open a PR and trigger a pipeline in this repository that performs integration and E2E tests. 4 | 5 | ## Getting Started 6 | 7 | ### Contributing 8 | 9 | See [CONTRIBUTING.md](CONTRIBUTING.md) for more information on how to contribute to this repository. 10 | 11 | ## Upstream Modules 12 | 13 | Details of the UDS-IaC modules tested by this repository can be found in each of the upstream repositories: 14 | 15 | - [VPC Module](https://github.com/defenseunicorns/terraform-aws-vpc) 16 | - [RDS Module](https://github.com/defenseunicorns/terraform-aws-rds) 17 | - [EKS Module](https://github.com/defenseunicorns/terraform-aws-eks) 18 | - [Bastion Module](https://github.com/defenseunicorns/terraform-aws-bastion) 19 | - [Lambda Module](https://github.com/defenseunicorns/terraform-aws-lambda) 20 | 21 | ## Supported Integrations 22 | 23 | ### EKS 24 | 25 | See the `cluster_version` variable in [variables.tf](examples/complete/variables.tf) for the list of supported EKS versions. 26 | 27 | ### Defense Unicorns Big Bang Distribution (DUBBD) 28 | 29 | We intend for the latest version of DUBBD to be deployable on top of the infrastructure created by [the example root module](examples/complete), but we aren't yet testing this in our automated tests. If you encounter any issues with this, please [open an issue](https://github.com/defenseunicorns/delivery-aws-iac/issues/new/choose). 30 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .cache/ 2 | .idea/ 3 | .DS_Store 4 | .vscode 5 | 6 | 7 | # Local .terraform directories 8 | .terraform/ 9 | *.terraform.* 10 | # .tfstate files 11 | *.tfstate 12 | *.tfstate.* 13 | 14 | *.terraform.lock.hcl 15 | 16 | # Crash log files 17 | crash.log 18 | crash.*.log 19 | 20 | # Exclude all .tfvars files, which are likely to contain sensitive data, such as 21 | # password, private keys, and other secrets. These should not be part of version 22 | # control as they are data points which are potentially sensitive and subject 23 | # to change depending on the environment. 24 | *.tfvars 25 | *.tfvars.json 26 | 27 | # Except ones that we do want to commit because they are used for automated tests 28 | !examples/complete/fixtures.common.tfvars 29 | !examples/complete/fixtures.insecure.tfvars 30 | !examples/complete/fixtures.secure.tfvars 31 | !modules/cloudtrail/examples/complete/fixtures.create-bucket.tfvars 32 | 33 | # Ignore override files as they are usually used to override resources locally and so 34 | # are not checked in 35 | override.tf 36 | override.tf.json 37 | *_override.tf 38 | *_override.tf.json 39 | 40 | # Include override files you do wish to add to version control using negated pattern 41 | # !example_override.tf 42 | 43 | # Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan 44 | # example: *tfplan* 45 | 46 | # Ignore CLI configuration files 47 | .terraformrc 48 | terraform.rc 49 | 50 | # Ignore Terraform cache 51 | .terragrunt-cache* 52 | 53 | # Ignore Terraform state files 54 | backend.tf 55 | 56 | # Ignore Checkov external module downloads 57 | .external_modules 58 | 59 | examples/zarf-complete-example/build 60 | 61 | #ignore lamba builds json file created from deploying lambda resource 62 | 63 | **/ignore 64 | 65 | **/builds 66 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/general_issue.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: General Issue 3 | about: Suggest a new feature, report a bug, or just ask a question 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | 11 | 12 | ### Persona 13 | 14 | 15 | 16 | 17 | 18 | ### Description 19 | 20 | 21 | 22 | 23 | 24 | ### Use Case 25 | 26 | 27 | 28 | 29 | 30 | ### Impact 31 | 32 | 33 | 34 | 35 | 36 | ### Completion 37 | 38 | 43 | 44 | 45 | 46 | ### Additional Context 47 | 48 | 49 | -------------------------------------------------------------------------------- /doc/adr/0003-move-tfstate-backend-module-to-separate-repo.md: -------------------------------------------------------------------------------- 1 | # 3. Move tfstate-backend module to separate repo 2 | 3 | Date: 2023-02-21 4 | 5 | ## Status 6 | 7 | Accepted 8 | 9 | ## Context 10 | 11 | As we start moving more toward treating our Terraform infrastructure code as a product, additional focus is needed on making each module into an independently consumable product. This means that each module should: 12 | 13 | * Be versioned independently 14 | * Run automated tests 15 | * Have sufficient documentation 16 | 17 | ## Decision 18 | 19 | To assist with being versioned independently, we will move the `tfstate-backend` module to a [separate repository](https://github.com/defenseunicorns/terraform-aws-tfstate-backend). This will allow us to develop and release new versions of the module independently of the rest of the infrastructure code. 20 | 21 | This decision is, for now, just being made for the `tfstate-backend` module. We will evaluate other modules for similar treatment in the future as we uncover better ways and best practices for managing reusable production-level Terraform work. 22 | 23 | ## Consequences 24 | 25 | What becomes easier or more difficult to do and any risks introduced by the change that will need to be mitigated. 26 | 27 | * It will be easier to version the module independently of the rest of the infrastructure code. 28 | * It will be easier to run automated tests on the module since we won't need any custom logic to figure out when certain tests can be skipped (e.g. when this module has not been changed but another has, only run the other module's tests) 29 | * Our work will be less DRY (Don't Repeat Yourself) since each independent module repo will need its own set of GitHub Actions workflows/scripts/Makefile, etc. This can potentially be mitigated by using automation to keep code that is the same across all modules in sync. 30 | -------------------------------------------------------------------------------- /doc/adr/0012-automated-renovate-and-release-please-workflows.md: -------------------------------------------------------------------------------- 1 | # 12. automated renovate and release-please workflows 2 | 3 | Date: 2023-11-06 4 | 5 | ## Status 6 | 7 | Accepted 8 | 9 | ## Context 10 | 11 | We have many repos that have automated dependency updates, but they need manual approvals and linting after the fact in the pull requests. 12 | 13 | These maintenance PRs need to be automated as much as possible. 14 | 15 | ## Decision 16 | 17 | - Renovate will run less often: **weekly** 18 | - Renovate PRs will automatically be handled by renovate-bot and narwhal-bot 19 | - Renovate PR workflow steps: 20 | - Renovate PR opened by renovate bot 21 | - Narwhal-bot runs pre-commit hooks for linting and documentation updates, pushes changes to renovate branch 22 | - Narwhal-bot modifies the repository settings' branch protection for main so that it may approve PRs (removes CODEOWNER approval requirement for the PR) 23 | - *Note*: you can't add bots to a team, you can't add bots to CODEOWNERS file, so we are getting hacky with repo settings 24 | - Narwhal-bot approves the PR via GH acttion 25 | - Narwhal-bot `auto-merges` PR via graph-ql mutation 26 | - Renovate PR gets added to merge queue 27 | - Workflow step queries the repo's merge queue in a loop to check if PR has been added to the queue via graph-ql 28 | - Narwhal-bot adds CODEOWNERs approval back to branch protection 29 | - PR is merged into main and closed 30 | 31 | ## Consequences 32 | 33 | No more babysitting renovate PRs. Less maintenance overhead. 34 | Failing code still doesn't make it into main. Merge queue pattern will kick out anything that fails tests as determined by the e2e test pipeline patterns. This will then require manual intervention. 35 | 36 | ### Risks 37 | 38 | Need to be conscious of cloud resource quotas. Renovate needs to be configured with proper deployment windows. 39 | -------------------------------------------------------------------------------- /doc/adr/0006-project-decisions.md: -------------------------------------------------------------------------------- 1 | # 4. Selection Criteria for Upstream Terraform Modules 2 | 3 | Date: 2023-05-11 (Effective 2023-01-02) 4 | 5 | ## Status 6 | 7 | Accepted 8 | 9 | ## Context 10 | 11 | Why / How were opinionatation decisions made for this project 12 | 13 | Guiding Principles: 14 | * AWS is the target environment 15 | * zarf will be the GitOps mechanism for IaC 16 | * this project will provision requisite AWS resources to `zarf init` an EKS cluster and deploy bigbang 17 | * IL5 controls are met 18 | * make it simpler to deploy / update *the same* zarf package (base) across multiple environments 19 | - extensible to & for dev -> stg -> prd deployment values via the same base package 20 | * limit deployment options to: 21 | - be secure by default / allow for additional security restrictions 22 | - standardize access to the cloud env / how we intended interaction to EKS 23 | - sufficiently meet access controls (NIST 800-53), network controls and STIG requirements 24 | 25 | ## Decision 26 | 27 | * The bastion module will: 28 | - inform how users can/are expected to interact with the environment 29 | - be provisioned in private or intra subnets and leverage SSM for access 30 | - standardize access for the environment in an approved way that follows a common / established pattern for cloud, air gap & on prem 31 | - establish a common pattern for users to interact with EKS / AWS managed services 32 | * Customizable terraform root module (see complete example) will: 33 | - enable zarf to be the highly opinionated wrapper / version controlled mechanism 34 | - be extensible for different mission hero use cases / environments 35 | * Sops module will: 36 | - create a common pattern for handling secrets & leverage a managed service via IAM roles for key rotations 37 | - assume flux or zarf will handle decryption of values via the provided roles 38 | * AWS Private Link will: 39 | - be enabled by the VPC and used by all resources 40 | - be enforced via IAM policy conditions to ensure that services are only accessible from within a VPC 41 | 42 | ## Consequences 43 | -------------------------------------------------------------------------------- /.github/workflows/slash-command-dispatch.yml: -------------------------------------------------------------------------------- 1 | # When someone with write access to the repo adds a comment to a PR that contains "/test ", dispatch the workflow found in "test-command.yml" 2 | # When someone with write access to the repo adds a comment to a PR that contains "/update ", dispatch the workflow found in "update-command.yml" 3 | 4 | name: Slash Command Dispatch 5 | 6 | on: 7 | issue_comment: 8 | types: [created] 9 | 10 | jobs: 11 | 12 | slashCommandDispatchTest: 13 | if: github.event.issue.pull_request && contains(github.event.comment.body, '/test') 14 | runs-on: ubuntu-latest 15 | steps: 16 | - name: Get token 17 | id: get_workflow_token 18 | uses: peter-murray/workflow-application-token-action@v3 19 | with: 20 | application_id: ${{ secrets.NARWHAL_BOT_APP_ID }} 21 | application_private_key: ${{ secrets.NARWHAL_BOT_SECRET }} 22 | 23 | - name: Slash Command Dispatch 24 | uses: peter-evans/slash-command-dispatch@v4 25 | with: 26 | token: ${{ steps.get_workflow_token.outputs.token }} 27 | reaction-token: ${{ steps.get_workflow_token.outputs.token }} 28 | commands: test 29 | permission: write 30 | issue-type: pull-request 31 | 32 | slashCommandDispatchUpdate: 33 | if: github.event.issue.pull_request && contains(github.event.comment.body, '/update') 34 | runs-on: ubuntu-latest 35 | steps: 36 | - name: Get token 37 | id: get_workflow_token 38 | uses: peter-murray/workflow-application-token-action@v3 39 | with: 40 | application_id: ${{ secrets.NARWHAL_BOT_APP_ID }} 41 | application_private_key: ${{ secrets.NARWHAL_BOT_SECRET }} 42 | 43 | - name: Slash Command Dispatch 44 | uses: peter-evans/slash-command-dispatch@v4 45 | with: 46 | token: ${{ steps.get_workflow_token.outputs.token }} 47 | reaction-token: ${{ steps.get_workflow_token.outputs.token }} 48 | commands: update 49 | permission: write 50 | issue-type: pull-request 51 | -------------------------------------------------------------------------------- /doc/adr/0009-automated-release-process.md: -------------------------------------------------------------------------------- 1 | # 9. Automated Release Process 2 | 3 | Date: 2023-07-27 4 | 5 | ## Status 6 | 7 | Accepted 8 | 9 | ## Context 10 | 11 | We need to decide as a team how we will do releases for this repo. We have a few options: 12 | 13 | - Do a SemVer version on every push to main 14 | - Do a CalVer version on every push to main (something like `vYYYYMMDDss`) 15 | - Do a SemVer version periodically 16 | - Do a CalVer version periodically 17 | 18 | We also need to decide how we will do the releases. The current options being discussed are: 19 | 20 | - A GitHub Actions workflow that creates a tag 21 | - The ReleasePlease bot 22 | 23 | ## Decision 24 | 25 | - We will do a SemVer version periodically, with no fixed release cadence. We will release when we feel it is important to do so. 26 | - We will set up a GitHub Action that runs every day that alerts us via Slack if a release has not been made in the last 14 days. 27 | - We will use the ReleasePlease bot to do the releases. 28 | - We will delete any CalVer tags that are present in the repo so that ReleasePlease doesn't try to use them to determine the next version. 29 | 30 | Why: 31 | - We don't want to release on every commit to main because sometimes small commits to main to not warrant new releases. 32 | - We want to set up an automated test that tests the upgrade path between the last release and the latest commit to main. If releases are happening too frequently, this type of test becomes far less useful. 33 | - We want to use SemVer because it gives any consumer a better understanding of what happened in the release 34 | - We want to use ReleasePlease because it will help us automate good habits of maintaining a changelog and good release notes. 35 | 36 | ## Consequences 37 | 38 | - Maintainers will need to use [Conventional Commit](https://www.conventionalcommits.org/en/v1.0.0/) messages when merging PRs for ReleasePlease to work properly. 39 | - We will need to set up a GitHub Action that runs on commits to main that alerts us via Slack if a commit is made to main that does not conform to the Semantic Commit format. 40 | -------------------------------------------------------------------------------- /examples/README.md: -------------------------------------------------------------------------------- 1 | # Examples 2 | 3 | This directory contains examples of how to use the various modules in this repository. 4 | 5 | ## How to Deploy 6 | 7 | ### Prerequisites 8 | 9 | - *Nix operating system (Linux, macOS, WSL2) 10 | - AWS CLI environment variables 11 | - At minimum: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and either `AWS_REGION` or `AWS_DEFAULT_REGION` 12 | - Preferred: the above plus `AWS_SESSION_TOKEN`, `AWS_SECURITY_TOKEN`, and `AWS_SESSION_EXPIRATION` 13 | > If the account is set up to require MFA, you'll be required to have the session stuff. We recommend that you use [aws-vault](https://github.com/99designs/aws-vault). Friends don't let friends use unencrypted AWS creds. 14 | - `docker` 15 | - `make` 16 | - various standard CLI tools that usually come with running on *Nix (grep, sed, etc) 17 | 18 | ### Deploy 19 | 20 | We'll be using our automated tests to stand up environments. They use [Terratest](https://github.com/gruntwork-io/terratest). Each test is based on one of examples in the `examples` directory. For example, if you want to stand up the "complete" example in "insecure" mode, you'll run the `test-ci-complete-insecure` target. 21 | 22 | ```shell 23 | export SKIP_TEARDOWN=1 24 | unset SKIP_SETUP 25 | unset SKIP_TEST 26 | make 27 | ``` 28 | > `SKIP_TEARDOWN` tells Terratest to skip running the test stage called "TEARDOWN", which is the stage that destroys the environment. We want things to stay up, so we set this variable. We also make sure `SKIP_SETUP` and `SKIP_TEST` are unset. 29 | 30 | > Run `make help` to see all the available targets. Any of them can be used to stand up an environment with different parameters. Do not run `make test` directly, as it will run all the tests in parallel and is not compatible with `SKIP_TEARDOWN`. 31 | 32 | ### Destroy 33 | 34 | ```shell 35 | unset SKIP_TEARDOWN 36 | export SKIP_SETUP=1 37 | export SKIP_TEST=1 38 | make 39 | ``` 40 | > Since we're tearing down this time, we don't want `SKIP_TEARDOWN` to be set. Instead, we are setting `SKIP_SETUP` and `SKIP_TEST` to skip the setup and test stages. 41 | -------------------------------------------------------------------------------- /test/e2e/examples_complete_secure_test.go: -------------------------------------------------------------------------------- 1 | package e2e_test 2 | 3 | import ( 4 | "testing" 5 | "time" 6 | 7 | "github.com/gruntwork-io/terratest/modules/terraform" 8 | teststructure "github.com/gruntwork-io/terratest/modules/test-structure" 9 | ) 10 | 11 | // This test deploys the complete example in govcloud, "secure mode". Secure mode is: 12 | // - Self-managed nodegroups only 13 | // - Dedicated instance tenancy 14 | // - EKS public endpoint disabled. 15 | func TestExamplesCompleteSecure(t *testing.T) { 16 | t.Parallel() 17 | // Setup options 18 | tempFolder := teststructure.CopyTerraformFolderToTemp(t, "../..", "examples/complete") 19 | terraformOptions := &terraform.Options{ 20 | TerraformBinary: "tofu", 21 | TerraformDir: tempFolder, 22 | VarFiles: []string{ 23 | "fixtures.common.tfvars", 24 | "fixtures.secure.tfvars", 25 | }, 26 | RetryableTerraformErrors: map[string]string{ 27 | ".*": "Failed to apply Terraform configuration due to an error.", 28 | }, 29 | MaxRetries: 5, 30 | TimeBetweenRetries: 5 * time.Second, 31 | } 32 | 33 | // Defer the teardown 34 | // Defer the teardown 35 | defer func() { 36 | t.Helper() 37 | teststructure.RunTestStage(t, "TEARDOWN", func() { 38 | terraform.Destroy(t, terraformOptions) 39 | }) 40 | }() 41 | // Set up the infra 42 | teststructure.RunTestStage(t, "SETUP", func() { 43 | terraform.InitAndApply(t, terraformOptions) 44 | }) 45 | 46 | // // Run assertions 47 | // add tests here to do stuff to the cluster with sshuttle because the public endpoint is disabled 48 | 49 | // teststructure.RunTestStage(t, "TEST", func() { 50 | // // Start sshuttle 51 | // cmd, err := utils.RunSshuttleInBackground(t, tempFolder) 52 | // require.NoError(t, err) 53 | // defer func(t *testing.T, cmd *exec.Cmd) { 54 | // t.Helper() 55 | // err := utils.StopSshuttle(t, cmd) 56 | // require.NoError(t, err) 57 | // }(t, cmd) 58 | // utils.ValidateEFSFunctionality(t, tempFolder) 59 | // utils.DownloadZarfInitPackage(t) 60 | // utils.ConfigureKubeconfig(t, tempFolder) 61 | // utils.ValidateZarfInit(t, tempFolder) 62 | // }) 63 | } 64 | -------------------------------------------------------------------------------- /.github/workflows/pull-request-opened-by-renovate.yml: -------------------------------------------------------------------------------- 1 | # If Renovate is not the author of the PR that triggers this workflow, it will do nothing. 2 | # If Renovate is the author of the PR that triggers this workflow, but the workflow event is anything but "opened", it will do nothing. 3 | # If Renovate is the author of the PR that triggers this workflow, and the workflow event is "opened", it will: 4 | # 1. Autoformat using pre-commit and, if necessary, push an additional commit to the PR with the autoformat fixes. 5 | # 2. Change the branch protection rules to turn off require codeowner approval due to github apps not being able to be codeowners or added to teams. 6 | # 3. narwhal-bot approves the PR. 7 | # 4. narwhal-bot merges the PR. 8 | # 5. PR is added to merge queue. 9 | # 6. tests are ran. 10 | # a. If tests pass, PR is merged. 11 | # i. If PR is merged, it is closed and branch is deleted. 12 | # b. If tests fail, PR stays open and it is removed from merge queue. 13 | # 7. Branch protection is always set back to the original state. 14 | # 15 | # See ADR #0008. 16 | name: auto-test 17 | on: 18 | pull_request: 19 | # WARNING: DO NOT ADD MORE EVENT TYPES HERE! Because this workflow will push a new commit to the PR in the Autoformat step, adding more event types may cause an infinite loop. 20 | types: 21 | - opened 22 | 23 | permissions: 24 | id-token: write 25 | contents: write 26 | 27 | defaults: 28 | run: 29 | # We need -e -o pipefail for consistency with GitHub Actions' default behavior 30 | shell: bash -e -o pipefail {0} 31 | 32 | jobs: 33 | renovate-test: 34 | if: github.event.client_payload.github.actor == 'renovate[bot]' || github.actor == 'renovate[bot]' 35 | uses: defenseunicorns/delivery-github-actions-workflows/.github/workflows/renovate-test.yml@main 36 | secrets: 37 | APPLICATION_ID: ${{ secrets.NARWHAL_BOT_APP_ID }} 38 | APPLICATION_PRIVATE_KEY: ${{ secrets.NARWHAL_BOT_SECRET }} 39 | AWS_COMMERCIAL_ROLE_TO_ASSUME: ${{ secrets.NARWHAL_AWS_COMMERCIAL_ROLE_TO_ASSUME }} 40 | AWS_GOVCLOUD_ROLE_TO_ASSUME: ${{ secrets.NARWHAL_AWS_GOVCLOUD_ROLE_TO_ASSUME }} 41 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | repos: 2 | - repo: https://github.com/pre-commit/pre-commit-hooks 3 | rev: v4.6.0 4 | hooks: 5 | - id: check-added-large-files 6 | args: ["--maxkb=1024"] 7 | - id: check-merge-conflict 8 | - id: detect-aws-credentials 9 | args: 10 | - "--allow-missing-credentials" 11 | - id: detect-private-key 12 | - id: end-of-file-fixer 13 | - id: fix-byte-order-marker 14 | - id: trailing-whitespace 15 | args: [--markdown-linebreak-ext=md] 16 | - id: check-yaml 17 | args: 18 | - "--allow-multiple-documents" 19 | - repo: https://github.com/sirosen/texthooks 20 | rev: 0.6.6 21 | hooks: 22 | - id: fix-smartquotes 23 | - repo: https://github.com/tekwizely/pre-commit-golang 24 | rev: v1.0.0-rc.1 25 | hooks: 26 | - id: go-fmt 27 | args: [-s, -w, ./test] 28 | - id: golangci-lint 29 | args: 30 | - "--timeout=10m" 31 | - "--verbose" 32 | - "--allow-parallel-runners" 33 | - repo: https://github.com/antonbabenko/pre-commit-terraform 34 | rev: v1.91.0 35 | hooks: 36 | - id: terraform_fmt 37 | args: 38 | - --hook-config=--tf-path=tofu 39 | - id: terraform_tflint 40 | args: 41 | - --args=--config=__GIT_WORKING_DIR__/.tflint.hcl 42 | - --hook-config=--tf-path=tofu 43 | - repo: https://github.com/tofuutils/pre-commit-opentofu 44 | rev: v1.0.3 # Get the latest from: https://github.com/tofuutils/pre-commit-opentofu/releases 45 | hooks: 46 | - id: tofu_docs 47 | args: 48 | - --args=--lockfile=false 49 | - --hook-config=--path-to-file=README.md # Valid UNIX path. I.e. ../TFDOC.md or docs/README.md etc. 50 | - --hook-config=--add-to-existing-file=true # Boolean. true or false 51 | - --hook-config=--create-file-if-not-exist=true # Boolean. true or false 52 | - id: tofu_checkov 53 | args: 54 | - --args=--config-file __GIT_WORKING_DIR__/.checkov.yml 55 | - repo: https://github.com/renovatebot/pre-commit-hooks 56 | rev: 37.410.2 57 | hooks: 58 | - id: renovate-config-validator 59 | -------------------------------------------------------------------------------- /doc/adr/0008-how-to-trigger-automated-tests.md: -------------------------------------------------------------------------------- 1 | # 8. How To Trigger Automated Tests 2 | 3 | Date: 2023-07-27 4 | 5 | ## Status 6 | 7 | Superceded by [10. e2e testing improvement](0010-e2e-testing-improvement.md) 8 | 9 | ## Context 10 | 11 | We need to decide as a team how tests should be triggered, whether automatically, manually or both. 12 | 13 | ### Our Options 14 | 15 | 1. Use manual test triggers using Slash Command Dispatch. 16 | 1. Run automatically on a variety of `pull_request` events 17 | 1. A combination of manual triggers using Slash Command Dispatch and autoomatic triggers 18 | 19 | ## Decision 20 | 21 | - We will automatically trigger the tests if and only if all the following conditions are met: 22 | - The author of the pull request is Renovate 23 | - The pull request was just opened (i.e., it should only ever run automatically once per pull request) 24 | - If the author is Renovate and an automatic test run is triggered, before running the automatic test run, we will first commit and push any pre-commit changes to the branch, so that the tests don't need to be run twice (once for the initial commit, again after making the changes pre-commit requires). 25 | > This allows us to quickly merge Renovate PRs that were created overnight without having to wait for the tests to finish after a manual trigger. 26 | - Otherwise, the tests will be triggered manually by a person by adding a comment to the PR that says `/test `. Most of the time that will be `/test all` but there will likely be times when we may want to run a specific test, in which case we would use something like `/test e2e-commercial-insecure` 27 | > This lets us minimize the number of unnecessary tests we run. 28 | - We will stop running the tests on every commit to main. 29 | > Since [ADR #7](./0007-branch-protection-settings.md) was accepted, we will no longer be able to merge to main without a successful test run. So there is no need to run the tests on every commit to main. 30 | 31 | ## Consequences 32 | 33 | ### Pros 34 | - We will have more control over when it is appropriate to execute a test as some tests will cost real infrastructure $$$. 35 | - We can easily manually trigger subsequent tests as discussion is had and changes are made. 36 | - 3rd party contributors who are submitting PRs from fork may fully participate in the development process. 37 | 38 | ### Cons 39 | - It requires human interaction to run the tests, which will likely increase our development cycle times a small amount. 40 | -------------------------------------------------------------------------------- /examples/complete/providers.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.0.0" 3 | required_providers { 4 | aws = { 5 | source = "hashicorp/aws" 6 | version = ">= 4.62.0" 7 | } 8 | cloudinit = { 9 | source = "hashicorp/cloudinit" 10 | version = ">= 2.0.0" 11 | } 12 | helm = { 13 | source = "hashicorp/helm" 14 | version = ">= 2.5.1" 15 | } 16 | kubernetes = { 17 | source = "hashicorp/kubernetes" 18 | version = ">= 2.10.0" 19 | } 20 | local = { 21 | source = "hashicorp/local" 22 | version = ">= 2.1.0" 23 | } 24 | null = { 25 | source = "hashicorp/null" 26 | version = ">= 3.1.0" 27 | } 28 | random = { 29 | source = "hashicorp/random" 30 | version = ">= 3.1.0" 31 | } 32 | time = { 33 | source = "hashicorp/time" 34 | version = ">= 0.9.1" 35 | } 36 | tls = { 37 | source = "hashicorp/tls" 38 | version = ">= 3.0.0" 39 | } 40 | http = { 41 | source = "terraform-aws-modules/http" 42 | version = "2.4.1" 43 | } 44 | archive = { 45 | source = "hashicorp/archive" 46 | version = "2.4.2" 47 | } 48 | } 49 | } 50 | 51 | provider "aws" { 52 | region = var.region 53 | # default_tags { 54 | # tags = var.tags #bug https://github.com/hashicorp/terraform-provider-aws/issues/19583#issuecomment-855773246 55 | # } 56 | } 57 | 58 | provider "kubernetes" { 59 | host = module.eks.cluster_endpoint 60 | cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data) 61 | exec { 62 | api_version = "client.authentication.k8s.io/v1beta1" 63 | command = "/bin/sh" 64 | args = ["-c", "for i in $(seq 1 30); do curl -s -k -f ${module.eks.cluster_endpoint}/healthz > /dev/null && break || sleep 10; done && aws eks --region ${var.region} get-token --cluster-name ${local.cluster_name}"] 65 | } 66 | } 67 | 68 | provider "helm" { 69 | kubernetes { 70 | host = module.eks.cluster_endpoint 71 | cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data) 72 | exec { 73 | api_version = "client.authentication.k8s.io/v1beta1" 74 | command = "/bin/sh" 75 | args = ["-c", "for i in $(seq 1 30); do curl -s -k -f ${module.eks.cluster_endpoint}/healthz > /dev/null && break || sleep 10; done && aws eks --region ${var.region} get-token --cluster-name ${local.cluster_name}"] 76 | } 77 | } 78 | } 79 | -------------------------------------------------------------------------------- /doc/adr/0005-terraform-selection.md: -------------------------------------------------------------------------------- 1 | # 5. Use Terraform for IaC 2 | 3 | Date: 2023-05-05 (Effective 2023-01-02) 4 | 5 | ## Status 6 | 7 | Accepted 8 | 9 | ## Context 10 | 11 | We need to select a tool for Infrastructure as Code that: 12 | 13 | * establish a common pattern across multiple environments (on prem, cloud providers, etc.) 14 | * is portable 15 | * is widely supported 16 | * can be easily adopted / extended by external entities running day 2 ops 17 | * is extensible for internal use cases 18 | * doesn't introduce significant complexity / unnecessary cost 19 | 20 | Tools that were considered: 21 | 22 | * Pulumi 23 | * Crossplane 24 | * Terragrunt 25 | * Terraform 26 | * CDKTF 27 | 28 | ## Decision 29 | 30 | We chose Terraform because 31 | - it is the most widely adopted IaC tool (we felt it would resonate best with external day 2 ops partners) 32 | - it is portable (easily supports air gapped environments) 33 | - allowed us to leverage existing capabilities / experience (expedited delivery of capabilities) 34 | - didn't introduce additional complexity (minimal dependencies) 35 | - could be converted to a more versatile language via CDKTF 36 | - easily tested in pipelines 37 | - easily deployable with zarf 38 | 39 | Why we didn't choose one of the other tools: 40 | 41 | * Pulumi 42 | - limited engineering experience internally (with an assumption that many day 2 ops teams would also have this) 43 | - delivery time constraints 44 | - not as widely supported 45 | 46 | * Crossplane 47 | - requires a utility cluster / hub and spoke architecture (chicken and egg scenario) 48 | - incurs additional cloud costs - many DoD environments require dedicated tenancy which is x3 more expensive 49 | - add significant complexity to mission environments and e2e testing pipelines 50 | - portability concerns 51 | - adoption / extsensibility concerns 52 | - not as widely supported 53 | 54 | 55 | * Terragrunt (we intentionally refactored away from terragrunt because) 56 | - zarf is effectively a wrapper for terraform and could handle those functions eventually™ 57 | - it didn't feel like this was the right layer to opionate / abstract from 58 | - additional complexity in maintaining both terraform modules and a highly opinionated terragrunt folder structure 59 | - compatibility issues with some opinionated upstream modules 60 | 61 | * CDKTF 62 | - limited engineering experience internally (with an assumption that many day 2 ops teams would also have this) 63 | - only supports cloud providers 64 | - not as widely supported 65 | - because we chose terraform, there is an option to migrate to this at a later time 66 | 67 | ## Consequences 68 | -------------------------------------------------------------------------------- /doc/adr/0004-module-selection-criteria.md: -------------------------------------------------------------------------------- 1 | # 4. Selection Criteria for Upstream Terraform Modules 2 | 3 | Date: 2023-05-04 (Effective 2023-01-02) 4 | 5 | ## Status 6 | 7 | Accepted 8 | 9 | ## Context 10 | 11 | This project was started to provide a highly opinionated secure and declarative infrastructure baseline that supports a Big Bang deployment in AWS. Below is a list of modules as of the first tag cut in this repository and how they primarily function. 12 | 13 | * bastion 14 | - highly opinionated EC2 terraform module that informs how users can/are expected to interact with the environment 15 | - fully built / maintained by this project 16 | * eks 17 | - opinionated wrapper of EKS Blueprints module (maintained by AWS Solution Architects) 18 | * rds 19 | - opinionated wrapper of terraform-aws-rds module 20 | - supports big bang add-ons that need a managed service database 21 | * s3-irsa 22 | - opinionated wrapper of terraform-aws-modules/s3-bucket 23 | - adds iam / kms AWS resources to enable irsa but depends on k8s configuration (svc account) to be handle by GitOps 24 | * sops 25 | - highly opinionated iam / kms module that allows encryption / decryption of GitOps secrets via the bastion & flux in eks 26 | - adds iam / kms AWS resources to enable irsa but depends on k8s configuration (svc account) to be handle by GitOps 27 | - fully built / maintained by this project 28 | * tfstate-backend 29 | - opinionated wrapper of terraform-aws-modules/s3-bucket 30 | - adds dynamodb / kms resources to enable secure tf state backend 31 | * vpc 32 | - opinionated wrapper of terraform-aws-modules/vpc module 33 | - adds security group for VPC endpoints 34 | 35 | As we start moving more toward treating our Terraform infrastructure code as a product, there is a need to capture previous decisions for module selection in order to enhance the process going forward. Below is a list of modules as of the first tag cut in this repository along with guiding principles that were taken into account for the initial selection: 36 | 37 | * License 38 | * Upstream had Active Community Support / Engagement 39 | * Upstream was Well Maintained / Managed 40 | * Upstream was Extensible (it can do what need) 41 | * Simple design / didn't introduce additional complexity 42 | * Upstream was testing (important for highly complex modules because we were doing it manually at the time) 43 | * Ease to maintain ourself 44 | 45 | ## Decision 46 | 47 | Below is a list of the original modules and why those early decisions were made. 48 | 49 | * bastion (no upstream module selected) 50 | - We looked at the CloudPosse ec2-bastion-server and terraform-aws-ec2-instance but ultimately decided to completely "own" this for the following reasons. 51 | - The CloudPosse module hadn't been updated since mid-2021 and added complexity with the additional upstream opinionation (we were already VERY opinionated on this module) 52 | - We would have had to refactor to leverage the upstream terraform-aws-ec2-instance module and already had a ton of resources that were outside of it, so we decided at the time that the juice was worth the squeeze. 53 | * eks (eks blueprints upstream) 54 | - We looked at several upstream modules and, because of the complexity, testing / upstream support was important. We ultimately chose EKS Blueprints because it was VERY active, well maintained (by AWS Solutions Architects) and tested. 55 | * rds (terraform-aws-rds module upstream) 56 | - opinionated wrapper of terraform-aws-rds module 57 | - supports big bang add-ons that need a managed service database 58 | * s3-irsa (terraform-aws-modules/s3-bucket) 59 | - opinionated wrapper of terraform-aws-modules/s3-bucket 60 | - adds iam / kms AWS resources to enable irsa but depends on k8s configuration (svc account) to be handle by GitOps 61 | * sops (no upstream module selected) 62 | - this decision was made because it was simple, easy to maintain and there weren't any upstream options. 63 | * tfstate-backend (terraform-aws-modules/s3-bucket) 64 | - We looked at the CloudPosse tfstate-backend module but it hadn't been updated since Nov 2021 and added complexity with the additional upstream opinionation. 65 | - this decision was made because it was simple and easy to maintain. 66 | * vpc (terraform-aws-modules/vpc upstream) 67 | - the vpc submodule in EKS blueprints wasn't easily extensible for our use case and blueprints was simply opinionating the upstream module that we selected. 68 | - we chose this upstream modules because it was simple and well used in many places. 69 | 70 | ## Consequences 71 | -------------------------------------------------------------------------------- /examples/complete/fixtures.common.tfvars: -------------------------------------------------------------------------------- 1 | ########################################################### 2 | ################## Global Settings ######################## 3 | 4 | tags = { 5 | Environment = "dev" 6 | Project = "du-iac-cicd" 7 | } 8 | name_prefix = "ex-complete" 9 | 10 | ########################################################### 11 | #################### VPC Config ########################### 12 | 13 | vpc_cidr = "10.200.0.0/16" 14 | secondary_cidr_blocks = ["100.64.0.0/16"] #https://aws.amazon.com/blogs/containers/optimize-ip-addresses-usage-by-pods-in-your-amazon-eks-cluster/ 15 | 16 | ########################################################### 17 | ################## Bastion Config ######################### 18 | 19 | bastion_ssh_user = "ec2-user" # local user in bastion used to ssh 20 | bastion_ssh_password = "my-password" 21 | # renovate: datasource=github-tags depName=defenseunicorns/zarf 22 | zarf_version = "v0.29.2" 23 | 24 | ########################################################### 25 | #################### EKS Config ########################### 26 | # renovate: datasource=endoflife-date depName=amazon-eks versioning=loose extractVersion=^(?.*)-eks.+$ 27 | cluster_version = "1.27" 28 | eks_use_mfa = false 29 | 30 | ########################################################### 31 | ############## Big Bang Dependencies ###################### 32 | 33 | keycloak_enabled = true # provisions keycloak dedicated nodegroup 34 | 35 | # #################### EKS Addons ######################### 36 | # add other "eks native" marketplace addons and configs to this list 37 | cluster_addons = { 38 | vpc-cni = { 39 | most_recent = true 40 | before_compute = true 41 | configuration_values = <<-JSON 42 | { 43 | "env": { 44 | "AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG": "true", 45 | "ENABLE_PREFIX_DELEGATION": "true", 46 | "ENI_CONFIG_LABEL_DEF": "topology.kubernetes.io/zone", 47 | "WARM_PREFIX_TARGET": "1", 48 | "ANNOTATE_POD_IP": "true", 49 | "POD_SECURITY_GROUP_ENFORCING_MODE": "standard" 50 | }, 51 | "enableNetworkPolicy": "true" 52 | } 53 | JSON 54 | } 55 | coredns = { 56 | preserve = true 57 | most_recent = true 58 | 59 | timeouts = { 60 | create = "2m" 61 | delete = "10m" 62 | } 63 | } 64 | kube-proxy = { 65 | most_recent = true 66 | } 67 | aws-ebs-csi-driver = { 68 | most_recent = true 69 | 70 | timeouts = { 71 | create = "4m" 72 | delete = "10m" 73 | } 74 | } 75 | aws-efs-csi-driver = { 76 | most_recent = true 77 | timeouts = { 78 | create = "20m" 79 | delete = "10m" 80 | } 81 | } 82 | } 83 | 84 | enable_amazon_eks_aws_ebs_csi_driver = true 85 | enable_gp3_default_storage_class = true 86 | storageclass_reclaim_policy = "Delete" # set to `Retain` for non-dev use 87 | 88 | #################### Blueprints addons ################### 89 | #wait false for all addons, as it times out on teardown in the test pipeline 90 | 91 | enable_amazon_eks_aws_efs_csi_driver = true 92 | #todo - move from blueprints to marketplace addons in terraform-aws-eks 93 | aws_efs_csi_driver = { 94 | wait = false 95 | chart_version = "2.4.8" 96 | } 97 | 98 | enable_aws_node_termination_handler = true 99 | aws_node_termination_handler = { 100 | wait = false 101 | 102 | # renovate: datasource=docker depName=public.ecr.aws/aws-ec2/helm/aws-node-termination-handler 103 | chart_version = "0.22.0" 104 | chart = "aws-node-termination-handler" 105 | repository = "oci://public.ecr.aws/aws-ec2/helm" 106 | } 107 | 108 | enable_cluster_autoscaler = true 109 | cluster_autoscaler = { 110 | wait = false 111 | # renovate: datasource=github-tags depName=kubernetes/autoscaler extractVersion=^cluster-autoscaler-chart-(?.*)$ 112 | chart_version = "v9.29.3" 113 | } 114 | 115 | enable_metrics_server = true 116 | metrics_server = { 117 | wait = false 118 | # renovate: datasource=github-tags depName=kubernetes-sigs/metrics-server extractVersion=^metrics-server-helm-chart-(?.*)$ 119 | chart_version = "v3.11.0" 120 | } 121 | 122 | ###################################################### 123 | ################## Lambda Config ##################### 124 | 125 | ################# Password Rotation ################## 126 | # Add users that will be on your ec2 instances. 127 | users = ["ec2-user", "Administrator"] 128 | 129 | cron_schedule_password_rotation = "cron(0 0 1 * ? *)" 130 | 131 | slack_notification_enabled = false 132 | 133 | slack_webhook_url = "" 134 | -------------------------------------------------------------------------------- /doc/adr/0010-e2e-testing-improvement.md: -------------------------------------------------------------------------------- 1 | # 10. e2e Testing Improvement 2 | 3 | Date: 2023-09-14 4 | 5 | ## Status 6 | 7 | Supersedes [08. how to trigger automated tests](0008-how-to-trigger-automated-tests.md) 8 | 9 | Superceded by [11. e2e ci test scaling](0011-e2e-ci-test-scaling.md) 10 | 11 | ## Context 12 | 13 | End-to-end (e2e) testing is a crucial component for validating the robustness and functionality of our Infrastructure as Code (IaC). While our current e2e testing workflow does the job, there are opportunities for making it more efficient and effective. This ADR aims to address various issues like test redundancy, inefficient triggering of tests, and the efficacy of secure tests. 14 | 15 | ### Only Run e2e Tests if Code-related Files Have Updated 16 | 17 | 1. **Skip Unnecessary Runs**: Running e2e tests for changes that don't affect the code (e.g., updating READMEs) is inefficient. 18 | - To achieve this, we'll need to add custom logic to our GitHub workflows, as GitHub's branch protection rules and required status checks do not offer this granularity. 19 | 20 | 2. **Conditional Workflows**: Workflows should be `skipped` if no relevant files have changed; otherwise, run the tests. 21 | - **Implementation Plan**: Use the [paths-filter](https://github.com/dorny/paths-filter) GitHub Action to conditionally execute workflow steps and jobs based on the files modified. 22 | - **Note**: Jobs that are `skipped` will still report "Success" as their status, so they won't block pull requests even if they are marked as required checks. 23 | 24 | ### Revise or Change how we run the Secure Test 25 | 26 | The secure test takes significantly longer to complete compared to the insecure test, doubling the time required for e2e tests to complete. 27 | 28 | 1. **Relevance of Secure Test**: It's important to consider what the secure test is verifying about our IaC that isn't covered by the insecure test. 29 | - Is it the deployment pattern, the cloud environment, or the instance types that make it essential? The private eks endpoint pattern is essential to match our target environments. 30 | 31 | If we only utilize public EKS endpoints for pipeline purposes, our e2e tests will run much faster, eliminating the need for sshuttle and multiple Terraform setup & apply cycles in terratest. 32 | 33 | ### Avoid Duplication of Required Test Passing 34 | 35 | 1. **Optimize Testing**: Having to pass all tests twice—once for the pull request and once before (merge queue) or after merging into `main`—seems redundant. 36 | 37 | ### Leverage Merge Queue Feature 38 | 39 | 1. **Streamline e2e Testing with Merge Queue**: We can optimize our testing process by integrating it with the merge queue. 40 | - **Implementation Plan**: A workflow triggered on `pull request` will either skip the e2e test status checks or succeed them, followed by the actual tests running in the `merge_group` workflow. 41 | 42 | ## Decision 43 | 44 | - Workflows should be `skipped` if no relevant files have changed; otherwise, run the tests. 45 | - We will use the [paths-filter](https://github.com/dorny/paths-filter) GitHub Action to conditionally execute workflow steps and jobs based on the files modified. 46 | - **Note**: Jobs that are `skipped` will still report "Success" as their status, so they won't block pull requests even if they are marked as required checks. 47 | - In a PR we are able to run both insecure and secure tests 48 | - Tests are not required to pass to be added to merge queue 49 | - Maintainers can use slash command dispatch 50 | - The E2E Insecure test is required in to pass in `merge_group` event. This will ensure only code that passes this test is merged to main. 51 | - The E2E Secure test is required to run at least nightly and be integrated with slack notification 52 | - This maintains successful deployment validation for our target environments 53 | - Resolving failures must become the top priority of the iac team 54 | - Release Please PR should not be added to the merge queue unless the merge queue is empty. Release please needs to be at the head of the queue, else it will miss other commits to main in its commit to the CHANGELOG.md file. 55 | - Release Please PRs will run both tests to ensure that all tests pass before a tag is cut. Ensures that releases will be valid. 56 | 57 | ## Consequences 58 | 59 | ### What Becomes Easier 60 | 61 | - Developers experience fewer delays as irrelevant or redundant tests are skipped. 62 | - Efficient use of resources, both time and compute, leading to cost savings. 63 | 64 | ### What Becomes More Difficult 65 | 66 | - Initial setup of the refined GitHub logic and any new workflows will need extra time and technical acumen. 67 | 68 | ### Risks 69 | 70 | - The custom logic for triggering tests could introduce bugs, necessitating additional debugging. 71 | - Altering the "secure test" needs a comprehensive review to ensure that it doesn't compromise the quality of our IaC. 72 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributor Guide 2 | 3 | Thanks so much for wanting to help out! :tada: 4 | 5 | Most of what you'll see in this document is our attempt at documenting the lightweight development process that works for our team. We're always open to feedback and suggestions for improvement. The intention is not to force people to follow this process step by step, rather to document it as a norm and provide a baseline for discussion. 6 | 7 | ## Developer Experience 8 | 9 | Continuous Delivery is core to our development philosophy. Check out [https://minimumcd.org](https://minimumcd.org/) for a good baseline agreement on what that means. 10 | 11 | Specifically: 12 | 13 | - We do trunk-based development (`main`) with short-lived feature branches that originate from the trunk, get merged to the trunk, and are deleted after the merge. 14 | - We don't merge work into `main` that isn't releasable. 15 | - We perform automated testing on all pushes to `main`. Fixing failing pipelines in `main` are prioritized over all other work. 16 | - We create immutable release artifacts. 17 | 18 | ### Developer Workflow 19 | 20 | :key: == Required by automation 21 | 22 | 1. Pick an issue to work on, assign it to yourself, and drop a comment in the issue to let everyone know you're working on it. 23 | 2. Create a Draft Pull Request targeting the `main` branch as soon as you are able to, even if it is just 5 minutes after you started working on it. We lean towards working in the open as much as we can. If you're not sure what to put in the PR description, just put a link to the issue you're working on. If you're not sure what to put in the PR title, just put "WIP" (Work In Progress) and we'll help you out with the rest. 24 | 3. :key: The automated tests have to pass for the PR to be able to be merged. To run the tests in the PR add a comment to the PR that says `/test`. **NOTE** tests still have to pass in the merge queue, **you do not need to have tests pass in the PR, status checks are automatically reported as success in the PR**. If you want to run a specific test manually in the PR, you can use `/test make= region=`. The available CI tests are found in the [Makefile](./Makefile) and start with the string "test-ci-" 25 | 4. If your PR is still set as a Draft transition it to "Ready for Review" 26 | 5. Get it reviewed by a [CODEOWNER](./CODEOWNERS) 27 | 6. Add the PR to the merge queue 28 | 7. The merge queue will run different tests based on if it's a `release-please` pull request or just a regular pull request. If it's a `release-please` pull request, it will run all make targets starting with `test-ci-` and `test-release-` by default. If it's a regular pull request, it will run all make targets starting with `test-ci-` test by default. If the tests fail, the PR will be removed from the merge queue and the PR stays open. If the tests pass, the PR will be merged to `main` and the PR will be closed. 29 | 8. If the issue is fully resolved, close it. _Hint: You can add "Closes #XXX" to the PR description to automatically close the issue when the PR is merged._ 30 | 31 | ### Pre-Commit Hooks 32 | 33 | This project uses [pre-commit](https://pre-commit.com/) to run a set of checks on your code before you commit it. You have the option to either install pre-commit and all other needed tools locally or use our docker-based build harness. To use the build harness, run 34 | 35 | ```shell 36 | make run-pre-commit-hooks 37 | ``` 38 | > NOTE: Sometimes file ownership of stuff in the `.cache` folder can get messed up. You can optionally add the `fix-cache-permissions` target to the above command to fix that. It is idempotent so it is safe to run it every time. 39 | 40 | ### Commit Messages 41 | 42 | Because we use the [release-please](https://github.com/googleapis/release-please) bot, commit messages to main must follow the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) specification. This is enforced by the [commitlint](https://commitlint.js.org/#/) tool. This requirement is only enforced on the `main` branch. Commit messages in PRs can be whatever you want them to be. "Squash" mode must be used when merging a PR, with a commit message that follows the Conventional Commits specification. 43 | 44 | ### Release Process 45 | 46 | This repo uses the [release-please](https://github.com/googleapis/release-please) bot. Release-please will automatically open a PR to update the version of the repo when a commit is merged to `main` that follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) specification. The bot will automatically keep the PR up to date until a human merges it. When that happens the bot will automatically create a new release. 47 | 48 | ### Backlog Management 49 | 50 | - We use [GitHub Issues](https://github.com/defenseunicorns/delivery-aws-iac/issues) to manage our backlog. 51 | - Issues need to meet our Definition of Ready (see below). If it does not meet the Definition of Ready, we may close it and ask the requester to re-open it once it does. 52 | 53 | #### Definition of Ready for a Backlog Item 54 | 55 | To meet the Definition of Ready the issue needs to answer the following questions: 56 | - Who is requesting it? 57 | - What is being requested? 58 | - Why is it needed? 59 | - What is the impact? What will happen if the request is not fulfilled? 60 | - How do we know that we are done? 61 | 62 | This can take various forms, and we don't care which form the issue takes as long as it answers the questions above. 63 | -------------------------------------------------------------------------------- /examples/complete/access-logging.tf: -------------------------------------------------------------------------------- 1 | # TODO: Evaluate whether this should all go into a new module 2 | 3 | # Create a KMS key and corresponding alias. This KMS key will be used whenever encryption is needed in creating this infrastructure deployment 4 | resource "aws_kms_key" "default" { 5 | description = "SSM Key" 6 | deletion_window_in_days = var.kms_key_deletion_window 7 | enable_key_rotation = true 8 | policy = data.aws_iam_policy_document.kms_access.json 9 | tags = local.tags 10 | multi_region = true 11 | } 12 | 13 | resource "aws_kms_alias" "default" { 14 | name_prefix = local.kms_key_alias_name_prefix 15 | target_key_id = aws_kms_key.default.key_id 16 | } 17 | 18 | # Create custom policy for KMS 19 | data "aws_iam_policy_document" "kms_access" { 20 | # checkov:skip=CKV_AWS_111: todo reduce perms on key 21 | # checkov:skip=CKV_AWS_109: todo be more specific with resources 22 | # checkov:skip=CKV_AWS_356: todo be more specific with kms resources 23 | statement { 24 | sid = "KMS Key Default" 25 | principals { 26 | type = "AWS" 27 | identifiers = [ 28 | "arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:root" 29 | ] 30 | } 31 | 32 | actions = [ 33 | "kms:*", 34 | ] 35 | 36 | resources = ["*"] 37 | } 38 | statement { 39 | sid = "CloudWatchLogsEncryption" 40 | principals { 41 | type = "Service" 42 | identifiers = ["logs.${var.region}.amazonaws.com"] 43 | } 44 | actions = [ 45 | "kms:Encrypt*", 46 | "kms:Decrypt*", 47 | "kms:ReEncrypt*", 48 | "kms:GenerateDataKey*", 49 | "kms:Describe*", 50 | ] 51 | 52 | resources = ["*"] 53 | } 54 | statement { 55 | sid = "Cloudtrail KMS permissions" 56 | principals { 57 | type = "Service" 58 | identifiers = [ 59 | "cloudtrail.amazonaws.com" 60 | ] 61 | } 62 | actions = [ 63 | "kms:Encrypt*", 64 | "kms:Decrypt*", 65 | "kms:ReEncrypt*", 66 | "kms:GenerateDataKey*", 67 | "kms:Describe*", 68 | ] 69 | resources = ["*"] 70 | } 71 | } 72 | 73 | # Create S3 bucket for access logs with versioning, encryption, blocked public access enabled 74 | resource "aws_s3_bucket" "access_log_bucket" { 75 | # checkov:skip=CKV_AWS_144: Cross region replication is overkill 76 | # checkov:skip=CKV_AWS_18: "Ensure the S3 bucket has access logging enabled" -- This is the access logging bucket. Logging to the logging bucket would cause an infinite loop. 77 | bucket_prefix = local.access_logging_name_prefix 78 | force_destroy = true 79 | tags = local.tags 80 | 81 | lifecycle { 82 | precondition { 83 | condition = length(local.access_logging_name_prefix) <= 37 84 | error_message = "Bucket name prefixes may not be longer than 37 characters." 85 | } 86 | } 87 | } 88 | 89 | resource "aws_s3_bucket_versioning" "access_log_bucket" { 90 | bucket = aws_s3_bucket.access_log_bucket.id 91 | 92 | versioning_configuration { 93 | status = "Enabled" 94 | } 95 | } 96 | 97 | resource "aws_s3_bucket_server_side_encryption_configuration" "access_log_bucket" { 98 | bucket = aws_s3_bucket.access_log_bucket.id 99 | 100 | rule { 101 | apply_server_side_encryption_by_default { 102 | kms_master_key_id = aws_kms_key.default.arn 103 | sse_algorithm = "aws:kms" 104 | } 105 | } 106 | } 107 | 108 | resource "aws_s3_bucket_public_access_block" "access_log_bucket" { 109 | bucket = aws_s3_bucket.access_log_bucket.id 110 | block_public_acls = true 111 | block_public_policy = true 112 | ignore_public_acls = true 113 | restrict_public_buckets = true 114 | } 115 | 116 | resource "aws_s3_bucket_lifecycle_configuration" "access_log_bucket" { 117 | bucket = aws_s3_bucket.access_log_bucket.id 118 | 119 | rule { 120 | id = "delete_after_X_days" 121 | status = "Enabled" 122 | 123 | expiration { 124 | days = var.access_log_expire_days 125 | } 126 | } 127 | 128 | rule { 129 | id = "abort_incomplete_multipart_upload" 130 | status = "Enabled" 131 | abort_incomplete_multipart_upload { 132 | days_after_initiation = 7 133 | } 134 | } 135 | } 136 | 137 | resource "aws_sqs_queue" "access_log_queue" { 138 | count = var.enable_sqs_events_on_access_log_access ? 1 : 0 139 | name = local.access_log_sqs_queue_name 140 | kms_master_key_id = aws_kms_key.default.arn 141 | kms_data_key_reuse_period_seconds = 300 142 | visibility_timeout_seconds = 300 143 | 144 | policy = < 67 | make target and region to run e2e tests on, must be json formatted 68 | If not provided, upstream workflow will run all make targets in the caller repo's Makefile that start with 'test-ci-' 69 | These tests are ran when: 70 | - In the merge queue when merging into main 71 | - In the merge queue when merging into a release-please branch that would cut a release of the repo 72 | 73 | example input: 74 | [ 75 | { 76 | "make-target": "test-ci-complete-common", 77 | "region": "us-east-2" 78 | } 79 | ] 80 | type: string 81 | required: false 82 | release-e2e-test-matrix: 83 | description: > 84 | make target and region to run e2e tests on, must be json formatted 85 | If not provided, upstream workflow will run all make targets in the caller repo's Makefile that start with 'test-release-' and 'test-ci-' 86 | These tests are ran when: 87 | - In the merge queue when merging into a release-please branch that would cut a release of the repo 88 | 89 | example input: 90 | [ 91 | { 92 | "make-target": "test-ci-complete-common", 93 | "region": "us-east-2" 94 | }, 95 | { 96 | "make-target": "test-release-complete-common", 97 | "region": "us-gov-west-1" 98 | }, 99 | { 100 | "make-target": "test-release-complete-whatever", 101 | "region": "us-gov-east-1" 102 | } 103 | ] 104 | type: string 105 | required: false 106 | e2e-required-status-check: 107 | description: "status check to report when e2e tests are complete" 108 | type: string 109 | required: false 110 | default: '["e2e-tests"]' 111 | ``` 112 | 113 | essentially, caller repos can input both or none of these and it will work. They are overridable if needed to keep our pipelines DRY. 114 | 115 | ### e2e-test.yml reusable workflow 116 | This workflow is used by the `pr-merge-group-test.yml` reusable workflow and the `/test` slash command dispatch workflow pattern. 117 | #### inputs 118 | 119 | `e2e-test-matrix`: 120 | - This matrix of jobs has precedence over the default e2e test matrixes generated by the caller repo's makefile 121 | 122 | `release-branch`: 123 | - Defaults to false 124 | - This boolean switch determines if CI is going to add the `test-release-` make targets to the e2e-test matrix 125 | 126 | ### pre-commit.yml reusable workflow 127 | 128 | `pre-commit checks`: 129 | - This matrix of jobs will still be ran in PRs, merging into main, and before a release is cut 130 | - Reworked a to fetch the caller repo's makefile and find all make targets starting with `pre-commit-` (excluding `pre-commit-all`) 131 | 132 | ### slash command dispatch 133 | 134 | `/test` changes: 135 | 136 | From a PR comment, a user with write status to the repo can: 137 | 138 | run a single test 139 | `/test make= region=` 140 | 141 | run ping test and print debug info to the pipeline logs 142 | `/test ping` 143 | 144 | run all tests in the makefile beginning with `test-ci-` 145 | `/test` 146 | 147 | ## Consequences 148 | 149 | - Pipelines will scale to the caller repos needs if these makefile patterns are followed. 150 | - Logic is simplified in the pipelines. 151 | - Repo management is simplifed in the form of required status-checks being standardized. 152 | - Hard logic is dictated by the makefile. The pipeline will report status from the inputs fed to it by the caller workflows. 153 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | include .env 2 | 3 | .DEFAULT_GOAL := help 4 | 5 | # Optionally add the "-it" flag for docker run commands if the env var "CI" is not set (meaning we are on a local machine and not in github actions) 6 | TTY_ARG := 7 | ifndef CI 8 | TTY_ARG := -it 9 | endif 10 | 11 | # Silent mode by default. Run `make VERBOSE=1` to turn off silent mode. 12 | ifndef VERBOSE 13 | .SILENT: 14 | endif 15 | 16 | # Idiomatic way to force a target to always run, by having it depend on this dummy target 17 | FORCE: 18 | 19 | .PHONY: help 20 | help: ## Show a list of all targets 21 | grep -E '^[a-zA-Z0-9_-]+:.*?## .*$$' $(MAKEFILE_LIST) \ 22 | | sed -n 's/^\(.*\): \(.*\)##\(.*\)/\1:\3/p' \ 23 | | column -t -s ":" 24 | 25 | .PHONY: _create-folders 26 | _create-folders: 27 | mkdir -p .cache/docker 28 | mkdir -p .cache/pre-commit 29 | mkdir -p .cache/go 30 | mkdir -p .cache/go-build 31 | mkdir -p .cache/tmp 32 | mkdir -p .cache/.terraform.d/plugin-cache 33 | mkdir -p .cache/.zarf-cache 34 | 35 | .PHONY: _test-all 36 | _test-all: _create-folders 37 | # import any TF_VAR_ environment variables into the docker container. 38 | echo "Running automated tests. This will take several minutes. At times it does not log anything to the console. If you interrupt the test run you will need to log into AWS console and manually delete any orphaned infrastructure.";\ 39 | TF_VARS=$$(env | grep '^TF_VAR_' | awk -F= '{printf "-e %s ", $$1}'); \ 40 | docker run $(TTY_ARG) --rm \ 41 | --cap-add=NET_ADMIN \ 42 | --cap-add=NET_RAW \ 43 | -v "${PWD}:/app" \ 44 | -v "${PWD}/.cache/tmp:/tmp" \ 45 | -v "${PWD}/.cache/go:/root/go" \ 46 | -v "${PWD}/.cache/go-build:/root/.cache/go-build" \ 47 | -v "${PWD}/.cache/.terraform.d/plugin-cache:/root/.terraform.d/plugin-cache" \ 48 | -v "${PWD}/.cache/.zarf-cache:/root/.zarf-cache" \ 49 | --workdir "/app" \ 50 | -e TF_LOG_PATH \ 51 | -e TF_LOG \ 52 | -e GOPATH=/root/go \ 53 | -e GOCACHE=/root/.cache/go-build \ 54 | -e TF_PLUGIN_CACHE_MAY_BREAK_DEPENDENCY_LOCK_FILE=true \ 55 | -e TF_PLUGIN_CACHE_DIR=/root/.terraform.d/plugin-cache \ 56 | -e AWS_REGION \ 57 | -e AWS_DEFAULT_REGION \ 58 | -e AWS_ACCESS_KEY_ID \ 59 | -e AWS_SECRET_ACCESS_KEY \ 60 | -e AWS_SESSION_TOKEN \ 61 | -e AWS_SECURITY_TOKEN \ 62 | -e AWS_SESSION_EXPIRATION \ 63 | -e SKIP_SETUP \ 64 | -e SKIP_TEST \ 65 | -e SKIP_TEARDOWN \ 66 | $${TF_VARS} \ 67 | ${BUILD_HARNESS_REPO}:${BUILD_HARNESS_VERSION} \ 68 | bash -c 'git config --global --add safe.directory /app && cd examples/complete && tofu init -upgrade=true && cd ../../test/e2e && go test -count 1 -v $(EXTRA_TEST_ARGS) .' 69 | 70 | .PHONY: bastion-connect 71 | bastion-connect: _create-folders ## To be used after deploying "secure mode" of examples/complete. It (a) creates a tunnel through the bastion host using sshuttle, and (b) sets up the KUBECONFIG so that the EKS cluster is able to be interacted with. Requires the standard AWS cred environment variables to be set. We recommend using 'aws-vault' to set them. 72 | # TODO: Figure out a better way to deal with the bastion's SSH password. Ideally it should come from a terraform output but you can't directly pass inputs to outputs (at least not when you are using "-target") 73 | docker run $(TTY_ARG) --rm \ 74 | --cap-add=NET_ADMIN \ 75 | --cap-add=NET_RAW \ 76 | -v "${PWD}:/app" \ 77 | -v "${PWD}/.cache/tmp:/tmp" \ 78 | -v "${PWD}/.cache/go:/root/go" \ 79 | -v "${PWD}/.cache/go-build:/root/.cache/go-build" \ 80 | -v "${PWD}/.cache/.terraform.d/plugin-cache:/root/.terraform.d/plugin-cache" \ 81 | -v "${PWD}/.cache/.zarf-cache:/root/.zarf-cache" \ 82 | --workdir "/app/examples/complete" \ 83 | -e TF_LOG_PATH \ 84 | -e TF_LOG \ 85 | -e GOPATH=/root/go \ 86 | -e GOCACHE=/root/.cache/go-build \ 87 | -e TF_PLUGIN_CACHE_MAY_BREAK_DEPENDENCY_LOCK_FILE=true \ 88 | -e TF_PLUGIN_CACHE_DIR=/root/.terraform.d/plugin-cache \ 89 | -e AWS_REGION \ 90 | -e AWS_DEFAULT_REGION \ 91 | -e AWS_ACCESS_KEY_ID \ 92 | -e AWS_SECRET_ACCESS_KEY \ 93 | -e AWS_SESSION_TOKEN \ 94 | -e AWS_SECURITY_TOKEN \ 95 | -e AWS_SESSION_EXPIRATION \ 96 | ${BUILD_HARNESS_REPO}:${BUILD_HARNESS_VERSION} \ 97 | bash -c 'git config --global --add safe.directory /app \ 98 | && tofu init -upgrade=true \ 99 | && sshuttle -D -e '"'"'sshpass -p "my-password" ssh -q -o CheckHostIP=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ProxyCommand="aws ssm --region $(shell cd examples/complete && terraform output -raw bastion_region) start-session --target %h --document-name AWS-StartSSHSession --parameters portNumber=%p"'"'"' --dns --disable-ipv6 -vr ec2-user@$(shell cd examples/complete && terraform output -raw bastion_instance_id) $(shell cd examples/complete && terraform output -raw vpc_cidr) \ 100 | && aws eks --region $(shell cd examples/complete && terraform output -raw bastion_region) update-kubeconfig --name $(shell cd examples/complete && terraform output -raw eks_cluster_name) \ 101 | && echo "SShuttle is running and KUBECONFIG has been set. Try running kubectl get nodes." \ 102 | && bash' 103 | 104 | .PHONY: test 105 | test: ## Run all automated tests. Requires access to an AWS account. Costs real money. 106 | $(MAKE) _test-all EXTRA_TEST_ARGS="-timeout 3h" 107 | 108 | .PHONY: test-ci-complete-insecure 109 | test-ci-complete-insecure: ## Run one test (TestExamplesCompleteInsecure). Requires access to an AWS account. Costs real money. 110 | $(eval export TF_VAR_region := $(or $(REGION),$(TF_VAR_region),us-east-2)) 111 | $(MAKE) _test-all EXTRA_TEST_ARGS="-timeout 3h -run TestExamplesCompleteInsecure" 112 | 113 | .PHONY: test-release-complete-secure 114 | test-release-complete-secure: ## Run one test (TestExamplesCompleteSecure). Requires access to an AWS account. Costs real money. 115 | $(eval export TF_VAR_region := $(or $(REGION),$(TF_VAR_region),us-gov-west-1)) 116 | $(MAKE) _test-all EXTRA_TEST_ARGS="-timeout 3h -run TestExamplesCompleteSecure" 117 | 118 | .PHONY: test-complete-plan-only 119 | test-complete-plan-only: ## Run one test (TestExamplesCompletePlanOnly). Requires access to an AWS account. It will not cost money or create any resources since it is just running `terraform plan`. 120 | $(eval export TF_VAR_region := $(or $(REGION),$(TF_VAR_region),us-east-2)) 121 | $(MAKE) _test-all EXTRA_TEST_ARGS="-timeout 2h -run TestExamplesCompletePlanOnly" 122 | 123 | .PHONY: docker-save-build-harness 124 | docker-save-build-harness: _create-folders ## Pulls the build harness docker image and saves it to a tarball 125 | docker pull ${BUILD_HARNESS_REPO}:${BUILD_HARNESS_VERSION} 126 | docker save -o .cache/docker/build-harness.tar ${BUILD_HARNESS_REPO}:${BUILD_HARNESS_VERSION} 127 | 128 | .PHONY: docker-load-build-harness 129 | docker-load-build-harness: ## Loads the saved build harness docker image 130 | docker load -i .cache/docker/build-harness.tar 131 | 132 | .PHONY: _runhooks 133 | _runhooks: _create-folders 134 | docker run $(TTY_ARG) --rm \ 135 | -v "${PWD}:/app" \ 136 | -v "${PWD}/.cache/tmp:/tmp" \ 137 | -v "${PWD}/.cache/go:/root/go" \ 138 | -v "${PWD}/.cache/go-build:/root/.cache/go-build" \ 139 | -v "${PWD}/.cache/.terraform.d/plugin-cache:/root/.terraform.d/plugin-cache" \ 140 | -v "${PWD}/.cache/.zarf-cache:/root/.zarf-cache" \ 141 | --workdir "/app" \ 142 | -e GOPATH=/root/go \ 143 | -e GOCACHE=/root/.cache/go-build \ 144 | -e TF_PLUGIN_CACHE_MAY_BREAK_DEPENDENCY_LOCK_FILE=true \ 145 | -e TF_PLUGIN_CACHE_DIR=/root/.terraform.d/plugin-cache \ 146 | -e "SKIP=$(SKIP)" \ 147 | -e "PRE_COMMIT_HOME=/app/.cache/pre-commit" \ 148 | ${BUILD_HARNESS_REPO}:${BUILD_HARNESS_VERSION} \ 149 | bash -c 'git config --global --add safe.directory /app && pre-commit run -a --show-diff-on-failure $(HOOK)' 150 | 151 | .PHONY: pre-commit-all 152 | pre-commit-all: ## Run all pre-commit hooks. Returns nonzero exit code if any hooks fail. Uses Docker for maximum compatibility 153 | $(MAKE) _runhooks HOOK="" SKIP="" 154 | 155 | .PHONY: pre-commit-terraform 156 | pre-commit-terraform: ## Run the terraform pre-commit hooks. Returns nonzero exit code if any hooks fail. Uses Docker for maximum compatibility 157 | $(MAKE) _runhooks HOOK="" SKIP="check-added-large-files,check-merge-conflict,detect-aws-credentials,detect-private-key,end-of-file-fixer,fix-byte-order-marker,trailing-whitespace,check-yaml,fix-smartquotes,go-fmt,golangci-lint,renovate-config-validator" 158 | 159 | .PHONY: pre-commit-golang 160 | pre-commit-golang: ## Run the golang pre-commit hooks. Returns nonzero exit code if any hooks fail. Uses Docker for maximum compatibility 161 | $(MAKE) _runhooks HOOK="" SKIP="check-added-large-files,check-merge-conflict,detect-aws-credentials,detect-private-key,end-of-file-fixer,fix-byte-order-marker,trailing-whitespace,check-yaml,fix-smartquotes,terraform_fmt,tofu_docs,tofu_checkov,terraform_tflint,renovate-config-validator" 162 | 163 | .PHONY: pre-commit-renovate 164 | pre-commit-renovate: ## Run the renovate pre-commit hooks. Returns nonzero exit code if any hooks fail. Uses Docker for maximum compatibility 165 | $(MAKE) _runhooks HOOK="renovate-config-validator" SKIP="" 166 | 167 | .PHONY: pre-commit-common 168 | pre-commit-common: ## Run the common pre-commit hooks. Returns nonzero exit code if any hooks fail. Uses Docker for maximum compatibility 169 | $(MAKE) _runhooks HOOK="" SKIP="go-fmt,golangci-lint,terraform_fmt,tofu_docs,tofu_checkov,terraform_tflint,renovate-config-validator" 170 | 171 | .PHONY: fix-cache-permissions 172 | fix-cache-permissions: ## Fixes the permissions on the pre-commit cache 173 | docker run $(TTY_ARG) --rm -v "${PWD}:/app" --workdir "/app" -e "PRE_COMMIT_HOME=/app/.cache/pre-commit" ${BUILD_HARNESS_REPO}:${BUILD_HARNESS_VERSION} chmod -R a+rx .cache 174 | 175 | .PHONY: autoformat 176 | autoformat: ## Update files with automatic formatting tools. Uses Docker for maximum compatibility. 177 | $(MAKE) _runhooks HOOK="" SKIP="check-added-large-files,check-merge-conflict,detect-aws-credentials,detect-private-key,check-yaml,golangci-lint,tofu_checkov,terraform_tflint,renovate-config-validator" 178 | -------------------------------------------------------------------------------- /doc/adr/0014-secrets-management.md: -------------------------------------------------------------------------------- 1 | # 14. Secrets Management 2 | 3 | Date: 2025-03-17 4 | 5 | ## Status 6 | Accepted 7 | 8 | ## Context 9 | 10 | Mission applications require robust secrets management, especially when secret values (credentials, API keys, certificates, etc.) are stored and rotated outside the cluster (e.g., in cloud secret managers or vaults). In our Kubernetes clusters, we need a way to pull in these externally managed secrets and keep them in sync. Without an automated solution, a secret update in an external store might not reflect in the cluster, leaving applications running with stale or invalid credentials. Furthermore, Kubernetes does not automatically propagate updated secret values to running pods’ environment variables, meaning applications won’t pick up new secret values unless the pods are reloaded ([Secrets store CSI driver vs external secrets in a nutshel](https://www.yuribacciarini.com/secrets-store-csi-driver-vs-external-secrets-in-a-nutshel/#:~:text=Finally%20take%20in%20mind%20that%2C,com%2Fstakater%2FReloader)). This underscores the importance of a solution that **both** updates Kubernetes secrets from external sources and refreshes dependent workloads when those secrets change. 11 | 12 | ## Decision 13 | 14 | We will use **External Secrets Operator (ESO)** in combination with **Reloader** to manage external secrets in our Kubernetes clusters. This approach was chosen for its simplicity and alignment with our deployment practices. Key advantages of this decision include: 15 | 16 | - **Simple configuration and minimal overhead**: External Secrets Operator runs as a single controller in the cluster, not a per-node daemon, which keeps resource usage low ([Clarity: secrets store CSI driver vs external secrets... what to use? · Issue #478 · external-secrets/external-secrets · GitHub](https://github.com/external-secrets/external-secrets/issues/478#:~:text=One%20of%20the%20differences%20is,and%20save%20resources%20as%20well)). We only need to deploy the operator and its Custom Resources. Reloader is a lightweight add-on that watches for secret changes and triggers pod restarts as needed. 17 | - **Manifests remain in code (UDS package compatibility)**: Using ESO allows us to define **ExternalSecret** resources in our configuration manifests (e.g., our UDS packages) just like any other Kubernetes object. This means we retain our Infrastructure-as-Code approach — the source-of-truth for what secrets are needed and where they come from stays in our git manifests, with no manual secret injection steps. 18 | - **Minimal prerequisites (IRSA integration)**: The only upfront requirement is an AWS IAM Role for Service Account (IRSA) or equivalent credentials setup, which grants the operator access to external secret stores (like AWS Secrets Manager). Once that IRSA role is prepared and provided at deploy time, no further external setup is needed in the cluster. We do not need to install complex storage drivers or additional node-level components. 19 | 20 | With External Secrets Operator pulling in the latest secret values and Reloader ensuring pods reload those updates, our cluster will automatically stay up-to-date with externally managed secrets. This decision strikes a balance between operational simplicity and reliability in secret management. 21 | 22 | ### Alternatives Considered 23 | 24 | - **External Secrets Operator (ESO) – *Chosen***: ESO natively integrates with external secret managers and was ultimately selected for its ease of use and lightweight footprint. It leverages Kubernetes Custom Resource Definitions to track external secrets and automatically creates/updates standard Kubernetes Secret objects. This means our existing apps can consume secrets as usual, but now those secrets stay in sync with the external source. The operator’s simple deployment (no per-node agents) and compatibility with our manifest-driven workflow made it the preferred choice. 25 | 26 | - **Secrets Store CSI Driver (AWS Secrets CSI)**: We considered Kubernetes Secrets Store CSI Driver with an AWS Secrets Manager provider. This solution was **not chosen** because it introduces additional overhead and complexity. The CSI driver runs as a DaemonSet on every node (along with provider pods), often with elevated privileges ([Secrets store CSI driver vs external secrets in a nutshel](https://www.yuribacciarini.com/secrets-store-csi-driver-vs-external-secrets-in-a-nutshel/#:~:text=,pod)). While it can mount secrets as volumes or create Secrets, using it typically requires modifying application manifests to use CSI volumes or enabling secret syncing. This represents a heavier lift and new moving parts in the cluster. In contrast, ESO provides a more straightforward pull-and-create mechanism for secrets without per-node infrastructure, reducing complexity and resource usage ([Clarity: secrets store CSI driver vs external secrets... what to use? · Issue #478 · external-secrets/external-secrets · GitHub](https://github.com/external-secrets/external-secrets/issues/478#:~:text=One%20of%20the%20differences%20is,and%20save%20resources%20as%20well)). 27 | 28 | - **Manual Secret Management (e.g. scheduled `uds run`)**: Another alternative was to handle external secrets by manually retrieving and applying them to the cluster (for example, running `uds run` on a schedule or during deployments to update secrets). We rejected this approach due to its operational drawbacks. Relying on manual or scheduled updates would require continual human involvement or custom scripting, increasing the chance of delays or errors. There’s a risk that a secret could be updated externally and not reflected in the cluster until the next manual run, potentially causing application failures or security exposure. This method does not scale well and adds operational burden, whereas an operator-based solution automates the process continuously. 29 | 30 | **Security Considerations**: 31 | Security and compliance were factored into this decision, especially regarding container images and access controls: 32 | 33 | - **External Secrets Operator**: The ESO container image is available in trusted repositories (it’s included in Chainguard’s secure image catalog and is also present in IronBank). This means it has been vetted for security vulnerabilities and can meet our organization’s compliance requirements. We will use the approved image and follow best practices (least-privilege IAM role, Kubernetes RBAC restrictions for the operator) to ensure the operator only accesses the secrets it needs. 34 | 35 | - **Reloader**: The Reloader component (e.g., Stakater’s Reloader) is available as a Chainguard image, which provides a high level of supply-chain security. However, it is not currently listed in IronBank’s container catalog. This implies that we may need to get Reloader separately approved or use the Chainguard image with our internal accreditation. We acknowledge this as a minor concern and will mitigate it by tracking Reloader’s image source and updates closely. Notably, Reloader’s functionality is simple (watching for secret/configmap changes and triggering rollouts), and it doesn’t require elevated privileges, which limits its security risk profile. 36 | 37 | **Consequences**: 38 | Adopting External Secrets Operator and Reloader has several consequences, both positive outcomes and trade-offs: 39 | 40 | - **Automated secret rotation**: Our cluster resources will always reflect the latest external secret values. When an external secret (e.g., in AWS Secrets Manager) changes, ESO will quickly sync the new value into the corresponding Kubernetes Secret. This reduces the risk of applications using outdated credentials and improves our security posture by ensuring rotations actually propagate. 41 | 42 | - **Automatic application reloads**: With Reloader in place, any update to a Kubernetes Secret (or ConfigMap) will trigger a rolling restart of pods that reference that secret. Applications will seamlessly pick up new secrets without manual intervention. This greatly simplifies operations — for example, if a certificate or password is rotated, the new value is used by pods within minutes, with no need to orchestrate a manual rollout. We must ensure our workloads handle restarts gracefully, but in exchange we get a more self-healing and up-to-date system. 43 | 44 | - **Operational simplicity**: This solution fits into our existing deployment model (manifests and GitOps). Development teams don’t need to learn new manual processes or volume mounting patterns to use external secrets — they continue to define required secrets in manifests, and the system handles the rest. It offloads the complexity of secret management to the operator, which is its core purpose. 45 | 46 | - **Additional components to manage**: By implementing this decision, we introduce two new components in our clusters (ESO and Reloader). This comes with a small overhead of monitoring these components and keeping them updated. The team will need to include the operator and reloader in our upgrade/testing cycle to ensure they remain compatible with cluster upgrades. However, these components are well-maintained open-source projects, and the burden of running them is low compared to the effort saved in manual secret management. 47 | 48 | - **Compliance trade-off**: Because Reloader is not in IronBank, there is a slight deviation from using only IronBank-approved images. The consequence is that we will either use the Chainguard-provided Reloader image or build our own hardened image. We judge this trade-off acceptable given Reloader’s benefits, but it will be documented and reviewed by our security team. In the future, if an IronBank-certified Reloader becomes available or if ESO introduces a similar reload feature, we can re-evaluate and potentially phase out the separate Reloader component. 49 | 50 | In summary, the decision to use External Secrets Operator with Reloader provides a **secure, automated, and maintainable** way to manage externally sourced secrets. It significantly reduces manual work and the risk of secret drift, at the cost of running two additional lightweight services in the cluster. This trade-off is justified by the improvements in security hygiene and operational efficiency for our Kubernetes environments. 51 | 52 | ## Consequences 53 | 54 | - Update the UDS Base Bundle deployed to clusters to include the External Secrets Operator and Reloader packages. 55 | -------------------------------------------------------------------------------- /examples/complete/variables.tf: -------------------------------------------------------------------------------- 1 | ########################################################### 2 | ################## Global Settings ######################## 3 | 4 | variable "region" { 5 | description = "The AWS region to deploy into" 6 | type = string 7 | } 8 | 9 | variable "name_prefix" { 10 | description = "The prefix to use when naming all resources" 11 | type = string 12 | default = "ex-complete" 13 | validation { 14 | condition = length(var.name_prefix) <= 20 15 | error_message = "The name prefix cannot be more than 20 characters" 16 | } 17 | } 18 | 19 | variable "iam_role_permissions_boundary" { 20 | description = "ARN of the policy that is used to set the permissions boundary for IAM roles" 21 | type = string 22 | default = null 23 | } 24 | 25 | variable "aws_admin_usernames" { 26 | description = "A list of one or more AWS usernames with authorized access to KMS and EKS resources, will automatically add the user running the terraform as an admin" 27 | type = list(string) 28 | default = [] 29 | } 30 | 31 | variable "tags" { 32 | description = "A map of tags to apply to all resources" 33 | type = map(string) 34 | default = {} 35 | } 36 | 37 | variable "kms_key_deletion_window" { 38 | description = "Waiting period for scheduled KMS Key deletion. Can be 7-30 days." 39 | type = number 40 | default = 7 41 | } 42 | 43 | variable "access_log_expire_days" { 44 | description = "Number of days to wait before deleting access logs" 45 | type = number 46 | default = 30 47 | } 48 | 49 | variable "enable_sqs_events_on_access_log_access" { 50 | description = "If true, generates an SQS event whenever on object is created in the Access Log bucket, which happens whenever a server access log is generated by any entity. This will potentially generate a lot of events, so use with caution." 51 | type = bool 52 | default = false 53 | } 54 | 55 | variable "eks_use_mfa" { 56 | description = "Use MFA for auth_eks_role" 57 | type = bool 58 | } 59 | 60 | ########################################################### 61 | #################### VPC Config ########################### 62 | variable "vpc_cidr" { 63 | description = "The CIDR block for the VPC" 64 | type = string 65 | } 66 | 67 | variable "secondary_cidr_blocks" { 68 | description = "A list of secondary CIDR blocks for the VPC" 69 | type = list(string) 70 | default = [] 71 | } 72 | 73 | variable "num_azs" { 74 | description = "The number of AZs to use" 75 | type = number 76 | default = 3 77 | } 78 | 79 | ########################################################### 80 | #################### EKS Config ########################### 81 | variable "access_entries" { 82 | description = "Map of access entries to add to the cluster" 83 | type = any 84 | default = {} 85 | } 86 | 87 | variable "authentication_mode" { 88 | description = "The authentication mode for the cluster. Valid values are `CONFIG_MAP`, `API` or `API_AND_CONFIG_MAP`" 89 | type = string 90 | default = "API" 91 | } 92 | 93 | variable "enable_cluster_creator_admin_permissions" { 94 | description = "Indicates whether or not to add the cluster creator (the identity used by Terraform) as an administrator via access entry" 95 | type = bool 96 | default = true 97 | } 98 | 99 | variable "eks_worker_tenancy" { 100 | description = "The tenancy of the EKS worker nodes" 101 | type = string 102 | default = "default" 103 | } 104 | 105 | variable "cluster_version" { 106 | description = "Kubernetes version to use for EKS cluster" 107 | type = string 108 | # renovate: datasource=endoflife-date depName=amazon-eks versioning=loose extractVersion=^(?.*)-eks.+$ 109 | default = "1.27" 110 | } 111 | 112 | variable "cluster_endpoint_public_access" { 113 | description = "Whether to enable private access to the EKS cluster" 114 | type = bool 115 | default = false 116 | } 117 | 118 | variable "enable_eks_managed_nodegroups" { 119 | description = "Enable managed node groups" 120 | type = bool 121 | } 122 | 123 | variable "enable_self_managed_nodegroups" { 124 | description = "Enable self managed node groups" 125 | type = bool 126 | } 127 | 128 | variable "dataplane_wait_duration" { 129 | description = "The duration to wait for the EKS cluster to be ready before creating the node groups" 130 | type = string 131 | default = "30s" 132 | } 133 | 134 | ########################################################### 135 | ################## EKS Addons Config ###################### 136 | 137 | variable "cluster_addons" { 138 | description = <<-EOD 139 | Nested of eks native add-ons and their associated parameters. 140 | See https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_add-on for supported values. 141 | See https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/complete/main.tf#L44-L60 for upstream example. 142 | 143 | to see available eks marketplace addons available for your cluster's version run: 144 | aws eks describe-addon-versions --kubernetes-version $k8s_cluster_version --query 'addons[].{MarketplaceProductUrl: marketplaceInformation.productUrl, Name: addonName, Owner: owner Publisher: publisher, Type: type}' --output table 145 | EOD 146 | type = any 147 | default = {} 148 | } 149 | 150 | variable "create_kubernetes_resources" { 151 | description = "If true, kubernetes resources related to non-marketplace addons to will be created" 152 | type = bool 153 | default = true 154 | } 155 | 156 | variable "create_ssm_parameters" { 157 | description = "Create SSM parameters for values from eks blueprints addons" 158 | type = bool 159 | default = true 160 | } 161 | 162 | #----------------AWS EBS CSI Driver------------------------- 163 | variable "enable_amazon_eks_aws_ebs_csi_driver" { 164 | description = "Enable EKS Managed AWS EBS CSI Driver add-on" 165 | type = bool 166 | default = false 167 | } 168 | 169 | variable "enable_gp3_default_storage_class" { 170 | description = "Enable gp3 as default storage class" 171 | type = bool 172 | default = false 173 | } 174 | 175 | variable "storageclass_reclaim_policy" { 176 | description = "Reclaim policy for gp3 storage class, valid options are Delete and Retain" 177 | type = string 178 | default = "Delete" 179 | } 180 | 181 | #----------------Metrics Server------------------------- 182 | variable "enable_metrics_server" { 183 | description = "Enable metrics server add-on" 184 | type = bool 185 | default = false 186 | } 187 | 188 | variable "metrics_server" { 189 | description = "Metrics Server config for aws-ia/eks-blueprints-addon/aws" 190 | type = any 191 | default = {} 192 | } 193 | 194 | #----------------AWS Node Termination Handler------------------------- 195 | variable "enable_aws_node_termination_handler" { 196 | description = "Enable AWS Node Termination Handler add-on" 197 | type = bool 198 | default = false 199 | } 200 | 201 | variable "aws_node_termination_handler" { 202 | description = "AWS Node Termination Handler config for aws-ia/eks-blueprints-addon/aws" 203 | type = any 204 | default = {} 205 | } 206 | 207 | #----------------Cluster Autoscaler------------------------- 208 | variable "enable_cluster_autoscaler" { 209 | description = "Enable Cluster autoscaler add-on" 210 | type = bool 211 | default = false 212 | } 213 | 214 | variable "cluster_autoscaler" { 215 | description = "Cluster Autoscaler Helm Chart config" 216 | type = any 217 | default = {} 218 | } 219 | 220 | #----------------Enable_EFS_CSI------------------------- 221 | variable "enable_amazon_eks_aws_efs_csi_driver" { 222 | description = "Enable EFS CSI add-on" 223 | type = bool 224 | default = false 225 | } 226 | 227 | variable "aws_efs_csi_driver" { 228 | description = "AWS EFS CSI Driver helm chart config" 229 | type = any 230 | default = {} 231 | } 232 | 233 | variable "reclaim_policy" { 234 | description = "Reclaim policy for EFS storage class, valid options are Delete and Retain" 235 | type = string 236 | default = "Delete" 237 | } 238 | 239 | #----------------AWS Loadbalancer Controller------------------------- 240 | variable "enable_aws_load_balancer_controller" { 241 | description = "Enable AWS Loadbalancer Controller add-on" 242 | type = bool 243 | default = false 244 | } 245 | 246 | variable "aws_load_balancer_controller" { 247 | description = "AWS Loadbalancer Controller Helm Chart config" 248 | type = any 249 | default = {} 250 | } 251 | 252 | #----------------k8s Secret Store CSI Driver------------------------- 253 | variable "enable_secrets_store_csi_driver" { 254 | description = "Enable k8s Secret Store CSI Driver add-on" 255 | type = bool 256 | default = false 257 | } 258 | 259 | variable "secrets_store_csi_driver" { 260 | description = "k8s Secret Store CSI Driver Helm Chart config" 261 | type = any 262 | default = {} 263 | } 264 | 265 | ########################################################### 266 | ################## Bastion Config ######################### 267 | variable "enable_bastion" { 268 | description = "If true, a bastion will be created" 269 | type = bool 270 | default = true 271 | } 272 | 273 | variable "bastion_tenancy" { 274 | description = "The tenancy of the bastion" 275 | type = string 276 | default = "default" 277 | } 278 | 279 | variable "bastion_instance_type" { 280 | description = "value for the instance type of the EKS worker nodes" 281 | type = string 282 | default = "m5.xlarge" 283 | } 284 | 285 | variable "bastion_ssh_user" { 286 | description = "The SSH user to use for the bastion" 287 | type = string 288 | default = "ec2-user" 289 | } 290 | 291 | variable "bastion_ssh_password" { 292 | description = "The SSH password to use for the bastion if SSM authentication is used" 293 | type = string 294 | default = "my-password" 295 | } 296 | 297 | variable "zarf_version" { 298 | description = "The version of Zarf to use" 299 | type = string 300 | default = "" 301 | } 302 | 303 | ############################################################################ 304 | ####################### DUBBD Add-on Dependencies ######################## 305 | 306 | variable "keycloak_enabled" { 307 | description = "Enable Keycloak dedicated nodegroup" 308 | type = bool 309 | default = false 310 | } 311 | 312 | ############################################################################ 313 | ################## Lambda Password Rotation Config ######################### 314 | 315 | variable "users" { 316 | description = "This needs to be a list of users that will be on your ec2 instances that need password changes." 317 | type = list(string) 318 | default = [] 319 | } 320 | 321 | variable "cron_schedule_password_rotation" { 322 | description = "Schedule for password change function to run on" 323 | type = string 324 | default = "cron(0 0 1 * ? *)" 325 | } 326 | 327 | variable "slack_notification_enabled" { 328 | description = "enable slack notifications for password rotation function. If enabled a slack webhook url will also need to be provided for this to work" 329 | type = bool 330 | default = false 331 | } 332 | 333 | variable "slack_webhook_url" { 334 | description = "value" 335 | type = string 336 | default = null 337 | } 338 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /doc/adr/0013-opinionated-tofu-aws-wrapper-modules.md: -------------------------------------------------------------------------------- 1 | # 13. Opinionated Tofu AWS Wrapper Modules 2 | 3 | Date: 2024-10-01 4 | 5 | ## Status 6 | 7 | Accepted 8 | 9 | ## Context 10 | 11 | We have a collection of Defense Unicorns terraform modules that wrap official AWS modules. We want these modules to be more opinionated such that new missions and product workflows can consume them quickly and understand context for the design. 12 | 13 | At a high level what questions do we need to ask when using Tofu to bootstrap infrastructure on a new mission. 14 | 15 | - Cloud (AWS, Azure, etc...) 16 | - Impact Level (DoD Cloud Computing requirements: IL4, IL5, etc...) 17 | - k8s platform (eks, k0s, k3s) 18 | - Network Type (Air-Gapped, Partially Isolated, Open) 19 | 20 | Modules with defaults based on this context for these needs allow for informed, standardized choices. 21 | 22 | We'll focus on the modules used by the reference deployments and collaborate with Delivery org technical leads to streamline choices and associate defaults to mission contexts. 23 | 24 | Initial modules: 25 | 26 | - https://github.com/defenseunicorns/terraform-aws-uds-vpc 27 | - https://github.com/defenseunicorns/terraform-aws-uds-bastion 28 | - https://github.com/defenseunicorns/terraform-aws-uds-eks 29 | 30 | Module Demo: 31 | 32 | - https://github.com/defenseunicorns/uds-reference-deployments/tree/chore/opinionated-tofu-modules-update/atmos 33 | 34 | ## Decision 35 | 36 | We will... 37 | 38 | - Make it clear to the consumer what's required and what's intended for them to decide for the mission. 39 | - We want to avoid modules that take on the responsibility of being flexible to what may or may not be needed downstream of their deployment. 40 | This doesn't mean we don't what this flexibility, it means it should live outside of architectural modules. For example, selection of Availability Zones may be a mission decision 41 | based on capabilities needed at the EKS layer, but decision needs to be deployed at VPC layer. 42 | We can provide a `mission-init` module that can help dynamically select Availability Zones, instance types, etc... This approach may also be used to help organize what should be decided up 43 | front based on mission requirements. 44 | - This is one of the motivations for adding context and more opinions. Another example, VPCs may 45 | need to know the name of the EKS cluster before the cluster is created in order to set tags 46 | for internal load balancing. The context makes the names deterministic. 47 | - Add one layer with Defense Unicorns opinions around official AWS modules. (Don't wrap our wrappers) 48 | - This layer shall be highly opinionated. We'll be removing most choices in favor of "classes" of settings 49 | based on context. 50 | - All defaults for the official aws module shall be captured in a single file. (i.e.: `locals-defaults-terraform-aws-vpc.tf`). 51 | This enables programmatic updates from the source module and ensures that defaults do not get changed out from under us. 52 | - We'll be removing most options in favor of decisions. This doesn't mean we can't add flexibility later, but we will start with stronger opinions with options based on UDS bundle needs. Examples... 53 | - Kubernetes version shall be fixed in the module. 54 | - Node group options will be fixed for core (specific instances labeled for Keycloak sso) 55 | - Providing public access for bundle development or impact level 2 shall be via a transit gateway into 56 | an IL5 private deployment. 57 | - GPU node group for EKS based on LeapfrogAI requirements is still a conditional option. 58 | - Organize wrapper module vars by what's required versus optional with secrets broken out. 59 | - Use high level cloud ownership context to guide breakdown of top level config object parameters. 60 | Make it obvious where to start with settings related to the following. 61 | - IAM 62 | - Compute 63 | - Networking 64 | - Observability 65 | - Storage 66 | - Security 67 | - UDS 68 | - Set defaults based on Impact Level using overrides from a base. 69 | - Allow for config defaults to be selected from criteria such as impact level from the global context. 70 | - Organize locals for context based configs into separate file such that 71 | CODEOWNERS can be use to keep SME's in the loop. 72 | - [Prefer single objects over multiple simple inputs for related configuration](https://docs.cloudposse.com/best-practices/terraform/#prefer-a-single-object-over-multiple-simple-inputs-for-related-configuration) 73 | - Don't mix secrets with non-secrets to aid in troubleshoot. Mixing will mask non-secret data in 74 | deployment output. 75 | - Use [Cloud Posse context provider](https://github.com/cloudposse/terraform-provider-context/) 76 | - for shared context between modules and applies 77 | - common attributes such as name prefix/suffix (labels), tags and other global configuration. 78 | - Avoid directly passing attributes to wrapped modules in favor of defaults organized by context. 79 | 80 | We will not... 81 | 82 | - Take on the scope to keep secrets in a store and out of module output. 83 | - It's best practices to keep secrets out of tofu state (in favor of pointers to their location in a secret store). 84 | - This can be addressed in a future ADR. 85 | 86 | ## Design Patterns 87 | 88 | - Context is set by the context provider and used for resource names and tags. 89 | - Defaults for wrapped modules are explicitly set such that they cannot change out from under us. 90 | - Each module shall have it's own file to set a local object with default settings. 91 | - The names of the attributes much match the names from the wrapped module. 92 | - File name: `locals-defaults-.tf` 93 | ``` 94 | locals { 95 | _defaults = { 96 | # Defaults generated from wrapped module 97 | # terraform-docs --sort-by required json ${path} | jq -r '.inputs[]|.name + " = " + if (.type == "string" and .default != null) then "\"" + .default + "\"" else (.default| tostring) end' 98 | } 99 | keycloak_config_defaults = { 100 | kms_config = local.aws_kms_defaults 101 | db_config = local.aws_rds_defaults 102 | } 103 | } 104 | ``` 105 | - Defense Unicorn opinions are organized by context. Impact Level shall be used for initial context with a default of IL5. 106 | 107 | - Context based overrides shall have their own files and broken down into subcategories and context as needed. 108 | When breaking into subcategories or contexts they should organize such that codeowners is used for maintainability. 109 | - File name: `locals-overrides---.tf` 110 | 111 | ``` 112 | locals { 113 | ___overrides = { 114 | # Overrides should strive to match the attribute names of the modules they wrap as much as possible. 115 | } 116 | 117 | base_uds_keycloak_overrides = { 118 | db_config = { 119 | # Attribute names match those needed for the wrapped RDS module 120 | } 121 | kms_config = { 122 | # Attribute names match those needed for the wrapped KMS module 123 | description = "UDS Keycloak Key" 124 | } 125 | tags = data.context_tags.this.tags 126 | # other wrapper module attributes. 127 | } 128 | } 129 | ``` 130 | 131 | - Advance overrides variable shall be provided to allow runtime override of context based settings. 132 | - Consumers with advanced understanding of wrapped modules and override data structures can change 133 | settings directly without the wrapper module having to expose everything inside. 134 | - TF var file 135 | ``` 136 | advanced_overrides = { 137 | kms_config = { 138 | description = "Override Keycloak Key Description" 139 | } 140 | } 141 | ``` 142 | - Overrides Deep merge 143 | - A deep merge of Defaults <- Context based overrides <- advanced overrides variable is performed 144 | before passing attributes to wrapped modules and resources. 145 | ``` 146 | locals { 147 | context_key = "impact_level" 148 | keycloak_config_contexts = { 149 | base = [local.base_uds_keycloak_overrides, ] 150 | il4 = [local.base_uds_keycloak_overrides, ] 151 | il5 = [local.base_uds_keycloak_overrides, ] 152 | devx = [local.base_uds_keycloak_overrides, local.devx_overrides] 153 | } 154 | context_overrides = local.keycloak_config_contexts[data.context_config.this.values[local.context_key]] 155 | uds_keycloak_config = module.config_deepmerge.merged 156 | } 157 | module "config_deepmerge" { 158 | source = "cloudposse/config/yaml//modules/deepmerge" 159 | version = "0.2.0" 160 | maps = concat( 161 | [local.keycloak_config_defaults], 162 | local.context_overrides, 163 | [var.advanced_overrides], 164 | ) 165 | } 166 | ``` 167 | - Wrapped Module configuration 168 | 169 | - Wrapped modules are configured with the merged defaults, context overrides and advanced overrides. 170 | ``` 171 | module "kms" { 172 | source = "terraform-aws-modules/kms/aws" 173 | version = "3.1.0" 174 | description = local.uds_keycloak_config.kms_config.description 175 | deletion_window_in_days = local.uds_keycloak_config.kms_config.deletion_window_in_days 176 | enable_key_rotation = local.uds_keycloak_config.kms_config.enable_key_rotation 177 | policy = data.aws_iam_policy_document.kms_access.json 178 | multi_region = local.uds_keycloak_config.kms_config.multi_region 179 | key_owners = local.uds_keycloak_config.kms_config.key_owners 180 | tags = local.uds_keycloak_config.kms_config.tags 181 | create_external = local.uds_keycloak_config.kms_config.create_external 182 | key_usage = local.uds_keycloak_config.kms_config.key_usage 183 | customer_master_key_spec = local.uds_keycloak_config.kms_config.customer_master_key_spec 184 | } 185 | ``` 186 | 187 | - Wrapper modules inputs 188 | 189 | - Use objects to organize classes of settings. 190 | - Sensitive data shall not be mixed with non-sensitive data in objects. 191 | - All sensitive data must be flagged as such. 192 | - Use context for tagging and resource labels. 193 | 194 | ``` 195 | terraform { 196 | required_providers { 197 | context = { 198 | source = "registry.terraform.io/cloudposse/context" 199 | version = "~> 0.4.0" 200 | } 201 | } 202 | } 203 | // Context data sources that spans modules and deploys. 204 | data "context_config" "this" {} 205 | data "context_label" "this" {} 206 | data "context_tags" "this" {} 207 | 208 | // Standardize on config objects. Use `optional()` to set defaults as needed. 209 | variable "vpc_config" { 210 | description = "Existing VPC configuration for EKS" 211 | type = object({ 212 | vpc_id = string 213 | subnet_ids = list(string) 214 | azs = list(string) 215 | private_subnets = list(string) 216 | intra_subnets = list(string) 217 | database_subnets = optional(list(string)) 218 | database_subnet_group_name = optional(string) 219 | }) 220 | } 221 | 222 | // EKS configuration options. We can put in defaults, however defaults 223 | // should not be provided for items that need to be a mission decision. 224 | variable "eks_config_opts" { 225 | description = "EKS Configuration options to be determined by mission needs." 226 | type = object({ 227 | cluster_version = optional(string, "1.30") 228 | }) 229 | } 230 | 231 | variable "eks_sensitive_config_opts" { 232 | sensitive = true 233 | type = object({ 234 | eks_sensitive_opt1 = optional(string) 235 | eks_sensitive_opt2 = optional(string) 236 | }) 237 | } 238 | module "aws_eks" { 239 | source = "git::https://github.com/terraform-aws-modules/terraform-aws-eks.git?ref=v20.24.0" 240 | cluster_name = data.context_label.this.rendered 241 | tags = data.context_tags.this.tags 242 | } 243 | ``` 244 | 245 | - Wrapper modules outputs 246 | 247 | - Use single object to output all non-sensitive data 248 | - Sensitive outputs should be grouped together and flagged as being sensitive 249 | - Only output information deemed necessary for other modules to consume 250 | 251 | ``` 252 | output "vpc_properties" { 253 | description = "Configuration of the VPC including subnet groups, subnets, and VPC ID" 254 | value = { 255 | azs = module.vpc.azs 256 | database_subnet_group_name = module.vpc.database_subnet_group_name 257 | database_subnets = module.vpc.database_subnets 258 | intra_subnets = module.vpc.intra_subnets 259 | private_subnets = module.vpc.private_subnets 260 | private_subnets_cidr_blocks = module.vpc.private_subnets_cidr_blocks 261 | public_subnets = module.vpc.public_subnets 262 | vpc_id = module.vpc.vpc_id 263 | } 264 | } 265 | ``` 266 | 267 | - Input and Output naming 268 | 269 | - Variables should be named: 270 | - `-options` 271 | - For optional inputs 272 | - `-requirements` 273 | - For required inputs 274 | - `-advanced-overrides` 275 | - For advanced variable overrides 276 | - `-properties` 277 | - For module outputs 278 | 279 | An example of the data flow through modules connected via a stack is provided [here](https://github.com/defenseunicorns/uds-reference-deployments/blob/chore/opinionated-tofu-modules-update/atmos/stacks/catalog/aws/eks.yaml). You can see how in 280 | lieu of vars for name prefixes, tags and other global config the context is used. 281 | 282 | ## Consequences 283 | 284 | Bootstrapping new missions becomes context driven by standards put in place by impact level. It's clear why one is choosing configuration options. 285 | 286 | We do run the risk of not exposing the appropriate config options for new missions. The `advanced_overrides` pattern mitigates that risk. 287 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # Changelog 2 | 3 | ## [0.0.12](https://github.com/defenseunicorns/delivery-aws-iac/compare/v0.0.11...v0.0.12) (2024-02-26) 4 | 5 | 6 | ### Bug Fixes 7 | 8 | * cleanup for new patterns ([#398](https://github.com/defenseunicorns/delivery-aws-iac/issues/398)) ([2f2ca52](https://github.com/defenseunicorns/delivery-aws-iac/commit/2f2ca52105526b2a93f61a40360e9f5936fc009a)) 9 | * **deps:** update all dependencies ([#393](https://github.com/defenseunicorns/delivery-aws-iac/issues/393)) ([9365089](https://github.com/defenseunicorns/delivery-aws-iac/commit/93650891ea0b0109f10557d477dd3f5cd59e54b0)) 10 | 11 | 12 | ### Miscellaneous Chores 13 | 14 | * **deps:** update all dependencies ([#385](https://github.com/defenseunicorns/delivery-aws-iac/issues/385)) ([bbfcd7b](https://github.com/defenseunicorns/delivery-aws-iac/commit/bbfcd7bc0c349e3d5b9f1d315a951e0b21bb713f)) 15 | * **deps:** update all dependencies ([#386](https://github.com/defenseunicorns/delivery-aws-iac/issues/386)) ([6c6876c](https://github.com/defenseunicorns/delivery-aws-iac/commit/6c6876cbd665bdc862d0433455c0801971d9ffa4)) 16 | * **deps:** update all dependencies ([#387](https://github.com/defenseunicorns/delivery-aws-iac/issues/387)) ([15f5614](https://github.com/defenseunicorns/delivery-aws-iac/commit/15f5614397224878ce15ac6200f3cf1987f92eb1)) 17 | * **deps:** update all dependencies ([#390](https://github.com/defenseunicorns/delivery-aws-iac/issues/390)) ([d5741eb](https://github.com/defenseunicorns/delivery-aws-iac/commit/d5741eb8cbfd5f6db78711d26c66d4184c632e2b)) 18 | * **deps:** update all dependencies ([#391](https://github.com/defenseunicorns/delivery-aws-iac/issues/391)) ([643b90f](https://github.com/defenseunicorns/delivery-aws-iac/commit/643b90f2e88d74e775d57ffcf6e39f08b082cdba)) 19 | * **deps:** update all dependencies ([#395](https://github.com/defenseunicorns/delivery-aws-iac/issues/395)) ([ef9e4f5](https://github.com/defenseunicorns/delivery-aws-iac/commit/ef9e4f54ed579ef048d4d81f4a5d01f01255466a)) 20 | * **deps:** update pre-commit hook renovatebot/pre-commit-hooks to v37.142.1 ([#394](https://github.com/defenseunicorns/delivery-aws-iac/issues/394)) ([ac42947](https://github.com/defenseunicorns/delivery-aws-iac/commit/ac42947225995d88585eb9442ef2b2ad92f23607)) 21 | * update examples from EKS module ([#392](https://github.com/defenseunicorns/delivery-aws-iac/issues/392)) ([f6cb1ed](https://github.com/defenseunicorns/delivery-aws-iac/commit/f6cb1edd262479a009cc8fc034f50c95b2a61cc6)) 22 | 23 | 24 | ### Tests 25 | 26 | * tweak test spam ([#383](https://github.com/defenseunicorns/delivery-aws-iac/issues/383)) ([9ef7714](https://github.com/defenseunicorns/delivery-aws-iac/commit/9ef7714778c15330609343cc5f202695df014bf6)) 27 | 28 | 29 | ### Continuous Integration 30 | 31 | * renovate window update and vuln handling ([#389](https://github.com/defenseunicorns/delivery-aws-iac/issues/389)) ([6e14b13](https://github.com/defenseunicorns/delivery-aws-iac/commit/6e14b132c1934973a1b0c3874d3ce5851a19b10b)) 32 | 33 | ## [0.0.11](https://github.com/defenseunicorns/delivery-aws-iac/compare/v0.0.10...v0.0.11) (2023-11-28) 34 | 35 | 36 | ### Features 37 | 38 | * migrate lambda module to its own repo ([#375](https://github.com/defenseunicorns/delivery-aws-iac/issues/375)) ([1447362](https://github.com/defenseunicorns/delivery-aws-iac/commit/14473623f34561aaa14940855243f0ee79c88e1c)) 39 | 40 | 41 | ### Bug Fixes 42 | 43 | * **deps:** update all dependencies ([#382](https://github.com/defenseunicorns/delivery-aws-iac/issues/382)) ([58483e5](https://github.com/defenseunicorns/delivery-aws-iac/commit/58483e5ba53be677329a79314b2af1b91fd4602f)) 44 | 45 | 46 | ### Documentation 47 | 48 | * ADR 11 for ci testing refactor ([#374](https://github.com/defenseunicorns/delivery-aws-iac/issues/374)) ([f35c424](https://github.com/defenseunicorns/delivery-aws-iac/commit/f35c4244a84b38aedad855f46a36289728495c6c)) 49 | * spike/adr for e2e testing ([#354](https://github.com/defenseunicorns/delivery-aws-iac/issues/354)) ([360cc0b](https://github.com/defenseunicorns/delivery-aws-iac/commit/360cc0be575dc002478e881b4cd376516f9314ea)) 50 | * update ADR ([#378](https://github.com/defenseunicorns/delivery-aws-iac/issues/378)) ([df4a7ec](https://github.com/defenseunicorns/delivery-aws-iac/commit/df4a7ec2eefd089d4ff267c93c0cbb33c787f50d)) 51 | 52 | 53 | ### Miscellaneous Chores 54 | 55 | * **deps:** update all dependencies ([#352](https://github.com/defenseunicorns/delivery-aws-iac/issues/352)) ([122308a](https://github.com/defenseunicorns/delivery-aws-iac/commit/122308a1f73f808fe158a255d98d3de4bd5030dd)) 56 | * remove uds references ([#376](https://github.com/defenseunicorns/delivery-aws-iac/issues/376)) ([ba9f8b2](https://github.com/defenseunicorns/delivery-aws-iac/commit/ba9f8b2bdc7d88d68e967194f9b94119ebaca855)) 57 | 58 | 59 | ### Continuous Integration 60 | 61 | * e2e secure test once a week ([#379](https://github.com/defenseunicorns/delivery-aws-iac/issues/379)) ([0a56de6](https://github.com/defenseunicorns/delivery-aws-iac/commit/0a56de6e69bc64b20e41a053393be6b3fab90737)) 62 | * fix permissions issues and move release-please to reusable workflow ([#365](https://github.com/defenseunicorns/delivery-aws-iac/issues/365)) ([f847649](https://github.com/defenseunicorns/delivery-aws-iac/commit/f84764990254ceea749651a77f5ee2d7578cdf35)) 63 | * make ci flexible ([#372](https://github.com/defenseunicorns/delivery-aws-iac/issues/372)) ([c1aa0e0](https://github.com/defenseunicorns/delivery-aws-iac/commit/c1aa0e06e37190702fc9c1c61809facd18b1af21)) 64 | * merge queue testing implementation ([#364](https://github.com/defenseunicorns/delivery-aws-iac/issues/364)) ([9ef9fcf](https://github.com/defenseunicorns/delivery-aws-iac/commit/9ef9fcff49e5bc87397cc2747649df8895562ae8)) 65 | * remove asdf install ([#367](https://github.com/defenseunicorns/delivery-aws-iac/issues/367)) ([6b57a1d](https://github.com/defenseunicorns/delivery-aws-iac/commit/6b57a1de742d4713e8729c7befc399ee5e0b8153)) 66 | * update renovate window ([#381](https://github.com/defenseunicorns/delivery-aws-iac/issues/381)) ([7c495ac](https://github.com/defenseunicorns/delivery-aws-iac/commit/7c495ac62e3a4710f84bd8ee58a1303fb0b8bab6)) 67 | 68 | ## [0.0.10](https://github.com/defenseunicorns/delivery-aws-iac/compare/v0.0.9...v0.0.10) (2023-09-13) 69 | 70 | 71 | ### Miscellaneous Chores 72 | 73 | * **deps:** update all dependencies ([#315](https://github.com/defenseunicorns/delivery-aws-iac/issues/315)) ([ccc71d9](https://github.com/defenseunicorns/delivery-aws-iac/commit/ccc71d9c94ee8aa904a0decb4bb776335dda15e5)) 74 | * **deps:** update all dependencies ([#345](https://github.com/defenseunicorns/delivery-aws-iac/issues/345)) ([41e5da0](https://github.com/defenseunicorns/delivery-aws-iac/commit/41e5da049e99e09ed5f65e605f8fc2bf85ef59f1)) 75 | * **deps:** update all dependencies ([#349](https://github.com/defenseunicorns/delivery-aws-iac/issues/349)) ([6144bf9](https://github.com/defenseunicorns/delivery-aws-iac/commit/6144bf9b4761648ddcb85620cf60a28aff2bcc54)) 76 | 77 | 78 | ### Code Refactoring 79 | 80 | * removal of kubectl provider ([#348](https://github.com/defenseunicorns/delivery-aws-iac/issues/348)) ([98ca153](https://github.com/defenseunicorns/delivery-aws-iac/commit/98ca153745879625dc7ab3cbf5816a71678e061b)) 81 | 82 | 83 | ### Continuous Integration 84 | 85 | * fix some inputs for renovate to monitor properly ([#347](https://github.com/defenseunicorns/delivery-aws-iac/issues/347)) ([bf60547](https://github.com/defenseunicorns/delivery-aws-iac/commit/bf60547b1d206266c030a537baf40cba5acf5ddb)) 86 | * refactor for shared workflows ([#346](https://github.com/defenseunicorns/delivery-aws-iac/issues/346)) ([5c4bb84](https://github.com/defenseunicorns/delivery-aws-iac/commit/5c4bb845d1b293d83a2b2657ecf2ec42bb53c312)) 87 | 88 | ## [0.0.9](https://github.com/defenseunicorns/delivery-aws-iac/compare/v0.0.8...v0.0.9) (2023-08-23) 89 | 90 | 91 | ### Features 92 | 93 | * update eks module for blueprints v5 migration ([#337](https://github.com/defenseunicorns/delivery-aws-iac/issues/337)) ([2b7e7d5](https://github.com/defenseunicorns/delivery-aws-iac/commit/2b7e7d5f136f6eaa5f1fa74ff0318a26a02eefdb)) 94 | 95 | 96 | ### Miscellaneous Chores 97 | 98 | * **ci:** Add support for using the merge queue, refactor workflow file names, and add more docs to the workflow files ([#340](https://github.com/defenseunicorns/delivery-aws-iac/issues/340)) ([410ee6c](https://github.com/defenseunicorns/delivery-aws-iac/commit/410ee6ca4c1ba659cfce3b423b64d75793d0a1e5)) 99 | * **ci:** Fix typo in merge_group test jobs ([#342](https://github.com/defenseunicorns/delivery-aws-iac/issues/342)) ([cc15c5c](https://github.com/defenseunicorns/delivery-aws-iac/commit/cc15c5c9e85df42b44b3b8797079644bb4118386)) 100 | 101 | ## 0.0.8 (2023-08-16) 102 | 103 | 104 | ### Features 105 | 106 | * add password-rotation Lambda module and utilize it in the examples/complete root module ([#311](https://github.com/defenseunicorns/delivery-aws-iac/issues/311)) ([97cfb65](https://github.com/defenseunicorns/delivery-aws-iac/commit/97cfb65254940b0042385ca1f989ffd9853ecfe7)) 107 | 108 | 109 | ### Bug Fixes 110 | 111 | * add eks 1.27 & gp3 storage class ([#325](https://github.com/defenseunicorns/delivery-aws-iac/issues/325)) ([b4ecfb1](https://github.com/defenseunicorns/delivery-aws-iac/commit/b4ecfb1f2e399419e855fe9eaff871ea9f304219)) 112 | * **ci:** Change the parse job in test and update workflows to always use main ([#332](https://github.com/defenseunicorns/delivery-aws-iac/issues/332)) ([2a92fae](https://github.com/defenseunicorns/delivery-aws-iac/commit/2a92fae5cdfd20eaada800e692a76570326aed09)) 113 | * remove big bang dependencies ([#325](https://github.com/defenseunicorns/delivery-aws-iac/issues/325)) ([b4ecfb1](https://github.com/defenseunicorns/delivery-aws-iac/commit/b4ecfb1f2e399419e855fe9eaff871ea9f304219)) 114 | * stop using spot instances in the example root module ([#329](https://github.com/defenseunicorns/delivery-aws-iac/issues/329)) ([656cee6](https://github.com/defenseunicorns/delivery-aws-iac/commit/656cee66e6d591309745dba287c5b35685db7293)) 115 | * upgrade addon versions ([#325](https://github.com/defenseunicorns/delivery-aws-iac/issues/325)) ([b4ecfb1](https://github.com/defenseunicorns/delivery-aws-iac/commit/b4ecfb1f2e399419e855fe9eaff871ea9f304219)) 116 | 117 | 118 | ### Documentation 119 | 120 | * **adr:** add ADR for how to trigger automated tests ([#321](https://github.com/defenseunicorns/delivery-aws-iac/issues/321)) ([49d1f77](https://github.com/defenseunicorns/delivery-aws-iac/commit/49d1f77ed3c4bd188e0f782b543e0b0d2cbe936d)) 121 | 122 | 123 | ### Miscellaneous Chores 124 | 125 | * **deps:** enable renovate updates in examples/complete/fixtures.common.tfvars for Zarf, EKS, aws-node-termination-handler, kubernetes autoscaler chart and underlying image, metrics-server, and calico ([#309](https://github.com/defenseunicorns/delivery-aws-iac/issues/309)) ([1e48f68](https://github.com/defenseunicorns/delivery-aws-iac/commit/1e48f68c40c4201eff4b41c6bb146164fa6742c0)) 126 | * **deps:** help Renovate get around Checkov's IP block ([#324](https://github.com/defenseunicorns/delivery-aws-iac/issues/324)) ([71f4534](https://github.com/defenseunicorns/delivery-aws-iac/commit/71f4534cb872bab766b818b79745bf5d7fa2358c)) 127 | * **deps:** Upgrade aws/aws-sdk-go from v1.44.293 to v1.44.299 ([#305](https://github.com/defenseunicorns/delivery-aws-iac/issues/305)) ([7cfec56](https://github.com/defenseunicorns/delivery-aws-iac/commit/7cfec56cff02a74502296456131d89e2aeaa3c7d)) 128 | * **deps:** upgrade awscli from 2.13.0 to 2.13.1 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 129 | * **deps:** upgrade awscli from v2.12.5 to v2.13.0 ([#305](https://github.com/defenseunicorns/delivery-aws-iac/issues/305)) ([7cfec56](https://github.com/defenseunicorns/delivery-aws-iac/commit/7cfec56cff02a74502296456131d89e2aeaa3c7d)) 130 | * **deps:** upgrade defenseunicorns/build-harness from 1.7.0 to 1.8.0 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 131 | * **deps:** upgrade defenseunicorns/build-harness from v1.4.2 to v1.7.0 ([#305](https://github.com/defenseunicorns/delivery-aws-iac/issues/305)) ([7cfec56](https://github.com/defenseunicorns/delivery-aws-iac/commit/7cfec56cff02a74502296456131d89e2aeaa3c7d)) 132 | * **deps:** upgrade defenseunicorns/zarf from 0.28.1 to 0.28.2 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 133 | * **deps:** upgrade defenseunicorns/zarf from v0.28.0 to v0.28.1 ([#305](https://github.com/defenseunicorns/delivery-aws-iac/issues/305)) ([7cfec56](https://github.com/defenseunicorns/delivery-aws-iac/commit/7cfec56cff02a74502296456131d89e2aeaa3c7d)) 134 | * **deps:** upgrade github.com/aws/aws-sdk-go from 1.44.299 to 1.44.301 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 135 | * **deps:** upgrade github.com/defenseunicorns/terraform-aws-eks from 0.0.1-alpha to 0.0.2 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 136 | * **deps:** upgrade github.com/defenseunicorns/terraform-aws-vpc from 0.0.2-alpha to 0.0.2 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 137 | * **deps:** upgrade github.com/terraform-aws-modules/terraform-aws-lambda from 5.0.0 to 5.3.0 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 138 | * **deps:** upgrade golang from 1.20.5 to 1.20.6 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 139 | * **deps:** upgrade gruntwork-ioterratest from v0.43.6 to v0.43.8 ([#305](https://github.com/defenseunicorns/delivery-aws-iac/issues/305)) ([7cfec56](https://github.com/defenseunicorns/delivery-aws-iac/commit/7cfec56cff02a74502296456131d89e2aeaa3c7d)) 140 | * **deps:** upgrade renovatebot/pre-commit-hooks from 36.7.0 to 36.10..0 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 141 | * **deps:** upgrade renovatebot/pre-commit-hooks from v35.147.0 to v36.7.0 ([#305](https://github.com/defenseunicorns/delivery-aws-iac/issues/305)) ([7cfec56](https://github.com/defenseunicorns/delivery-aws-iac/commit/7cfec56cff02a74502296456131d89e2aeaa3c7d)) 142 | * **deps:** upgrade terraform from 1.5.2 to 1.5.3 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 143 | * release 0.0.8 ([6011188](https://github.com/defenseunicorns/delivery-aws-iac/commit/601118878b7d4ef030fbcdfb3c5dedfa0758b8a4)) 144 | 145 | 146 | ### Code Refactoring 147 | 148 | * **automation:** refactor parse logic to make it a bit simpler ([#323](https://github.com/defenseunicorns/delivery-aws-iac/issues/323)) ([113758a](https://github.com/defenseunicorns/delivery-aws-iac/commit/113758a2a09ce993cbb85149c976cb1d0fba64b9)) 149 | * **automation:** refactor the update-command workflow so that the bulk of the logic is in a reusable action, since it will now be used in 2 different places ([#323](https://github.com/defenseunicorns/delivery-aws-iac/issues/323)) ([113758a](https://github.com/defenseunicorns/delivery-aws-iac/commit/113758a2a09ce993cbb85149c976cb1d0fba64b9)) 150 | * **automation:** simplify the if statements now that we aren't trying to run tests on push to main anymore ([#323](https://github.com/defenseunicorns/delivery-aws-iac/issues/323)) ([113758a](https://github.com/defenseunicorns/delivery-aws-iac/commit/113758a2a09ce993cbb85149c976cb1d0fba64b9)) 151 | * **pr-automation:** simplify the GitHub context name that the E2E tests use so that we can require them regardless of how they were triggered ([#323](https://github.com/defenseunicorns/delivery-aws-iac/issues/323)) ([113758a](https://github.com/defenseunicorns/delivery-aws-iac/commit/113758a2a09ce993cbb85149c976cb1d0fba64b9)) 152 | 153 | 154 | ### Continuous Integration 155 | 156 | * **main:** delete the pre-commit-trunk workflow now that we aren't trying to run it on commits to main ([#323](https://github.com/defenseunicorns/delivery-aws-iac/issues/323)) ([113758a](https://github.com/defenseunicorns/delivery-aws-iac/commit/113758a2a09ce993cbb85149c976cb1d0fba64b9)) 157 | * **pr-automation:** delete the auto-labeling workflow ([#323](https://github.com/defenseunicorns/delivery-aws-iac/issues/323)) ([113758a](https://github.com/defenseunicorns/delivery-aws-iac/commit/113758a2a09ce993cbb85149c976cb1d0fba64b9)) 158 | * **release:** change release-please config to bump patch for pre-1.0.0 minor changes ([#331](https://github.com/defenseunicorns/delivery-aws-iac/issues/331)) ([a556560](https://github.com/defenseunicorns/delivery-aws-iac/commit/a556560f443aa763f4d7449b3558cdf31ffd914b)) 159 | * **release:** fix auto-tagging workflow previously released ([#308](https://github.com/defenseunicorns/delivery-aws-iac/issues/308)) ([12310eb](https://github.com/defenseunicorns/delivery-aws-iac/commit/12310eb419c623ddbcf58cdb2f42e43af76381a1)) 160 | * **release:** update automated release process to use release-please ([#326](https://github.com/defenseunicorns/delivery-aws-iac/issues/326)) ([bd2cf4d](https://github.com/defenseunicorns/delivery-aws-iac/commit/bd2cf4d49f77b794babbc54e1368a6a47990cc9a)) 161 | * **test:** Add ability to run Secure E2E and Insecure E2E tests separately, and only run Insecure test in commercial and Secure test in govcloud ([#333](https://github.com/defenseunicorns/delivery-aws-iac/issues/333)) ([bd2e23a](https://github.com/defenseunicorns/delivery-aws-iac/commit/bd2e23a04f32cbd33cea9e147312dfeeeb644aa1)) 162 | * **test:** change commercial primary test region from us-east-1 to us-east-2 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 163 | * **test:** change commercial secondary test region from us-east-2 to us-east-1 ([#313](https://github.com/defenseunicorns/delivery-aws-iac/issues/313)) ([f4a7cae](https://github.com/defenseunicorns/delivery-aws-iac/commit/f4a7caebefd8338214c2fb606c49a697ce7dc3ee)) 164 | * **test:** update the auto-test workflow to reflect the decisions documented in ADR 8 ([#323](https://github.com/defenseunicorns/delivery-aws-iac/issues/323)) ([113758a](https://github.com/defenseunicorns/delivery-aws-iac/commit/113758a2a09ce993cbb85149c976cb1d0fba64b9)) 165 | -------------------------------------------------------------------------------- /examples/complete/main.tf: -------------------------------------------------------------------------------- 1 | data "aws_partition" "current" {} 2 | 3 | data "aws_caller_identity" "current" {} 4 | data "aws_iam_session_context" "current" { 5 | # This data source provides information on the IAM source role of an STS assumed role 6 | # For non-role ARNs, this data source simply passes the ARN through issuer ARN 7 | # Ref https://github.com/terraform-aws-modules/terraform-aws-eks/issues/2327#issuecomment-1355581682 8 | # Ref https://github.com/hashicorp/terraform-provider-aws/issues/28381 9 | arn = data.aws_caller_identity.current.arn 10 | } 11 | 12 | data "aws_availability_zones" "available" { 13 | filter { 14 | name = "opt-in-status" 15 | values = ["opt-in-not-required"] 16 | } 17 | } 18 | 19 | data "aws_ami" "eks_default_bottlerocket" { 20 | most_recent = true 21 | owners = ["amazon"] 22 | 23 | filter { 24 | name = "name" 25 | values = ["bottlerocket-aws-k8s-${var.cluster_version}-x86_64-*"] 26 | } 27 | } 28 | 29 | resource "random_id" "default" { 30 | byte_length = 2 31 | } 32 | 33 | locals { 34 | vpc_name = "${var.name_prefix}-${lower(random_id.default.hex)}" 35 | cluster_name = "${var.name_prefix}-${lower(random_id.default.hex)}" 36 | bastion_name = "${var.name_prefix}-bastion-${lower(random_id.default.hex)}" 37 | access_logging_name_prefix = "${var.name_prefix}-accesslog-${lower(random_id.default.hex)}" 38 | kms_key_alias_name_prefix = "alias/${var.name_prefix}-${lower(random_id.default.hex)}" 39 | access_log_sqs_queue_name = "${var.name_prefix}-accesslog-access-${lower(random_id.default.hex)}" 40 | tags = merge( 41 | var.tags, 42 | { 43 | RootTFModule = replace(basename(path.cwd), "_", "-") # tag names based on the directory name 44 | GithubRepo = "github.com/defenseunicorns/delivery-aws-iac" 45 | } 46 | ) 47 | } 48 | 49 | ################################################################################ 50 | # VPC 51 | ################################################################################ 52 | 53 | locals { 54 | azs = [for az_name in slice(data.aws_availability_zones.available.names, 0, min(length(data.aws_availability_zones.available.names), var.num_azs)) : az_name] 55 | } 56 | 57 | module "vpc" { 58 | source = "git::https://github.com/defenseunicorns/terraform-aws-vpc.git?ref=v0.1.7" 59 | 60 | name = local.vpc_name 61 | vpc_cidr = var.vpc_cidr 62 | secondary_cidr_blocks = var.secondary_cidr_blocks 63 | azs = local.azs 64 | public_subnets = [for k, v in module.vpc.azs : cidrsubnet(module.vpc.vpc_cidr_block, 5, k)] 65 | private_subnets = [for k, v in module.vpc.azs : cidrsubnet(module.vpc.vpc_cidr_block, 5, k + 4)] 66 | database_subnets = [for k, v in module.vpc.azs : cidrsubnet(module.vpc.vpc_cidr_block, 5, k + 8)] 67 | intra_subnets = [for k, v in module.vpc.azs : cidrsubnet(element(module.vpc.vpc_secondary_cidr_blocks, 0), 5, k)] 68 | single_nat_gateway = true 69 | enable_nat_gateway = true 70 | 71 | private_subnet_tags = { 72 | "kubernetes.io/cluster/${local.cluster_name}" = "shared" 73 | "kubernetes.io/role/internal-elb" = 1 74 | } 75 | create_database_subnet_group = true 76 | 77 | instance_tenancy = "default" 78 | vpc_flow_log_permissions_boundary = var.iam_role_permissions_boundary 79 | 80 | tags = local.tags 81 | } 82 | 83 | ################################################################################ 84 | # Bastion instance 85 | ################################################################################ 86 | 87 | locals { 88 | ingress_bastion_to_cluster = { 89 | description = "Bastion SG to Cluster" 90 | security_group_id = module.eks.cluster_security_group_id 91 | from_port = 443 92 | to_port = 443 93 | protocol = "tcp" 94 | type = "ingress" 95 | source_security_group_id = try(module.bastion[0].security_group_ids[0], null) 96 | } 97 | 98 | } 99 | 100 | data "aws_ami" "amazonlinux2" { 101 | count = var.enable_bastion ? 1 : 0 102 | 103 | most_recent = true 104 | 105 | filter { 106 | name = "name" 107 | values = ["amzn2-ami-hvm*x86_64-gp2"] 108 | } 109 | 110 | owners = ["amazon"] 111 | } 112 | 113 | module "bastion" { 114 | source = "git::https://github.com/defenseunicorns/terraform-aws-bastion.git?ref=v0.0.13" 115 | 116 | count = var.enable_bastion ? 1 : 0 117 | 118 | enable_bastion_terraform_permissions = true 119 | 120 | ami_id = data.aws_ami.amazonlinux2[0].id 121 | instance_type = var.bastion_instance_type 122 | root_volume_config = { 123 | volume_type = "gp3" 124 | volume_size = "20" 125 | encrypted = true 126 | } 127 | name = local.bastion_name 128 | vpc_id = module.vpc.vpc_id 129 | subnet_id = module.vpc.private_subnets[0] 130 | region = var.region 131 | access_logs_bucket_name = aws_s3_bucket.access_log_bucket.id 132 | session_log_bucket_name_prefix = "${local.bastion_name}-sessionlogs" 133 | kms_key_arn = aws_kms_key.default.arn 134 | ssh_user = var.bastion_ssh_user 135 | ssh_password = var.bastion_ssh_password 136 | assign_public_ip = false 137 | enable_log_to_s3 = true 138 | enable_log_to_cloudwatch = true 139 | tenancy = var.bastion_tenancy 140 | zarf_version = var.zarf_version 141 | permissions_boundary = var.iam_role_permissions_boundary 142 | tags = merge( 143 | local.tags, 144 | { Function = "bastion-ssm" }) 145 | } 146 | 147 | ################################################################################ 148 | # EKS Cluster 149 | ################################################################################ 150 | 151 | locals { 152 | cluster_security_group_additional_rules = merge( 153 | var.enable_bastion ? { ingress_bastion_to_cluster = local.ingress_bastion_to_cluster } : {}, 154 | #other rules here 155 | ) 156 | 157 | # eks managed node groups settings 158 | eks_managed_node_group_defaults = { 159 | # https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/node_groups.tf 160 | iam_role_permissions_boundary = var.iam_role_permissions_boundary 161 | ami_type = "AL2_x86_64" 162 | instance_types = ["m5a.large", "m5.large", "m6i.large"] 163 | tags = { 164 | subnet_type = "private", 165 | cluster = local.cluster_name 166 | } 167 | } 168 | 169 | mission_app_mg_node_group = { 170 | managed_ng1 = { 171 | min_size = 2 172 | max_size = 2 173 | desired_size = 2 174 | disk_size = 50 175 | } 176 | } 177 | 178 | eks_managed_node_groups = merge( 179 | var.enable_eks_managed_nodegroups ? local.mission_app_mg_node_group : {}, 180 | # var.enable_eks_managed_nodegroups && var.keycloak_enabled ? local.keycloak_mg_node_group : {} 181 | ) 182 | 183 | # self managed node groups settings 184 | self_managed_node_group_defaults = { 185 | iam_role_permissions_boundary = var.iam_role_permissions_boundary 186 | instance_type = null # conflicts with instance_requirements settings 187 | update_launch_template_default_version = true 188 | 189 | use_mixed_instances_policy = true 190 | 191 | instance_requirements = { 192 | allowed_instance_types = ["m7i.4xlarge", "m6a.4xlarge", "m5a.4xlarge"] #this should be adjusted to the appropriate instance family if reserved instances are being utilized 193 | memory_mib = { 194 | min = 64000 195 | } 196 | vcpu_count = { 197 | min = 16 198 | } 199 | } 200 | 201 | placement = { 202 | tenancy = var.eks_worker_tenancy 203 | } 204 | 205 | pre_bootstrap_userdata = <<-EOT 206 | yum install -y amazon-ssm-agent 207 | systemctl enable amazon-ssm-agent && systemctl start amazon-ssm-agent 208 | EOT 209 | 210 | post_userdata = <<-EOT 211 | echo "Bootstrap successfully completed! You can further apply config or install to run after bootstrap if needed" 212 | EOT 213 | 214 | # bootstrap_extra_args used only when you pass custom_ami_id. Allows you to change the Container Runtime for Nodes 215 | # e.g., bootstrap_extra_args="--use-max-pods false --container-runtime containerd" 216 | bootstrap_extra_args = "--use-max-pods false" 217 | 218 | iam_role_additional_policies = { 219 | AmazonSSMManagedInstanceCore = "arn:${data.aws_partition.current.partition}:iam::aws:policy/AmazonSSMManagedInstanceCore", 220 | AmazonElasticFileSystemFullAccess = "arn:${data.aws_partition.current.partition}:iam::aws:policy/AmazonElasticFileSystemFullAccess" 221 | } 222 | 223 | # enable discovery of autoscaling groups by cluster-autoscaler 224 | autoscaling_group_tags = merge( 225 | local.tags, 226 | { 227 | "k8s.io/cluster-autoscaler/enabled" : true, 228 | "k8s.io/cluster-autoscaler/${local.cluster_name}" : "owned" 229 | }) 230 | 231 | metadata_options = { 232 | #https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/launch_template#metadata-options 233 | http_endpoint = "enabled" 234 | http_put_response_hop_limit = 2 235 | http_tokens = "optional" # set to "enabled" to enforce IMDSv2, default for upstream terraform-aws-eks module 236 | } 237 | 238 | tags = { 239 | subnet_type = "private", 240 | cluster = local.cluster_name 241 | } 242 | } 243 | 244 | mission_app_self_mg_node_group = { 245 | bigbang_ng = { 246 | subnet_ids = module.vpc.private_subnets 247 | min_size = 2 248 | max_size = 2 249 | desired_size = 2 250 | 251 | block_device_mappings = { 252 | xvda = { 253 | device_name = "/dev/xvda" 254 | ebs = { 255 | volume_size = 50 256 | volume_type = "gp3" 257 | } 258 | } 259 | } 260 | } 261 | } 262 | 263 | keycloak_self_mg_node_group = { 264 | keycloak_ng_sso = { 265 | platform = "bottlerocket" 266 | ami_id = data.aws_ami.eks_default_bottlerocket.id 267 | instance_type = null # conflicts with instance_requirements settings 268 | min_size = 2 269 | max_size = 2 270 | desired_size = 2 271 | key_name = var.keycloak_enabled ? module.key_pair[0].key_pair_name : null 272 | 273 | bootstrap_extra_args = <<-EOT 274 | # The admin host container provides SSH access and runs with "superpowers". 275 | # It is disabled by default, but can be disabled explicitly. 276 | [settings.host-containers.admin] 277 | enabled = false 278 | 279 | # The control host container provides out-of-band access via SSM. 280 | # It is enabled by default, and can be disabled if you do not expect to use SSM. 281 | # This could leave you with no way to access the API and change settings on an existing node! 282 | [settings.host-containers.control] 283 | enabled = true 284 | 285 | # extra args added 286 | [settings.kernel] 287 | lockdown = "integrity" 288 | 289 | [settings.kubernetes.node-labels] 290 | label1 = "sso" 291 | label2 = "bb-core" 292 | 293 | [settings.kubernetes.node-taints] 294 | dedicated = "experimental:PreferNoSchedule" 295 | special = "true:NoSchedule" 296 | EOT 297 | } 298 | } 299 | 300 | self_managed_node_groups = merge( 301 | var.enable_self_managed_nodegroups ? local.mission_app_self_mg_node_group : {}, 302 | var.enable_self_managed_nodegroups && var.keycloak_enabled ? local.keycloak_self_mg_node_group : {} 303 | ) 304 | 305 | } 306 | 307 | module "ssm_kms_key" { 308 | source = "terraform-aws-modules/kms/aws" 309 | version = "~> 2.0" 310 | 311 | create = var.create_ssm_parameters 312 | 313 | description = "KMS key for SecureString SSM parameters" 314 | 315 | key_administrators = [ 316 | data.aws_iam_session_context.current.issuer_arn 317 | ] 318 | 319 | computed_aliases = { 320 | ssm = { 321 | name = "${local.kms_key_alias_name_prefix}-ssm" 322 | } 323 | } 324 | 325 | key_statements = [ 326 | { 327 | sid = "SSM service access" 328 | effect = "Allow" 329 | principals = [ 330 | { 331 | type = "Service" 332 | identifiers = ["ssm.amazonaws.com"] 333 | } 334 | ] 335 | actions = [ 336 | "kms:Decrypt", 337 | "kms:Encrypt", 338 | "kms:ReEncrypt*", 339 | "kms:GenerateDataKey*", 340 | "kms:DescribeKey", 341 | ] 342 | resources = ["*"] 343 | } 344 | ] 345 | 346 | tags = local.tags 347 | } 348 | 349 | locals { 350 | ssm_parameter_key_arn = var.create_ssm_parameters ? module.ssm_kms_key.key_arn : "" 351 | 352 | unicorn_admin_role = data.aws_iam_session_context.current.issuer_arn != "arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:role/unicorn-admin" ? { 353 | unicorn_admin = { 354 | principal_arn = "arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:role/unicorn-admin" 355 | type = "STANDARD" 356 | policy_associations = { 357 | admin = { 358 | policy_arn = "arn:${data.aws_partition.current.partition}:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy" 359 | access_scope = { 360 | type = "cluster" 361 | } 362 | } 363 | } 364 | } 365 | } : {} 366 | 367 | bastion_role = var.enable_bastion ? { 368 | bastion = { 369 | principal_arn = module.bastion[0].bastion_role_arn 370 | type = "STANDARD" 371 | policy_associations = { 372 | admin = { 373 | policy_arn = "arn:${data.aws_partition.current.partition}:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy" 374 | access_scope = { 375 | type = "cluster" 376 | } 377 | } 378 | } 379 | } 380 | } : {} 381 | 382 | access_entries = merge( 383 | var.access_entries, 384 | local.unicorn_admin_role, 385 | local.bastion_role 386 | ) 387 | } 388 | 389 | module "eks" { 390 | source = "git::https://github.com/defenseunicorns/terraform-aws-eks.git?ref=v0.0.17" 391 | 392 | name = local.cluster_name 393 | aws_region = var.region 394 | azs = module.vpc.azs 395 | vpc_id = module.vpc.vpc_id 396 | private_subnet_ids = module.vpc.private_subnets 397 | control_plane_subnet_ids = module.vpc.private_subnets 398 | iam_role_permissions_boundary = var.iam_role_permissions_boundary 399 | cluster_security_group_additional_rules = local.cluster_security_group_additional_rules 400 | cluster_endpoint_public_access = var.cluster_endpoint_public_access 401 | cluster_endpoint_private_access = true 402 | vpc_cni_custom_subnet = module.vpc.intra_subnets 403 | aws_admin_usernames = var.aws_admin_usernames 404 | cluster_version = var.cluster_version 405 | cidr_blocks = module.vpc.private_subnets_cidr_blocks 406 | eks_use_mfa = var.eks_use_mfa 407 | dataplane_wait_duration = var.dataplane_wait_duration 408 | # authentication 409 | access_entries = local.access_entries 410 | authentication_mode = var.authentication_mode 411 | enable_cluster_creator_admin_permissions = var.enable_cluster_creator_admin_permissions 412 | 413 | ######################## EKS Managed Node Group ################################### 414 | eks_managed_node_group_defaults = local.eks_managed_node_group_defaults 415 | eks_managed_node_groups = local.eks_managed_node_groups 416 | 417 | ######################## Self Managed Node Group ################################### 418 | self_managed_node_group_defaults = local.self_managed_node_group_defaults 419 | self_managed_node_groups = local.self_managed_node_groups 420 | 421 | tags = local.tags 422 | 423 | 424 | 425 | #--------------------------------------------------------------- 426 | #"native" EKS Add-Ons 427 | #--------------------------------------------------------------- 428 | 429 | cluster_addons = var.cluster_addons 430 | 431 | #--------------------------------------------------------------- 432 | # EKS Blueprints - blueprints curated helm charts 433 | #--------------------------------------------------------------- 434 | create_kubernetes_resources = var.create_kubernetes_resources 435 | create_ssm_parameters = var.create_ssm_parameters 436 | ssm_parameter_key_arn = local.ssm_parameter_key_arn 437 | 438 | # AWS EKS EBS CSI Driver 439 | enable_amazon_eks_aws_ebs_csi_driver = var.enable_amazon_eks_aws_ebs_csi_driver 440 | enable_gp3_default_storage_class = var.enable_gp3_default_storage_class 441 | storageclass_reclaim_policy = var.storageclass_reclaim_policy 442 | 443 | # AWS EKS EFS CSI Driver 444 | enable_amazon_eks_aws_efs_csi_driver = var.enable_amazon_eks_aws_efs_csi_driver 445 | aws_efs_csi_driver = var.aws_efs_csi_driver 446 | 447 | reclaim_policy = var.reclaim_policy 448 | 449 | # AWS EKS node termination handler 450 | enable_aws_node_termination_handler = var.enable_aws_node_termination_handler 451 | aws_node_termination_handler = var.aws_node_termination_handler 452 | 453 | # k8s Metrics Server 454 | enable_metrics_server = var.enable_metrics_server 455 | metrics_server = var.metrics_server 456 | 457 | # k8s Cluster Autoscaler 458 | enable_cluster_autoscaler = var.enable_cluster_autoscaler 459 | cluster_autoscaler = var.cluster_autoscaler 460 | 461 | # AWS Load Balancer Controller 462 | enable_aws_load_balancer_controller = var.enable_aws_load_balancer_controller 463 | aws_load_balancer_controller = var.aws_load_balancer_controller 464 | 465 | # k8s Secrets Store CSI Driver 466 | enable_secrets_store_csi_driver = var.enable_secrets_store_csi_driver 467 | secrets_store_csi_driver = var.secrets_store_csi_driver 468 | } 469 | 470 | #--------------------------------------------------------------- 471 | #Keycloak Self Managed Node Group Dependencies 472 | #--------------------------------------------------------------- 473 | 474 | module "key_pair" { 475 | source = "terraform-aws-modules/key-pair/aws" 476 | version = "~> 2.0" 477 | 478 | count = var.keycloak_enabled ? 1 : 0 479 | 480 | key_name_prefix = local.cluster_name 481 | create_private_key = true 482 | 483 | tags = local.tags 484 | } 485 | 486 | module "ebs_kms_key" { 487 | source = "terraform-aws-modules/kms/aws" 488 | version = "~> 2.0" 489 | 490 | count = var.keycloak_enabled ? 1 : 0 491 | 492 | description = "Customer managed key to encrypt EKS managed node group volumes" 493 | 494 | # Policy 495 | key_administrators = [ 496 | data.aws_iam_session_context.current.issuer_arn 497 | ] 498 | 499 | key_service_roles_for_autoscaling = [ 500 | # required for the ASG to manage encrypted volumes for nodes 501 | "arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling", 502 | # required for the cluster / persistentvolume-controller to create encrypted PVCs 503 | module.eks.cluster_iam_role_arn, 504 | ] 505 | 506 | # Aliases 507 | aliases = ["eks/keycloak_ng_sso/ebs"] 508 | aliases_use_name_prefix = true 509 | 510 | tags = local.tags 511 | } 512 | 513 | resource "aws_iam_policy" "additional" { 514 | 515 | count = var.keycloak_enabled ? 1 : 0 516 | 517 | name = "${local.cluster_name}-additional" 518 | description = "Example usage of node additional policy" 519 | 520 | policy = jsonencode({ 521 | Version = "2012-10-17" 522 | Statement = [ 523 | { 524 | Action = [ 525 | "ec2:Describe*", 526 | ] 527 | Effect = "Allow" 528 | Resource = "*" 529 | }, 530 | ] 531 | }) 532 | 533 | tags = local.tags 534 | } 535 | 536 | ############################################################################ 537 | ##################### Lambda Password Rotation ############################# 538 | 539 | module "password_lambda" { 540 | 541 | count = var.enable_bastion ? 1 : 0 542 | 543 | source = "git::https://github.com/defenseunicorns/terraform-aws-lambda.git//modules/password-rotation?ref=v0.0.3" 544 | region = var.region 545 | random_id = lower(random_id.default.hex) 546 | name_prefix = var.name_prefix 547 | users = var.users 548 | # Add any additional instances you want the function to run against here 549 | instance_ids = [try(module.bastion[0].instance_id)] 550 | cron_schedule_password_rotation = var.cron_schedule_password_rotation 551 | slack_notification_enabled = var.slack_notification_enabled 552 | slack_webhook_url = var.slack_webhook_url 553 | } 554 | -------------------------------------------------------------------------------- /examples/complete/README.md: -------------------------------------------------------------------------------- 1 | # Complete Example: EKS Cluster Deployment with new VPC & Big Bang Dependencies 2 | 3 | This example deploys: 4 | 5 | - A VPC with: 6 | - 3 public subnets with internet gateway 7 | - 3 private subnets with NAT gateway 8 | - An EKS cluster with worker node group(s) 9 | - A Bastion host in one of the private subnets 10 | - Big Bang dependencies: 11 | - KMS key and IAM roles for SOPS and IRSA 12 | - S3 bucket for Loki 13 | - RDS database for Keycloak 14 | - Password rotation lambda module. 15 | - This module can be enabled or disabled. Enabled by default for E2E testing. 16 | - you can also modify the crob job schedule by changing the cron_schedule_password_rotation variable 17 | - if enabled ensure you pass instance ids as seen in examples/complete/main.tf `instance_ids = [module.bastion.instance_id]` and users to the function via fixtures.common.tfvars for example. 18 | - You also have the option of enabling slack notifications for the password rotation status by setting the variables slack_notification_enabled and slack_webhook_url 19 | - Note that this function deploys resources outside of terraform. Secrets and Parameter store resources are created by the function and need to be deleted manually. If slack notifications are enabled you will need to create a slack webhook url to put into the variable. 20 | 21 | > This example has 2 modes: "insecure"(commercial) and "secure"(govcloud). Insecure mode uses managed nodegroups, default instance tenancy, and enables the public endpoint on the EKS cluster. Secure mode uses self-managed nodegroups, dedicated instance tenancy, and disables the public endpoint on the EKS cluster. The method of choosing which mode to use is by using either `fixtures.insecure.tfvars` or `fixtures.secure.tfvars` as an overlay on top of `fixtures.common.tfvars`. 22 | 23 | ## Deploy/Destroy 24 | 25 | See the [examples README](../README.md) for instructions on how to deploy/destroy this example. The make targets for this example are either `test-ci-complete-insecure` or `test-release-complete-secure`. 26 | 27 | ## Connect 28 | 29 | ### Insecure mode 30 | 31 | In insecure mode, the EKS cluster has a public endpoint. You can get the kubeconfig you need to connect to the cluster with the following command: 32 | 33 | ```shell 34 | aws eks update-kubeconfig --region --name --kubeconfig --alias 35 | ``` 36 | > Use `aws eks list-clusters --region ` to get the name of the cluster. 37 | 38 | ### Secure mode 39 | 40 | In secure mode, the EKS cluster does not have a public endpoint. To connect to it, you'll need to tunnel through the bastion host. We use `sshuttle` to do this. 41 | 42 | For convenience, we have set up a Make target called `bastion-connect`. Running the target will run a Docker container with `sshuttle` already running and the KUBECONFIG already configured and drop you into a bash shell inside the container. 43 | 44 | ```shell 45 | aws-vault exec -- make bastion-connect 46 | 47 | SShuttle is running and KUBECONFIG has been set. Try running kubectl get nodes. 48 | [root@f72f0495c0cd complete]$ kubectl get nodes 49 | NAME STATUS ROLES AGE VERSION 50 | ip-10-200-36-117.us-east-2.compute.internal Ready 22h v1.23.16-eks-48e63af 51 | ip-10-200-41-153.us-east-2.compute.internal Ready 22h v1.23.16-eks-48e63af 52 | ip-10-200-48-31.us-east-2.compute.internal Ready 22h v1.23.16-eks-48e63af 53 | ``` 54 | 55 | To do this manually, you're going to want to run: 56 | 57 | > NOTE: This is not really recommended. Better to use the make target / docker container. If the container doesn't have a tool you need, open an issue [here](https://github.com/defenseunicorns/build-harness) and we'll get it added. 58 | 59 | ```shell 60 | # Switch to the examples/complete directory 61 | cd examples/complete 62 | 63 | # Init Terraform 64 | terraform init 65 | 66 | # Set up the AWS environment. This will drop you into a new shell with the env vars you need. 67 | aws-vault exec 68 | 69 | # Make sure you have the env vars you need 70 | env | grep AWS 71 | AWS_VAULT= 72 | AWS_DEFAULT_REGION= 73 | AWS_REGION= 74 | AWS_ACCESS_KEY_ID= 75 | AWS_SECRET_ACCESS_KEY= 76 | AWS_SESSION_TOKEN= 77 | AWS_SECURITY_TOKEN= 78 | AWS_SESSION_EXPIRATION= 79 | 80 | # Run sshuttle in the background. Don't forget to kill the background process when you are done. Use 'ps' to get the PID, then use 'kill -15 ' to kill it. 81 | # There's a ton of stuff here. Here's the breakdown: 82 | # sshpass: pass the bastion's SSH password noninteractively 83 | # -D: Run sshuttle in daemon (background) mode. Don't use '-D' if you'd rather run it in the foreground in a separate terminal window. 84 | # -o CheckHostIP=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null: Don't check the bastion's host key 85 | # -o ProxyCommand="...": Tell SSH to use AWS SSM 86 | sshuttle -D -e 'sshpass -p "my-password" ssh -q -o CheckHostIP=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ProxyCommand="aws ssm --region $(terraform output -raw bastion_region) start-session --target %h --document-name AWS-StartSSHSession --parameters portNumber=%p"' --dns --disable-ipv6 -vr ec2-user@$(terraform output -raw bastion_instance_id) $(terraform output -raw vpc_cidr) 87 | 88 | # Set up the KUBECONFIG 89 | aws eks --region $(terraform output -raw bastion_region) update-kubeconfig --name $(terraform output -raw eks_cluster_name) 90 | 91 | # Test it out 92 | kubectl get nodes 93 | ``` 94 | 95 | 96 | ## Requirements 97 | 98 | | Name | Version | 99 | |------|---------| 100 | | [terraform](#requirement\_terraform) | >= 1.0.0 | 101 | | [archive](#requirement\_archive) | 2.4.2 | 102 | | [aws](#requirement\_aws) | >= 4.62.0 | 103 | | [cloudinit](#requirement\_cloudinit) | >= 2.0.0 | 104 | | [helm](#requirement\_helm) | >= 2.5.1 | 105 | | [http](#requirement\_http) | 2.4.1 | 106 | | [kubernetes](#requirement\_kubernetes) | >= 2.10.0 | 107 | | [local](#requirement\_local) | >= 2.1.0 | 108 | | [null](#requirement\_null) | >= 3.1.0 | 109 | | [random](#requirement\_random) | >= 3.1.0 | 110 | | [time](#requirement\_time) | >= 0.9.1 | 111 | | [tls](#requirement\_tls) | >= 3.0.0 | 112 | 113 | ## Providers 114 | 115 | | Name | Version | 116 | |------|---------| 117 | | [aws](#provider\_aws) | >= 4.62.0 | 118 | | [random](#provider\_random) | >= 3.1.0 | 119 | 120 | ## Modules 121 | 122 | | Name | Source | Version | 123 | |------|--------|---------| 124 | | [bastion](#module\_bastion) | git::https://github.com/defenseunicorns/terraform-aws-bastion.git | v0.0.13 | 125 | | [ebs\_kms\_key](#module\_ebs\_kms\_key) | terraform-aws-modules/kms/aws | ~> 2.0 | 126 | | [eks](#module\_eks) | git::https://github.com/defenseunicorns/terraform-aws-eks.git | v0.0.17 | 127 | | [key\_pair](#module\_key\_pair) | terraform-aws-modules/key-pair/aws | ~> 2.0 | 128 | | [password\_lambda](#module\_password\_lambda) | git::https://github.com/defenseunicorns/terraform-aws-lambda.git//modules/password-rotation | v0.0.3 | 129 | | [ssm\_kms\_key](#module\_ssm\_kms\_key) | terraform-aws-modules/kms/aws | ~> 2.0 | 130 | | [vpc](#module\_vpc) | git::https://github.com/defenseunicorns/terraform-aws-vpc.git | v0.1.7 | 131 | 132 | ## Resources 133 | 134 | | Name | Type | 135 | |------|------| 136 | | [aws_iam_policy.additional](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource | 137 | | [aws_kms_alias.default](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/kms_alias) | resource | 138 | | [aws_kms_key.default](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/kms_key) | resource | 139 | | [aws_s3_bucket.access_log_bucket](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket) | resource | 140 | | [aws_s3_bucket_lifecycle_configuration.access_log_bucket](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_lifecycle_configuration) | resource | 141 | | [aws_s3_bucket_notification.access_log_bucket_notification](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_notification) | resource | 142 | | [aws_s3_bucket_public_access_block.access_log_bucket](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_public_access_block) | resource | 143 | | [aws_s3_bucket_server_side_encryption_configuration.access_log_bucket](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_server_side_encryption_configuration) | resource | 144 | | [aws_s3_bucket_versioning.access_log_bucket](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_versioning) | resource | 145 | | [aws_sqs_queue.access_log_queue](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sqs_queue) | resource | 146 | | [random_id.default](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/id) | resource | 147 | | [aws_ami.amazonlinux2](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ami) | data source | 148 | | [aws_ami.eks_default_bottlerocket](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ami) | data source | 149 | | [aws_availability_zones.available](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/availability_zones) | data source | 150 | | [aws_caller_identity.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/caller_identity) | data source | 151 | | [aws_iam_policy_document.kms_access](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source | 152 | | [aws_iam_session_context.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_session_context) | data source | 153 | | [aws_partition.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/partition) | data source | 154 | 155 | ## Inputs 156 | 157 | | Name | Description | Type | Default | Required | 158 | |------|-------------|------|---------|:--------:| 159 | | [access\_entries](#input\_access\_entries) | Map of access entries to add to the cluster | `any` | `{}` | no | 160 | | [access\_log\_expire\_days](#input\_access\_log\_expire\_days) | Number of days to wait before deleting access logs | `number` | `30` | no | 161 | | [authentication\_mode](#input\_authentication\_mode) | The authentication mode for the cluster. Valid values are `CONFIG_MAP`, `API` or `API_AND_CONFIG_MAP` | `string` | `"API"` | no | 162 | | [aws\_admin\_usernames](#input\_aws\_admin\_usernames) | A list of one or more AWS usernames with authorized access to KMS and EKS resources, will automatically add the user running the terraform as an admin | `list(string)` | `[]` | no | 163 | | [aws\_efs\_csi\_driver](#input\_aws\_efs\_csi\_driver) | AWS EFS CSI Driver helm chart config | `any` | `{}` | no | 164 | | [aws\_load\_balancer\_controller](#input\_aws\_load\_balancer\_controller) | AWS Loadbalancer Controller Helm Chart config | `any` | `{}` | no | 165 | | [aws\_node\_termination\_handler](#input\_aws\_node\_termination\_handler) | AWS Node Termination Handler config for aws-ia/eks-blueprints-addon/aws | `any` | `{}` | no | 166 | | [bastion\_instance\_type](#input\_bastion\_instance\_type) | value for the instance type of the EKS worker nodes | `string` | `"m5.xlarge"` | no | 167 | | [bastion\_ssh\_password](#input\_bastion\_ssh\_password) | The SSH password to use for the bastion if SSM authentication is used | `string` | `"my-password"` | no | 168 | | [bastion\_ssh\_user](#input\_bastion\_ssh\_user) | The SSH user to use for the bastion | `string` | `"ec2-user"` | no | 169 | | [bastion\_tenancy](#input\_bastion\_tenancy) | The tenancy of the bastion | `string` | `"default"` | no | 170 | | [cluster\_addons](#input\_cluster\_addons) | Nested of eks native add-ons and their associated parameters.
See https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_add-on for supported values.
See https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/complete/main.tf#L44-L60 for upstream example.

to see available eks marketplace addons available for your cluster's version run:
aws eks describe-addon-versions --kubernetes-version $k8s\_cluster\_version --query 'addons[].{MarketplaceProductUrl: marketplaceInformation.productUrl, Name: addonName, Owner: owner Publisher: publisher, Type: type}' --output table | `any` | `{}` | no | 171 | | [cluster\_autoscaler](#input\_cluster\_autoscaler) | Cluster Autoscaler Helm Chart config | `any` | `{}` | no | 172 | | [cluster\_endpoint\_public\_access](#input\_cluster\_endpoint\_public\_access) | Whether to enable private access to the EKS cluster | `bool` | `false` | no | 173 | | [cluster\_version](#input\_cluster\_version) | Kubernetes version to use for EKS cluster | `string` | `"1.27"` | no | 174 | | [create\_kubernetes\_resources](#input\_create\_kubernetes\_resources) | If true, kubernetes resources related to non-marketplace addons to will be created | `bool` | `true` | no | 175 | | [create\_ssm\_parameters](#input\_create\_ssm\_parameters) | Create SSM parameters for values from eks blueprints addons | `bool` | `true` | no | 176 | | [cron\_schedule\_password\_rotation](#input\_cron\_schedule\_password\_rotation) | Schedule for password change function to run on | `string` | `"cron(0 0 1 * ? *)"` | no | 177 | | [dataplane\_wait\_duration](#input\_dataplane\_wait\_duration) | The duration to wait for the EKS cluster to be ready before creating the node groups | `string` | `"30s"` | no | 178 | | [eks\_use\_mfa](#input\_eks\_use\_mfa) | Use MFA for auth\_eks\_role | `bool` | n/a | yes | 179 | | [eks\_worker\_tenancy](#input\_eks\_worker\_tenancy) | The tenancy of the EKS worker nodes | `string` | `"default"` | no | 180 | | [enable\_amazon\_eks\_aws\_ebs\_csi\_driver](#input\_enable\_amazon\_eks\_aws\_ebs\_csi\_driver) | Enable EKS Managed AWS EBS CSI Driver add-on | `bool` | `false` | no | 181 | | [enable\_amazon\_eks\_aws\_efs\_csi\_driver](#input\_enable\_amazon\_eks\_aws\_efs\_csi\_driver) | Enable EFS CSI add-on | `bool` | `false` | no | 182 | | [enable\_aws\_load\_balancer\_controller](#input\_enable\_aws\_load\_balancer\_controller) | Enable AWS Loadbalancer Controller add-on | `bool` | `false` | no | 183 | | [enable\_aws\_node\_termination\_handler](#input\_enable\_aws\_node\_termination\_handler) | Enable AWS Node Termination Handler add-on | `bool` | `false` | no | 184 | | [enable\_bastion](#input\_enable\_bastion) | If true, a bastion will be created | `bool` | `true` | no | 185 | | [enable\_cluster\_autoscaler](#input\_enable\_cluster\_autoscaler) | Enable Cluster autoscaler add-on | `bool` | `false` | no | 186 | | [enable\_cluster\_creator\_admin\_permissions](#input\_enable\_cluster\_creator\_admin\_permissions) | Indicates whether or not to add the cluster creator (the identity used by Terraform) as an administrator via access entry | `bool` | `true` | no | 187 | | [enable\_eks\_managed\_nodegroups](#input\_enable\_eks\_managed\_nodegroups) | Enable managed node groups | `bool` | n/a | yes | 188 | | [enable\_gp3\_default\_storage\_class](#input\_enable\_gp3\_default\_storage\_class) | Enable gp3 as default storage class | `bool` | `false` | no | 189 | | [enable\_metrics\_server](#input\_enable\_metrics\_server) | Enable metrics server add-on | `bool` | `false` | no | 190 | | [enable\_secrets\_store\_csi\_driver](#input\_enable\_secrets\_store\_csi\_driver) | Enable k8s Secret Store CSI Driver add-on | `bool` | `false` | no | 191 | | [enable\_self\_managed\_nodegroups](#input\_enable\_self\_managed\_nodegroups) | Enable self managed node groups | `bool` | n/a | yes | 192 | | [enable\_sqs\_events\_on\_access\_log\_access](#input\_enable\_sqs\_events\_on\_access\_log\_access) | If true, generates an SQS event whenever on object is created in the Access Log bucket, which happens whenever a server access log is generated by any entity. This will potentially generate a lot of events, so use with caution. | `bool` | `false` | no | 193 | | [iam\_role\_permissions\_boundary](#input\_iam\_role\_permissions\_boundary) | ARN of the policy that is used to set the permissions boundary for IAM roles | `string` | `null` | no | 194 | | [keycloak\_enabled](#input\_keycloak\_enabled) | Enable Keycloak dedicated nodegroup | `bool` | `false` | no | 195 | | [kms\_key\_deletion\_window](#input\_kms\_key\_deletion\_window) | Waiting period for scheduled KMS Key deletion. Can be 7-30 days. | `number` | `7` | no | 196 | | [metrics\_server](#input\_metrics\_server) | Metrics Server config for aws-ia/eks-blueprints-addon/aws | `any` | `{}` | no | 197 | | [name\_prefix](#input\_name\_prefix) | The prefix to use when naming all resources | `string` | `"ex-complete"` | no | 198 | | [num\_azs](#input\_num\_azs) | The number of AZs to use | `number` | `3` | no | 199 | | [reclaim\_policy](#input\_reclaim\_policy) | Reclaim policy for EFS storage class, valid options are Delete and Retain | `string` | `"Delete"` | no | 200 | | [region](#input\_region) | The AWS region to deploy into | `string` | n/a | yes | 201 | | [secondary\_cidr\_blocks](#input\_secondary\_cidr\_blocks) | A list of secondary CIDR blocks for the VPC | `list(string)` | `[]` | no | 202 | | [secrets\_store\_csi\_driver](#input\_secrets\_store\_csi\_driver) | k8s Secret Store CSI Driver Helm Chart config | `any` | `{}` | no | 203 | | [slack\_notification\_enabled](#input\_slack\_notification\_enabled) | enable slack notifications for password rotation function. If enabled a slack webhook url will also need to be provided for this to work | `bool` | `false` | no | 204 | | [slack\_webhook\_url](#input\_slack\_webhook\_url) | value | `string` | `null` | no | 205 | | [storageclass\_reclaim\_policy](#input\_storageclass\_reclaim\_policy) | Reclaim policy for gp3 storage class, valid options are Delete and Retain | `string` | `"Delete"` | no | 206 | | [tags](#input\_tags) | A map of tags to apply to all resources | `map(string)` | `{}` | no | 207 | | [users](#input\_users) | This needs to be a list of users that will be on your ec2 instances that need password changes. | `list(string)` | `[]` | no | 208 | | [vpc\_cidr](#input\_vpc\_cidr) | The CIDR block for the VPC | `string` | n/a | yes | 209 | | [zarf\_version](#input\_zarf\_version) | The version of Zarf to use | `string` | `""` | no | 210 | 211 | ## Outputs 212 | 213 | | Name | Description | 214 | |------|-------------| 215 | | [bastion\_instance\_id](#output\_bastion\_instance\_id) | The ID of the bastion host | 216 | | [bastion\_private\_dns](#output\_bastion\_private\_dns) | The private DNS address of the bastion host | 217 | | [bastion\_region](#output\_bastion\_region) | The region that the bastion host was deployed to | 218 | | [efs\_storageclass\_name](#output\_efs\_storageclass\_name) | The name of the EFS storageclass that was created (if var.enable\_amazon\_eks\_aws\_efs\_csi\_driver was set to true) | 219 | | [eks\_cluster\_name](#output\_eks\_cluster\_name) | The name of the EKS cluster | 220 | | [lambda\_password\_function\_arn](#output\_lambda\_password\_function\_arn) | Arn for lambda password function | 221 | | [vpc\_cidr](#output\_vpc\_cidr) | The CIDR block of the VPC | 222 | 223 | --------------------------------------------------------------------------------