├── .tool-versions ├── .github ├── CODEOWNERS └── workflows │ ├── tfsec-pr-commenter.yml │ ├── documentation.yml │ ├── release.yml │ └── pr-title.yml ├── example-app ├── .tool-versions ├── outputs.tf ├── variables.tf ├── README.md └── main.tf ├── versions.tf ├── modules ├── storage │ ├── outputs.tf │ ├── variables.tf │ └── main.tf ├── service_account │ ├── variables.tf │ ├── outputs.tf │ └── main.tf ├── database │ ├── outputs.tf │ ├── variables.tf │ └── main.tf ├── networking │ ├── variables.tf │ ├── outputs.tf │ └── main.tf ├── cluster │ ├── outputs.tf │ ├── variables.tf │ └── main.tf └── registry │ ├── variables.tf │ ├── outputs.tf │ └── main.tf ├── .gitignore ├── .pre-commit-config.yaml ├── .releaserc.json ├── outputs.tf ├── variables.tf ├── main.tf ├── CHANGELOG.md └── README.md /.tool-versions: -------------------------------------------------------------------------------- 1 | terraform 1.5.3 2 | -------------------------------------------------------------------------------- /.github/CODEOWNERS: -------------------------------------------------------------------------------- 1 | * @jsbroks @dlachasse 2 | -------------------------------------------------------------------------------- /example-app/.tool-versions: -------------------------------------------------------------------------------- 1 | terraform 1.5.3 2 | -------------------------------------------------------------------------------- /versions.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.5.0, < 2.0.0" 3 | } 4 | -------------------------------------------------------------------------------- /modules/storage/outputs.tf: -------------------------------------------------------------------------------- 1 | output "storage_bucket_name" { 2 | value = google_storage_bucket.file_storage.name 3 | } 4 | -------------------------------------------------------------------------------- /example-app/outputs.tf: -------------------------------------------------------------------------------- 1 | output "dagster_user_deployment_manifest" { 2 | value = helm_release.dagster_user_deployment.manifest 3 | } 4 | 5 | output "dagster_service_manifest" { 6 | value = helm_release.dagster_service.manifest 7 | } 8 | -------------------------------------------------------------------------------- /modules/service_account/variables.tf: -------------------------------------------------------------------------------- 1 | variable "project_id" { 2 | description = "Project ID" 3 | type = string 4 | } 5 | 6 | variable "namespace" { 7 | description = "The namespace name used as a prefix for all resources created." 8 | type = string 9 | } 10 | -------------------------------------------------------------------------------- /modules/service_account/outputs.tf: -------------------------------------------------------------------------------- 1 | output "service_account_credentials" { 2 | value = base64decode(google_service_account_key.default.private_key) 3 | description = "The private key of the service account." 4 | } 5 | 6 | output "service_account" { 7 | value = google_service_account.default 8 | description = "The service account." 9 | } 10 | -------------------------------------------------------------------------------- /modules/database/outputs.tf: -------------------------------------------------------------------------------- 1 | output "cloudsql_database" { 2 | value = { 3 | host = google_sql_database_instance.default.private_ip_address 4 | name = google_sql_database.default.name 5 | username = google_sql_user.default.name 6 | password = google_sql_user.default.password 7 | } 8 | description = "Database connection parameters" 9 | } 10 | -------------------------------------------------------------------------------- /modules/networking/variables.tf: -------------------------------------------------------------------------------- 1 | variable "namespace" { 2 | description = "The application name used as a prefix for all resources created." 3 | type = string 4 | } 5 | variable "project" { 6 | description = "Project ID" 7 | type = string 8 | } 9 | 10 | variable "region" { 11 | description = "Google region" 12 | type = string 13 | } 14 | -------------------------------------------------------------------------------- /.github/workflows/tfsec-pr-commenter.yml: -------------------------------------------------------------------------------- 1 | name: tfsec-pr-commenter 2 | on: 3 | pull_request: 4 | jobs: 5 | tfsec: 6 | name: tfsec PR commenter 7 | runs-on: ubuntu-latest 8 | 9 | steps: 10 | - name: Clone repo 11 | uses: actions/checkout@master 12 | - name: tfsec 13 | uses: aquasecurity/tfsec-pr-commenter-action@main 14 | with: 15 | github_token: ${{ github.token }} 16 | -------------------------------------------------------------------------------- /modules/networking/outputs.tf: -------------------------------------------------------------------------------- 1 | output "network" { 2 | value = google_compute_network.default 3 | description = "The network." 4 | } 5 | 6 | output "subnetwork" { 7 | value = google_compute_subnetwork.default 8 | description = "The subnetwork." 9 | } 10 | 11 | output "connection" { 12 | description = "The private connection between the network and GCP services." 13 | value = google_service_networking_connection.default 14 | } 15 | -------------------------------------------------------------------------------- /.github/workflows/documentation.yml: -------------------------------------------------------------------------------- 1 | name: Generate terraform docs 2 | on: 3 | - pull_request 4 | 5 | jobs: 6 | docs: 7 | runs-on: ubuntu-latest 8 | steps: 9 | - uses: actions/checkout@v2 10 | with: 11 | ref: ${{ github.event.pull_request.head.ref }} 12 | 13 | - name: Render terraform docs and push changes back to PR 14 | uses: terraform-docs/gh-actions@main 15 | with: 16 | working-dir: . 17 | output-file: README.md 18 | output-method: inject 19 | git-push: "true" 20 | -------------------------------------------------------------------------------- /modules/cluster/outputs.tf: -------------------------------------------------------------------------------- 1 | output "cluster_id" { 2 | value = google_container_cluster.default.id 3 | } 4 | 5 | output "cluster_endpoint" { 6 | value = google_container_cluster.default.endpoint 7 | } 8 | 9 | output "cluster_ca_certificate" { 10 | value = google_container_cluster.default.master_auth[0].cluster_ca_certificate 11 | sensitive = true 12 | } 13 | 14 | output "cluster_self_link" { 15 | value = google_container_cluster.default.self_link 16 | } 17 | 18 | output "cluster_node_pool" { 19 | value = google_container_node_pool.default 20 | } 21 | -------------------------------------------------------------------------------- /modules/registry/variables.tf: -------------------------------------------------------------------------------- 1 | variable "namespace" { 2 | description = "The namespace used as a prefix for all resources created." 3 | type = string 4 | } 5 | 6 | variable "location" { 7 | description = "The location to host Artifact registry repository in." 8 | type = string 9 | } 10 | 11 | variable "service_account" { 12 | description = "Service account used to grant registry IAM membership." 13 | type = object({ email = string }) 14 | } 15 | 16 | variable "service_account_credentials" { 17 | description = "Service account json key to grant permissions for registry image pulling." 18 | type = string 19 | } 20 | -------------------------------------------------------------------------------- /modules/storage/variables.tf: -------------------------------------------------------------------------------- 1 | variable "namespace" { 2 | description = "The namespace used as a prefix for all resources created." 3 | type = string 4 | } 5 | 6 | variable "service_account" { 7 | description = "The service account used to manage application services." 8 | type = object({ email = string, account_id = string }) 9 | } 10 | 11 | variable "cloud_storage_bucket_location" { 12 | description = "The location of the bucket" 13 | type = string 14 | } 15 | 16 | variable "deletion_protection" { 17 | description = "If the DB instance should have deletion protection enabled. The database can't be deleted when this value is set to `true`." 18 | type = bool 19 | } 20 | -------------------------------------------------------------------------------- /.github/workflows/release.yml: -------------------------------------------------------------------------------- 1 | name: Release 2 | 3 | on: 4 | workflow_dispatch: 5 | push: 6 | branches: 7 | - main 8 | 9 | jobs: 10 | release: 11 | name: Release 12 | runs-on: ubuntu-latest 13 | # Skip running release workflow on forks 14 | if: github.repository_owner == 'wandb' 15 | steps: 16 | - name: Checkout 17 | uses: actions/checkout@v2 18 | with: 19 | persist-credentials: false 20 | fetch-depth: 0 21 | 22 | - name: Release 23 | uses: cycjimmy/semantic-release-action@v2 24 | with: 25 | semantic_version: 19.0.2 26 | extra_plugins: | 27 | @semantic-release/changelog@6.0.1 28 | @semantic-release/git@10.0.1 29 | conventional-changelog-conventionalcommits@4.6.3 30 | env: 31 | GITHUB_TOKEN: ${{ secrets.WANDB_RELEASE_TOKEN }} 32 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Local .terraform directories 2 | **/.terraform/* 3 | 4 | # .tfstate files 5 | *.tfstate 6 | *.tfstate.* 7 | 8 | # Crash log files 9 | crash.log 10 | 11 | # Ignore any .tfvars files that are generated automatically for each Terraform run. Most 12 | # .tfvars files are managed as part of configuration and so should be included in 13 | # version control. 14 | # 15 | # example.tfvars 16 | 17 | # Ignore override files as they are usually used to override resources locally and so 18 | # are not checked in 19 | override.tf 20 | override.tf.json 21 | *_override.tf 22 | *_override.tf.json 23 | 24 | # Include override files you do wish to add to version control using negated pattern 25 | # 26 | # !example_override.tf 27 | 28 | # Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan 29 | # example: *tfplan* 30 | 31 | .DS_Store 32 | .idea/ 33 | *.tfvars 34 | 35 | .terraform.lock* 36 | 37 | *.dccache 38 | -------------------------------------------------------------------------------- /example-app/variables.tf: -------------------------------------------------------------------------------- 1 | variable "project_id" { 2 | description = "Project ID" 3 | type = string 4 | } 5 | 6 | variable "region" { 7 | description = "Google region" 8 | type = string 9 | } 10 | 11 | variable "zone" { 12 | description = "Google zone" 13 | type = string 14 | } 15 | 16 | variable "namespace" { 17 | description = "Namespace used as a prefix for all resources" 18 | type = string 19 | } 20 | 21 | variable "dagster_version" { 22 | description = "Version of Dagster to deploy" 23 | type = string 24 | default = "0.14.3" 25 | } 26 | 27 | variable "dagster_deployment_image" { 28 | description = "Image name of user code deployment" 29 | type = string 30 | default = "user-code-example" 31 | } 32 | 33 | variable "dagster_deployment_tag" { 34 | description = "User code deployment tag of Dagster to deploy" 35 | type = string 36 | default = "latest" 37 | } 38 | 39 | variable "domain" { 40 | description = "The domain in which your Google Groups are defined." 41 | type = string 42 | default = "example" 43 | } 44 | -------------------------------------------------------------------------------- /modules/registry/outputs.tf: -------------------------------------------------------------------------------- 1 | output "registry" { 2 | description = "Artifact registry for user code deployment images." 3 | value = google_artifact_registry_repository.default 4 | } 5 | 6 | output "registry_name" { 7 | description = "Name of provisioned Artifact Registry" 8 | value = google_artifact_registry_repository.default.name 9 | } 10 | 11 | output "registry_location" { 12 | description = "Location of provisioned Artifact Registry" 13 | value = google_artifact_registry_repository.default.location 14 | } 15 | 16 | output "registry_image_path" { 17 | description = "Path to registry repository images (ex: example_repo-docker.pkg.dev/my-project/dagster-images)" 18 | value = "${google_artifact_registry_repository.default.location}-docker.pkg.dev/${google_artifact_registry_repository.default.project}/${google_artifact_registry_repository.default.repository_id}" 19 | } 20 | 21 | output "registry_image_pull_secret" { 22 | description = "Name of Kubernetes secret with Docker config to pull private images from Artifact Registry" 23 | value = kubernetes_secret.image_pull_secret.metadata[0].name 24 | } 25 | -------------------------------------------------------------------------------- /.github/workflows/pr-title.yml: -------------------------------------------------------------------------------- 1 | name: "Validate PR Title" 2 | 3 | on: 4 | pull_request_target: 5 | types: [opened, edited, synchronize] 6 | 7 | jobs: 8 | main: 9 | name: Validate PR title 10 | runs-on: ubuntu-latest 11 | steps: 12 | # https://github.com/amannn/action-semantic-pull-request/releases 13 | - uses: amannn/action-semantic-pull-request@v4.2.0 14 | env: 15 | GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} 16 | with: 17 | # Allowed types: https://github.com/commitizen/conventional-commit-types 18 | types: | 19 | fix 20 | feat 21 | docs 22 | style 23 | refactor 24 | perf 25 | test 26 | build 27 | ci 28 | chore 29 | revert 30 | requireScope: false 31 | subjectPattern: ^[A-Z].+$ 32 | subjectPatternError: | 33 | The subject "{subject}" found in the pull request title "{title}" 34 | didn't match the configured pattern. Please ensure that the subject 35 | starts with an uppercase character. 36 | wip: true 37 | validateSingleCommit: true 38 | validateSingleCommitMatchesPrTitle: true 39 | -------------------------------------------------------------------------------- /modules/database/variables.tf: -------------------------------------------------------------------------------- 1 | variable "namespace" { 2 | type = string 3 | description = "The namespace name used as a prefix for all resources created." 4 | } 5 | 6 | variable "network_connection" { 7 | description = "The private service networking connection that will connect database to the network." 8 | type = object({ network = string }) 9 | } 10 | 11 | variable "deletion_protection" { 12 | description = "If the DB instance should have deletion protection enabled. The database can't be deleted when this value is set to `true`." 13 | type = bool 14 | } 15 | 16 | variable "cloudsql_postgres_version" { 17 | description = "The postgres version of the CloudSQL instance." 18 | type = string 19 | } 20 | 21 | variable "cloudsql_tier" { 22 | description = "The CloudSQL machine tier to use." 23 | type = string 24 | } 25 | 26 | variable "cloudsql_availability_type" { 27 | description = "The availability type of the CloudSQL instance." 28 | type = string 29 | } 30 | 31 | variable "cloudsql_edition" { 32 | description = "The edition of the CloudSQL instance." 33 | type = string 34 | } 35 | 36 | variable "cloudsql_query_insights_enabled" { 37 | description = "Whether to enable query insights." 38 | type = bool 39 | } 40 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | repos: 2 | - repo: https://github.com/compilerla/conventional-pre-commit 3 | rev: fc6b56c3b489ae5b1dccd2ec5c0e62e5c86fb5a0 4 | hooks: 5 | - id: conventional-pre-commit 6 | stages: [commit-msg] 7 | - repo: https://github.com/pre-commit/pre-commit-hooks 8 | rev: v5.0.0 9 | hooks: 10 | - id: end-of-file-fixer 11 | - id: trailing-whitespace 12 | - repo: https://github.com/antonbabenko/pre-commit-terraform 13 | rev: v1.62.3 14 | hooks: 15 | - id: terraform_validate 16 | - id: terraform_fmt 17 | args: 18 | - --args=-diff 19 | - id: terraform_docs 20 | args: 21 | - --hook-config=--path-to-file=README.md 22 | - --hook-config=--add-to-existing-file=true 23 | - --hook-config=--create-file-if-not-exist=false 24 | - id: terraform_tflint 25 | args: 26 | - --args=--enable-rule=terraform_documented_variables 27 | - --args=--enable-rule=terraform_unused_required_providers 28 | - --args=--enable-rule=terraform_unused_declarations 29 | - --args=--enable-rule=terraform_typed_variables 30 | - --args=--enable-rule=terraform_standard_module_structure 31 | - --args=--enable-rule=terraform_naming_convention 32 | - --args=--enable-rule=terraform_comment_syntax 33 | -------------------------------------------------------------------------------- /modules/storage/main.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.5.0, < 2.0.0" 3 | 4 | required_providers { 5 | google = { 6 | source = "hashicorp/google" 7 | version = "~> 6.0" 8 | } 9 | random = { 10 | source = "hashicorp/random" 11 | version = "~> 3.4" 12 | } 13 | } 14 | } 15 | 16 | locals { 17 | member = "serviceAccount:${var.service_account.email}" 18 | } 19 | 20 | resource "random_pet" "suffix" { 21 | length = 1 22 | } 23 | 24 | resource "google_storage_bucket" "file_storage" { 25 | name = "${var.namespace}-storage-${random_pet.suffix.id}" 26 | location = var.cloud_storage_bucket_location 27 | 28 | uniform_bucket_level_access = true 29 | force_destroy = !var.deletion_protection 30 | 31 | cors { 32 | method = ["GET", "HEAD", "PUT"] 33 | response_header = ["ETag"] 34 | max_age_seconds = 3000 35 | } 36 | 37 | lifecycle_rule { 38 | condition { 39 | age = 365 40 | } 41 | action { 42 | type = "Delete" 43 | } 44 | } 45 | } 46 | 47 | resource "google_storage_bucket_iam_member" "file_storage_object_admin" { 48 | bucket = google_storage_bucket.file_storage.name 49 | member = local.member 50 | role = "roles/storage.objectAdmin" 51 | 52 | depends_on = [google_storage_bucket.file_storage] 53 | } 54 | -------------------------------------------------------------------------------- /.releaserc.json: -------------------------------------------------------------------------------- 1 | { 2 | "branches": [ 3 | "main" 4 | ], 5 | "ci": false, 6 | "plugins": [ 7 | [ 8 | "@semantic-release/commit-analyzer", 9 | { 10 | "preset": "conventionalcommits" 11 | } 12 | ], 13 | [ 14 | "@semantic-release/release-notes-generator", 15 | { 16 | "preset": "conventionalcommits" 17 | } 18 | ], 19 | [ 20 | "@semantic-release/github", 21 | { 22 | "successComment": "This ${issue.pull_request ? 'PR is included' : 'issue has been resolved'} in version ${nextRelease.version} :tada:", 23 | "labels": false, 24 | "releasedLabels": false 25 | } 26 | ], 27 | [ 28 | "@semantic-release/changelog", 29 | { 30 | "changelogFile": "CHANGELOG.md", 31 | "changelogTitle": "# Changelog\n\nAll notable changes to this project will be documented in this file." 32 | } 33 | ], 34 | [ 35 | "@semantic-release/git", 36 | { 37 | "assets": [ 38 | "CHANGELOG.md" 39 | ], 40 | "message": "chore(release): version ${nextRelease.version} [skip ci]\n\n${nextRelease.notes}" 41 | } 42 | ] 43 | ] 44 | } 45 | -------------------------------------------------------------------------------- /modules/cluster/variables.tf: -------------------------------------------------------------------------------- 1 | variable "namespace" { 2 | description = "The namespace name used as a prefix for all resources created." 3 | type = string 4 | } 5 | 6 | variable "project_id" { 7 | description = "GCP project id used to enable a workload identity pool." 8 | type = string 9 | } 10 | 11 | variable "cluster_compute_machine_type" { 12 | description = "Compute machine type to deploy cluster nodes on." 13 | type = string 14 | } 15 | 16 | variable "cluster_node_pool_max_node_count" { 17 | description = "Max number of nodes cluster can scale up to." 18 | type = number 19 | } 20 | 21 | variable "cluster_monitoring_components" { 22 | description = "Components to enable in the GKE monitoring stack." 23 | type = list(string) 24 | default = [] 25 | } 26 | 27 | variable "service_account" { 28 | description = "The service account associated with the GKE cluster instances that host Dagster." 29 | type = object({ email = string }) 30 | } 31 | 32 | variable "network" { 33 | description = "Google Compute Engine network to which the cluster is connected." 34 | type = object({ self_link = string }) 35 | } 36 | 37 | variable "subnetwork" { 38 | description = "Google Compute Engine subnetwork in which the cluster's instances are launched." 39 | type = object({ self_link = string }) 40 | } 41 | 42 | variable "domain" { 43 | description = "The domain in which your Google Groups are defined." 44 | type = string 45 | } 46 | -------------------------------------------------------------------------------- /modules/service_account/main.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.5.0, < 2.0.0" 3 | 4 | required_providers { 5 | google = { 6 | source = "hashicorp/google" 7 | version = "~> 6.0" 8 | } 9 | random = { 10 | source = "hashicorp/random" 11 | version = "~> 3.4" 12 | } 13 | } 14 | } 15 | 16 | resource "random_id" "default" { 17 | # 30 bytes ensures that enough characters are generated to satisfy the service account ID requirements, regardless of 18 | # the prefix. 19 | byte_length = 30 20 | prefix = "${var.namespace}-" 21 | } 22 | 23 | resource "google_service_account" "default" { 24 | # Limit the string used to 30 characters. 25 | account_id = substr(random_id.default.dec, 0, 30) 26 | display_name = "GKE node pool Service Account" 27 | description = "Used by GKE node pools in ${var.namespace}." 28 | } 29 | 30 | resource "google_service_account_key" "default" { 31 | service_account_id = google_service_account.default.name 32 | 33 | depends_on = [google_service_account.default] 34 | } 35 | 36 | locals { 37 | roles = [ 38 | "roles/monitoring.viewer", 39 | "roles/monitoring.metricWriter", 40 | "roles/logging.logWriter", 41 | "roles/stackdriver.resourceMetadata.writer", 42 | "roles/autoscaling.metricsWriter" 43 | ] 44 | } 45 | 46 | resource "google_project_iam_member" "roles" { 47 | for_each = toset(local.roles) 48 | 49 | project = var.project_id 50 | role = each.key 51 | member = "serviceAccount:${google_service_account.default.email}" 52 | } 53 | -------------------------------------------------------------------------------- /modules/networking/main.tf: -------------------------------------------------------------------------------- 1 | # Creates VPC network 2 | resource "google_compute_network" "default" { 3 | name = "${var.namespace}-vpc" 4 | description = "${var.namespace} VPC Network" 5 | auto_create_subnetworks = false 6 | project = var.project 7 | 8 | # A global routing mode can have an unexpected impact on load balancers; always use a regional mode 9 | routing_mode = "REGIONAL" 10 | } 11 | 12 | # VM instances and other resources to communicate with each other via internal, 13 | # private IP addresses 14 | resource "google_compute_subnetwork" "default" { 15 | name = "${var.namespace}-subnet" 16 | ip_cidr_range = "10.10.0.0/16" 17 | network = google_compute_network.default.self_link 18 | project = var.project 19 | region = var.region 20 | 21 | # When enabled, VMs in this subnetwork without external IP addresses can access Google APIs 22 | # and services by using Private Google Access. 23 | private_ip_google_access = true 24 | 25 | depends_on = [google_compute_network.default] 26 | } 27 | 28 | resource "google_compute_global_address" "private_ip_address" { 29 | name = "${var.namespace}-private-ip-address" 30 | purpose = "VPC_PEERING" 31 | address_type = "INTERNAL" 32 | prefix_length = 16 33 | network = google_compute_network.default.id 34 | project = var.project 35 | 36 | depends_on = [google_compute_network.default] 37 | } 38 | 39 | resource "google_service_networking_connection" "default" { 40 | network = google_compute_network.default.id 41 | service = "servicenetworking.googleapis.com" 42 | reserved_peering_ranges = [google_compute_global_address.private_ip_address.name] 43 | 44 | depends_on = [google_compute_network.default, google_compute_global_address.private_ip_address] 45 | } 46 | -------------------------------------------------------------------------------- /outputs.tf: -------------------------------------------------------------------------------- 1 | output "service_account" { 2 | description = "Service account created to manage and authenticate services." 3 | value = module.service_account.service_account 4 | } 5 | 6 | output "cluster_endpoint" { 7 | description = "Endpoint of provisioned Kubernetes cluster" 8 | value = module.cluster.cluster_endpoint 9 | } 10 | 11 | output "cluster_id" { 12 | description = "Id of provisioned Kubernetes cluster" 13 | value = module.cluster.cluster_id 14 | } 15 | 16 | output "cluster_ca_certificate" { 17 | description = "Cluster certificate of provisioned Kubernetes cluster" 18 | value = module.cluster.cluster_ca_certificate 19 | sensitive = true 20 | } 21 | 22 | output "storage_bucket_name" { 23 | description = "Name of provisioned Cloud Storage bucket" 24 | value = module.storage.storage_bucket_name 25 | } 26 | 27 | output "cloudsql_database" { 28 | description = "Object containing connection parameters for provisioned CloudSQL database" 29 | value = module.database.cloudsql_database 30 | sensitive = true 31 | } 32 | 33 | output "registry_name" { 34 | description = "Name of provisioned Artifact Registry" 35 | value = module.registry.registry_name 36 | } 37 | 38 | output "registry_location" { 39 | description = "Location of provisioned Artifact Registry" 40 | value = module.registry.registry_location 41 | } 42 | 43 | output "registry_image_path" { 44 | description = "Docker image path of provisioned Artifact Registry" 45 | value = module.registry.registry_image_path 46 | } 47 | 48 | output "registry_image_pull_secret" { 49 | description = "Name of Kubernetes secret containing Docker config with permissions to pull from private Artifact Registry repository" 50 | value = module.registry.registry_image_pull_secret 51 | } 52 | 53 | output "network_name" { 54 | description = "Name of provisioned VPC network" 55 | value = module.networking.network.name 56 | } 57 | -------------------------------------------------------------------------------- /modules/registry/main.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.5.0, < 2.0.0" 3 | 4 | required_providers { 5 | google = { 6 | source = "hashicorp/google" 7 | version = "~> 6.0" 8 | } 9 | kubernetes = { 10 | source = "hashicorp/kubernetes" 11 | version = "~> 2.9" 12 | } 13 | } 14 | } 15 | 16 | resource "google_artifact_registry_repository" "default" { 17 | format = "DOCKER" 18 | location = var.location 19 | repository_id = "${var.namespace}-registry" 20 | 21 | cleanup_policies { 22 | id = "delete-old-images" 23 | action = "DELETE" 24 | condition { 25 | older_than = "365d" 26 | } 27 | } 28 | } 29 | 30 | # Grants pull access for the registry to the project's service account 31 | # This service account will be used by the Kubernetes cluster when accessing 32 | # code deployment images in the private repository. 33 | resource "google_artifact_registry_repository_iam_member" "access" { 34 | project = google_artifact_registry_repository.default.project 35 | location = google_artifact_registry_repository.default.location 36 | repository = google_artifact_registry_repository.default.name 37 | role = "roles/viewer" 38 | member = "serviceAccount:${var.service_account.email}" 39 | 40 | depends_on = [google_artifact_registry_repository.default] 41 | } 42 | 43 | # Grants permissions to pull images from Artifact Registry. 44 | # This must be used as an imagePullSecret to access private images. 45 | resource "kubernetes_secret" "image_pull_secret" { 46 | metadata { 47 | name = "artifact-registry" 48 | } 49 | 50 | data = { 51 | ".dockerconfigjson" = jsonencode({ 52 | auths = { 53 | "https://${google_artifact_registry_repository.default.id}-docker.pkg.dev" = { 54 | auth = base64encode("_json_key:${var.service_account_credentials}") 55 | } 56 | } 57 | }) 58 | } 59 | 60 | type = "kubernetes.io/dockerconfigjson" 61 | depends_on = [google_artifact_registry_repository_iam_member.access] 62 | } 63 | -------------------------------------------------------------------------------- /variables.tf: -------------------------------------------------------------------------------- 1 | variable "project_id" { 2 | description = "Project ID" 3 | type = string 4 | } 5 | 6 | variable "region" { 7 | description = "Google region" 8 | type = string 9 | } 10 | 11 | variable "namespace" { 12 | description = "Namespace used as a prefix for all resources" 13 | type = string 14 | } 15 | 16 | variable "deletion_protection" { 17 | description = "Indicates whether or not storage and databases have deletion protection enabled" 18 | type = bool 19 | default = true 20 | } 21 | 22 | variable "cloud_storage_bucket_location" { 23 | description = "Location to create cloud storage bucket in." 24 | type = string 25 | default = "US" 26 | } 27 | 28 | variable "cloudsql_postgres_version" { 29 | description = "The postgres version of the CloudSQL instance." 30 | type = string 31 | default = "POSTGRES_14" 32 | } 33 | 34 | variable "cloudsql_tier" { 35 | description = "The machine type to use" 36 | type = string 37 | default = "db-f1-micro" 38 | } 39 | 40 | variable "cloudsql_availability_type" { 41 | description = "The availability type of the Cloud SQL instance." 42 | type = string 43 | default = "ZONAL" 44 | } 45 | 46 | variable "cloudsql_edition" { 47 | description = "The edition of the CloudSQL instance." 48 | type = string 49 | default = "ENTERPRISE" 50 | } 51 | 52 | variable "cloudsql_query_insights_enabled" { 53 | description = "Whether to enable query insights." 54 | type = bool 55 | default = false 56 | } 57 | 58 | variable "cluster_compute_machine_type" { 59 | description = "Compute machine type to deploy cluster nodes on." 60 | type = string 61 | default = "e2-standard-2" 62 | } 63 | 64 | variable "cluster_node_pool_max_node_count" { 65 | description = "Max number of nodes cluster can scale up to." 66 | type = number 67 | default = 2 68 | } 69 | 70 | variable "cluster_monitoring_components" { 71 | description = "Components to enable in the GKE monitoring stack." 72 | type = list(string) 73 | default = ["SYSTEM_COMPONENTS"] 74 | } 75 | 76 | variable "domain" { 77 | description = "The domain in which your Google Groups are defined." 78 | type = string 79 | } 80 | -------------------------------------------------------------------------------- /modules/cluster/main.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.5.0, < 2.0.0" 3 | 4 | required_providers { 5 | google = { 6 | source = "hashicorp/google" 7 | version = "~> 6.0" 8 | } 9 | } 10 | } 11 | 12 | resource "google_container_cluster" "default" { 13 | name = "${var.namespace}-cluster" 14 | networking_mode = "VPC_NATIVE" 15 | network = var.network.self_link 16 | subnetwork = var.subnetwork.self_link 17 | 18 | ip_allocation_policy { 19 | cluster_ipv4_cidr_block = "/14" 20 | services_ipv4_cidr_block = "/19" 21 | } 22 | 23 | authenticator_groups_config { 24 | security_group = "gke-security-groups@${var.domain}.com" 25 | } 26 | 27 | workload_identity_config { 28 | workload_pool = "${var.project_id}.svc.id.goog" 29 | } 30 | 31 | release_channel { 32 | channel = "STABLE" 33 | } 34 | 35 | monitoring_config { 36 | enable_components = var.cluster_monitoring_components 37 | } 38 | 39 | remove_default_node_pool = true 40 | initial_node_count = 1 41 | 42 | # Disable client certificate authentication, which reduces the attack surface 43 | # for the cluster by disabling this deprecated feature. It defaults to false, 44 | # but this will make it explicit and quiet some security tooling. 45 | master_auth { 46 | client_certificate_config { 47 | issue_client_certificate = false 48 | } 49 | } 50 | 51 | lifecycle { 52 | ignore_changes = [ 53 | # We are relying on the release channel to maintain version upgrades 54 | node_version 55 | ] 56 | } 57 | } 58 | 59 | resource "google_container_node_pool" "default" { 60 | name = "default-node-pool" 61 | cluster = google_container_cluster.default.id 62 | 63 | autoscaling { 64 | min_node_count = 1 65 | max_node_count = var.cluster_node_pool_max_node_count 66 | } 67 | management { 68 | auto_repair = true 69 | auto_upgrade = true 70 | } 71 | 72 | node_config { 73 | machine_type = var.cluster_compute_machine_type 74 | service_account = var.service_account.email 75 | } 76 | 77 | network_config { 78 | # Isolate nodes from the internet by default. Internet access is granted with NAT. 79 | enable_private_nodes = true 80 | } 81 | 82 | lifecycle { 83 | ignore_changes = [ 84 | location, 85 | version 86 | ] 87 | } 88 | } 89 | -------------------------------------------------------------------------------- /modules/database/main.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.5.0, < 2.0.0" 3 | 4 | required_providers { 5 | google = { 6 | source = "hashicorp/google" 7 | version = "~> 6.0" 8 | } 9 | random = { 10 | source = "hashicorp/random" 11 | version = "~> 3.4" 12 | } 13 | } 14 | } 15 | 16 | resource "random_string" "master_password" { 17 | length = 32 18 | special = false 19 | } 20 | 21 | locals { 22 | database_name = var.namespace 23 | master_username = var.namespace 24 | master_password = random_string.master_password.result 25 | } 26 | 27 | resource "google_sql_database_instance" "default" { 28 | database_version = var.cloudsql_postgres_version 29 | name = local.database_name 30 | deletion_protection = var.deletion_protection 31 | 32 | settings { 33 | tier = var.cloudsql_tier 34 | edition = var.cloudsql_edition 35 | activation_policy = "ALWAYS" 36 | availability_type = var.cloudsql_availability_type 37 | deletion_protection_enabled = var.deletion_protection 38 | 39 | ip_configuration { 40 | # We're giving the Cloud SQL instance a public IP address in order to connect to it with 41 | # Cloud SQL Proxy (which requires IAM authentication). We're not exposing the instance 42 | # to the internet because no external network is authorized. 43 | ipv4_enabled = true 44 | private_network = var.network_connection.network 45 | ssl_mode = "ENCRYPTED_ONLY" 46 | } 47 | 48 | database_flags { 49 | name = "cloudsql.iam_authentication" 50 | value = "on" 51 | } 52 | 53 | database_flags { 54 | name = "max_connections" 55 | value = "1000" 56 | } 57 | 58 | backup_configuration { 59 | backup_retention_settings { 60 | retained_backups = 7 61 | retention_unit = "COUNT" 62 | } 63 | 64 | enabled = true 65 | location = "us" 66 | start_time = "01:00" 67 | transaction_log_retention_days = 7 68 | } 69 | 70 | disk_autoresize = true 71 | disk_autoresize_limit = 0 72 | disk_size = 120 73 | disk_type = "PD_SSD" 74 | 75 | insights_config { 76 | query_insights_enabled = var.cloudsql_query_insights_enabled 77 | } 78 | } 79 | 80 | lifecycle { 81 | # Prevents Terraform from deleting the disk and recreating it when the size changes due to 82 | # auto-resizing. 83 | ignore_changes = [settings[0].disk_size] 84 | } 85 | } 86 | 87 | resource "google_sql_database" "default" { 88 | name = local.database_name 89 | instance = google_sql_database_instance.default.name 90 | 91 | depends_on = [google_sql_database_instance.default] 92 | } 93 | 94 | resource "google_sql_user" "default" { 95 | instance = google_sql_database_instance.default.name 96 | name = local.master_username 97 | password = local.master_password 98 | 99 | depends_on = [google_sql_database_instance.default] 100 | } 101 | -------------------------------------------------------------------------------- /main.tf: -------------------------------------------------------------------------------- 1 | # Ensures APIs are all enabled in project 2 | module "project_factory_project_services" { 3 | source = "terraform-google-modules/project-factory/google//modules/project_services" 4 | version = "~> 18.0" 5 | 6 | project_id = null 7 | 8 | activate_apis = [ 9 | "iam.googleapis.com", # Service accounts 10 | "logging.googleapis.com", # Logging 11 | "sqladmin.googleapis.com", # Database 12 | "networkmanagement.googleapis.com", # Networking 13 | "servicenetworking.googleapis.com", # Networking 14 | "storage.googleapis.com", # Cloud Storage 15 | "artifactregistry.googleapis.com", # Artifact Registry 16 | "container.googleapis.com", # Kubernetes 17 | "compute.googleapis.com" # Kubernetes 18 | ] 19 | disable_dependent_services = false 20 | disable_services_on_destroy = false 21 | } 22 | 23 | module "service_account" { 24 | source = "./modules/service_account" 25 | project_id = var.project_id 26 | namespace = var.namespace 27 | } 28 | 29 | module "storage" { 30 | source = "./modules/storage" 31 | namespace = var.namespace 32 | deletion_protection = var.deletion_protection 33 | cloud_storage_bucket_location = var.cloud_storage_bucket_location 34 | 35 | service_account = module.service_account.service_account 36 | 37 | depends_on = [module.service_account] 38 | } 39 | 40 | module "networking" { 41 | source = "./modules/networking" 42 | namespace = var.namespace 43 | project = var.project_id 44 | region = var.region 45 | } 46 | 47 | module "cluster" { 48 | source = "./modules/cluster" 49 | namespace = var.namespace 50 | project_id = var.project_id 51 | cluster_compute_machine_type = var.cluster_compute_machine_type 52 | cluster_node_pool_max_node_count = var.cluster_node_pool_max_node_count 53 | domain = var.domain 54 | cluster_monitoring_components = var.cluster_monitoring_components 55 | 56 | network = module.networking.network 57 | subnetwork = module.networking.subnetwork 58 | service_account = module.service_account.service_account 59 | 60 | depends_on = [module.networking, module.service_account] 61 | } 62 | 63 | module "database" { 64 | source = "./modules/database" 65 | namespace = var.namespace 66 | deletion_protection = var.deletion_protection 67 | cloudsql_postgres_version = var.cloudsql_postgres_version 68 | cloudsql_tier = var.cloudsql_tier 69 | cloudsql_availability_type = var.cloudsql_availability_type 70 | cloudsql_edition = var.cloudsql_edition 71 | cloudsql_query_insights_enabled = var.cloudsql_query_insights_enabled 72 | 73 | network_connection = module.networking.connection 74 | 75 | depends_on = [module.networking] 76 | } 77 | 78 | module "registry" { 79 | source = "./modules/registry" 80 | namespace = var.namespace 81 | location = var.region 82 | 83 | service_account = module.service_account.service_account 84 | service_account_credentials = module.service_account.service_account_credentials 85 | 86 | # Depends on cluster existing as Kubernetes secret will be created containing an imagePullSecret 87 | # with Docker config for private registry 88 | depends_on = [module.cluster, module.service_account] 89 | } 90 | -------------------------------------------------------------------------------- /example-app/README.md: -------------------------------------------------------------------------------- 1 | # Example Dagster Deployment 2 | 3 | This uses the official [Dagster Helm chart](https://artifacthub.io/packages/helm/dagster/dagster) on top of the `terraform-google-dagster` (TDG) module. The TDG module will provision all of the infrastructure needed to deploy your Dagster Kubernetes cluster; attach it to a Cloud SQL database for persistent metadata storage; and a Cloud Storage bucket for IO management, log storage, or whatever else you may need it for. 4 | 5 | You'll likely want to enable an ingress (either through the Helm chart or by creating your own Kubernetes ingress) to enable access to the Dagit UI. 6 | 7 | **Important note** 8 | If you are using your own code deployment image with the private Artifact Registry repository as the source you'll need to ensure you have published your code deployment image after the repository has been created by the TDG module. If not pushed, the Terraform module will time out as the user code deployment pod will be left in a state of `imagePullBackoff` as it will not be able to find the image. 9 | 10 | ## Usage 11 | 12 | Ensure you have terraform installed ([instructions](https://learn.hashicorp.com/tutorials/terraform/install-cli#install-terraform)) 13 | 14 | Clone the base repository and navigate to the `example-app/` directory: 15 | 16 | ``` 17 | git clone https://github.com/wandb/terraform-google-dagster 18 | cd terraform-google-dagster/example-app 19 | ``` 20 | 21 | Either set your variables in a `terraform.tfvars` or wait for the prompts to set these when applying the configuration. 22 | 23 | ``` 24 | terraform init 25 | terraform apply 26 | ``` 27 | 28 | ## Requirements 29 | 30 | | Name | Version | 31 | |------|---------| 32 | | [terraform](#requirement\_terraform) | >= 1.5.0, < 2.0.0 | 33 | | [google](#requirement\_google) | ~> 6.0 | 34 | | [helm](#requirement\_helm) | ~> 2.4 | 35 | | [kubernetes](#requirement\_kubernetes) | ~> 2.9 | 36 | 37 | ## Providers 38 | 39 | | Name | Version | 40 | |------|---------| 41 | | [google](#provider\_google) | 6.18.1 | 42 | | [helm](#provider\_helm) | 2.17.0 | 43 | 44 | ## Modules 45 | 46 | | Name | Source | Version | 47 | |------|--------|---------| 48 | | [dagster\_infra](#module\_dagster\_infra) | ../ | n/a | 49 | 50 | ## Resources 51 | 52 | | Name | Type | 53 | |------|------| 54 | | [helm_release.dagster_service](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource | 55 | | [helm_release.dagster_user_deployment](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource | 56 | | [google_client_config.current](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/client_config) | data source | 57 | 58 | ## Inputs 59 | 60 | | Name | Description | Type | Default | Required | 61 | |------|-------------|------|---------|:--------:| 62 | | [dagster\_deployment\_image](#input\_dagster\_deployment\_image) | Image name of user code deployment | `string` | `"user-code-example"` | no | 63 | | [dagster\_deployment\_tag](#input\_dagster\_deployment\_tag) | User code deployment tag of Dagster to deploy | `string` | `"latest"` | no | 64 | | [dagster\_version](#input\_dagster\_version) | Version of Dagster to deploy | `string` | `"0.14.3"` | no | 65 | | [domain](#input\_domain) | The domain in which your Google Groups are defined. | `string` | `"example"` | no | 66 | | [namespace](#input\_namespace) | Namespace used as a prefix for all resources | `string` | n/a | yes | 67 | | [project\_id](#input\_project\_id) | Project ID | `string` | n/a | yes | 68 | | [region](#input\_region) | Google region | `string` | n/a | yes | 69 | | [zone](#input\_zone) | Google zone | `string` | n/a | yes | 70 | 71 | ## Outputs 72 | 73 | | Name | Description | 74 | |------|-------------| 75 | | [dagster\_service\_manifest](#output\_dagster\_service\_manifest) | n/a | 76 | | [dagster\_user\_deployment\_manifest](#output\_dagster\_user\_deployment\_manifest) | n/a | 77 | 78 | -------------------------------------------------------------------------------- /example-app/main.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | required_version = ">= 1.5.0, < 2.0.0" 3 | 4 | required_providers { 5 | kubernetes = { 6 | source = "hashicorp/kubernetes" 7 | version = "~> 2.9" 8 | } 9 | google = { 10 | source = "hashicorp/google" 11 | version = "~> 6.0" 12 | } 13 | helm = { 14 | source = "hashicorp/helm" 15 | version = "~> 2.4" 16 | } 17 | } 18 | } 19 | 20 | provider "google" { 21 | project = var.project_id 22 | region = var.region 23 | zone = var.zone 24 | } 25 | 26 | # Much more is configurable here, this simply uses the defaults for cloud resources 27 | module "dagster_infra" { 28 | source = "../" 29 | project_id = var.project_id 30 | region = var.region 31 | namespace = var.namespace 32 | domain = var.domain 33 | 34 | # cloud_storage_bucket_location (default US) 35 | # cloudsql_postgres_version (default POSTGRES_14) 36 | # cloudsql_tier (default db-f1-micro) 37 | # cloudsql_availability_type (default ZONAL) 38 | # cluster_compute_machine_type (default e2-standard-2) 39 | # cluster_node_pool_max_node_count (default 2) 40 | # deletion_protection (default true) 41 | } 42 | 43 | data "google_client_config" "current" {} 44 | 45 | provider "kubernetes" { 46 | host = "https://${module.dagster_infra.cluster_endpoint}" 47 | cluster_ca_certificate = base64decode(module.dagster_infra.cluster_ca_certificate) 48 | token = data.google_client_config.current.access_token 49 | } 50 | 51 | provider "helm" { 52 | kubernetes { 53 | host = "https://${module.dagster_infra.cluster_endpoint}" 54 | cluster_ca_certificate = base64decode(module.dagster_infra.cluster_ca_certificate) 55 | token = data.google_client_config.current.access_token 56 | } 57 | } 58 | 59 | locals { 60 | code_deployment_name = "pipelines" 61 | code_deployment_port = 9090 62 | imagePullSecrets = [{ # tflint-ignore: terraform_naming_convention 63 | name = module.dagster_infra.registry_image_pull_secret 64 | }] 65 | user_deployment_values = { 66 | deployments = [ 67 | { 68 | name = local.code_deployment_name 69 | image = { 70 | repository = "${module.dagster_infra.registry_image_path}/${var.dagster_deployment_image}" 71 | tag = var.dagster_deployment_tag 72 | pullPolicy = "Always" 73 | } 74 | dagsterApiGrpcArgs = ["-f", "path/to/repository.py"] 75 | port = local.code_deployment_port 76 | } 77 | ] 78 | imagePullSecrets = local.imagePullSecrets 79 | } 80 | service_values = { 81 | dagsterWebserver = { 82 | workspace = { 83 | enabled = true 84 | servers = [{ 85 | host = local.code_deployment_name 86 | port = local.code_deployment_port 87 | }] 88 | } 89 | } 90 | dagster-user-deployments = { 91 | enableSubchart = false 92 | } 93 | postgresql = { 94 | # Disables postgresql service pod. This is non-persistent in that it will be destroyed when infrastructure 95 | # changes or the pod is restarted. It's intented for ephemeral or test usage. 96 | enabled = false 97 | postgresqlHost = module.dagster_infra.cloudsql_database.host 98 | postgresqlUsername = module.dagster_infra.cloudsql_database.username 99 | postgresqlDatabase = module.dagster_infra.cloudsql_database.name 100 | postgresqlPassword = module.dagster_infra.cloudsql_database.password 101 | } 102 | imagePullSecrets = local.imagePullSecrets 103 | } 104 | } 105 | 106 | # IMPORTANT: 107 | # Before your helm release is deployed you must ensure you have your code deployment image is pushed to 108 | # your private repository. This will not break anything other than put your pod into an imagePullBackoff 109 | # state as it won't be able to find your Docker image. 110 | resource "helm_release" "dagster_user_deployment" { 111 | name = "dagster-code" 112 | version = var.dagster_version 113 | repository = "https://dagster-io.github.io/helm" 114 | chart = "dagster-user-deployments" 115 | # Current values can be found here https://artifacthub.io/packages/helm/dagster/dagster-user-deployments?modal=values 116 | values = [yamlencode(local.user_deployment_values)] 117 | 118 | depends_on = [module.dagster_infra] 119 | } 120 | 121 | resource "helm_release" "dagster_service" { 122 | name = "dagster-service" 123 | version = var.dagster_version 124 | repository = "https://dagster-io.github.io/helm" 125 | chart = "dagster" 126 | # Current values can be found here https://artifacthub.io/packages/helm/dagster/dagster?modal=values 127 | values = [yamlencode(local.service_values)] 128 | 129 | depends_on = [module.dagster_infra] 130 | } 131 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # Changelog 2 | 3 | All notable changes to this project will be documented in this file. 4 | 5 | ## [2.12.0](https://github.com/wandb/terraform-google-dagster/compare/v2.11.1...v2.12.0) (2025-08-18) 6 | 7 | 8 | ### Features 9 | 10 | * Better database config ([#27](https://github.com/wandb/terraform-google-dagster/issues/27)) ([8095f02](https://github.com/wandb/terraform-google-dagster/commit/8095f02aef6e36434c797b996f95f4239b6ee81d)) 11 | 12 | ### [2.11.1](https://github.com/wandb/terraform-google-dagster/compare/v2.11.0...v2.11.1) (2025-02-10) 13 | 14 | 15 | ### Bug Fixes 16 | 17 | * CloudSQL auto-resizing ([#24](https://github.com/wandb/terraform-google-dagster/issues/24)) ([c6be161](https://github.com/wandb/terraform-google-dagster/commit/c6be1610ae4c0697a3009f678248c14cf8485593)) 18 | 19 | ## [2.11.0](https://github.com/wandb/terraform-google-dagster/compare/v2.10.0...v2.11.0) (2025-01-31) 20 | 21 | 22 | ### Features 23 | 24 | * Add cleanup policy on the Artifact Registry ([#23](https://github.com/wandb/terraform-google-dagster/issues/23)) ([dd0ed6d](https://github.com/wandb/terraform-google-dagster/commit/dd0ed6d0df81d000f9a720f5aa1bd091f63ca092)) 25 | 26 | ## [2.10.0](https://github.com/wandb/terraform-google-dagster/compare/v2.9.0...v2.10.0) (2024-08-23) 27 | 28 | 29 | ### Features 30 | 31 | * Collect GKE metrics ([#22](https://github.com/wandb/terraform-google-dagster/issues/22)) ([d23d44d](https://github.com/wandb/terraform-google-dagster/commit/d23d44d9da68e1afbc7b096e9c4e91b39bdb8aca)) 32 | 33 | ## [2.9.0](https://github.com/wandb/terraform-google-dagster/compare/v2.8.0...v2.9.0) (2024-08-08) 34 | 35 | 36 | ### Features 37 | 38 | * Explicitly disable query insights ([#21](https://github.com/wandb/terraform-google-dagster/issues/21)) ([443a9ff](https://github.com/wandb/terraform-google-dagster/commit/443a9ff1365b4aeeec0ae1b4ac21fd82770fbdd7)) 39 | 40 | ## [2.8.0](https://github.com/wandb/terraform-google-dagster/compare/v2.7.0...v2.8.0) (2024-07-12) 41 | 42 | 43 | ### Features 44 | 45 | * Private GKE cluster ([#20](https://github.com/wandb/terraform-google-dagster/issues/20)) ([635ac51](https://github.com/wandb/terraform-google-dagster/commit/635ac512e0dbb9ddd189cc4baf1752a79accd4ee)) 46 | 47 | ## [2.7.0](https://github.com/wandb/terraform-google-dagster/compare/v2.6.0...v2.7.0) (2024-01-29) 48 | 49 | 50 | ### Features 51 | 52 | * Increase CloudSQL active connections ([#19](https://github.com/wandb/terraform-google-dagster/issues/19)) ([a26c1b5](https://github.com/wandb/terraform-google-dagster/commit/a26c1b52abe1762fc340b5d86737bb84d18c1b92)) 53 | 54 | ## [2.6.0](https://github.com/wandb/terraform-google-dagster/compare/v2.5.0...v2.6.0) (2023-10-27) 55 | 56 | 57 | ### Features 58 | 59 | * Add lifecycle policy to storage bucket ([#17](https://github.com/wandb/terraform-google-dagster/issues/17)) ([b1f9fb6](https://github.com/wandb/terraform-google-dagster/commit/b1f9fb66c298411dd8fda3da6ce2fcc15005cf9d)) 60 | 61 | ## [2.5.0](https://github.com/wandb/terraform-google-dagster/compare/v2.4.1...v2.5.0) (2023-10-26) 62 | 63 | 64 | ### Features 65 | 66 | * Public IP address ([#16](https://github.com/wandb/terraform-google-dagster/issues/16)) ([2707f50](https://github.com/wandb/terraform-google-dagster/commit/2707f50a18ec5b5325aa12521c03f2f829150989)) 67 | 68 | ### [2.4.1](https://github.com/wandb/terraform-google-dagster/compare/v2.4.0...v2.4.1) (2023-09-27) 69 | 70 | 71 | ### Bug Fixes 72 | 73 | * Dagit has been renamed Dagster webserver ([#15](https://github.com/wandb/terraform-google-dagster/issues/15)) ([d56202c](https://github.com/wandb/terraform-google-dagster/commit/d56202cf8863c8a509cb23676dd657f09df4600f)) 74 | 75 | ## [2.4.0](https://github.com/wandb/terraform-google-dagster/compare/v2.3.0...v2.4.0) (2023-09-25) 76 | 77 | 78 | ### Features 79 | 80 | * Enable Google Groups for RBAC ([#14](https://github.com/wandb/terraform-google-dagster/issues/14)) ([2bac576](https://github.com/wandb/terraform-google-dagster/commit/2bac57608f4971dc383448a432eaa369f0409bcf)) 81 | 82 | ## [2.3.0](https://github.com/wandb/terraform-google-dagster/compare/v2.2.0...v2.3.0) (2023-08-18) 83 | 84 | 85 | ### Features 86 | 87 | * Add two registry outputs ([#13](https://github.com/wandb/terraform-google-dagster/issues/13)) ([d01fcd8](https://github.com/wandb/terraform-google-dagster/commit/d01fcd8e034728b70ca8900c8e6fd99966460bf9)) 88 | 89 | ## [2.2.0](https://github.com/wandb/terraform-google-dagster/compare/v2.1.0...v2.2.0) (2023-08-14) 90 | 91 | 92 | ### Features 93 | 94 | * Migrate out of legacy module configuration ([#12](https://github.com/wandb/terraform-google-dagster/issues/12)) ([d4e256f](https://github.com/wandb/terraform-google-dagster/commit/d4e256f62027c5fd4be9ccc70be2373e7379c111)) 95 | 96 | ## [2.1.0](https://github.com/wandb/terraform-google-dagster/compare/v2.0.0...v2.1.0) (2023-01-19) 97 | 98 | 99 | ### Features 100 | 101 | * Enable Cloud SQL IAM authentication ([#11](https://github.com/wandb/terraform-google-dagster/issues/11)) ([8b08ebc](https://github.com/wandb/terraform-google-dagster/commit/8b08ebccf42bea6ab74f12c688c34f2698c80bd4)) 102 | 103 | ## [2.0.0](https://github.com/wandb/terraform-google-dagster/compare/v1.1.0...v2.0.0) (2022-10-19) 104 | 105 | 106 | ### ⚠ BREAKING CHANGES 107 | 108 | * Enable cluster auto-scaling, auto-repairs and auto-upgrades (#10) 109 | 110 | ### Features 111 | 112 | * Enable cluster auto-scaling, auto-repairs and auto-upgrades ([#10](https://github.com/wandb/terraform-google-dagster/issues/10)) ([7b6f93e](https://github.com/wandb/terraform-google-dagster/commit/7b6f93e3ee690cabc1f789d7a1d5352ccccdda1f)) 113 | 114 | ## [1.1.0](https://github.com/wandb/terraform-google-dagster/compare/v1.0.1...v1.1.0) (2022-07-29) 115 | 116 | 117 | ### Features 118 | 119 | * Export cluster ID ([#9](https://github.com/wandb/terraform-google-dagster/issues/9)) ([a995e7a](https://github.com/wandb/terraform-google-dagster/commit/a995e7a6bf6e85e7ebbdf3fcf2c6fea18b1854eb)) 120 | 121 | ### [1.0.1](https://github.com/wandb/terraform-google-dagster/compare/v1.0.0...v1.0.1) (2022-07-12) 122 | 123 | 124 | ### Bug Fixes 125 | 126 | * Explicitly disable client certificate auth (deprecated by kubernetes) ([#6](https://github.com/wandb/terraform-google-dagster/issues/6)) ([8c57966](https://github.com/wandb/terraform-google-dagster/commit/8c579669e9b5963f22a41a09546d626d9b134e7d)) 127 | 128 | ## 1.0.0 (2022-04-04) 129 | 130 | 131 | ### Features 132 | 133 | * Any filetype changed can trigger a release ([#5](https://github.com/wandb/terraform-google-dagster/issues/5)) ([d0462e5](https://github.com/wandb/terraform-google-dagster/commit/d0462e5492516be3e5413a24bb553cb3fc299345)) 134 | * enable workload identity on cluster ([2da02c2](https://github.com/wandb/terraform-google-dagster/commit/2da02c28c0f04438da192f68fe345521176392e2)) 135 | * release v1 ([#4](https://github.com/wandb/terraform-google-dagster/issues/4)) ([e3794fc](https://github.com/wandb/terraform-google-dagster/commit/e3794fc31b836f01922c7be53b9d0998394a56fd)) 136 | 137 | 138 | ### Bug Fixes 139 | 140 | * add suffix to cloud storage bucket name ([0729d97](https://github.com/wandb/terraform-google-dagster/commit/0729d97138b5337a5191bce61446f5fcc4b29e02)) 141 | * align service account variable names ([72f6c99](https://github.com/wandb/terraform-google-dagster/commit/72f6c99abb7cdfaa2fc8968d5c7a484b736ee4a9)) 142 | * fix syntax error with random generated string ([42c73d7](https://github.com/wandb/terraform-google-dagster/commit/42c73d7c90af910cf4923e033a7757abd6efc43b)) 143 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # terraform-google-dagster 2 | Terraform module to provision required GCP infrastructure necessary for a Dagster Kubernetes-based deployment. 3 | 4 | **Note: this configuration does not bundle the Dagster application but rather all of its required services. A reference implementation may be found in the `example-app` directory.** 5 | 6 | ## Overview 7 | The `terraform-google-dagster` module does not attempt to make any assumptions about how your Dagster deployment should look as this can vary widely and will not actually create a Dagster deployment. It _will_ create all of the core foundational components necessary for running a Dagster cluster which should be easily pluggable into the [Dagster Helm chart](https://artifacthub.io/packages/helm/dagster/dagster) or your own Dagster Kubernetes resources. 8 | 9 | The module will provision: 10 | - **Service account**: This will manage all of the resources associated with your application 11 | - **Private network**: Private network from which your resources connect to one another (specifically Kubernetes and Postgres) 12 | - **CloudSQL Postgres Instance**: A Cloud SQL Postgres instance 13 | - **Kubernetes Cluster**: Primary cluster from which you can run dagit, the dagster-daemon and user code deployments 14 | - **Cloud Storage Bucket**: Can be used as an IOManager, for log storage, asset materializations, or as a staging layer for data 15 | - **Artifact Registry Docker Repository**: This can be used for for private code deployment images 16 | 17 | # Example 18 | 19 | You can find an example deployment utilizing the official [Dagster Helm chart](https://artifacthub.io/packages/helm/dagster/dagster) inside of the `example-app/` directory. 20 | 21 | 22 | ## Requirements 23 | 24 | | Name | Version | 25 | |------|---------| 26 | | [terraform](#requirement\_terraform) | >= 1.5.0, < 2.0.0 | 27 | 28 | ## Providers 29 | 30 | No providers. 31 | 32 | ## Modules 33 | 34 | | Name | Source | Version | 35 | |------|--------|---------| 36 | | [cluster](#module\_cluster) | ./modules/cluster | n/a | 37 | | [database](#module\_database) | ./modules/database | n/a | 38 | | [networking](#module\_networking) | ./modules/networking | n/a | 39 | | [project\_factory\_project\_services](#module\_project\_factory\_project\_services) | terraform-google-modules/project-factory/google//modules/project_services | ~> 18.0 | 40 | | [registry](#module\_registry) | ./modules/registry | n/a | 41 | | [service\_account](#module\_service\_account) | ./modules/service_account | n/a | 42 | | [storage](#module\_storage) | ./modules/storage | n/a | 43 | 44 | ## Resources 45 | 46 | No resources. 47 | 48 | ## Inputs 49 | 50 | | Name | Description | Type | Default | Required | 51 | |------|-------------|------|---------|:--------:| 52 | | [cloud\_storage\_bucket\_location](#input\_cloud\_storage\_bucket\_location) | Location to create cloud storage bucket in. | `string` | `"US"` | no | 53 | | [cloudsql\_availability\_type](#input\_cloudsql\_availability\_type) | The availability type of the Cloud SQL instance. | `string` | `"ZONAL"` | no | 54 | | [cloudsql\_edition](#input\_cloudsql\_edition) | The edition of the CloudSQL instance. | `string` | `"ENTERPRISE"` | no | 55 | | [cloudsql\_postgres\_version](#input\_cloudsql\_postgres\_version) | The postgres version of the CloudSQL instance. | `string` | `"POSTGRES_14"` | no | 56 | | [cloudsql\_query\_insights\_enabled](#input\_cloudsql\_query\_insights\_enabled) | Whether to enable query insights. | `bool` | `false` | no | 57 | | [cloudsql\_tier](#input\_cloudsql\_tier) | The machine type to use | `string` | `"db-f1-micro"` | no | 58 | | [cluster\_compute\_machine\_type](#input\_cluster\_compute\_machine\_type) | Compute machine type to deploy cluster nodes on. | `string` | `"e2-standard-2"` | no | 59 | | [cluster\_monitoring\_components](#input\_cluster\_monitoring\_components) | Components to enable in the GKE monitoring stack. | `list(string)` |
[
"SYSTEM_COMPONENTS"
]
| no | 60 | | [cluster\_node\_pool\_max\_node\_count](#input\_cluster\_node\_pool\_max\_node\_count) | Max number of nodes cluster can scale up to. | `number` | `2` | no | 61 | | [deletion\_protection](#input\_deletion\_protection) | Indicates whether or not storage and databases have deletion protection enabled | `bool` | `true` | no | 62 | | [domain](#input\_domain) | The domain in which your Google Groups are defined. | `string` | n/a | yes | 63 | | [namespace](#input\_namespace) | Namespace used as a prefix for all resources | `string` | n/a | yes | 64 | | [project\_id](#input\_project\_id) | Project ID | `string` | n/a | yes | 65 | | [region](#input\_region) | Google region | `string` | n/a | yes | 66 | 67 | ## Outputs 68 | 69 | | Name | Description | 70 | |------|-------------| 71 | | [cloudsql\_database](#output\_cloudsql\_database) | Object containing connection parameters for provisioned CloudSQL database | 72 | | [cluster\_ca\_certificate](#output\_cluster\_ca\_certificate) | Cluster certificate of provisioned Kubernetes cluster | 73 | | [cluster\_endpoint](#output\_cluster\_endpoint) | Endpoint of provisioned Kubernetes cluster | 74 | | [cluster\_id](#output\_cluster\_id) | Id of provisioned Kubernetes cluster | 75 | | [network\_name](#output\_network\_name) | Name of provisioned VPC network | 76 | | [registry\_image\_path](#output\_registry\_image\_path) | Docker image path of provisioned Artifact Registry | 77 | | [registry\_image\_pull\_secret](#output\_registry\_image\_pull\_secret) | Name of Kubernetes secret containing Docker config with permissions to pull from private Artifact Registry repository | 78 | | [registry\_location](#output\_registry\_location) | Location of provisioned Artifact Registry | 79 | | [registry\_name](#output\_registry\_name) | Name of provisioned Artifact Registry | 80 | | [service\_account](#output\_service\_account) | Service account created to manage and authenticate services. | 81 | | [storage\_bucket\_name](#output\_storage\_bucket\_name) | Name of provisioned Cloud Storage bucket | 82 | 83 | 84 | # Development 85 | 86 | If you'd like to contribute to this repository you'll have a few dependencies you'll need to install before committing. We use `pre-commit` to ensure standards are adhered to by running Terraform validations via git hooks. We specifically use the following packages: 87 | 88 | - `conventional-pre-commit`: No additional dependencies needed for this 89 | - `terraform_validate`: No additional dependencies needed for this 90 | - `terraform_fmt`: No additional depenencies needed for this 91 | - `terraform_docs`: Installation instructions [here](https://github.com/terraform-docs/terraform-docs) 92 | - `terraform_tflint`: Installation instructions [here](https://github.com/terraform-linters/tflint) 93 | 94 | You'll also need to install [`pre-commit`](https://pre-commit.com/#installation). 95 | 96 | Once you have these dependencies installed you can execute the following: 97 | 98 | ``` 99 | pre-commit install 100 | pre-commit install --hook-type commit-msg # installs the hook for commit messages to enforce conventional commits 101 | pre-commit run -a # this will run pre-commit across all files in the project to validate installation 102 | ``` 103 | 104 | Now after creating git commits these commit hooks will execute and ensure your 105 | changes adhere to the project standards. In general, we've followed the 106 | guidelines for best-practices laid out in [Terraform Best 107 | Practices](https://www.terraform-best-practices.com/), it would be recommended 108 | to follow these guidelines when submitting any contributions of your own. 109 | 110 | 111 | ## Requirements 112 | 113 | | Name | Version | 114 | |------|---------| 115 | | [terraform](#requirement\_terraform) | >= 1.5.0, < 2.0.0 | 116 | 117 | ## Providers 118 | 119 | No providers. 120 | 121 | ## Modules 122 | 123 | | Name | Source | Version | 124 | |------|--------|---------| 125 | | [cluster](#module\_cluster) | ./modules/cluster | n/a | 126 | | [database](#module\_database) | ./modules/database | n/a | 127 | | [networking](#module\_networking) | ./modules/networking | n/a | 128 | | [project\_factory\_project\_services](#module\_project\_factory\_project\_services) | terraform-google-modules/project-factory/google//modules/project_services | ~> 18.0 | 129 | | [registry](#module\_registry) | ./modules/registry | n/a | 130 | | [service\_account](#module\_service\_account) | ./modules/service_account | n/a | 131 | | [storage](#module\_storage) | ./modules/storage | n/a | 132 | 133 | ## Resources 134 | 135 | No resources. 136 | 137 | ## Inputs 138 | 139 | | Name | Description | Type | Default | Required | 140 | |------|-------------|------|---------|:--------:| 141 | | [cloud\_storage\_bucket\_location](#input\_cloud\_storage\_bucket\_location) | Location to create cloud storage bucket in. | `string` | `"US"` | no | 142 | | [cloudsql\_availability\_type](#input\_cloudsql\_availability\_type) | The availability type of the Cloud SQL instance. | `string` | `"ZONAL"` | no | 143 | | [cloudsql\_edition](#input\_cloudsql\_edition) | The edition of the CloudSQL instance. | `string` | `"ENTERPRISE"` | no | 144 | | [cloudsql\_postgres\_version](#input\_cloudsql\_postgres\_version) | The postgres version of the CloudSQL instance. | `string` | `"POSTGRES_14"` | no | 145 | | [cloudsql\_query\_insights\_enabled](#input\_cloudsql\_query\_insights\_enabled) | Whether to enable query insights. | `bool` | `false` | no | 146 | | [cloudsql\_tier](#input\_cloudsql\_tier) | The machine type to use | `string` | `"db-f1-micro"` | no | 147 | | [cluster\_compute\_machine\_type](#input\_cluster\_compute\_machine\_type) | Compute machine type to deploy cluster nodes on. | `string` | `"e2-standard-2"` | no | 148 | | [cluster\_monitoring\_components](#input\_cluster\_monitoring\_components) | Components to enable in the GKE monitoring stack. | `list(string)` |
[
"SYSTEM_COMPONENTS"
]
| no | 149 | | [cluster\_node\_pool\_max\_node\_count](#input\_cluster\_node\_pool\_max\_node\_count) | Max number of nodes cluster can scale up to. | `number` | `2` | no | 150 | | [deletion\_protection](#input\_deletion\_protection) | Indicates whether or not storage and databases have deletion protection enabled | `bool` | `true` | no | 151 | | [domain](#input\_domain) | The domain in which your Google Groups are defined. | `string` | n/a | yes | 152 | | [namespace](#input\_namespace) | Namespace used as a prefix for all resources | `string` | n/a | yes | 153 | | [project\_id](#input\_project\_id) | Project ID | `string` | n/a | yes | 154 | | [region](#input\_region) | Google region | `string` | n/a | yes | 155 | 156 | ## Outputs 157 | 158 | | Name | Description | 159 | |------|-------------| 160 | | [cloudsql\_database](#output\_cloudsql\_database) | Object containing connection parameters for provisioned CloudSQL database | 161 | | [cluster\_ca\_certificate](#output\_cluster\_ca\_certificate) | Cluster certificate of provisioned Kubernetes cluster | 162 | | [cluster\_endpoint](#output\_cluster\_endpoint) | Endpoint of provisioned Kubernetes cluster | 163 | | [cluster\_id](#output\_cluster\_id) | Id of provisioned Kubernetes cluster | 164 | | [network\_name](#output\_network\_name) | Name of provisioned VPC network | 165 | | [registry\_image\_path](#output\_registry\_image\_path) | Docker image path of provisioned Artifact Registry | 166 | | [registry\_image\_pull\_secret](#output\_registry\_image\_pull\_secret) | Name of Kubernetes secret containing Docker config with permissions to pull from private Artifact Registry repository | 167 | | [registry\_location](#output\_registry\_location) | Location of provisioned Artifact Registry | 168 | | [registry\_name](#output\_registry\_name) | Name of provisioned Artifact Registry | 169 | | [service\_account](#output\_service\_account) | Service account created to manage and authenticate services. | 170 | | [storage\_bucket\_name](#output\_storage\_bucket\_name) | Name of provisioned Cloud Storage bucket | 171 | 172 | --------------------------------------------------------------------------------