├── .gitignore ├── LICENSE ├── README.md ├── deployment ├── README.md ├── deploy.sh ├── service-create-nginx.json ├── service-update-nginx.json ├── td-nginx.json └── td-nginx.template ├── ecs.tf ├── ecs.tfvars ├── ecs_fake_private ├── img ├── alb.png ├── deployment.png ├── ecs-infra.png └── ecs-terraform-modules.png └── modules ├── alb ├── main.tf ├── outputs.tf └── variables.tf ├── ecs ├── alb.tf ├── instance_policy.tf ├── loadbalancer_policy.tf ├── main.tf ├── outputs.tf └── variables.tf ├── ecs_events ├── main.tf └── variables.tf ├── ecs_instances ├── cloudwatch.tf ├── main.tf ├── outputs.tf ├── templates │ └── user_data.sh └── variables.tf ├── ecs_roles ├── aws_caller_identity.json ├── ecs_default_task.json ├── main.tf └── variables.tf ├── nat_gateway ├── main.tf ├── outputs.tf └── variables.tf ├── network ├── main.tf ├── outputs.tf └── variables.tf ├── subnet ├── main.tf ├── outputs.tf └── variables.tf ├── users ├── ecs_deployer.json ├── main.tf └── outputs.tf └── vpc ├── main.tf ├── outputs.tf └── variables.tf /.gitignore: -------------------------------------------------------------------------------- 1 | aws.tf 2 | .terraform 3 | terraform.tfstate* 4 | .idea/ 5 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 Armin Coralic 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Terraform-AWS-ECS 2 | 3 | Terraform modules for creating a production ready ECS Cluster in AWS. 4 | 5 | Features: 6 | * High-availability (Multi-AZ) 7 | * Loadbalanced (ALB) 8 | * Isolated in a VPC 9 | * Private -> Public access (NAT'd) 10 | * Auto-scaling 11 | 12 | ## Usage: 13 | 14 | * Specify the AWS region to create resources into, in **ecs.tfvars**, using `aws_region` variable. 15 | * Specify the AMI to build your ECS instance from, in **ecs.tfvars**, using `aws_ecs_ami` variable. 16 | * Leave empty to use the latest Linux 2 ECS-optimized AMI by Amazon. 17 | * Find the latest recommended Linux 2 ECS-optimized AMI for current aws-cli region: 18 | 19 | ``` 20 | aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2/recommended 21 | ``` 22 | 23 | * Manually find latest recommended ECS-optimized AMI for any region or OS: 24 | 25 | Check here: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-optimized_AMI.html 26 | 27 | * Specify the aws-cli profile for the account to create resources in, in **ecs.tfvars**, using `aws_profile`. 28 | * The default location to view your aws-cli profiles is `$HOME/.aws/credentials` on Linux and macOS and `%USERPROFILE%\.aws\credentials` on Windows. 29 | * There are a number of other options for authenticating with the AWS Provider. These can be found here: https://registry.terraform.io/providers/hashicorp/aws/latest/docs. To implement other strategies, replace the `profile` property of the aws provider as appropriate. 30 | 31 | * Learn more about the repository, configure the infrastructure to your needs, or create the infrastructure as is, with empty ECS instances. 32 | * Learn more: [Directory](#directory) 33 | * Configure: [ECS configuration](#ecs-configuration) 34 | * Create: [How to create the infrastructure](#create-it) 35 | 36 | ## Directory 37 | * [What is ECS?](#what-is-ecs) 38 | * [ECS infrastructure in AWS](#ecs-infra) 39 | * [ECS Terraform module](#terraform-module) 40 | * [How to create the infrastructure](#create-it) 41 | * [ECS Deployment](deployment/README.md) 42 | * [Things you should know](#must-know) 43 | * [SSH access to the instances](#ssh-access-to-the-instances) 44 | * [ECS configuration](#ecs-configuration) 45 | * [Logging](#logging) 46 | * [ECS instances](#ecs-instances) 47 | * [LoadBalancer](#loadbalancer) 48 | * [Using 'default'](#using-default) 49 | * [ECS deployment strategies](#ecs-deployment-strategies) 50 | * [System containers & custom boot commands](#system-containers-and-custom-boot-commands) 51 | * [EC2 node security and updates](#ec2-node-security-and-updates) 52 | * [Service discovery](#service-discovery) 53 | * [ECS detect deployments failure](#ecs-detect-deployments-failure) 54 | 55 | ## What is ECS 56 | 57 | ECS stands for Elastic Container Service and is the AWS platform for running Docker containers. 58 | The full documentation about ECS can be found [here](https://aws.amazon.com/ecs/), the development guide can be found [here](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/Welcome.html). A more fun read can be found at [The Hitchhiker's Guide to AWS ECS and Docker](http://start.jcolemorrison.com/the-hitchhikers-guide-to-aws-ecs-and-docker/). 59 | 60 | To understand ECS it is good to state the obvious differences against the competitors like [Kubernetes](https://kubernetes.io/) or [DC/OS Mesos](https://docs.mesosphere.com/). The major differences are that ECS can not be run on-prem and that it lacks advanced features. These two differences can either been seen as weakness or as strengths. 61 | 62 | ### AWS specific 63 | 64 | You can not run ECS on-prem because it is an AWS service and not installable software. This makes it easier to setup and maintain than hosting your own Kubernetes or Mesos on-prem or in the cloud. Although it is a service it's not the same as [Google hosted Kubernetes](https://cloud.google.com/container-engine/). Why? Google really offers Kubernetes as a SAAS. Meaning, you don't manage any infrastructure while ECS actually requires a cluster and therefore infrastructure. 65 | 66 | The difference between running your own Kubernetes or Mesos and ECS is the lack of maintenance of the master nodes on ECS. You are only responsible for allowing the EC2 nodes to connect to ECS and ECS does the rest. This makes the instances in an ECS cluster replaceable and allows for low maintenance by using the standard AWS ECS optimized OS and other building blocks like autoscale etc.. 67 | 68 | ### Advanced features 69 | 70 | Although it misses some advanced features ECS plays well with other AWS services to provide simple but powerful deployments. This makes the learning curve less high for DevOps teams to run their own infrastructure. You could argue that if you are trying to do complex stuff in ECS you are either making it unnecessary complex or ECS does not fit your needs. 71 | 72 | Having said that ECS does have a possibility to be used like a Kubernetes or Mesos by using [Blox](https://blox.github.io/). Blox is essentially a set of tools that give you more control of the cluster and even more advanced deployment strategies. 73 | 74 | ## ECS infra 75 | 76 | As stated above, ECS needs EC2 instances that are used to run Docker containers on. To do so you need infrastructure for this. Here is an ECS production-ready infrastructure diagram. 77 | 78 | ![ECS infra](img/ecs-infra.png) 79 | 80 | What are we creating: 81 | 82 | * VPC with a /16 ip address range and an internet gateway 83 | * We are choosing a region and a number of availability zones we want to use. For high-availability we need at least two 84 | * In every availability zone we are creating a private and a public subnet with a /24 ip address range 85 | * Public subnet convention is 10.x.0.x and 10.x.1.x etc.. 86 | * Private subnet convention is 10.x.50.x and 10.x.51.x etc.. 87 | * In the public subnet we place a NAT gateway and the LoadBalancer 88 | * The private subnets are used in the autoscale group which places instances in them 89 | * We create an ECS cluster where the instances connect to 90 | 91 | ## Terraform module 92 | 93 | To be able to create the stated infrastructure we are using Terraform. To allow everyone to use the infrastructure code, this repository contains the code as Terraform modules so it can be easily used by others. 94 | 95 | Creating one big module does not really give a benefit of modules. Therefore the ECS module itself consists of different modules. This way it is easier for others to make changes, swap modules or use pieces from this repository even if not setting up ECS. 96 | 97 | Details regarding how a module works or why it is setup is described in the module itself if needed. 98 | 99 | Modules need to be used to create infrastructure. For an example on how to use the modules to create a working ECS cluster see **ecs.tf** and **ecs.tfvars**. 100 | 101 | **Note:** You need to use Terraform CLI version 0.9.5 and above. 102 | 103 | ### Conventions 104 | 105 | These are the conventions we have in every module: 106 | 107 | * Contains main.tf where all the terraform code is 108 | * If main.tf is too big we create more *.tf files with proper names 109 | * [Optional] Contains outputs.tf with the output parameters 110 | * [Optional] Contains variables.tf which sets required attributes 111 | * For grouping in AWS we set the tag "Environment" everywhere where possible 112 | 113 | ### Module structure 114 | 115 | ![Terraform module structure](img/ecs-terraform-modules.png) 116 | 117 | ## Create it 118 | 119 | *You'll need to install Terraform CLI if you haven't already done so. The instructions can be found here: https://learn.hashicorp.com/tutorials/terraform/install-cli.* 120 | 121 | Make sure you've updated **ecs.tfvars** to indicate your aws profile and region before creating. 122 | 123 | ```bash 124 | terraform get 125 | terraform plan -input=false -var-file=ecs.tfvars 126 | terraform apply -input=false -var-file=ecs.tfvars 127 | ``` 128 | 129 | ### `terraform get` 130 | 131 | *The `terraform get` command is used to download and update modules mentioned in the root module (https://www.terraform.io/docs/commands/get.html).* 132 | 133 | **Note:** When installing a remote module, Terraform will download it into the .terraform directory in your configuration's root directory. When installing a local module, Terraform will instead refer directly to the source directory. Because of this, Terraform will automatically notice changes to local modules without having to re-run terraform init or terraform get. 134 | 135 | ### `terraform plan -input=false -var-file=ecs.tfvars` 136 | 137 | *The `terraform plan` command is used to create an execution plan. This command is a convenient way to check whether the execution plan for a set of changes matches your expectations without making any changes to real resources or to the state. For example, `terraform plan` might be run before committing a change to version control, to create confidence that it will behave as expected (https://www.terraform.io/docs/commands/plan.html).* 138 | 139 | `-input=false` specifies that we don't want prompted for input for variables not directly set. 140 | 141 | `-var-file=ecs.tfvars` specifies that we want to specify variables in our terraform configuration from the **ecs.tfvars** file. 142 | 143 | ### `terraform apply -input=false -var-file=ecs.tfvars` 144 | 145 | *The `terraform apply` command is used to apply the changes required to reach the desired state of the configuration. (https://www.terraform.io/docs/commands/apply.html).* 146 | 147 | `-input=false` specifies that we don't want prompted for input for variables not directly set. 148 | 149 | `-var-file=ecs.tfvars` specifies that we want to specify variables in our terraform configuration from the **ecs.tfvars** file. 150 | 151 | ## Must know 152 | 153 | ### SSH access to the instances 154 | 155 | You should not put your ECS instances directly on the internet. You should not allow SSH access to the instances directly but use a bastion server for that. Having SSH access to the acceptance environment is fine but you should not allow SSH access to production instances. You don't want to make any manual changes in the production environment. 156 | 157 | This ECS module allows you to use an AWS SSH key to be able to access the instances, for quick usage purposes the ecs.tf creates a new AWS SSH key. The private key can be found in the root of this repository with the name 'ecs_fake_private'. 158 | 159 | For a new method see issue [#1](https://github.com/arminc/terraform-ecs/issues/1). 160 | 161 | ### ECS configuration 162 | 163 | ECS is configured using the */etc/ecs/ecs.config* file as you can see [here](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-agent-config.html). There are two important configurations in this file. One is the ECS cluster name so that it can connect to the cluster, this should be specified from terraform because you want this to be variable. The other one is access to Docker Hub to be able to access private repositories. To do this safely use an S3 bucket that contains the Docker Hub configuration. See the *ecs_config* variable in the *ecs_instances* module for an example. 164 | 165 | ### Logging 166 | 167 | All the default system logs like Docker or ECS agent should go to CloudWatch as configured in this repository. The ECS container logs can be pushed to CloudWatch as well but it is better to push these logs to a service like [ElasticSearch](https://www.elastic.co/cloud). CloudWatch does support search and alerts but with ElasticSearch or other log services you can use more advanced search and grouping. See issue [#5](https://github.com/arminc/terraform-ecs/issues/5). 168 | 169 | The [ECS configuration](#ecs-configuration) as described here allows configuration of additional [Docker log drivers](https://docs.docker.com/engine/admin/logging/overview/) to be configured. For example fluentd as shown in the *ecs_logging* variable in the *ecs_instances* module. 170 | 171 | Be aware when creating two clusters in one AWS account on CloudWatch log group collision, [read the info](modules/ecs_instances/cloudwatch.tf). 172 | 173 | ### ECS instances 174 | 175 | Normally there is only one group of instances like configured in this repository. But it is possible to use the *ecs_instances* module to add more groups of different type of instances that can be used for different deployments. This makes it possible to have multiple different types of instances with different scaling options. 176 | 177 | ### LoadBalancer 178 | 179 | It is possible to use the Application LoadBalancer and the Classic LoadBalancer with this setup. The default configuration is Application LoadBalancer because that makes more sense in combination with ECS. There is also a concept of [Internal and External facing LoadBalancer](deployment/README.md#internal-vs-external). 180 | 181 | ### Using default 182 | 183 | The philosophy is that the modules should provide as much as possible of sane defaults. That way when using the modules it is possible to quickly configure them but still change when needed. That is also why we introduced something like a name 'default' as the default value for some of the components. Another reason behind it is that you don't need to come up with names when you probably might only have one cluster in your environment. 184 | 185 | Looking at [ecs.tf](ecs.tf) might give you a different impression, but there we configure more things than needed to show it can be done. 186 | 187 | ### ECS deployment strategies 188 | 189 | ECS has a lot of different ways to deploy or place a task in the cluster. You can have different placement strategies like random and binpack, see here for full [documentation](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-placement-strategies.html). Besides the placement strategies, it is also possible to specify constraints, as described [here](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-placement-constraints.html). The constraints allow for a more fine-grained placement of tasks on specific EC2 nodes, like *instance type* or custom attributes. 190 | 191 | What ECS does not have is a possibility to run a task on every EC2 node on boot, that's where [System containers and custom boot commands](#system-containers-and-custom-boot-commands) comes into place. 192 | 193 | ### System containers and custom boot commands 194 | 195 | In some cases, it is necessary to have a system 'service' running that does a particular task, like gathering metrics. It is possible to add an OS specific service when booting an EC2 node but that means you are not portable. A better option is to have the 'service' run in a container and run the container as a 'service', also called a System container. 196 | 197 | ECS has different [deployment strategies](#ecs-deployment-strategies) but it does not have an option to run a system container on every EC2 node on boot. It is possible to do this via ECS workaround or via Docker. 198 | 199 | #### ECS workaround 200 | 201 | The ECS workaround is described here [Running an Amazon ECS Task on Every Instance](https://aws.amazon.com/blogs/compute/running-an-amazon-ecs-task-on-every-instance/). It basically means use a Task definition and a custom boot script to start and register the task in ECS. This is awesome because it allows you to see the system container running in ECS console. The bad thing about it is that it does not restart the container when it crashes. It is possible to create a Lambda to listen to changes/exits of the system container and act on it. For example, start it again on the same EC2 node. See issue [#2](https://github.com/arminc/terraform-ecs/issues/2). 202 | 203 | #### Docker 204 | 205 | It is also possible to do the same thing by just running a docker run command on EC2 node on boot. To make sure the container keeps running we tell docker to restart the container on exit. The great thing about this method is that it is simple and you can use the 'errors' that can be caught in CloudWatch to alert when something bad happens. 206 | 207 | **Note:** Both of these methods have one big flaw and that is that you need to change the launch configuration and restart every EC2 node one by one to apply the changes. Most of the time this does not have to be a problem because the system containers don't change that often but is still an issue. It is possible to fix this in a better way with [Blox](https://blox.github.io/), but this also introduces more complexity. So it is a choice between simplicity and an explicit update flow or advanced usage with more complexity. 208 | 209 | Regardless which method you pick you will need to add a custom command on EC2 node on boot. This is already available in the module *ecs_instances* by using the *custom_userdata* variable. An example for Docker would look like this: 210 | 211 | ```bash 212 | docker run \ 213 | --name=cadvisor \ 214 | --detach=true \ 215 | --publish 9200:8080 \ 216 | --publish=8080:8080 \ 217 | --memory="300m" \ 218 | --privileged=true \ 219 | --restart=always \ 220 | --volume=/:/rootfs:ro \ 221 | --volume=/cgroup:/cgroup:ro \ 222 | --volume=/var/run:/var/run:rw \ 223 | --volume=/sys:/sys:ro \ 224 | --volume=/var/lib/docker:/var/lib/docker:ro \ 225 | --log-driver=awslogs \ 226 | --log-opt=awslogs-region=eu-west-1 \ 227 | --log-opt=awslogs-group=cadvisor \ 228 | --log-opt=awslogs-stream=${cluster_name}/$container_instance_id \ 229 | google/cadvisor:v0.24.1 230 | ``` 231 | 232 | ### EC2 node security and updates 233 | 234 | Because the EC2 nodes are created by us it means we need to make sure they are up to date and secure. It is possible to create an own AMI with your own OS, Docker, ECS agent and everything else. But it is much easier to use the [ECS optimized AMIs](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-optimized_AMI.html) which are maintained by AWS with a secure AWS Linux, regular security patches, recommended versions of ECS agent, Docker and more... 235 | 236 | To know when to update your EC2 node you can subscribe to AWS ECS AMI updates, like described [here](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ECS-AMI-SubscribeTopic.html). Note: We can not create a sample module for this because terraform does not support email protocol on SNS. 237 | 238 | If you need to perform an update you will need to update the information in the *ecs_instances* and then apply the changes on the cluster. This will only create a new *launch_configuration* but it will not touch the running instances. Therefore you need to replace your instances one by one. There are three ways to do this: 239 | 240 | Terminating the instances, but this may cause disruption to your application users. By terminating an instance a new one will be started with the new *launch_configuration*. 241 | 242 | Double the size of your cluster and your applications and when everything is up and running scale the cluster down. This might be a costly operation and you also need to specify or protect the new instances so that the AWS auto scale does not terminate the new instances instead of the old ones. 243 | 244 | The best option is to drain the containers from an ECS instance like described [here](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/container-instance-draining.html). Then you can terminate the instance without disrupting your application users. This can be done by doubling the EC2 nodes instances in your cluster or just by one and doing this slowly one by one. Currently, there is no automated/scripted way to do this. See issue [#3](https://github.com/arminc/terraform-ecs/issues/3). 245 | 246 | ### Service discovery 247 | 248 | ECS allows the use of [ALB and ELB](deployment/README.md#alb-vs-elb) facing [Internally or Externally](deployment/README.md#internal-vs-external) which allows for a simple but very effective service discovery. If you encounter the need to use external tools like consul etc... then you should ask yourself the question: Am I not making it too complex? 249 | 250 | Kubernetes and Mesos act like a big cluster where they encourage you to deploy all kinds of things on it. ECS can do the same but it makes sense to group your applications to domains or logical groups and create separate ECS clusters for them. This can be easily done because you are not paying for the master nodes. You can still be in the same AWS account and the same VPC but on a separate cluster with separate instances. 251 | 252 | ### ECS detect deployments failure 253 | 254 | When deploying manually we can see if the new container has started or is stuck in a start/stop loop. But when deploying automatically this is not visible. To make sure we get alerted when containers start failing we need to watch for events from ECS who state that a container has STOPPED. This can be done by using the module [ecs_events](modules/ecs_events/main.tf). The only thing that is missing from the module is the actual alert. This is because terraform can't handle email and all other protocols for *aws_sns_topic_subscription* are specific per customer. 255 | -------------------------------------------------------------------------------- /deployment/README.md: -------------------------------------------------------------------------------- 1 | # ECS Deployment 2 | 3 | * [What is needed for an deployment](#deployment) 4 | * [Initial deployment](#initial-deployment) 5 | * [How to deploy a new version?](#new-version-deployment) 6 | * [Expose the service to the outside world](#alb-vs-elb) 7 | * [Things you should know](#must-know) 8 | * [Task Definition global](#task-definition-global) 9 | * [Internal vs External](#internal-vs-external) 10 | * [Automated deployments](#automated-deployments) 11 | * [Container secrets](#container-secrets) 12 | 13 | ## Deployment 14 | 15 | Deployment process on ECS consists of two steps. 16 | The first step is registering a Task definition that holds the information about what container you want to start and what the requirements are. For instance memory, cpu or port. The second step is creating or updating a Service definition which defines a Service which eventually uses the Task Definition to start the containers on ECS and keeps them running. 17 | 18 | ![Deployment](../img/deployment.png) 19 | 20 | The Task definition documentation can be found [here](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definitions.html) and the Service definition documentation can be found [here](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/scheduling_tasks.html) 21 | 22 | ### Deployment user 23 | 24 | Most deployments will be automatically, this means we need an ECS deployment user. See the [users module](../modules/users/main.tf) on how to create a proper user with proper rights. Be careful with *iam:PassRole* as described in the module, you could misuse it to give a task in ECS Admin rights. 25 | 26 | ## Initial deployment 27 | 28 | [Automated process](#automated-deployments) 29 | 30 | ### Register a Task definition 31 | 32 | To deploy an application to ECS for the first time we need to register a Task definition, you can see an example [here](td-nginx.json): 33 | 34 | ```json 35 | { 36 | "family": "nginx", 37 | "containerDefinitions": [ 38 | { 39 | "name": "nginx", 40 | "image": "nginx:alpine", 41 | "memory": 128, 42 | "portMappings": [ 43 | { 44 | "containerPort": 80, 45 | "protocol": "tcp" 46 | } 47 | ] 48 | } 49 | ] 50 | } 51 | ``` 52 | 53 | We are defining a Nginx docker container with 128 MB of memory and we are specifying that the container is listening on port 80. You can look at the task definition as a predefinition of the Docker run command without actually executing the run. For all possible Task definition parameters have a look at the [documentation](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html). 54 | 55 | This is the AWS cli command to create the Task definition: 56 | 57 | ```bash 58 | aws ecs register-task-definition --cli-input-json file://td-nginx.json 59 | ``` 60 | 61 | ### Create a Service definition 62 | 63 | To actually run the container we need a Service, to create a service we need the Service definition like [here](service-create-nginx.json): 64 | 65 | ```json 66 | { 67 | "cluster": "test", 68 | "serviceName": "nginx", 69 | "taskDefinition": "@@TASKDEFINITION_ARN@@", 70 | "loadBalancers": [ 71 | { 72 | "targetGroupArn": "@@TARGET_GROUP_ARN@@", 73 | "containerName": "nginx", 74 | "containerPort": 80 75 | } 76 | ], 77 | "desiredCount": 1, 78 | "role": "/ecs/test_ecs_lb_role", 79 | "deploymentConfiguration": { 80 | "maximumPercent": 100, 81 | "minimumHealthyPercent": 0 82 | } 83 | } 84 | ``` 85 | 86 | While the Task definition is not aware of the environment the Service definition definitely is. That is why we are specifying the *cluster* name and the *role*. For service to know which Task definition to run we need to specify the *taskDefinition* arn. This can be found in AWS console under Task definitions or you will get it when [Registering a Task definition](#register-a-task-definition). 87 | 88 | We also need to provide a *targetGroupArn*, which is used to [expose the service to the outside world](#alb-vs-elb) 89 | 90 | For all possible Service definition parameters have a look at the [documentation](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/service_definition_paramters.html). 91 | 92 | This is the AWS cli command to create the Service definition: 93 | 94 | ```bash 95 | aws ecs create-service --cli-input-json file://service-create-nginx.json 96 | ``` 97 | 98 | ## New version deployment 99 | 100 | [Automated process](#automated-deployments) 101 | 102 | When we want to deploy a new version of a container we need to update the Task definition and register a new revision. This is done exactly as described in [Register a Task definition](#register-a-task-definition) 103 | 104 | ### Update a Service definition 105 | 106 | Because we already have a service we can not create a new one, we need to update it. That means we are telling the service to update our Task definition from revision X to revision Y. Therefore we just need to provide a small set of information to the service as it can be found [here](service-update-nginx.json): 107 | 108 | ```json 109 | { 110 | "cluster": "test", 111 | "service": "nginx", 112 | "taskDefinition": "@@TASKDEFINITION_ARN@@", 113 | "desiredCount": 1, 114 | "deploymentConfiguration": { 115 | "maximumPercent": 100, 116 | "minimumHealthyPercent": 0 117 | } 118 | } 119 | ``` 120 | 121 | This is the AWS cli command to update the Service definition: 122 | 123 | ```bash 124 | aws ecs update-service --cli-input-json file://service-update-nginx.json 125 | ``` 126 | 127 | ## ALB vs ELB 128 | 129 | The goal is not to deploy an application but to make it accessible to the outside world or the internal services. This can be done by using the ALB (Application LoadBalancer) or the ELB (Elastic LoadBalancer). The difference is that the ELB has no knowledge of containers, it just looks at the health of the EC2 node and exposes a predefined port of that node. 130 | 131 | ALB is 'container' aware, in the sense that the containers get registered to the ALB and that the ALB exposes containers to the outside world instead of the EC2 node. This also means that you can have multiple containers of the same type on one EC2 node. 132 | 133 | For a full overview have a look at the [documentation](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-load-balancing.html). 134 | 135 | This deployment example is looking at the ALB because that makes sense in a Docker platform world. It is good to know that the ALB consists of a Listener and a target group, as seen in the illustration below. The full documentation can be found [here](http://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html). 136 | 137 | ![Deployment](../img/alb.png) 138 | 139 | The listener is the actual port that is exposed to the outside world. For the listener to route traffic to something it uses a *context path* like */api* to target the target group. The target group allows containers or other resources to register them self so that they receive the traffic. Target group also checks the health of the containers and decides if they are healthy or not. 140 | 141 | ## Must know 142 | 143 | ### Task Definition global 144 | 145 | The Task definition is global on AWS. It means that when you create a Task definition with the name *test* you can not remove it. Even when you get rid of it in the UI the next time you create a Task definition with the name *test* it will have a revision number that is +1 of the previous version. 146 | 147 | ### Internal vs External 148 | 149 | AWS has a concept of having an external or internal facing LB (LoadBalancer), as can be read [here](http://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-internal-load-balancers.html) External facing means available on the internet and is probably the most used one. The internal one is not available on the internet but only inside your VPC. 150 | 151 | The internal facing LB can be very interesting for connecting microservices without using any complicated service discovery. 152 | 153 | ### Automated deployments 154 | 155 | Although it is posible to deploy manualy trough AWS console or executing the above commands by your self, it makes more sense to automate this process. It is posible to use one of the following scripts 156 | Or apply KISS and do something like this (deploy.sh can be found [here](deploy.sh): 157 | 158 | To do an initial deployment: 159 | 160 | ```bash 161 | CONTAINER_VERSION=nginx:alpine ./deploy.sh create 162 | ``` 163 | 164 | To do an update deployment: 165 | 166 | ```bash 167 | CONTAINER_VERSION=nginx:alpine ./deploy.sh update 168 | ``` 169 | 170 | ### Container secrets 171 | 172 | Almost all containers require some form of external values or secrets, like the database password or keys to another service. There are a lot of ways to do this, the simplest way when using ECS is by using AWS Parameter Store. Here is a [blog post](http://blog.coralic.nl/2017/03/22/docker-container-secrets-on-aws-ecs/) I wrote that describes different options and how to use AWS Parameter Store. 173 | 174 | To allow a task to access the Parameter Store you need a role that you can assing to your task. The [ecs roles module](../modules/ecs_roles/main.tf) can create such a role. 175 | -------------------------------------------------------------------------------- /deployment/deploy.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh -e 2 | 3 | #Usage: CONTAINER_VERSION=docker_container_version [create|update] 4 | 5 | # register task-definition 6 | sed TASKDEF.json 7 | aws ecs register-task-definition --cli-input-json file://TASKDEF.json > REGISTERED_TASKDEF.json 8 | TASKDEFINITION_ARN=$( < REGISTERED_TASKDEF.json jq .taskDefinition.taskDefinitionArn ) 9 | 10 | # create or update service 11 | sed "s,@@TASKDEFINITION_ARN@@,$TASKDEFINITION_ARN," SERVICEDEF.json 12 | aws ecs $0-service --cli-input-json file://SERVICEDEF.json | tee SERVICE.json 13 | -------------------------------------------------------------------------------- /deployment/service-create-nginx.json: -------------------------------------------------------------------------------- 1 | { 2 | "cluster": "test", 3 | "serviceName": "nginx", 4 | "taskDefinition": "@@TASKDEFINITION_ARN@@", 5 | "loadBalancers": [ 6 | { 7 | "targetGroupArn": "@@TARGET_GROUP_ARN@@", 8 | "containerName": "nginx", 9 | "containerPort": 80 10 | } 11 | ], 12 | "desiredCount": 1, 13 | "role": "/ecs/test_ecs_lb_role", 14 | "deploymentConfiguration": { 15 | "maximumPercent": 100, 16 | "minimumHealthyPercent": 0 17 | } 18 | } 19 | -------------------------------------------------------------------------------- /deployment/service-update-nginx.json: -------------------------------------------------------------------------------- 1 | { 2 | "cluster": "test", 3 | "service": "nginx", 4 | "taskDefinition": "@@TASKDEFINITION_ARN@@", 5 | "desiredCount": 1, 6 | "deploymentConfiguration": { 7 | "maximumPercent": 100, 8 | "minimumHealthyPercent": 0 9 | } 10 | } -------------------------------------------------------------------------------- /deployment/td-nginx.json: -------------------------------------------------------------------------------- 1 | { 2 | "family": "nginx", 3 | "containerDefinitions": [ 4 | { 5 | "name": "nginx", 6 | "image": "nginx:alpine", 7 | "memory": 128, 8 | "portMappings": [ 9 | { 10 | "containerPort": 80, 11 | "protocol": "tcp" 12 | } 13 | ] 14 | } 15 | ] 16 | } 17 | -------------------------------------------------------------------------------- /deployment/td-nginx.template: -------------------------------------------------------------------------------- 1 | { 2 | "family": "nginx", 3 | "containerDefinitions": [ 4 | { 5 | "name": "nginx", 6 | "image": @VERSION@, 7 | "memory": 128, 8 | "portMappings": [ 9 | { 10 | "containerPort": 80, 11 | "protocol": "tcp" 12 | } 13 | ] 14 | } 15 | ] 16 | } 17 | -------------------------------------------------------------------------------- /ecs.tf: -------------------------------------------------------------------------------- 1 | provider "aws" { 2 | region = var.aws_region 3 | profile = var.aws_profile 4 | } 5 | 6 | module "ecs" { 7 | source = "./modules/ecs" 8 | 9 | environment = var.environment 10 | cluster = var.environment 11 | cloudwatch_prefix = "${var.environment}" #See ecs_instances module when to set this and when not! 12 | vpc_cidr = var.vpc_cidr 13 | public_subnet_cidrs = var.public_subnet_cidrs 14 | private_subnet_cidrs = var.private_subnet_cidrs 15 | availability_zones = var.availability_zones 16 | max_size = var.max_size 17 | min_size = var.min_size 18 | desired_capacity = var.desired_capacity 19 | key_name = aws_key_pair.ecs.key_name 20 | instance_type = var.instance_type 21 | ecs_aws_ami = var.aws_ecs_ami 22 | } 23 | 24 | resource "aws_key_pair" "ecs" { 25 | key_name = "ecs-key-${var.environment}" 26 | public_key = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCtMljjj0Ccxux5Mssqraa/iHHxheW+m0Rh17fbd8t365y9EwBn00DN/0PjdU2CK6bjxwy8BNGXWoUXiSDDtGqRupH6e9J012yE5kxhpXnnkIcLGjkAiflDBVV4sXS4b3a2LSXL5Dyb93N2GdnJ03FJM4qDJ8lfDQxb38eYHytZkmxW14xLoyW5Hbyr3SXhdHC2/ecdp5nLNRwRWiW6g9OA6jTQ3LgeOZoM6dK4ltJUQOakKjiHsE+jvmO0hJYQN7+5gYOw0HHsM+zmATvSipAWzoWBWcmBxAbcdW0R0KvCwjylCyRVbRMRbSZ/c4idZbFLZXRb7ZJkqNJuy99+ld41 ecs@aws.fake" 27 | } 28 | 29 | variable "environment" { 30 | description = "A name to describe the environment we're creating." 31 | } 32 | variable "aws_profile" { 33 | description = "The AWS-CLI profile for the account to create resources in." 34 | } 35 | variable "aws_region" { 36 | description = "The AWS region to create resources in." 37 | } 38 | variable "aws_ecs_ami" { 39 | description = "The AMI to seed ECS instances with." 40 | } 41 | variable "vpc_cidr" { 42 | description = "The IP range to attribute to the virtual network." 43 | } 44 | variable "public_subnet_cidrs" { 45 | description = "The IP ranges to use for the public subnets in your VPC." 46 | type = list 47 | } 48 | variable "private_subnet_cidrs" { 49 | description = "The IP ranges to use for the private subnets in your VPC." 50 | type = list 51 | } 52 | variable "availability_zones" { 53 | description = "The AWS availability zones to create subnets in." 54 | type = list 55 | } 56 | variable "max_size" { 57 | description = "Maximum number of instances in the ECS cluster." 58 | } 59 | variable "min_size" { 60 | description = "Minimum number of instances in the ECS cluster." 61 | } 62 | variable "desired_capacity" { 63 | description = "Ideal number of instances in the ECS cluster." 64 | } 65 | variable "instance_type" { 66 | description = "Size of instances in the ECS cluster." 67 | } 68 | 69 | output "default_alb_target_group" { 70 | value = module.ecs.default_alb_target_group 71 | } 72 | -------------------------------------------------------------------------------- /ecs.tfvars: -------------------------------------------------------------------------------- 1 | # A name to describe the environment we're creating. 2 | environment = "acc" 3 | 4 | # The AWS-CLI profile for the account to create resources in. 5 | aws_profile = "default" 6 | 7 | # The AWS region to create resources in. 8 | aws_region = "eu-west-1" 9 | 10 | # The AMI to seed ECS instances with. 11 | # Leave empty to use the latest Linux 2 ECS-optimized AMI by Amazon. 12 | aws_ecs_ami = "" 13 | 14 | # The IP range to attribute to the virtual network. 15 | # The allowed block size is between a /16 (65,536 addresses) and /28 (16 addresses). 16 | vpc_cidr = "10.0.0.0/16" 17 | 18 | # The IP ranges to use for the public subnets in your VPC. 19 | # Must be within the IP range of your VPC. 20 | public_subnet_cidrs = ["10.0.0.0/24", "10.0.1.0/24"] 21 | 22 | # The IP ranges to use for the private subnets in your VPC. 23 | # Must be within the IP range of your VPC. 24 | private_subnet_cidrs = ["10.0.50.0/24", "10.0.51.0/24"] 25 | 26 | # The AWS availability zones to create subnets in. 27 | # For high-availability, we need at least two. 28 | availability_zones = ["eu-west-1a", "eu-west-1b"] 29 | 30 | # Maximum number of instances in the ECS cluster. 31 | max_size = 1 32 | 33 | # Minimum number of instances in the ECS cluster. 34 | min_size = 1 35 | 36 | # Ideal number of instances in the ECS cluster. 37 | desired_capacity = 1 38 | 39 | # Size of instances in the ECS cluster. 40 | instance_type = "t2.micro" 41 | -------------------------------------------------------------------------------- /ecs_fake_private: -------------------------------------------------------------------------------- 1 | -----BEGIN RSA PRIVATE KEY----- 2 | MIIEpQIBAAKCAQEArTJY449AnMbseTLLKq2mv4hx8YXlvptEYde323fLd+ucvRMA 3 | Z9NAzf9D43VNgium48cMvATRl1qFF4kgw7RqkbqR+nvSdNdshOZMYaV555CHCxo5 4 | AIn5QwVVeLF0uG92ti0ly+Q8m/dzdhnZydNxSTOKgyfJXw0MW9/HmB8rWZJsVteM 5 | S6MluR28q90l4XRwtv3nHaeZyzUcEVoluoPTgOo00Ny4HjmaDOnSuJbSVEDmpCo4 6 | h7BPo75jtISWEDe/uYGDsNBx7DPs5gE70oqQFs6FgVnJgcQG3HVtEdCrwsI8pQsk 7 | VW0TEW0mf3OInWWxS2V0W+2SZKjSbsvffpXeNQIDAQABAoIBAE/wohh+YTs7kaAr 8 | Mp0DQ6i56KWqwKzma3yhnan9s/so09JtN820MwAhpwsQdHL1hPUzRYxuyPKMBEwl 9 | rerGlj2nGIO9rRji6aK5zV1wjEC2c65LLy4xgMxPZPDtL0uFnwxc8EoYkWUHpNJJ 10 | Aj3miy5XTMJWldp6Yk7xjeWH1XFkoiXTTSle8aWp5+NSxqvUQ8A+SfsO2m8kpBWB 11 | 1PIbdr9Y8DqQqRR8LrakTAsW3gEWtlg0Zr+X6yq2xaEVVbN99Eyc9OeGEhyFqyqh 12 | oC7YaYzcCsAvu6aMrv3RDDtH0r4OjbhnJ4zptUjlXiTgUJZ53cXq66ZA8b1yQJxu 13 | WKDj+kECgYEA4bBbCddJvLoI8yrDvKx0r9crZh3lWuZyCr2LbWdVbS1Eyfkhlc0f 14 | zjpVPVNjk+Qe3ghWeHK4ktRnzeGdRB7oxWTX5vCDXZBsvWv2eJTRSq1WUasb/AIe 15 | CHFLdcN0wWpndGk93dnfMe0VTP48z/1MPu9wfZJmvbMCBom7uxwa8pECgYEAxHUz 16 | tc+wxBivkoawD/4S+gVwvIRcIDkwatcYm4AKZiI5cLy56O+R5OrI5YdvIuqazoTN 17 | HZuInoPhv9gkfj1pWcF2PKpoRBxXMi6wPjba6JhQ8g0lVlpU8lq+xarTiFOfCfMq 18 | CiYQ/OvSNWltuAjk3QnMwEZlAQDdutpQtwog+2UCgYEAhuxzwLJgdt+RMi7Czi0b 19 | pXQxkd8Vmv5h92HR1RoNzDNgCI9UMMZs2VGuW/dadLPQcFTzvRZ4me86D69t4afI 20 | 6RmcqYfoQStyltvQgc4WQVrXXAO7uzFY2xtATasIRgliyAmS3uq9sI9YSKtFl+KK 21 | jqV+ztOTcJ1v/JCjFv16bsECgYEAgfOhA/fhTqWhpcQQPFPti5MDsr+/DNWnkFS+ 22 | A0ZcGpky877zHwExuYIQ57uBqVAUnN09rQMQCQLl1ngME7FdubB+HL0AAWXZy+kU 23 | TeMNRORUTvihJRDVtgaOwMQx7rCZuAQwX8w0WolHYGtf12eStB/iX6Fw+IvxH8N/ 24 | tsQtcv0CgYEAitjJle6599YvnHUhGOlfMacUtFw5GU1fBIuvYEH9mQWzyWUkQ74t 25 | zG4dC0CI4na3OMX/xb/4J9ULpt6RUbpbZC6svXWAMFDiAK0rt4wG1YbmXuff87d/ 26 | rJgsYqm7XOjfKp1uV276cga2GLRgvgx7ERuB0KAGh4HP4/QhZb0eHKE= 27 | -----END RSA PRIVATE KEY----- 28 | -------------------------------------------------------------------------------- /img/alb.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arminc/terraform-ecs/38f562cee06f95048565c557a7b698ab5e31da94/img/alb.png -------------------------------------------------------------------------------- /img/deployment.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arminc/terraform-ecs/38f562cee06f95048565c557a7b698ab5e31da94/img/deployment.png -------------------------------------------------------------------------------- /img/ecs-infra.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arminc/terraform-ecs/38f562cee06f95048565c557a7b698ab5e31da94/img/ecs-infra.png -------------------------------------------------------------------------------- /img/ecs-terraform-modules.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arminc/terraform-ecs/38f562cee06f95048565c557a7b698ab5e31da94/img/ecs-terraform-modules.png -------------------------------------------------------------------------------- /modules/alb/main.tf: -------------------------------------------------------------------------------- 1 | # Default ALB implementation that can be used connect ECS instances to it 2 | 3 | resource "aws_alb_target_group" "default" { 4 | name = "${var.alb_name}-default" 5 | port = 80 6 | protocol = "HTTP" 7 | vpc_id = var.vpc_id 8 | deregistration_delay = var.deregistration_delay 9 | 10 | health_check { 11 | path = var.health_check_path 12 | protocol = "HTTP" 13 | } 14 | 15 | tags = { 16 | Environment = var.environment 17 | } 18 | } 19 | 20 | resource "aws_alb" "alb" { 21 | name = var.alb_name 22 | subnets = var.public_subnet_ids 23 | security_groups = ["${aws_security_group.alb.id}"] 24 | 25 | tags = { 26 | Environment = var.environment 27 | } 28 | } 29 | 30 | resource "aws_alb_listener" "http" { 31 | load_balancer_arn = aws_alb.alb.id 32 | port = "80" 33 | protocol = "HTTP" 34 | 35 | default_action { 36 | target_group_arn = aws_alb_target_group.default.id 37 | type = "forward" 38 | } 39 | } 40 | 41 | resource "aws_security_group" "alb" { 42 | name = "${var.alb_name}_alb" 43 | vpc_id = var.vpc_id 44 | 45 | tags = { 46 | Environment = var.environment 47 | } 48 | } 49 | 50 | resource "aws_security_group_rule" "https_from_anywhere" { 51 | type = "ingress" 52 | from_port = 80 53 | to_port = 80 54 | protocol = "TCP" 55 | cidr_blocks = var.allow_cidr_block 56 | security_group_id = aws_security_group.alb.id 57 | } 58 | 59 | resource "aws_security_group_rule" "outbound_internet_access" { 60 | type = "egress" 61 | from_port = 0 62 | to_port = 0 63 | protocol = "-1" 64 | cidr_blocks = ["0.0.0.0/0"] 65 | security_group_id = aws_security_group.alb.id 66 | } 67 | -------------------------------------------------------------------------------- /modules/alb/outputs.tf: -------------------------------------------------------------------------------- 1 | output "alb_security_group_id" { 2 | value = aws_security_group.alb.id 3 | } 4 | 5 | output "default_alb_target_group" { 6 | value = aws_alb_target_group.default.arn 7 | } 8 | -------------------------------------------------------------------------------- /modules/alb/variables.tf: -------------------------------------------------------------------------------- 1 | variable "alb_name" { 2 | default = "default" 3 | description = "The name of the loadbalancer" 4 | } 5 | 6 | variable "environment" { 7 | description = "The name of the environment" 8 | } 9 | 10 | variable "public_subnet_ids" { 11 | type = list 12 | description = "List of public subnet ids to place the loadbalancer in" 13 | } 14 | 15 | variable "vpc_id" { 16 | description = "The VPC id" 17 | } 18 | 19 | variable "deregistration_delay" { 20 | default = "300" 21 | description = "The default deregistration delay" 22 | } 23 | 24 | variable "health_check_path" { 25 | default = "/" 26 | description = "The default health check path" 27 | } 28 | 29 | variable "allow_cidr_block" { 30 | default = ["0.0.0.0/0"] 31 | description = "Specify cidr block that is allowed to access the LoadBalancer" 32 | } 33 | -------------------------------------------------------------------------------- /modules/ecs/alb.tf: -------------------------------------------------------------------------------- 1 | module "alb" { 2 | source = "../alb" 3 | 4 | environment = var.environment 5 | alb_name = "${var.environment}-${var.cluster}" 6 | vpc_id = module.network.vpc_id 7 | public_subnet_ids = module.network.public_subnet_ids 8 | } 9 | 10 | resource "aws_security_group_rule" "alb_to_ecs" { 11 | type = "ingress" 12 | from_port = 32768 13 | to_port = 61000 14 | protocol = "TCP" 15 | source_security_group_id = module.alb.alb_security_group_id 16 | security_group_id = module.ecs_instances.ecs_instance_security_group_id 17 | } 18 | -------------------------------------------------------------------------------- /modules/ecs/instance_policy.tf: -------------------------------------------------------------------------------- 1 | # Why we need ECS instance policies http://docs.aws.amazon.com/AmazonECS/latest/developerguide/instance_IAM_role.html 2 | # ECS roles explained here http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs_managed_policies.html 3 | # Some other ECS policy examples http://docs.aws.amazon.com/AmazonECS/latest/developerguide/IAMPolicyExamples.html 4 | 5 | resource "aws_iam_role" "ecs_instance_role" { 6 | name = "${var.environment}_ecs_instance_role" 7 | 8 | assume_role_policy = <> /etc/ecs/ecs.config 13 | 14 | # Inject the CloudWatch Logs configuration file contents 15 | cat > /etc/awslogs/awslogs.conf <<- EOF 16 | [general] 17 | state_file = /var/lib/awslogs/agent-state 18 | 19 | [/var/log/dmesg] 20 | file = /var/log/dmesg 21 | log_group_name = ${cloudwatch_prefix}/var/log/dmesg 22 | log_stream_name = ${cluster_name}/{container_instance_id} 23 | 24 | [/var/log/messages] 25 | file = /var/log/messages 26 | log_group_name = ${cloudwatch_prefix}/var/log/messages 27 | log_stream_name = ${cluster_name}/{container_instance_id} 28 | datetime_format = %b %d %H:%M:%S 29 | 30 | [/var/log/docker] 31 | file = /var/log/docker 32 | log_group_name = ${cloudwatch_prefix}/var/log/docker 33 | log_stream_name = ${cluster_name}/{container_instance_id} 34 | datetime_format = %Y-%m-%dT%H:%M:%S.%f 35 | 36 | [/var/log/ecs/ecs-init.log] 37 | file = /var/log/ecs/ecs-init.log.* 38 | log_group_name = ${cloudwatch_prefix}/var/log/ecs/ecs-init.log 39 | log_stream_name = ${cluster_name}/{container_instance_id} 40 | datetime_format = %Y-%m-%dT%H:%M:%SZ 41 | 42 | [/var/log/ecs/ecs-agent.log] 43 | file = /var/log/ecs/ecs-agent.log.* 44 | log_group_name = ${cloudwatch_prefix}/var/log/ecs/ecs-agent.log 45 | log_stream_name = ${cluster_name}/{container_instance_id} 46 | datetime_format = %Y-%m-%dT%H:%M:%SZ 47 | 48 | [/var/log/ecs/audit.log] 49 | file = /var/log/ecs/audit.log.* 50 | log_group_name = ${cloudwatch_prefix}/var/log/ecs/audit.log 51 | log_stream_name = ${cluster_name}/{container_instance_id} 52 | datetime_format = %Y-%m-%dT%H:%M:%SZ 53 | 54 | EOF 55 | 56 | # Set the region to send CloudWatch Logs data to (the region where the container instance is located) 57 | # Get availability zone where the container instance is located and remove the trailing character to give us the region. 58 | # https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 59 | region=$(curl 169.254.169.254/latest/meta-data/placement/availability-zone | sed s'/.$//') 60 | # Replace the default log region with the region where the container instance is located. 61 | # https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/QuickStartEC2Instance.html#running-ec2-step-2 62 | sed -i -e "s/region = us-east-1/region = $region/g" /etc/awslogs/awscli.conf 63 | 64 | # Set the ip address of the node 65 | # Get the ipv4 of the container instance 66 | # https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 67 | container_instance_id=$(curl 169.254.169.254/latest/meta-data/local-ipv4) 68 | # Replace "{container_instance_id}" with ipv4 of container instance 69 | sed -i -e "s/{container_instance_id}/$container_instance_id/g" /etc/awslogs/awslogs.conf 70 | 71 | cat > /etc/init/awslogjob.conf <<- EOF 72 | #upstart-job 73 | description "Configure and start CloudWatch Logs agent on Amazon ECS container instance" 74 | author "Amazon Web Services" 75 | start on started ecs 76 | 77 | script 78 | exec 2>>/var/log/ecs/cloudwatch-logs-start.log 79 | set -x 80 | 81 | until curl -s http://localhost:51678/v1/metadata 82 | do 83 | sleep 1 84 | done 85 | 86 | service awslogs start 87 | chkconfig awslogs on 88 | end script 89 | 90 | EOF 91 | 92 | start ecs 93 | 94 | # Custom userdata script code 95 | ${custom_userdata} 96 | 97 | echo "Done" 98 | -------------------------------------------------------------------------------- /modules/ecs_instances/variables.tf: -------------------------------------------------------------------------------- 1 | variable "environment" { 2 | description = "The name of the environment" 3 | } 4 | 5 | variable "cloudwatch_prefix" { 6 | default = "" 7 | description = "If you want to avoid cloudwatch collision or you don't want to merge all logs to one log group specify a prefix" 8 | } 9 | 10 | variable "cluster" { 11 | description = "The name of the cluster" 12 | } 13 | 14 | variable "instance_group" { 15 | default = "default" 16 | description = "The name of the instances that you consider as a group" 17 | } 18 | 19 | variable "vpc_id" { 20 | description = "The VPC id" 21 | } 22 | 23 | variable "aws_ami" { 24 | description = "The AWS ami id to use" 25 | } 26 | 27 | variable "instance_type" { 28 | default = "t2.micro" 29 | description = "AWS instance type to use" 30 | } 31 | 32 | variable "max_size" { 33 | default = 1 34 | description = "Maximum size of the nodes in the cluster" 35 | } 36 | 37 | variable "min_size" { 38 | default = 1 39 | description = "Minimum size of the nodes in the cluster" 40 | } 41 | 42 | #For more explenation see http://docs.aws.amazon.com/autoscaling/latest/userguide/WhatIsAutoScaling.html 43 | variable "desired_capacity" { 44 | default = 1 45 | description = "The desired capacity of the cluster" 46 | } 47 | 48 | variable "iam_instance_profile_id" { 49 | description = "The id of the instance profile that should be used for the instances" 50 | } 51 | 52 | variable "private_subnet_ids" { 53 | type = list 54 | description = "The list of private subnets to place the instances in" 55 | } 56 | 57 | variable "load_balancers" { 58 | type = list 59 | default = [] 60 | description = "The load balancers to couple to the instances. Only used when NOT using ALB" 61 | } 62 | 63 | variable "depends_id" { 64 | description = "Workaround to wait for the NAT gateway to finish before starting the instances" 65 | } 66 | 67 | variable "key_name" { 68 | description = "SSH key name to be used" 69 | } 70 | 71 | variable "custom_userdata" { 72 | default = "" 73 | description = "Inject extra command in the instance template to be run on boot" 74 | } 75 | 76 | variable "ecs_config" { 77 | default = "echo '' > /etc/ecs/ecs.config" 78 | description = "Specify ecs configuration or get it from S3. Example: aws s3 cp s3://some-bucket/ecs.config /etc/ecs/ecs.config" 79 | } 80 | 81 | variable "ecs_logging" { 82 | default = "[\"json-file\",\"awslogs\"]" 83 | description = "Adding logging option to ECS that the Docker containers can use. It is possible to add fluentd as well" 84 | } 85 | -------------------------------------------------------------------------------- /modules/ecs_roles/aws_caller_identity.json: -------------------------------------------------------------------------------- 1 | { 2 | "Version": "2012-10-17", 3 | "Statement": [ 4 | { 5 | "Action": ["ssm:DescribeParameters"], 6 | "Effect": "Allow", 7 | "Resource": "*" 8 | }, 9 | { 10 | "Action": ["ssm:GetParameters"], 11 | "Effect": "Allow", 12 | "Resource": "arn:aws:ssm:$${aws_region}:$${account_id}:parameter/$${prefix}*" 13 | } 14 | ] 15 | } -------------------------------------------------------------------------------- /modules/ecs_roles/ecs_default_task.json: -------------------------------------------------------------------------------- 1 | { 2 | "Version": "2012-10-17", 3 | "Statement": [ 4 | { 5 | "Effect": "Allow", 6 | "Principal": { 7 | "Service": ["ecs-tasks.amazonaws.com"] 8 | }, 9 | "Action": "sts:AssumeRole" 10 | } 11 | ] 12 | } -------------------------------------------------------------------------------- /modules/ecs_roles/main.tf: -------------------------------------------------------------------------------- 1 | resource "aws_iam_role" "ecs_default_task" { 2 | name = "${var.environment}_${var.cluster}_default_task" 3 | path = "/ecs/" 4 | 5 | assume_role_policy = "${file("ecs_default_task.json")}" 6 | } 7 | 8 | data "aws_caller_identity" "current" {} 9 | 10 | data "aws_region" "current" { 11 | current = true 12 | } 13 | 14 | data "template_file" "policy" { 15 | template = "${file("aws_caller_identity.json")}" 16 | 17 | vars { 18 | account_id = data.aws_caller_identity.current.account_id 19 | prefix = var.prefix 20 | aws_region = data.aws_region.current.name 21 | } 22 | } 23 | 24 | resource "aws_iam_policy" "ecs_default_task" { 25 | name = "${var.environment}_${var.cluster}_ecs_default_task" 26 | path = "/" 27 | 28 | policy = data.template_file.policy.rendered 29 | } 30 | 31 | resource "aws_iam_policy_attachment" "ecs_default_task" { 32 | name = "${var.environment}_${var.cluster}_ecs_default_task" 33 | roles = ["${aws_iam_role.ecs_default_task.name}"] 34 | policy_arn = aws_iam_policy.ecs_default_task.arn 35 | } 36 | -------------------------------------------------------------------------------- /modules/ecs_roles/variables.tf: -------------------------------------------------------------------------------- 1 | variable "environment" { 2 | description = "The name of the environment" 3 | } 4 | 5 | variable "cluster" { 6 | default = "default" 7 | description = "The name of the ECS cluster" 8 | } 9 | 10 | variable "prefix" { 11 | default = "" 12 | description = "The prefix of the parameters this role should be able to access" 13 | } 14 | -------------------------------------------------------------------------------- /modules/nat_gateway/main.tf: -------------------------------------------------------------------------------- 1 | # Using the AWS NAT Gateway service instead of a nat instance, it's more expensive but easier 2 | # See comparison http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-nat-comparison.html 3 | 4 | resource "aws_nat_gateway" "nat" { 5 | allocation_id = element(aws_eip.nat.*.id, count.index) 6 | subnet_id = element(var.subnet_ids, count.index) 7 | count = var.subnet_count 8 | } 9 | 10 | resource "aws_eip" "nat" { 11 | vpc = true 12 | count = var.subnet_count 13 | } 14 | -------------------------------------------------------------------------------- /modules/nat_gateway/outputs.tf: -------------------------------------------------------------------------------- 1 | output "ids" { 2 | value = "${aws_nat_gateway.nat.*.id}" 3 | } 4 | -------------------------------------------------------------------------------- /modules/nat_gateway/variables.tf: -------------------------------------------------------------------------------- 1 | variable "subnet_ids" { 2 | type = list 3 | description = "List of subnets in which to place the NAT Gateway" 4 | } 5 | 6 | variable "subnet_count" { 7 | description = "Size of the subnet_ids. This needs to be provided because: value of 'count' cannot be computed" 8 | } 9 | -------------------------------------------------------------------------------- /modules/network/main.tf: -------------------------------------------------------------------------------- 1 | module "vpc" { 2 | source = "../vpc" 3 | 4 | cidr = var.vpc_cidr 5 | environment = var.environment 6 | } 7 | 8 | module "private_subnet" { 9 | source = "../subnet" 10 | 11 | name = "${var.environment}_private_subnet" 12 | environment = var.environment 13 | vpc_id = module.vpc.id 14 | cidrs = var.private_subnet_cidrs 15 | availability_zones = var.availability_zones 16 | } 17 | 18 | module "public_subnet" { 19 | source = "../subnet" 20 | 21 | name = "${var.environment}_public_subnet" 22 | environment = var.environment 23 | vpc_id = module.vpc.id 24 | cidrs = var.public_subnet_cidrs 25 | availability_zones = var.availability_zones 26 | } 27 | 28 | module "nat" { 29 | source = "../nat_gateway" 30 | 31 | subnet_ids = module.public_subnet.ids 32 | subnet_count = length(var.public_subnet_cidrs) 33 | } 34 | 35 | resource "aws_route" "public_igw_route" { 36 | count = length(var.public_subnet_cidrs) 37 | route_table_id = element(module.public_subnet.route_table_ids, count.index) 38 | gateway_id = module.vpc.igw 39 | destination_cidr_block = var.destination_cidr_block 40 | } 41 | 42 | resource "aws_route" "private_nat_route" { 43 | count = length(var.private_subnet_cidrs) 44 | route_table_id = element(module.private_subnet.route_table_ids, count.index) 45 | nat_gateway_id = element(module.nat.ids, count.index) 46 | destination_cidr_block = var.destination_cidr_block 47 | } 48 | 49 | # Creating a NAT Gateway takes some time. Some services need the internet (NAT Gateway) before proceeding. 50 | # Therefore we need a way to depend on the NAT Gateway in Terraform and wait until is finished. 51 | # Currently Terraform does not allow module dependency to wait on. 52 | # Therefore we use a workaround described here: https://github.com/hashicorp/terraform/issues/1178#issuecomment-207369534 53 | 54 | resource "null_resource" "dummy_dependency" { 55 | depends_on = [module.nat] 56 | } 57 | -------------------------------------------------------------------------------- /modules/network/outputs.tf: -------------------------------------------------------------------------------- 1 | output "vpc_id" { 2 | value = module.vpc.id 3 | } 4 | 5 | output "vpc_cidr" { 6 | value = module.vpc.cidr_block 7 | } 8 | 9 | output "private_subnet_ids" { 10 | value = module.private_subnet.ids 11 | } 12 | 13 | output "public_subnet_ids" { 14 | value = module.public_subnet.ids 15 | } 16 | 17 | output "depends_id" { 18 | value = null_resource.dummy_dependency.id 19 | } 20 | -------------------------------------------------------------------------------- /modules/network/variables.tf: -------------------------------------------------------------------------------- 1 | variable "vpc_cidr" { 2 | description = "VPC cidr block. Example: 10.0.0.0/16" 3 | } 4 | 5 | variable "environment" { 6 | description = "The name of the environment" 7 | } 8 | 9 | variable "destination_cidr_block" { 10 | default = "0.0.0.0/0" 11 | description = "Specify all traffic to be routed either trough Internet Gateway or NAT to access the internet" 12 | } 13 | 14 | variable "private_subnet_cidrs" { 15 | type = list 16 | description = "List of private cidrs, for every availability zone you want you need one. Example: 10.0.0.0/24 and 10.0.1.0/24" 17 | } 18 | 19 | variable "public_subnet_cidrs" { 20 | type = list 21 | description = "List of public cidrs, for every availability zone you want you need one. Example: 10.0.0.0/24 and 10.0.1.0/24" 22 | } 23 | 24 | variable "availability_zones" { 25 | type = list 26 | description = "List of availability zones you want. Example: eu-west-1a and eu-west-1b" 27 | } 28 | 29 | variable "depends_id" {} 30 | -------------------------------------------------------------------------------- /modules/subnet/main.tf: -------------------------------------------------------------------------------- 1 | # Module that allows creating a subnet inside a VPC. This module can be used to create 2 | # either a private or public-facing subnet. 3 | 4 | resource "aws_subnet" "subnet" { 5 | vpc_id = var.vpc_id 6 | cidr_block = element(var.cidrs, count.index) 7 | availability_zone = element(var.availability_zones, count.index) 8 | count = length(var.cidrs) 9 | 10 | tags = { 11 | Name = "${var.name}_${element(var.availability_zones, count.index)}" 12 | Environment = var.environment 13 | } 14 | } 15 | 16 | # We are creating one more subnets that we want to address as one, therefore we create a 17 | # routing table and add all the subnets to it. This allows us to easily create routing to 18 | # all the subnets at once. For example, when creating a route to the Internet Gateway. 19 | resource "aws_route_table" "subnet" { 20 | vpc_id = var.vpc_id 21 | count = length(var.cidrs) 22 | 23 | tags = { 24 | Name = "${var.name}_${element(var.availability_zones, count.index)}" 25 | Environment = var.environment 26 | } 27 | } 28 | 29 | resource "aws_route_table_association" "subnet" { 30 | subnet_id = element(aws_subnet.subnet.*.id, count.index) 31 | route_table_id = element(aws_route_table.subnet.*.id, count.index) 32 | count = length(var.cidrs) 33 | } 34 | -------------------------------------------------------------------------------- /modules/subnet/outputs.tf: -------------------------------------------------------------------------------- 1 | output "ids" { 2 | value = "${aws_subnet.subnet.*.id}" 3 | } 4 | 5 | output "route_table_ids" { 6 | value = "${aws_route_table.subnet.*.id}" 7 | } 8 | -------------------------------------------------------------------------------- /modules/subnet/variables.tf: -------------------------------------------------------------------------------- 1 | variable "name" { 2 | description = "Name of the subnet, actual name will be, for example: name_eu-west-1a" 3 | } 4 | 5 | variable "environment" { 6 | description = "The name of the environment" 7 | } 8 | 9 | variable "cidrs" { 10 | type = list 11 | description = "List of cidrs, for every availability zone you want you need one. Example: 10.0.0.0/24 and 10.0.1.0/24" 12 | } 13 | 14 | variable "availability_zones" { 15 | type = list 16 | description = "List of availability zones you want. Example: eu-west-1a and eu-west-1b" 17 | } 18 | 19 | variable "vpc_id" { 20 | description = "VPC id to place subnet into" 21 | } 22 | -------------------------------------------------------------------------------- /modules/users/ecs_deployer.json: -------------------------------------------------------------------------------- 1 | { 2 | "Version": "2012-10-17", 3 | "Statement": [ 4 | { 5 | "Effect": "Allow", 6 | "Action": [ 7 | "ecs:RegisterTaskDefinition", 8 | "ecs:DescribeTaskDefinitions", 9 | "ecs:ListTaskDefinitions", 10 | "ecs:CreateService", 11 | "ecs:UpdateService", 12 | "ecs:DescribeServices", 13 | "ecs:ListServices" 14 | ], 15 | "Resource": "*" 16 | }, 17 | { 18 | "Effect": "Allow", 19 | "Action": ["iam:PassRole"], 20 | "Resource": "arn:aws:iam::*:role/ecs/*" 21 | } 22 | ] 23 | } -------------------------------------------------------------------------------- /modules/users/main.tf: -------------------------------------------------------------------------------- 1 | resource "aws_iam_user" "ecs_deployer" { 2 | name = "ecs_deployer" 3 | path = "/ecs/" 4 | } 5 | 6 | # The most important part is the iam:PassRole. With that, this user can give roles to ECS tasks. 7 | # In theory the user can give the task Admin rights. To make sure that does not happen we restrict 8 | # the user and allow him only to hand out roles in /ecs/ path. You still need to be careful not 9 | # to have any roles in there with full admin rights, but no ECS task should have these rights! 10 | resource "aws_iam_user_policy" "ecs_deployer_policy" { 11 | name = "ecs_deployer_policy" 12 | user = aws_iam_user.ecs_deployer.name 13 | 14 | policy = "${file("ecs_deployer.json")}" 15 | } 16 | 17 | resource "aws_iam_access_key" "ecs_deployer" { 18 | user = aws_iam_user.ecs_deployer.name 19 | } 20 | -------------------------------------------------------------------------------- /modules/users/outputs.tf: -------------------------------------------------------------------------------- 1 | output "ecs_deployer_access_key" { 2 | value = aws_iam_access_key.ecs_deployer.id 3 | } 4 | 5 | output "ecs_deployer_secret_key" { 6 | value = aws_iam_access_key.ecs_deployer.secret 7 | } 8 | -------------------------------------------------------------------------------- /modules/vpc/main.tf: -------------------------------------------------------------------------------- 1 | resource "aws_vpc" "vpc" { 2 | cidr_block = var.cidr 3 | enable_dns_hostnames = true 4 | 5 | tags = { 6 | Name = var.environment 7 | Environment = var.environment 8 | } 9 | } 10 | 11 | resource "aws_internet_gateway" "vpc" { 12 | vpc_id = aws_vpc.vpc.id 13 | 14 | tags = { 15 | Environment = var.environment 16 | } 17 | } 18 | -------------------------------------------------------------------------------- /modules/vpc/outputs.tf: -------------------------------------------------------------------------------- 1 | output "id" { 2 | value = aws_vpc.vpc.id 3 | } 4 | 5 | output "cidr_block" { 6 | value = aws_vpc.vpc.cidr_block 7 | } 8 | 9 | output "igw" { 10 | value = aws_internet_gateway.vpc.id 11 | } 12 | -------------------------------------------------------------------------------- /modules/vpc/variables.tf: -------------------------------------------------------------------------------- 1 | variable "cidr" { 2 | description = "VPC cidr block. Example: 10.0.0.0/16" 3 | } 4 | 5 | variable "environment" { 6 | description = "The name of the environment" 7 | } 8 | --------------------------------------------------------------------------------