├── images └── diagram.png ├── CODE_OF_CONDUCT.md ├── LICENSE ├── ecs-stopped-tasks-cwlogs.yaml ├── README.md ├── cw-event.json └── CONTRIBUTING.md /images/diagram.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/amazon-ecs-stopped-tasks-cwlogs/HEAD/images/diagram.png -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of 4 | this software and associated documentation files (the "Software"), to deal in 5 | the Software without restriction, including without limitation the rights to 6 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 7 | the Software, and to permit persons to whom the Software is furnished to do so. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 10 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 11 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 12 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 13 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 14 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 15 | 16 | -------------------------------------------------------------------------------- /ecs-stopped-tasks-cwlogs.yaml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: 2010-09-09 2 | Description: Deploys the resources needed to store events of Amazon ECS stopped tasks in CloudWatch Logs 3 | 4 | Parameters: 5 | CWLogGroupName: 6 | Type: String 7 | Description: The CloudWatch log group name to store the events 8 | Default: /aws/events/ECSStoppedTasksEvent 9 | CWLogGroupRetention: 10 | Type: Number 11 | Description: The number of days to retain the events in the log group 12 | AllowedValues: [1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, 3653] 13 | Default: 30 14 | 15 | Resources: 16 | EventRule: 17 | Type: AWS::Events::Rule 18 | Properties: 19 | Name: ECSStoppedTasksEvent 20 | Description: Triggered when an Amazon ECS Task is stopped 21 | EventPattern: 22 | source: 23 | - aws.ecs 24 | detail-type: 25 | - ECS Task State Change 26 | detail: 27 | desiredStatus: 28 | - STOPPED 29 | lastStatus: 30 | - STOPPED 31 | State: ENABLED 32 | Targets: 33 | - Arn: !GetAtt LogGroup.Arn 34 | Id: ECSStoppedTasks 35 | 36 | LogGroup: 37 | Type: AWS::Logs::LogGroup 38 | Properties: 39 | LogGroupName: !Ref CWLogGroupName 40 | RetentionInDays: !Ref CWLogGroupRetention 41 | 42 | LogEventsPolicy: 43 | Type: AWS::Logs::ResourcePolicy 44 | Properties: 45 | PolicyName: LogEventsPolicy 46 | PolicyDocument: !Sub | 47 | { 48 | "Version": "2012-10-17", 49 | "Statement": [ 50 | { 51 | "Sid": "LogEventsPolicy", 52 | "Effect": "Allow", 53 | "Principal": { 54 | "Service": [ 55 | "delivery.logs.amazonaws.com", 56 | "events.amazonaws.com" 57 | ] 58 | }, 59 | "Action": [ 60 | "logs:CreateLogStream", 61 | "logs:PutLogEvents" 62 | ], 63 | "Resource": "${LogGroup.Arn}" 64 | } 65 | ] 66 | } 67 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ECS Stopped Tasks in CloudWatch Logs 2 | 3 | ## Context 4 | 5 | Amazon ECS stopped tasks are returned for at least 1 hour as described in the ListTasks API reference. 6 | 7 | > Recently stopped tasks might appear in the returned results. Currently, stopped tasks appear in the returned results for at least one hour. 8 | 9 | https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_ListTasks.html 10 | 11 | It also applies for the DescribeTasks API. 12 | 13 | It means that if you want to investigate a task that was stopped more than 1 hour, you won't be able to retrieve the information. For RCA purposes information such as: time when the task was stopped, stopped reason of the task and the exit code of the containers it's crucial. 14 | 15 | ## Solution 16 | 17 | The solution stores in CloudWatch Logs the EventBridge event that is triggered when a task is stopped. 18 | 19 | ![Diagram](images/diagram.png) 20 | 21 | To investigate and analyze the logs we recommend to use CloudWatch Logs Insights. 22 | 23 | The solution can be deployed using [this](ecs-stopped-tasks-cwlogs.yaml) CloudFormation template. 24 | 25 | #### Amazon EventBridge Event Pattern 26 | 27 | Amazon ECS sends three types of events to EventBridge: container instance state change events, task state change events and service action events. If these resources change an event is triggered. 28 | 29 | https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs_cwe_events.html 30 | 31 | The following EventBridge event pattern will match only the events when a task is stopped. 32 | 33 | ``` 34 | { 35 | "detail-type": [ 36 | "ECS Task State Change" 37 | ], 38 | "source": [ 39 | "aws.ecs" 40 | ], 41 | "detail": { 42 | "desiredStatus": [ 43 | "STOPPED" 44 | ], 45 | "lastStatus": [ 46 | "STOPPED" 47 | ] 48 | } 49 | } 50 | ``` 51 | 52 | #### Amazon CloudWatch Logs 53 | 54 | An example of the event stored in Amazon CloudWatch Logs can be found [here](cw-event.json). 55 | 56 | #### Amazon CloudWatch Logs Insights 57 | 58 | With CloudWatch Logs Insights you can easily search and analyze the log data in CloudWatch Logs. The following are sample queries: 59 | 60 | - Show a stopped task given a task ID. 61 | 62 | `filter detail.taskArn like ""` 63 | 64 | - Show stopped tasks for a given cluster and service name. 65 | 66 | `filter detail.clusterArn like "" and detail.group like ""` 67 | 68 | - Show *stoppedReason* and *exitCode* of a container given a task ID. 69 | 70 | `fields detail.stoppedReason, detail.containers.0.name, detail.containers.0.exitCode | filter detail.taskArn like ""` 71 | 72 | #### Deploy the solution with AWS CloudFormation 73 | 74 | To automate the deployment of this solution, you can use [this](ecs-stopped-tasks-cwlogs.yaml) CloudFormation template. 75 | 76 | ## License 77 | 78 | This library is licensed under the MIT-0 License. See the LICENSE file. 79 | -------------------------------------------------------------------------------- /cw-event.json: -------------------------------------------------------------------------------- 1 | { 2 | "version": "0", 3 | "id": "d005c784-98bd-b9c9-043f-c924cd120f54", 4 | "detail-type": "ECS Task State Change", 5 | "source": "aws.ecs", 6 | "account": "0123456789", 7 | "time": "2020-06-05T12:17:11Z", 8 | "region": "eu-west-1", 9 | "resources": [ 10 | "arn:aws:ecs:eu-west-1:0123456789:task/Features/33a7f27282624543abd4a00c3e84d0ee" 11 | ], 12 | "detail": { 13 | "attachments": [ 14 | { 15 | "id": "f552fb81-5170-4191-ae84-c1921505ef1c", 16 | "type": "sdi", 17 | "status": "DELETED", 18 | "details": [] 19 | } 20 | ], 21 | "availabilityZone": "eu-west-1a", 22 | "capacityProviderName": "FeaturesASG", 23 | "clusterArn": "arn:aws:ecs:eu-west-1:0123456789:cluster/Features", 24 | "containerInstanceArn": "arn:aws:ecs:eu-west-1:0123456789:container-instance/Features/0c304770a74540689f5bf39622b3d781", 25 | "containers": [ 26 | { 27 | "containerArn": "arn:aws:ecs:eu-west-1:0123456789:container/0a97d723-d9c4-45d7-8109-471e94afa4af", 28 | "exitCode": 0, 29 | "lastStatus": "STOPPED", 30 | "name": "Webapp", 31 | "image": "0123456789.dkr.ecr.eu-west-1.amazonaws.com/webapp:d1a3f22", 32 | "imageDigest": "sha256:48b2652a7a9a104300046dc3be4aba5e3e19a2ead82a593f63846739a7429230", 33 | "runtimeId": "8a78a445b4e66ee0f2c7624d9b52af4e3685039444a44999d884e1af2283e804", 34 | "networkBindings": [ 35 | { 36 | "bindIP": "0.0.0.0", 37 | "containerPort": 80, 38 | "hostPort": 32771, 39 | "protocol": "tcp" 40 | } 41 | ], 42 | "taskArn": "arn:aws:ecs:eu-west-1:0123456789:task/Features/33a7f27282624543abd4a00c3e84d0ee", 43 | "networkInterfaces": [], 44 | "cpu": "0", 45 | "memoryReservation": "128" 46 | } 47 | ], 48 | "createdAt": "2020-06-02T09:15:42.131Z", 49 | "launchType": "EC2", 50 | "cpu": "0", 51 | "memory": "128", 52 | "desiredStatus": "STOPPED", 53 | "group": "service:Webapp", 54 | "lastStatus": "STOPPED", 55 | "overrides": { 56 | "containerOverrides": [ 57 | { 58 | "name": "Webapp" 59 | } 60 | ] 61 | }, 62 | "connectivity": "CONNECTED", 63 | "connectivityAt": "2020-06-02T09:15:42.131Z", 64 | "pullStartedAt": "2020-06-02T09:15:42.626Z", 65 | "startedAt": "2020-06-02T09:16:17.544Z", 66 | "startedBy": "ecs-svc/5888882704497679006", 67 | "stoppingAt": "2020-06-05T12:16:57.199Z", 68 | "stoppedAt": "2020-06-05T12:17:11.696Z", 69 | "pullStoppedAt": "2020-06-02T09:15:43.626Z", 70 | "executionStoppedAt": "2020-06-05T12:17:10Z", 71 | "stoppedReason": "Task stopped by user", 72 | "stopCode": "UserInitiated", 73 | "updatedAt": "2020-06-05T12:17:11.696Z", 74 | "taskArn": "arn:aws:ecs:eu-west-1:0123456789:task/Features/33a7f27282624543abd4a00c3e84d0ee", 75 | "taskDefinitionArn": "arn:aws:ecs:eu-west-1:0123456789:task-definition/webapp:15", 76 | "version": 7 77 | } 78 | } 79 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *master* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | 61 | We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes. 62 | --------------------------------------------------------------------------------