├── .gitignore ├── LICENSE ├── README.md ├── backup-rds.py ├── clean-base-images.py ├── clean-es-indices.py ├── clean-release-images.py ├── cloudtrail-monitor.py ├── ebs-snapshots.py └── infrastructure ├── Makefile ├── src ├── cloudtrail-notifications.py ├── create-ebs-snapshots.py ├── maintenace-lambdas.py └── rds-cross-region-backup.py └── templates ├── cloudtrail-notifications.json ├── create-ebs-snapshots.json ├── maintenace-lambdas.json └── rds-cross-region-backup.json /.gitignore: -------------------------------------------------------------------------------- 1 | .idea/ 2 | .__pycache__/ 3 | *.pyc 4 | *.pyo 5 | .DS_Store -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2016 Paulina Budzoń 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # aws-maintenance 2 | Collection of scripts and Lambda functions used for maintaining various AWS resources. 3 | 4 | ## Table of contents 5 | - [Cross-region RDS backups](#cross-region-rds-backups-backup-rdspy) 6 | * [Regions](#regions) 7 | * [Limit to specific RDS instances](#limit-to-specific-rds-instances) 8 | * [Encryption](#encryption) 9 | * [Aurora clusters](#aurora-clusters) 10 | * [Guide](#guide) 11 | * [How to use for the first time](#how-to-use-for-the-first-time) 12 | * [How to update to the latest version](#how-to-update-to-the-latest-version) 13 | * [How to test](#how-to-test) 14 | * [Related blog posts](#related-blog-posts) 15 | - [Automated EC2 storage backups and retention management](#automated-ec2-storage-backups-and-retention-management-ebs-snapshotspy) 16 | * [Notes](#notes) 17 | * [Guide](#guide-1) 18 | * [How to use for the first time](#how-to-use-for-the-first-time-1) 19 | * [How to update to the latest version](#how-to-update-to-the-latest-version-1) 20 | * [How to test](#how-to-test-1) 21 | * [How to modify names of tags used by code or default retention period](#how-to-modify-names-of-tags-used-by-code-or-default-retention-period) 22 | * [Related blog posts](#related-blog-posts-1) 23 | - [Monitor CloudTrail events](#monitor-cloudtrail-events-cloudtrail-monitorpy) 24 | - [Other Lambdas](#other-lambdas) 25 | * [clean-base-images.py and clean-release-images.py](#clean-base-imagespy-and-clean-release-imagespy) 26 | * [clean-es-indices.py](#clean-es-indicespy) 27 | 28 | 29 | 30 | ## Cross-region RDS backups (backup-rds.py) 31 | 32 | Lambda function used to copy RDS snapshot from one region to another, to allow for the database to be restored in case 33 | of region failure. One (latest) copy for each RDS instance is kept in the target region. The provided CloudFormation 34 | template will create a subscription from RDS to Lambda, whenever an automated RDS snapshot on any database 35 | in that AWS region is made - that snapshot will be copied to target region and all older snapshots for that database 36 | will be removed. 37 | 38 | ### Regions 39 | You will be asked to specify the target region (where to copy your snapshots) to use by Lambda when creating the 40 | CloudFormation stack. The stack itself needs to be created in the same region where the RDS databases that you want to 41 | use it for are located. 42 | 43 | ### Limit to specific RDS instances 44 | You can also limit the function to only act for specific databases - specify the list of names in the "Databases to use 45 | for" parameter when creating the CloudFormation stack. If you leave it empty, Lambda will trigger for all RDS instances 46 | within the source region. 47 | 48 | ### Encryption 49 | If your RDS instances are encrypted, you need to provide a KMS key ARN in the target region when creating the stack. 50 | 51 | Since KMS keys are region-specific, when the snapshot is copied into another region, it needs to be re-encrypted 52 | using a key located in that region. 53 | [Create a KMS key](https://docs.aws.amazon.com/kms/latest/developerguide/create-keys.html#create-keys-console) in the 54 | target region, copy its ARN and paste that value into `KMS Key in target region` parameter when creating the 55 | CloudFormation stack. **If you do not provide that value, copy operation for encrypted snapshots will fail.** 56 | 57 | You can also provide that value if your RDS instances are not encrypted - the copied snapshots will be encrypted using 58 | that key. 59 | 60 | If you don't use encryption and don't want your snapshots to be encrypted, leave the `KMS Key in target region` 61 | parameter empty. 62 | 63 | ### Aurora clusters 64 | Since Aurora clusters do not offer an event notification for their automated backups, a daily schedule needs to be used 65 | to copy the latest snapshot over to the target region. If you're using clusters, set `Use for Aurora clusters` to 'Yes' 66 | when creating the CloudFormation stack. You can limit which clusters' snapshots are copied by specifying a comma-delimited 67 | list in `Aurora clusters to use for` parameter. 68 | The snapshots will be copied over once a day, at a random time of AWS choosing (using CloudWatch Event with `rate(1 day)`). 69 | 70 | ### Guide 71 | 72 | #### How to use for the first time 73 | 1. Download the [backup-rds.py](https://raw.githubusercontent.com/pbudzon/aws-maintenance/master/backup-rds.py) file 74 | from this repository and zip it into a file called `backup-rds.zip` (for example: `zip backup-rds.zip backup-rds.py`). 75 | 1. Upload the ZIP file to an S3 bucket on your AWS account in the same region where your RDS instances live. 76 | 1. Create a new CloudFormation stack using the template: `infrastructure/templates/rds-cross-region-backup.json`. 77 | 1. CloudFormation will ask you for the following parameters: 78 | - Required: **Target region** - provide the id of the AWS region where the copied snapshots should be stored, like 79 | 'eu-central-1'. Those are listed in 80 | [AWS documentation](https://docs.aws.amazon.com/general/latest/gr/rande.html#rds_region). 81 | - Required: **Name of S3 bucket** - name of the S3 bucket where you uploaded the ZIP in earlier step. 82 | - Required: **Name of ZIP file** - name of the ZIP file in S3 bucket you uploaded. If you uploaded it into a directory, 83 | provide a path to the file in S3 (for example `lambda_code/backup-rds.zip`) 84 | - Required/Optional: **KMS Key in target region** - if your RDS instances are encrypted, provide an ARN of a KMS key 85 | in the target region. See Encryption section above. 86 | - Optional: **Databases to use for** - if you want limit the functionality to only specific RDS instances, provide 87 | a comma-delimited list of their names. 88 | - Optional: **Use for Aurora clusters** - select "Yes" if you have any Aurora Clusters that you want this code to work 89 | with. 90 | - Optional: **Aurora clusters to use for** (applies only if you select "Yes" above) - if you want to limit the 91 | functionality to only specific Aurora Clusters, provide a comma-delimited list of clusters names. 92 | 93 | #### How to update to the latest version 94 | Follow the update steps, but name the zip file something else that before - for example, if you uploaded `backup-rds.zip`, 95 | upload the new file as `backup-rds-1.zip`. Update your CloudFormation stack with the latest template from this repo, 96 | and provide that new ZIP file name in *Name of ZIP file* parameter. 97 | 98 | #### How to test 99 | Once all resources are created, you can test your Lambda from the Console, by using the following test event: 100 | ``` 101 | { 102 | "Records": [ 103 | { 104 | "EventVersion": "1.0", 105 | "EventSubscriptionArn": "arn:aws:sns:EXAMPLE", 106 | "EventSource": "aws:sns", 107 | "Sns": { 108 | "Type": "Notification", 109 | "MessageId": "abcd", 110 | "TopicArn": "arn:aws:sns:eu-west-1:123456789012:topic_name", 111 | "Subject": "RDS Notification Message", 112 | "Message": "{\"Event Source\":\"db-instance\",\"Event Time\":\"2017-12-26 22:34:07.882\",\"Identifier Link\":\"https://console.aws.amazon.com/rds/home?region=eu-west-1#dbinstance:id=database_name\",\"Source ID\":\"PUT_YOUR_RDS_NAME_HERE\",\"Event ID\":\"http://docs.amazonwebservices.com/AmazonRDS/latest/UserGuide/USER_Events.html#RDS-EVENT-0002\",\"Event Message\":\"Finished DB Instance backup\"}", 113 | "Timestamp": "2017-12-26T22:35:19.946Z", 114 | "SignatureVersion": "1", 115 | "Signature": "xxx", 116 | "SigningCertURL": "xxx", 117 | "UnsubscribeURL": "xxx" 118 | } 119 | } 120 | ] 121 | } 122 | ``` 123 | Replace the `PUT_YOUR_RDS_NAME_HERE` in the JSON string with a name of any of your RDS instances. 124 | 125 | For Aurora Clusters, use the below event (no need to change anything): 126 | ``` 127 | { 128 | "version": "0", 129 | "id": "eb6d8ba9-c5c2-3269-3ac4-9918a9df74d9", 130 | "detail-type": "Scheduled Event", 131 | "source": "aws.events", 132 | "account": "123456789012", 133 | "time": "2018-01-30T21:11:00Z", 134 | "region": "eu-west-1", 135 | "resources": [ 136 | "arn:aws:events:eu-west-1:123456789012:rule/eventName" 137 | ], 138 | "detail": {} 139 | } 140 | ``` 141 | The code will go through all clusters (or those listed in *Aurora clusters to use for* parameter). 142 | 143 | ### Related blog posts 144 | - [Copying RDS snapshot to another region for cross-region recovery](https://mysteriouscode.io/blog/copying-rds-snapshot-to-another-region-for-cross-region-recovery/) 145 | - [Complete code: cross-region RDS recovery](https://mysteriouscode.io/blog/complete-code-cross-region-rds-recovery/) 146 | - [Cross-region RDS recovery: encryption and Aurora support](https://mysteriouscode.io/blog/cross-region-rds-recovery-encryption-and-aurora-support/) 147 | 148 | 149 | ## Automated EC2 storage backups and retention management (ebs-snapshots.py) 150 | 151 | Lambda function which will automatically create daily snapshots of instances tagged with "Backup" tag (name can be 152 | customized). 153 | The tag should contain a number of days the snapshot should be retained for - after that date, it will be deleted when 154 | this Lambda is executed. 155 | 156 | ### Notes 157 | - Encrypted volumes' snapshots will retain the encryption and use the same encryption key. 158 | - Unencrypted volumes' snapshots will remain unencrypted. 159 | - Default retention period is 7 days (can be changed in Lambda code, see below). 160 | - Lambda can be run multiple times a day if needed, it will NOT create duplicated snapshots in the same day. 161 | - Tags from EC2 instance will be copied to the snapshot (except "Backup" tag), and a new tag "CreatedBy" will be added 162 | with this Lambda's name. 163 | - If you have a lot of instances to snapshot, you may need to extend the Lambda execution time (or schedule it to be 164 | executed multiple times a day). 165 | 166 | ### Guide 167 | 168 | #### How to use for the first time 169 | 1. Download the [ebs-snapshots.py](https://raw.githubusercontent.com/pbudzon/aws-maintenance/master/ebs-snapshots.py) 170 | file from this repository and zip it into a file called `ebs-snapshots.zip` (for example: `zip ebs-snapshots.zip 171 | ebs-snapshots.py`). 172 | 1. Upload the ZIP file to an S3 bucket on your AWS account. 173 | 1. Create a new CloudFormation stack using the template: `infrastructure/templates/create-ebs-snapshots.json`. 174 | 1. CloudFormation will ask you for the following parameters: 175 | - Required: **Name of S3 bucket** - name of the S3 bucket where you uploaded the ZIP in earlier step. 176 | - Required: **Name of ZIP file** - name of the ZIP file in S3 bucket you uploaded. If you uploaded it into a directory, 177 | provide a path to the file in S3 (for example `lambda_code/ebs-snapshots.zip`) 178 | 1. Create the stack. 179 | 1. Add a tag called "Backup" to some instances, with a number of days (or 0) you want to retain their snapshots for as 180 | the tag's value. 181 | 1. That's it! CloudWatch Event Rule will be created that will trigger the Lambda once a day. You can 182 | also trigger it manually from Lambda console. 183 | 184 | #### How to update to the latest version 185 | Follow the update steps, but name the zip file something else that before - for example, if you uploaded `ebs-snapshots.zip`, 186 | upload the new file as `ebs-snapshots-1.zip`. Update your CloudFormation stack with the latest template from this repo, 187 | and provide that new ZIP file name in *Name of ZIP file* parameter. 188 | 189 | #### How to test 190 | Trigger the Lambda from the console. Any (even empty) input will do, it will be ignored. Output from the Lambda will 191 | list tagged EC2 instances found and which EBS snapshots were created. 192 | 193 | #### How to modify names of tags used by code or default retention period 194 | In `ebs-snapshots.py` file, one of the top few lines define the following variables, which you can change as needed: 195 | - `DEFAULT_RETENTION` - number of days the snapshots are retained for if the "Backup" tag value is zero (default: 7). 196 | - `BACKUP_TAG` - name of the tag on EC2 instances the code will look for (default: "Backup"). 197 | - `DELETE_ON_TAG` - name of the tag with deletion date that will be added to snapshots (default: "DeleteOn"). Important: 198 | If you change this AFTER some snapshots were already created with previous name, those snapshots will not be deleted 199 | when their date is reached. Either update the tag name assigned to them, or delete them manually. 200 | 201 | After changing those values, follow the update guide above to deploy your new code. 202 | 203 | ### Related blog posts 204 | - [Complete code: Automated EC2 snapshots and retention management](https://mysteriouscode.io/blog/complete-code-automated-ec2-snapshots-and-retention-management/) 205 | 206 | ## Monitor CloudTrail events (cloudtrail-monitor.py) 207 | 208 | Lambda function which monitors CloudTrail logs and sends SNS notification on `LaunchInstances` event. 209 | This can be modified to look for and respond to any AWS API calls as needed. 210 | 211 | Use `infrastructure/templates/cloudtrail-notifications.json` CloudFormation template to create the Lambda, 212 | CloudTrail and SNS topics. In the Outputs of the CloudFormation 213 | stack, you'll find the SNS topic to which you can subscribe to receive the notifications. 214 | 215 | 216 | ## Other Lambdas 217 | 218 | The Lambdas below can be created by using `infrastructure/templates/maintenance-lambdas.json` CloudFormation template. 219 | 220 | You should probably review (and adjust) them to your needs as necessary. They are provided as examples. 221 | 222 | ### clean-base-images.py and clean-release-images.py 223 | 224 | Remove AMIs from eu-west-1 (Ireland) and eu-central-1 (Frankfurt) based on different tags. 225 | 226 | Meant to be used as a part of immutable infrastructure, where each project has a base AMI (tagged with `Type=BaseImage`) 227 | and each release in contained within a new AMI based on it (tagged with `Type=ReleaseImage`). 228 | 229 | Assumptions: 230 | 231 | 1. base images are stored in Ireland. Release images are stored in Ireland and Frankfurt (as backups). 232 | 1. Apart from `Type` tag, each AMI has a `Project` tag, which can contain any value. 233 | 234 | Those scripts make sure only a certain amount of recent images for each project is stored to limit the costs. 235 | 236 | ### clean-es-indices.py 237 | 238 | Removes old CloudWatch indices inside AWS ElasticSearch Service. Useful when using CloudWatch log streaming into 239 | ElasticSearch. 240 | 241 | Configure list of accounts, ElasticSearch endpoint and amount of last indices to be kept inside the code. 242 | -------------------------------------------------------------------------------- /backup-rds.py: -------------------------------------------------------------------------------- 1 | # The MIT License (MIT) 2 | # 3 | # Copyright (c) 2016 Paulina Budzoń 4 | # 5 | # Permission is hereby granted, free of charge, to any person obtaining a copy 6 | # of this software and associated documentation files (the "Software"), to deal 7 | # in the Software without restriction, including without limitation the rights 8 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | # copies of the Software, and to permit persons to whom the Software is 10 | # furnished to do so, subject to the following conditions: 11 | # 12 | # The above copyright notice and this permission notice shall be included in all 13 | # copies or substantial portions of the Software. 14 | # 15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | # SOFTWARE. 22 | 23 | import json 24 | import operator 25 | import os 26 | 27 | import boto3 28 | import botocore 29 | 30 | # Env variables 31 | SOURCE_REGION = os.environ.get("SOURCE_REGION") 32 | TARGET_REGION = os.environ.get("TARGET_REGION") 33 | KMS_KEY_ID = os.environ.get("KMS_KEY_ID", "") 34 | 35 | # Global clients 36 | SOURCE_CLIENT = boto3.client("rds", SOURCE_REGION) 37 | TARGET_CLIENT = boto3.client("rds", TARGET_REGION) 38 | 39 | 40 | def get_snapshots_list(response, is_aurora): 41 | """ 42 | Simplifies list of snapshots by retaining snapshot name and creation time only 43 | :param response: dict Output from describe_db_snapshots or describe_db_cluster_snapshots 44 | :param is_aurora: bool True if output if from describe_db_cluster_snapshots, False otherwise 45 | :return: Dict with snapshot id as key and snapshot creation time as value 46 | """ 47 | snapshots = {} 48 | 49 | response_list_key = "DBClusterSnapshots" if is_aurora else "DBSnapshots" 50 | identifier_list_key = "DBClusterSnapshotIdentifier" if is_aurora else "DBSnapshotIdentifier" 51 | for snapshot in response[response_list_key]: 52 | if snapshot["Status"] != "available": 53 | continue 54 | 55 | snapshots[snapshot[identifier_list_key]] = snapshot["SnapshotCreateTime"] 56 | 57 | return snapshots 58 | 59 | 60 | def print_encryption_info(source_snapshot_arn, is_aurora): 61 | """ 62 | Prints out info about encryption for the snapshot copy. Can be skipped completely, only used for more detailed logs. 63 | :param source_snapshot_arn: string ARN of the source snapshot 64 | :param is_aurora: bool True it it's Aurora cluster snapshot, False otherwise 65 | :return: None 66 | """ 67 | if is_aurora: 68 | snapshot_details = SOURCE_CLIENT.describe_db_cluster_snapshots( 69 | DBClusterSnapshotIdentifier=source_snapshot_arn, 70 | ) 71 | else: 72 | snapshot_details = SOURCE_CLIENT.describe_db_snapshots( 73 | DBSnapshotIdentifier=source_snapshot_arn, 74 | ) 75 | 76 | # No key, but snapshot is encrypted 77 | if KMS_KEY_ID == "" and ( 78 | (not is_aurora and snapshot_details["DBSnapshots"][0]["Encrypted"]) or 79 | (is_aurora and snapshot_details["DBClusterSnapshots"][0]["StorageEncrypted"]) 80 | ): 81 | raise Exception( 82 | "Snapshot is encrypted, but no encryption key specified for copy! " + 83 | "Set KMS Key ID parameter in CloudFormation stack") 84 | 85 | # Key provided, but snapshot not encrypted (notice only) 86 | if KMS_KEY_ID != "" and ( 87 | (not is_aurora and not snapshot_details["DBSnapshots"][0]["Encrypted"]) or 88 | (is_aurora and not snapshot_details["DBClusterSnapshots"][0]["StorageEncrypted"]) 89 | ): 90 | print("Snapshot is not encrypted, but KMS key specified - copy WILL BE encrypted") 91 | 92 | 93 | def get_clusters(clusters_to_use): 94 | """ 95 | Gets a list of Aurora clusters and matches that against CLUSTERS_TO_USE env variable (if provided). 96 | :param clusters_to_use: List of cluster names 97 | :return: List of Aurora cluster names that match CLUSTERS_TO_USE (or all, if CLUSTERS_TO_USE is empty) 98 | """ 99 | clusters = [] 100 | clusters_list = SOURCE_CLIENT.describe_db_clusters() 101 | for cluster in clusters_list['DBClusters']: 102 | if (clusters_to_use and cluster['DBClusterIdentifier'] in clusters_to_use) or (not clusters_to_use): 103 | clusters.append(cluster['DBClusterIdentifier']) 104 | 105 | return clusters 106 | 107 | 108 | def copy_latest_snapshot(account_id, instance_name, is_aurora): 109 | """ 110 | Finds the latest snapshot for a given RDS instance/Aurora Cluster and copies it to target region. 111 | :param account_id: int ID of the current AWS account 112 | :param instance_name: string Name of the instance/cluster 113 | :param is_aurora: bool True if instance_name is name of Aurora cluster, False otherwise 114 | :return: None 115 | :raises Exception if instance/cluster has no automated snapshots or copy operation fails 116 | """ 117 | 118 | # Get a list of automated snapshots for this database 119 | if is_aurora: 120 | response = SOURCE_CLIENT.describe_db_cluster_snapshots( 121 | DBClusterIdentifier=instance_name, 122 | SnapshotType="automated" 123 | ) 124 | if len(response["DBClusterSnapshots"]) == 0: 125 | raise Exception("No automated snapshots found for cluster " + instance_name) 126 | else: 127 | response = SOURCE_CLIENT.describe_db_snapshots( 128 | DBInstanceIdentifier=instance_name, 129 | SnapshotType="automated" 130 | ) 131 | 132 | if len(response["DBSnapshots"]) == 0: 133 | raise Exception("No automated snapshots found for database " + instance_name) 134 | 135 | # Order the list of snapshots by creation time 136 | snapshots = get_snapshots_list(response, is_aurora) 137 | 138 | # Get the latest snapshot 139 | snapshot_name, snapshot_time = sorted(snapshots.items(), key=operator.itemgetter(1)).pop() 140 | print("Latest snapshot found: '{}' from {}".format(snapshot_name, snapshot_time)) 141 | copy_name = "{}-{}-{}".format(instance_name, SOURCE_REGION, snapshot_name.replace(":", "-")) 142 | print("Checking if '{}' exists in target region".format(copy_name)) 143 | 144 | # Look for the copy_name snapshot in target region 145 | try: 146 | if is_aurora: 147 | TARGET_CLIENT.describe_db_cluster_snapshots( 148 | DBClusterSnapshotIdentifier=copy_name 149 | ) 150 | else: 151 | TARGET_CLIENT.describe_db_snapshots( 152 | DBSnapshotIdentifier=copy_name 153 | ) 154 | 155 | print("{} is already copied to {}".format(copy_name, TARGET_REGION)) 156 | except botocore.exceptions.ClientError as e: 157 | if e.response["Error"]["Code"] in ("DBSnapshotNotFound", "DBClusterSnapshotNotFoundFault"): 158 | snapshot_arn_name = "cluster-snapshot" if is_aurora else "snapshot" 159 | source_snapshot_arn = "arn:aws:rds:{}:{}:{}:{}".format(SOURCE_REGION, account_id, snapshot_arn_name, 160 | snapshot_name) 161 | 162 | print_encryption_info(source_snapshot_arn, is_aurora) 163 | 164 | # Trigger a copy operation 165 | if is_aurora: 166 | response_list_key = "DBClusterSnapshot" 167 | response = TARGET_CLIENT.copy_db_cluster_snapshot( 168 | SourceDBClusterSnapshotIdentifier=source_snapshot_arn, 169 | TargetDBClusterSnapshotIdentifier=copy_name, 170 | CopyTags=True, 171 | KmsKeyId=KMS_KEY_ID, 172 | SourceRegion=SOURCE_REGION 173 | ) 174 | else: 175 | response_list_key = "DBSnapshot" 176 | response = TARGET_CLIENT.copy_db_snapshot( 177 | SourceDBSnapshotIdentifier=source_snapshot_arn, 178 | TargetDBSnapshotIdentifier=copy_name, 179 | CopyTags=True, 180 | KmsKeyId=KMS_KEY_ID, 181 | SourceRegion=SOURCE_REGION # Ref: https://github.com/boto/botocore/issues/1273 182 | ) 183 | 184 | # Check the status of the copy 185 | if response[response_list_key]["Status"] not in ("pending", "available", "copying"): 186 | raise Exception("Copy operation for {} failed!".format(copy_name)) 187 | 188 | print("Copied {} to {}".format(copy_name, TARGET_REGION)) 189 | return 190 | else: # Another error happened, re-raise 191 | raise e 192 | 193 | 194 | def remove_old_snapshots(instance_name, is_aurora): 195 | """ 196 | Finds previously-copied snapshots for given RDS instance / Aurora cluster in target regions and leaves only latest one. 197 | :param instance_name: string Name of the instance/cluster 198 | :param is_aurora: bool True if instance_name is name of Aurora cluster, False otherwise 199 | :return: None 200 | :raises Exception if instance/cluster has no snapshots in target region 201 | """ 202 | 203 | # Get a list of all snapshots for this database in target region 204 | if is_aurora: 205 | response = TARGET_CLIENT.describe_db_cluster_snapshots( 206 | SnapshotType="manual", 207 | DBClusterIdentifier=instance_name 208 | ) 209 | 210 | if len(response["DBClusterSnapshots"]) == 0: 211 | raise Exception("No snapshots for cluster {} found in target region".format(instance_name)) 212 | else: 213 | response = TARGET_CLIENT.describe_db_snapshots( 214 | SnapshotType="manual", 215 | DBInstanceIdentifier=instance_name 216 | ) 217 | 218 | if len(response["DBSnapshots"]) == 0: 219 | raise Exception("No snapshots for database {} found in target region".format(instance_name)) 220 | 221 | # List the snapshots by time created 222 | snapshots = get_snapshots_list(response, is_aurora) 223 | 224 | # Sort snapshots by time and get all other than the latest one 225 | if len(snapshots) > 1: 226 | sorted_snapshots = sorted(snapshots.items(), key=operator.itemgetter(1), reverse=True) 227 | snapshots_to_remove = [i[0] for i in sorted_snapshots[1:]] 228 | print("Found {} snapshot(s) to remove".format(len(snapshots_to_remove))) 229 | 230 | # Remove the snapshots 231 | for snapshot in snapshots_to_remove: 232 | print("Removing {}".format(snapshot)) 233 | if is_aurora: 234 | TARGET_CLIENT.delete_db_cluster_snapshot( 235 | DBClusterSnapshotIdentifier=snapshot 236 | ) 237 | else: 238 | TARGET_CLIENT.delete_db_snapshot( 239 | DBSnapshotIdentifier=snapshot 240 | ) 241 | else: 242 | print("No old snapshots to remove in target region") 243 | 244 | 245 | def lambda_handler(event, context): 246 | account_id = context.invoked_function_arn.split(":")[4] 247 | 248 | # Scheduled event for Aurora 249 | if 'source' in event and event['source'] == "aws.events": 250 | clusters_to_use = os.environ.get("CLUSTERS_TO_USE", None) 251 | if clusters_to_use: 252 | clusters_to_use = clusters_to_use.split(",") 253 | clusters = get_clusters(clusters_to_use) 254 | 255 | if len(clusters) == 0: 256 | raise Exception("No matching clusters found") 257 | 258 | for cluster in clusters: 259 | copy_latest_snapshot(account_id, cluster, True) 260 | remove_old_snapshots(cluster, True) 261 | 262 | else: # Assume SNS about instance backup 263 | message = json.loads(event["Records"][0]["Sns"]["Message"]) 264 | 265 | # Check that event reports backup has finished 266 | event_id = message["Event ID"].split("#") 267 | if event_id[1] == "RDS-EVENT-0002": 268 | copy_latest_snapshot(account_id, message["Source ID"], False) 269 | remove_old_snapshots(message["Source ID"], False) 270 | -------------------------------------------------------------------------------- /clean-base-images.py: -------------------------------------------------------------------------------- 1 | import boto3 2 | import operator 3 | 4 | 5 | def lambda_handler(event, context): 6 | LIMIT = 10 7 | client = boto3.client('ec2', 'eu-west-1') 8 | 9 | response = client.describe_images( 10 | Owners=['self'], 11 | Filters=[{'Name': 'tag:Type', 'Values': ['BaseImage']}] 12 | ) 13 | 14 | if len(response['Images']) == 0: 15 | raise Exception('no AMIs with Type=BaseImage tag found') 16 | 17 | images = {} 18 | for image in response['Images']: 19 | for tag in image['Tags']: 20 | if tag['Key'] == "Project": 21 | if tag['Value'] not in images.keys(): 22 | images[tag['Value']] = {} 23 | images[tag['Value']][image['ImageId']] = image['CreationDate'] 24 | break 25 | 26 | to_remove = [] 27 | for project in images: 28 | sorted_x = sorted(images[project].items(), key=operator.itemgetter(1), reverse=True) 29 | if len(sorted_x) > LIMIT: 30 | to_remove = to_remove + [i[0] for i in sorted_x[LIMIT:]] 31 | 32 | if len(to_remove) == 0: 33 | print("Nothing to do") 34 | return 0 35 | 36 | print("Will remove " + str(len(to_remove)) + " images") 37 | 38 | for ami in to_remove: 39 | print("Removing: " + ami) 40 | client.deregister_image(ImageId=ami) 41 | 42 | 43 | if __name__ == '__main__': 44 | lambda_handler(None, None) 45 | -------------------------------------------------------------------------------- /clean-es-indices.py: -------------------------------------------------------------------------------- 1 | import os 2 | import datetime 3 | import hashlib 4 | import hmac 5 | import urllib2 6 | import json 7 | 8 | ENDPOINTS_ACCOUNTS = { 9 | 'account-1': 'elastic-search-endpoint', 10 | 'account-2': 'elastic-search-endpoint', 11 | } 12 | 13 | THRESHOLD_ACCOUNTS = { 14 | 'account-1': 20, 15 | 'account-2': 60 16 | } 17 | 18 | 19 | def sign(key, msg): 20 | return hmac.new(key, msg.encode('utf-8'), hashlib.sha256).digest() 21 | 22 | 23 | def getSignatureKey(key, dateStamp, regionName, serviceName): 24 | kDate = sign(('AWS4' + key).encode('utf-8'), dateStamp) 25 | kRegion = sign(kDate, regionName) 26 | kService = sign(kRegion, serviceName) 27 | kSigning = sign(kService, 'aws4_request') 28 | return kSigning 29 | 30 | 31 | def get_signature(endpoint, method, canonical_uri): 32 | region = 'eu-west-1' 33 | service = 'es' 34 | access_key = os.environ.get('AWS_ACCESS_KEY_ID') 35 | secret_key = os.environ.get('AWS_SECRET_ACCESS_KEY') 36 | session_key = os.environ.get('AWS_SESSION_TOKEN') 37 | t = datetime.datetime.utcnow() 38 | amzdate = t.strftime('%Y%m%dT%H%M%SZ') 39 | datestamp = t.strftime('%Y%m%d') 40 | canonical_querystring = '' 41 | canonical_headers = 'host:' + endpoint + '\nx-amz-date:' + amzdate + '\nx-amz-security-token:' + session_key + "\n" 42 | signed_headers = 'host;x-amz-date;x-amz-security-token' 43 | payload_hash = hashlib.sha256('').hexdigest() 44 | canonical_request = method + '\n' + canonical_uri + '\n' + canonical_querystring + '\n' + canonical_headers + '\n' + signed_headers + '\n' + payload_hash 45 | algorithm = 'AWS4-HMAC-SHA256' 46 | credential_scope = datestamp + '/' + region + '/' + service + '/' + 'aws4_request' 47 | string_to_sign = algorithm + '\n' + amzdate + '\n' + credential_scope + '\n' + hashlib.sha256( 48 | canonical_request).hexdigest() 49 | signing_key = getSignatureKey(secret_key, datestamp, region, service) 50 | signature = hmac.new(signing_key, (string_to_sign).encode('utf-8'), hashlib.sha256).hexdigest() 51 | authorization_header = algorithm + ' ' + 'Credential=' + access_key + '/' + credential_scope + ', ' + 'SignedHeaders=' + signed_headers + ', ' + 'Signature=' + signature 52 | headers = {'x-amz-date': amzdate, 'x-amz-security-token': session_key, 'Authorization': authorization_header} 53 | request_url = 'https://' + endpoint + canonical_uri + '?' + canonical_querystring 54 | 55 | return {'url': request_url, 'headers': headers} 56 | 57 | 58 | def lambda_handler(event, context): 59 | INDEXPREFIX = 'cwl-' 60 | 61 | if 'account' in event: 62 | if event['account'] not in ENDPOINTS_ACCOUNTS.keys(): 63 | raise Exception("No endpoint configured for account " + str(event['account'])) 64 | ENDPOINT = ENDPOINTS_ACCOUNTS[event['account']] 65 | TOLEAVE = THRESHOLD_ACCOUNTS[event['account']] 66 | else: 67 | raise Exception("No account specified in event") 68 | 69 | response = json.loads(get_index_list(ENDPOINT)) 70 | indexes = [] 71 | for index in response: 72 | if index.startswith(INDEXPREFIX): 73 | indexes.append(index) 74 | 75 | indexes.sort(reverse=True) 76 | to_remove = indexes[TOLEAVE:] 77 | for index in to_remove: 78 | print("Removing " + index) 79 | delete_index(ENDPOINT, index) 80 | 81 | 82 | def delete_index(endpoint, index): 83 | info = get_signature(endpoint, 'DELETE', '/' + index) 84 | 85 | opener = urllib2.build_opener(urllib2.HTTPHandler) 86 | request = urllib2.Request(info['url'], headers=info['headers']) 87 | request.get_method = lambda: 'DELETE' 88 | 89 | r = opener.open(request) 90 | if r.getcode() != 200: 91 | raise Exception("Non 200 response when calling, got: " + str(r.getcode())) 92 | 93 | 94 | def get_index_list(endpoint): 95 | info = get_signature(endpoint, 'GET', '/_aliases') 96 | 97 | request = urllib2.Request(info['url'], headers=info['headers']) 98 | r = urllib2.urlopen(request) 99 | if r.getcode() != 200: 100 | raise Exception("Non 200 response when calling, got: " + str(r.getcode())) 101 | 102 | return r.read() 103 | 104 | 105 | if __name__ == '__main__': 106 | lambda_handler({'account': 'account-1'}, None) 107 | -------------------------------------------------------------------------------- /clean-release-images.py: -------------------------------------------------------------------------------- 1 | import boto3 2 | import operator 3 | 4 | 5 | def clean_images(region, limit): 6 | client = boto3.client('ec2', region) 7 | 8 | response = client.describe_images( 9 | Owners=['self'], 10 | Filters=[{'Name': 'tag:Type', 'Values': ['ReleaseImage']}] 11 | ) 12 | 13 | if len(response['Images']) == 0: 14 | raise Exception('no AMIs with Type=BaseImage tag found') 15 | 16 | images = {} 17 | for image in response['Images']: 18 | for tag in image['Tags']: 19 | if tag['Key'] == "Project": 20 | if tag['Value'] not in images.keys(): 21 | images[tag['Value']] = {} 22 | images[tag['Value']][image['ImageId']] = image['CreationDate'] 23 | break 24 | 25 | to_remove = []; 26 | for project in images: 27 | sorted_x = sorted(images[project].items(), key=operator.itemgetter(1), reverse=True) 28 | if len(sorted_x) > limit: 29 | to_remove = to_remove + [i[0] for i in sorted_x[limit:]] 30 | 31 | if len(to_remove) == 0: 32 | print("Nothing to do") 33 | return 0 34 | 35 | print("Will remove " + str(len(to_remove)) + " images") 36 | 37 | for ami in to_remove: 38 | print("Removing: " + ami) 39 | client.deregister_image(ImageId=ami) 40 | 41 | 42 | def lambda_handler(event, context): 43 | clean_images('eu-west-1', 50) 44 | clean_images('eu-central-1', 1) 45 | 46 | 47 | if __name__ == '__main__': 48 | lambda_handler(None, None) 49 | -------------------------------------------------------------------------------- /cloudtrail-monitor.py: -------------------------------------------------------------------------------- 1 | import json 2 | import boto3 3 | import gzip 4 | 5 | 6 | def lambda_handler(event, context): 7 | sns_topic = None 8 | 9 | info = boto3.client('lambda').get_function( 10 | FunctionName=context.function_name 11 | ) 12 | 13 | iam = boto3.client('iam') 14 | role_name = info['Configuration']['Role'].split('/')[1] 15 | 16 | policies = iam.list_role_policies( 17 | RoleName=role_name 18 | ) 19 | 20 | for policy in policies['PolicyNames']: 21 | details = iam.get_role_policy( 22 | RoleName=role_name, 23 | PolicyName=policy 24 | ) 25 | 26 | for statement in details['PolicyDocument']['Statement']: 27 | for action in statement['Action']: 28 | if action == 'sns:publish': 29 | sns_topic = statement['Resource'] 30 | break 31 | 32 | if sns_topic is None: 33 | raise Exception("Could not find SNS topic for notifications!") 34 | 35 | sns = boto3.client('sns') 36 | 37 | if 'Records' not in event: 38 | raise Exception("Invalid message received!") 39 | 40 | for record in event['Records']: 41 | if 'Message' not in record['Sns']: 42 | print(record) 43 | raise Exception("Invalid record!") 44 | 45 | message = json.loads(record['Sns']['Message']) 46 | 47 | if 's3Bucket' not in message or 's3ObjectKey' not in message: 48 | raise Exception("s3Bucket or s3ObjectKey missing from Message!") 49 | 50 | s3 = boto3.resource('s3') 51 | 52 | for s3key in message['s3ObjectKey']: 53 | s3.meta.client.download_file(message['s3Bucket'], s3key, '/tmp/s3file.json.gz') 54 | 55 | with gzip.open('/tmp/s3file.json.gz', 'rb') as f: 56 | file_content = json.loads(f.read()) 57 | for record in file_content['Records']: 58 | if record['eventSource'] == "ec2.amazonaws.com" and record['eventName'] == 'RunInstances': 59 | print(record) 60 | for topic in sns_topic: 61 | sns.publish( 62 | TopicArn=topic, 63 | Message=json.dumps(record), 64 | Subject="RunInstances invoked at " + record['eventTime'] 65 | ) 66 | 67 | 68 | if __name__ == '__main__': 69 | lambda_handler({ 70 | "Records": [{ 71 | "Sns": { 72 | "Message": "{\"s3Bucket\":\"cloudtrail-xxx\",\"s3ObjectKey\":[\"AWSLogs/xxx/CloudTrail/ap-northeast-1/2016/06/15/abc.json.gz\"]}" 73 | } 74 | }] 75 | }, None) 76 | -------------------------------------------------------------------------------- /ebs-snapshots.py: -------------------------------------------------------------------------------- 1 | # The MIT License (MIT) 2 | # 3 | # Copyright (c) 2016 Paulina Budzoń 4 | # 5 | # Permission is hereby granted, free of charge, to any person obtaining a copy 6 | # of this software and associated documentation files (the "Software"), to deal 7 | # in the Software without restriction, including without limitation the rights 8 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | # copies of the Software, and to permit persons to whom the Software is 10 | # furnished to do so, subject to the following conditions: 11 | # 12 | # The above copyright notice and this permission notice shall be included in all 13 | # copies or substantial portions of the Software. 14 | # 15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | # SOFTWARE. 22 | 23 | import datetime 24 | 25 | import boto3 26 | 27 | EC2_CLIENT = boto3.client("ec2") 28 | EC2_RESOURCE = boto3.resource("ec2") 29 | TODAY = datetime.date.today() 30 | 31 | # How long to keep backups for by default 32 | DEFAULT_RETENTION = 7 33 | # Name of the tag indicating which instances to backup 34 | BACKUP_TAG = "Backup" 35 | # Name of the tag indicating deletion date for snapshots 36 | DELETE_ON_TAG = "DeleteOn" 37 | 38 | 39 | def get_retention_period(instance): 40 | """ 41 | Finds "Backup" tag in list of tags or returns default period (7 days) 42 | :param instance: dict Dictionary output with instance details from describe_instances call 43 | :return: Retention period for that instance 44 | """ 45 | for tag in instance["Tags"]: 46 | if tag["Key"] == BACKUP_TAG: 47 | days = int(tag["Value"]) 48 | if days > 0: 49 | return days 50 | else: 51 | print("Retention period of {} makes no sense, using default".format(days)) 52 | 53 | return DEFAULT_RETENTION # default 54 | 55 | 56 | def find_delete_tag(tags): 57 | """ 58 | Finds "DeleteOn" tag within the list of tags and returns it as a date object 59 | :param tags: List of tags from instance describe_instances call 60 | :return: None if tag was not found or date object 61 | """ 62 | delete_date = None 63 | if tags: 64 | for tag in tags: 65 | if tag["Key"] == DELETE_ON_TAG: 66 | delete_date = datetime.datetime.strptime(tag["Value"], "%Y-%m-%d").date() 67 | 68 | return delete_date 69 | 70 | 71 | def is_already_snapshoted(volume): 72 | """ 73 | Check is volume already had snapshot created by us today 74 | :param volume: ec2.Volume object from boto3 for the volume in question 75 | :return: True/False whether the snapshot exists 76 | """ 77 | snapshots = volume.snapshots.all() 78 | for snapshot in snapshots: 79 | if snapshot.start_time.date() == TODAY and snapshot.state in ("pending", "completed") and \ 80 | find_delete_tag(snapshot.tags) is not None: 81 | return True 82 | 83 | return False 84 | 85 | 86 | def create_snapshots(context): 87 | """ 88 | Find instances to backup and create their snapshots 89 | :param context: Lambda context object 90 | """ 91 | paginator = EC2_CLIENT.get_paginator("describe_instances") 92 | 93 | response_iterator = paginator.paginate( 94 | Filters=[ 95 | {"Name": "tag-key", "Values": [BACKUP_TAG]}, 96 | ] 97 | ) 98 | 99 | for instances in response_iterator: 100 | for reservations in instances["Reservations"]: 101 | for instance in reservations["Instances"]: 102 | for device in instance["BlockDeviceMappings"]: 103 | # Look at every EBS volume attached to this instance 104 | if "Ebs" in device: 105 | # Get volume and check if snapshot already exists 106 | volume = EC2_RESOURCE.Volume(device["Ebs"]["VolumeId"]) 107 | if is_already_snapshoted(volume): 108 | print("Already done today: volume {} on instance {}, skipping".format(volume.id, instance[ 109 | "InstanceId"])) 110 | continue 111 | 112 | print("Found EBS volume {} on instance {}".format(volume.id, instance["InstanceId"])) 113 | 114 | # Create the snapshot 115 | snapshot = volume.create_snapshot( 116 | Description="Snapshot from instance {}".format(instance["InstanceId"]) 117 | ) 118 | 119 | # Get how many days we should keep this snapshot for 120 | retention_days = get_retention_period(instance) 121 | 122 | # Get instance tags and remove the "backup" tag 123 | tags = instance["Tags"] 124 | for tag in tags: 125 | if tag["Key"] == BACKUP_TAG: 126 | tags.remove(tag) 127 | break 128 | 129 | # Find date when to delete and add the tag to the list 130 | delete_date = datetime.date.today() + datetime.timedelta(days=retention_days) 131 | tags.append( 132 | { 133 | "Key": DELETE_ON_TAG, 134 | "Value": delete_date.strftime("%Y-%m-%d") 135 | } 136 | ) 137 | # Add function name to the tags for reference who created the snapshot 138 | tags.append( 139 | { 140 | "Key": "CreatedBy", 141 | "Value": context.function_name 142 | } 143 | ) 144 | 145 | # Apply all those tags to the snapshot 146 | snapshot.create_tags(Tags=tags) 147 | 148 | print("Retaining snapshot {} of volume {} from instance {} until {}".format( 149 | snapshot.id, volume.id, instance["InstanceId"], delete_date 150 | )) 151 | 152 | 153 | def remove_snapshots(): 154 | """ 155 | Find our old snapshots and remove as needed (when DeleteOn is today or earlier) 156 | """ 157 | paginator = EC2_CLIENT.get_paginator("describe_snapshots") 158 | response_iterator = paginator.paginate( 159 | Filters=[ 160 | {"Name": "tag-key", "Values": [DELETE_ON_TAG]}, 161 | ], 162 | ) 163 | 164 | for snapshots in response_iterator: 165 | for snapshot in snapshots["Snapshots"]: 166 | delete_date = find_delete_tag(snapshot["Tags"]) 167 | 168 | if delete_date is not None and delete_date <= TODAY: 169 | print("Deleting old snapshot: {}".format(snapshot["SnapshotId"])) 170 | EC2_CLIENT.delete_snapshot( 171 | SnapshotId=snapshot["SnapshotId"], 172 | ) 173 | 174 | 175 | def lambda_handler(event, context): 176 | create_snapshots(context) 177 | remove_snapshots() 178 | -------------------------------------------------------------------------------- /infrastructure/Makefile: -------------------------------------------------------------------------------- 1 | SOURCES := $(shell echo src/*.py) 2 | TARGETS := $(patsubst src/%.py,templates/%.json,$(SOURCES)) 3 | 4 | all: $(TARGETS) 5 | 6 | clean: 7 | rm -f $(TARGETS) 8 | 9 | templates/%.json: src/%.py 10 | python3 $< > $@ -------------------------------------------------------------------------------- /infrastructure/src/cloudtrail-notifications.py: -------------------------------------------------------------------------------- 1 | from troposphere import Template, GetAtt, Ref, Parameter, Join, Output 2 | from troposphere.iam import Role 3 | from troposphere.iam import Policy as IAMPolicy 4 | from troposphere.awslambda import Function, Code, Permission 5 | from troposphere.sns import Subscription, Topic, TopicPolicy 6 | from troposphere.cloudtrail import Trail 7 | from troposphere.s3 import Bucket, BucketPolicy 8 | from troposphere.cloudwatch import Alarm, MetricDimension 9 | from awacs.aws import Allow, Statement, Action, Principal, Policy, Condition, StringEquals, ArnEquals 10 | from awacs.sts import AssumeRole 11 | import os 12 | 13 | t = Template() 14 | 15 | t.add_description('Lambda function monitoring cloudtrail logs') 16 | 17 | notificationTopic = t.add_resource(Topic( 18 | "NotifcationTopic", 19 | DisplayName="CloudTrail Monitor Alerts" 20 | )) 21 | 22 | bucket = t.add_resource(Bucket( 23 | "Bucket", 24 | AccessControl="Private", 25 | BucketName=Join("-", [Ref("AWS::StackName"), Ref("AWS::AccountId")]), 26 | DeletionPolicy="Retain" 27 | )) 28 | 29 | bucket_policy = t.add_resource(BucketPolicy( 30 | "BucketPolicy", 31 | Bucket=Ref(bucket), 32 | PolicyDocument=Policy( 33 | Statement=[ 34 | Statement( 35 | Sid="AWSCloudTrailAclCheck", 36 | Effect=Allow, 37 | Action=[Action("s3", "GetBucketAcl")], 38 | Principal=Principal( 39 | "Service", ["cloudtrail.amazonaws.com"] 40 | ), 41 | Resource=[Join("", ["arn:aws:s3:::", Ref(bucket)])] 42 | ), 43 | Statement( 44 | Sid="AWSCloudTrailWrite", 45 | Effect=Allow, 46 | Action=[Action("s3", "PutObject")], 47 | Principal=Principal( 48 | "Service", ["cloudtrail.amazonaws.com"] 49 | ), 50 | Resource=[Join("", ["arn:aws:s3:::", Ref(bucket), "/AWSLogs/", Ref("AWS::AccountId"), "/*"])], 51 | Condition=Condition( 52 | StringEquals('s3:x-amz-acl', 'bucket-owner-full-control') 53 | ) 54 | ) 55 | ] 56 | ) 57 | )) 58 | 59 | lambda_role = t.add_resource(Role( 60 | "LambdaRole", 61 | AssumeRolePolicyDocument=Policy( 62 | Statement=[ 63 | Statement( 64 | Effect=Allow, Action=[AssumeRole], 65 | Principal=Principal( 66 | "Service", ["lambda.amazonaws.com"] 67 | ) 68 | ) 69 | ] 70 | ), 71 | Policies=[IAMPolicy( 72 | "LambdaPolicy", 73 | PolicyName="LambdaCloudtrailPolicy", 74 | PolicyDocument=Policy(Statement=[ 75 | Statement( 76 | Effect=Allow, 77 | Action=[ 78 | Action('s3', 'GetObject'), 79 | ], 80 | Resource=[Join("", ['arn:aws:s3:::', Ref(bucket), '/*'])] 81 | ), 82 | Statement( 83 | Effect=Allow, 84 | Action=[ 85 | Action('logs', 'CreateLogGroup'), 86 | Action('logs', 'CreateLogStream'), 87 | Action('logs', 'PutLogEvents'), 88 | ], 89 | Resource=['arn:aws:logs:*:*:*'] 90 | ), 91 | Statement( 92 | Effect=Allow, 93 | Action=[ 94 | Action('lambda', 'GetFunction'), 95 | ], 96 | Resource=['*'] # todo: limit this to the function itself 97 | ), 98 | Statement( 99 | Effect=Allow, 100 | Action=[ 101 | Action('sns', 'publish') 102 | ], 103 | Resource=[Ref(notificationTopic)] 104 | ), 105 | Statement( 106 | Effect=Allow, 107 | Action=[ 108 | Action('iam', 'ListRolePolicies'), 109 | Action('iam', 'GetRolePolicy') 110 | ], 111 | Resource=['*'] 112 | ), 113 | ]) 114 | )] 115 | )) 116 | 117 | source_file = os.path.realpath(__file__ + '/../../../cloudtrail-monitor.py') 118 | with open(source_file, 'r') as content_file: 119 | content = content_file.read() 120 | 121 | if len(content) > 4096: 122 | raise Exception("Function too long!") 123 | 124 | function = t.add_resource(Function( 125 | 'LambdaFunction', 126 | Description='Monitors CloudTrail', 127 | Code=Code( 128 | ZipFile=content 129 | ), 130 | Handler='index.lambda_handler', 131 | MemorySize=128, 132 | Role=GetAtt(lambda_role, 'Arn'), 133 | Runtime='python2.7', 134 | Timeout=10 135 | )) 136 | 137 | cloudtrail_topic = t.add_resource(Topic( 138 | "CloudtrailTopic", 139 | Subscription=[ 140 | Subscription( 141 | Endpoint=GetAtt(function, "Arn"), 142 | Protocol="lambda" 143 | ) 144 | ] 145 | )) 146 | 147 | lambda_permission = t.add_resource(Permission( 148 | "LambdaPermission", 149 | Action="lambda:InvokeFunction", 150 | FunctionName=Ref(function), 151 | Principal="sns.amazonaws.com", 152 | SourceAccount=Ref("AWS::AccountId"), 153 | SourceArn=Ref(cloudtrail_topic) 154 | )) 155 | 156 | t.add_resource(TopicPolicy( 157 | "CloudtrailTopicPolicy", 158 | Topics=[Ref(cloudtrail_topic)], 159 | PolicyDocument=Policy( 160 | Statement=[ 161 | Statement( 162 | Sid="AWSCloudTrailSNSPolicy", 163 | Effect=Allow, 164 | Principal=Principal( 165 | "Service", ["cloudtrail.amazonaws.com"] 166 | ), 167 | Action=[Action("sns", "publish")], 168 | Resource=[Ref(cloudtrail_topic)] 169 | ) 170 | ] 171 | ) 172 | )) 173 | 174 | cloudtrail = t.add_resource(Trail( 175 | "CloudTrail", 176 | IncludeGlobalServiceEvents=True, 177 | IsLogging=True, 178 | IsMultiRegionTrail=True, 179 | S3BucketName=Ref(bucket), 180 | SnsTopicName=Ref(cloudtrail_topic), 181 | DependsOn="BucketPolicy" 182 | )) 183 | 184 | t.add_resource(Alarm( 185 | "LambdaErrorsAlarm", 186 | ComparisonOperator='GreaterThanThreshold', 187 | EvaluationPeriods=1, 188 | MetricName='Errors', 189 | Namespace='AWS/Lambda', 190 | Dimensions=[ 191 | MetricDimension( 192 | Name='FunctionName', 193 | Value=Ref(function) 194 | ) 195 | ], 196 | Period=300, 197 | Statistic='Maximum', 198 | Threshold='0', 199 | AlarmActions=[Ref(notificationTopic)] 200 | )) 201 | 202 | t.add_resource(Alarm( 203 | "LambdaThrottlesAlarm", 204 | ComparisonOperator='GreaterThanThreshold', 205 | EvaluationPeriods=1, 206 | MetricName='Throttles', 207 | Namespace='AWS/Lambda', 208 | Dimensions=[ 209 | MetricDimension( 210 | Name='FunctionName', 211 | Value=Ref(function) 212 | ) 213 | ], 214 | Period=300, 215 | Statistic='Maximum', 216 | Threshold='0', 217 | AlarmActions=[Ref(notificationTopic)] 218 | )) 219 | 220 | 221 | t.add_output(Output( 222 | "SNSNotificationTopic", 223 | Description="SNS topic to which the alerts will be send", 224 | Value=Ref(notificationTopic) 225 | )) 226 | 227 | print(t.to_json()) 228 | -------------------------------------------------------------------------------- /infrastructure/src/create-ebs-snapshots.py: -------------------------------------------------------------------------------- 1 | from awacs import aws, sts 2 | from troposphere import Template, GetAtt, Ref, Parameter 3 | from troposphere import awslambda, iam, events 4 | 5 | template = Template() 6 | 7 | template.add_description("Automated EBS snapshots and retention management") 8 | 9 | s3_bucket_parameter = template.add_parameter(Parameter( 10 | "S3BucketParameter", 11 | Type="String", 12 | Description="Name of the S3 bucket where you uploaded the source code zip", 13 | )) 14 | 15 | source_zip_parameter = template.add_parameter(Parameter( 16 | "SourceZipParameter", 17 | Type="String", 18 | Default="ebs-snapshots.zip", 19 | Description="Name of the zip file inside the S3 bucket", 20 | )) 21 | 22 | template.add_metadata({ 23 | "AWS::CloudFormation::Interface": { 24 | "ParameterGroups": [ 25 | { 26 | "Label": { 27 | "default": "Basic configuration" 28 | }, 29 | "Parameters": [ 30 | "S3BucketParameter", 31 | "SourceZipParameter", 32 | ] 33 | }, 34 | ], 35 | "ParameterLabels": { 36 | "S3BucketParameter": {"default": "Name of S3 bucket"}, 37 | "SourceZipParameter": {"default": "Name of ZIP file"}, 38 | } 39 | } 40 | }) 41 | 42 | # Role for Lambda 43 | lambda_role = template.add_resource(iam.Role( 44 | "LambdaRole", 45 | AssumeRolePolicyDocument=aws.Policy( 46 | Statement=[ 47 | aws.Statement( 48 | Effect=aws.Allow, 49 | Action=[sts.AssumeRole], 50 | Principal=aws.Principal( 51 | "Service", ["lambda.amazonaws.com"] 52 | ) 53 | ) 54 | ] 55 | ), 56 | Policies=[iam.Policy( 57 | "LambdaBackupRDSPolicy", 58 | PolicyName="AccessToEC2Snapshots", 59 | PolicyDocument=aws.Policy(Statement=[ 60 | aws.Statement( 61 | Effect=aws.Allow, 62 | Action=[ 63 | aws.Action("ec2", "Describe*"), 64 | aws.Action("ec2", "CreateSnapshot"), 65 | aws.Action("ec2", "DeleteSnapshot"), 66 | aws.Action("ec2", "CreateTags"), 67 | aws.Action("ec2", "ModifySnapshotAttribute"), 68 | aws.Action("ec2", "ResetSnapshotAttribute"), 69 | ], 70 | Resource=["*"] 71 | ), 72 | aws.Statement( 73 | Effect=aws.Allow, 74 | Action=[ 75 | aws.Action("logs", "CreateLogGroup"), 76 | aws.Action("logs", "CreateLogStream"), 77 | aws.Action("logs", "PutLogEvents"), 78 | ], 79 | Resource=["arn:aws:logs:*:*:*"] 80 | ), 81 | ]) 82 | )] 83 | )) 84 | 85 | lambda_function = template.add_resource(awslambda.Function( 86 | "LambdaFunction", 87 | Description="Maintains EBS snapshots of tagged instances", 88 | Code=awslambda.Code( 89 | S3Bucket=Ref(s3_bucket_parameter), 90 | S3Key=Ref(source_zip_parameter), 91 | ), 92 | Handler="ebs-snapshots.lambda_handler", 93 | MemorySize=128, 94 | Role=GetAtt(lambda_role, "Arn"), 95 | Runtime="python3.6", 96 | Timeout=30 97 | )) 98 | 99 | schedule_event = template.add_resource(events.Rule( 100 | "LambdaTriggerRule", 101 | Description="Trigger EBS snapshot Lambda", 102 | ScheduleExpression="rate(1 day)", 103 | State="ENABLED", 104 | Targets=[ 105 | events.Target( 106 | Arn=GetAtt(lambda_function, "Arn"), 107 | Id="ebs-snapshot-lambda" 108 | ) 109 | ] 110 | )) 111 | 112 | # Permission for CloudWatch Events to trigger the Lambda 113 | template.add_resource(awslambda.Permission( 114 | "EventsPermissionForLambda", 115 | Action="lambda:invokeFunction", 116 | FunctionName=Ref(lambda_function), 117 | Principal="events.amazonaws.com", 118 | SourceArn=GetAtt(schedule_event, "Arn") 119 | )) 120 | 121 | print(template.to_json()) 122 | -------------------------------------------------------------------------------- /infrastructure/src/maintenace-lambdas.py: -------------------------------------------------------------------------------- 1 | from troposphere import Template, GetAtt, Ref, Parameter 2 | from troposphere.iam import Role 3 | from troposphere.iam import Policy as IAMPolicy 4 | from troposphere.awslambda import Function, Code 5 | from troposphere.cloudwatch import Alarm, MetricDimension 6 | from troposphere.sns import Subscription, Topic 7 | from awacs.aws import Allow, Statement, Action, Principal, Policy 8 | from awacs.sts import AssumeRole 9 | import os 10 | 11 | t = Template() 12 | 13 | t.add_description('Stack with Lambda function performing maintenance tasks') 14 | 15 | param_alarm_email = t.add_parameter(Parameter( 16 | "AlarmEmail", 17 | Description="Email where Lambda errors alarms should be sent to", 18 | Default="contact@example.com", 19 | Type="String", 20 | )) 21 | 22 | ec_images_role = t.add_resource(Role( 23 | "LambdaCleanImagesRole", 24 | AssumeRolePolicyDocument=Policy( 25 | Statement=[ 26 | Statement( 27 | Effect=Allow, Action=[AssumeRole], 28 | Principal=Principal( 29 | "Service", ["lambda.amazonaws.com"] 30 | ) 31 | ) 32 | ] 33 | ), 34 | Policies=[IAMPolicy( 35 | "LambdaCleanBaseImagesPolicy", 36 | PolicyName="LambdaCleanBaseImagesPolicy", 37 | PolicyDocument=Policy(Statement=[ 38 | Statement( 39 | Effect=Allow, 40 | Action=[ 41 | Action('ec2', 'DescribeImages'), 42 | Action('ec2', 'DeregisterImage'), 43 | ], 44 | Resource=['*'] 45 | ), 46 | Statement( 47 | Effect=Allow, 48 | Action=[ 49 | Action('logs', 'CreateLogGroup'), 50 | Action('logs', 'CreateLogStream'), 51 | Action('logs', 'PutLogEvents'), 52 | ], 53 | Resource=['arn:aws:logs:*:*:*'] 54 | ) 55 | ]) 56 | )] 57 | )) 58 | 59 | es_exec_role = t.add_resource(Role( 60 | "LambdaESExecRole", 61 | AssumeRolePolicyDocument=Policy( 62 | Statement=[ 63 | Statement( 64 | Effect=Allow, Action=[AssumeRole], 65 | Principal=Principal( 66 | "Service", ["lambda.amazonaws.com"] 67 | ) 68 | ) 69 | ] 70 | ), 71 | Policies=[IAMPolicy( 72 | "LambdaCleanBaseImagesPolicy", 73 | PolicyName="LambdaCleanBaseImagesPolicy", 74 | PolicyDocument=Policy(Statement=[ 75 | Statement( 76 | Effect=Allow, 77 | Action=[ 78 | Action('logs', 'CreateLogGroup'), 79 | Action('logs', 'CreateLogStream'), 80 | Action('logs', 'PutLogEvents'), 81 | ], 82 | Resource=['arn:aws:logs:*:*:*'] 83 | ), 84 | Statement( 85 | Effect=Allow, 86 | Action=[ 87 | Action('es', '*'), 88 | ], 89 | Resource=['arn:aws:es:*:*:*'] 90 | ) 91 | ]) 92 | )] 93 | )) 94 | 95 | source_file = os.path.realpath(__file__ + '/../../../clean-base-images.py') 96 | with open(source_file, 'r') as content_file: 97 | content = content_file.read() 98 | 99 | if len(content) > 4096: 100 | raise Exception("Base function too long!") 101 | 102 | base_function = t.add_resource(Function( 103 | 'LambdaBaseFunction', 104 | Description='Clears Base AMI images', 105 | Code=Code( 106 | ZipFile=content 107 | ), 108 | Handler='index.lambda_handler', 109 | MemorySize=128, 110 | Role=GetAtt(ec_images_role, 'Arn'), 111 | Runtime='python2.7', 112 | Timeout=10 113 | )) 114 | 115 | source_file = os.path.realpath(__file__ + '/../../../clean-release-images.py') 116 | with open(source_file, 'r') as content_file: 117 | content = content_file.read() 118 | 119 | if len(content) > 4096: 120 | raise Exception("Release function too long!") 121 | 122 | release_function = t.add_resource(Function( 123 | 'LambdaReleaseFunction', 124 | Description='Clears Release AMI images', 125 | Code=Code( 126 | ZipFile=content 127 | ), 128 | Handler='index.lambda_handler', 129 | MemorySize=128, 130 | Role=GetAtt(ec_images_role, 'Arn'), 131 | Runtime='python2.7', 132 | Timeout=10 133 | )) 134 | 135 | source_file = os.path.realpath(__file__ + '/../../../clean-es-indices.py') 136 | with open(source_file, 'r') as content_file: 137 | content = content_file.read() 138 | 139 | if len(content) > 4096: 140 | raise Exception("Clean ES function too long! Has " + str(len(content))) 141 | 142 | clea_es_function = t.add_resource(Function( 143 | 'LambdaCleanESFunction', 144 | Description='Removes old ElasticSearch indexes', 145 | Code=Code( 146 | ZipFile=content 147 | ), 148 | Handler='index.lambda_handler', 149 | MemorySize=128, 150 | Role=GetAtt(es_exec_role, 'Arn'), 151 | Runtime='python2.7', 152 | Timeout=60 153 | )) 154 | 155 | alarm_topic = t.add_resource(Topic( 156 | 'LambdaErrorTopic', 157 | Subscription=[Subscription( 158 | Protocol="email", 159 | Endpoint=Ref(param_alarm_email) 160 | )] 161 | )) 162 | 163 | t.add_resource(Alarm( 164 | "LambdaBaseErrorsAlarm", 165 | ComparisonOperator='GreaterThanThreshold', 166 | EvaluationPeriods=1, 167 | MetricName='Errors', 168 | Namespace='AWS/Lambda', 169 | Dimensions=[ 170 | MetricDimension( 171 | Name='FunctionName', 172 | Value=Ref(base_function) 173 | ) 174 | ], 175 | Period=300, 176 | Statistic='Maximum', 177 | Threshold='0', 178 | AlarmActions=[Ref(alarm_topic)] 179 | )) 180 | 181 | t.add_resource(Alarm( 182 | "LambdaReleaseErrorsAlarm", 183 | ComparisonOperator='GreaterThanThreshold', 184 | EvaluationPeriods=1, 185 | MetricName='Errors', 186 | Namespace='AWS/Lambda', 187 | Dimensions=[ 188 | MetricDimension( 189 | Name='FunctionName', 190 | Value=Ref(release_function) 191 | ) 192 | ], 193 | Period=300, 194 | Statistic='Maximum', 195 | Threshold='0', 196 | AlarmActions=[Ref(alarm_topic)] 197 | )) 198 | 199 | t.add_resource(Alarm( 200 | "LambdaCleanESErrorsAlarm", 201 | ComparisonOperator='GreaterThanThreshold', 202 | EvaluationPeriods=1, 203 | MetricName='Errors', 204 | Namespace='AWS/Lambda', 205 | Dimensions=[ 206 | MetricDimension( 207 | Name='FunctionName', 208 | Value=Ref(clea_es_function) 209 | ) 210 | ], 211 | Period=300, 212 | Statistic='Maximum', 213 | Threshold='0', 214 | AlarmActions=[Ref(alarm_topic)] 215 | )) 216 | 217 | t.add_resource(Alarm( 218 | "LambdaCleanEESThrottlesAlarm", 219 | ComparisonOperator='GreaterThanThreshold', 220 | EvaluationPeriods=1, 221 | MetricName='Throttles', 222 | Namespace='AWS/Lambda', 223 | Dimensions=[ 224 | MetricDimension( 225 | Name='FunctionName', 226 | Value=Ref(clea_es_function) 227 | ) 228 | ], 229 | Period=300, 230 | Statistic='Maximum', 231 | Threshold='0', 232 | AlarmActions=[Ref(alarm_topic)] 233 | )) 234 | 235 | print(t.to_json()) 236 | -------------------------------------------------------------------------------- /infrastructure/src/rds-cross-region-backup.py: -------------------------------------------------------------------------------- 1 | from awacs import aws, sts 2 | from troposphere import Template, GetAtt, Join, Ref, Parameter, Equals, If, AWS_NO_VALUE, AWS_REGION 3 | from troposphere import awslambda, iam, sns, rds, events 4 | 5 | template = Template() 6 | 7 | template.add_description('Resources copying RDS backups to another region') 8 | 9 | target_region_parameter = template.add_parameter(Parameter( 10 | "TargetRegionParameter", 11 | Type="String", 12 | Description="Region where to store the copies of snapshots (for example: eu-central-1)", 13 | AllowedPattern="^[a-z]+-[a-z]+-[0-9]+$", 14 | ConstraintDescription="The target region needs to be valid AWS region, for example: us-east-1" 15 | )) 16 | 17 | databases_to_use_parameter = template.add_parameter(Parameter( 18 | "DatabasesToUse", 19 | Type="CommaDelimitedList", 20 | Description="Optional: comma-delimited list of RDS instance (not Aurora clusters!) names to use. Leave empty to use for all instances in source region." 21 | )) 22 | 23 | include_aurora_clusters_parameter = template.add_parameter(Parameter( 24 | "IncludeAuroraClusters", 25 | Type="String", 26 | AllowedValues=["Yes", "No"], 27 | Default="No", 28 | Description="Choose 'Yes' if you have Aurora Clusters that you want to use this for, will add daily schedule." 29 | )) 30 | 31 | clusters_to_use_parameter = template.add_parameter(Parameter( 32 | "ClustersToUse", 33 | Type="String", 34 | Default="", 35 | Description="Optional: If including Aurora clusters - comma-delimited list of Aurora Clusters to use for. Leave empty to use for all clusters in source region." 36 | )) 37 | 38 | kms_key_parameter = template.add_parameter(Parameter( 39 | "KMSKeyParameter", 40 | Type="String", 41 | Description="KMS Key ARN in target region. Required if using encrypted RDS instances, optional otherwise.", 42 | )) 43 | 44 | s3_bucket_parameter = template.add_parameter(Parameter( 45 | "S3BucketParameter", 46 | Type="String", 47 | Description="Name of the S3 bucket where you uploaded the source code zip", 48 | )) 49 | 50 | source_zip_parameter = template.add_parameter(Parameter( 51 | "SourceZipParameter", 52 | Type="String", 53 | Default="backup-rds.zip", 54 | Description="Name of the zip file inside the S3 bucket", 55 | )) 56 | 57 | 58 | template.add_condition("UseAllDatabases", Equals(Join("", Ref(databases_to_use_parameter)), "")) 59 | template.add_condition("UseEncryption", Equals(Ref(kms_key_parameter), ""), ) 60 | template.add_condition("IncludeAurora", Equals(Ref(include_aurora_clusters_parameter), "Yes")) 61 | 62 | template.add_metadata({ 63 | "AWS::CloudFormation::Interface": { 64 | "ParameterGroups": [ 65 | { 66 | "Label": { 67 | "default": "Basic configuration" 68 | }, 69 | "Parameters": [ 70 | "TargetRegionParameter", 71 | "S3BucketParameter", 72 | "SourceZipParameter", 73 | ] 74 | }, 75 | { 76 | "Label": { 77 | "default": "Encryption - see https://github.com/pbudzon/aws-maintenance#encryption for details" 78 | }, 79 | "Parameters": [ 80 | "KMSKeyParameter", 81 | ] 82 | }, 83 | { 84 | "Label": { 85 | "default": "Optional: limit to specific RDS database(s)" 86 | }, 87 | "Parameters": [ 88 | "DatabasesToUse", 89 | ] 90 | }, 91 | { 92 | "Label": { 93 | "default": "Optional: Aurora support" 94 | }, 95 | "Parameters": [ 96 | "IncludeAuroraClusters", 97 | "ClustersToUse" 98 | ] 99 | }, 100 | ], 101 | "ParameterLabels": { 102 | "TargetRegionParameter": {"default": "Target region"}, 103 | "DatabasesToUse": {"default": "Databases to use for"}, 104 | "KMSKeyParameter": {"default": "KMS Key in target region"}, 105 | "IncludeAuroraClusters": {"default": "Use for Aurora clusters"}, 106 | "ClustersToUse": {"default": "Aurora clusters to use for"}, 107 | "S3BucketParameter": {"default": "Name of S3 bucket"}, 108 | "SourceZipParameter": {"default": "Name of ZIP file"}, 109 | } 110 | } 111 | }) 112 | 113 | # Role for Lambda 114 | backup_rds_role = template.add_resource(iam.Role( 115 | "LambdaBackupRDSRole", 116 | AssumeRolePolicyDocument=aws.Policy( 117 | Statement=[ 118 | aws.Statement( 119 | Effect=aws.Allow, 120 | Action=[sts.AssumeRole], 121 | Principal=aws.Principal( 122 | "Service", ["lambda.amazonaws.com"] 123 | ) 124 | ) 125 | ] 126 | ), 127 | Policies=[iam.Policy( 128 | "LambdaBackupRDSPolicy", 129 | PolicyName="AccessToRDSAndLogs", 130 | PolicyDocument=aws.Policy(Statement=[ 131 | aws.Statement( 132 | Effect=aws.Allow, 133 | Action=[ 134 | aws.Action('rds', 'DescribeDbSnapshots'), 135 | aws.Action('rds', 'CopyDbSnapshot'), 136 | aws.Action('rds', 'DeleteDbSnapshot'), 137 | aws.Action('rds', 'DeleteDbClusterSnapshot'), 138 | aws.Action('rds', 'DescribeDbClusters'), 139 | aws.Action('rds', 'DescribeDbClusterSnapshots'), 140 | aws.Action('rds', 'CopyDBClusterSnapshot'), 141 | ], 142 | Resource=['*'] 143 | ), 144 | aws.Statement( 145 | Effect=aws.Allow, 146 | Action=[ 147 | aws.Action('logs', 'CreateLogGroup'), 148 | aws.Action('logs', 'CreateLogStream'), 149 | aws.Action('logs', 'PutLogEvents'), 150 | ], 151 | Resource=['arn:aws:logs:*:*:*'] 152 | ), 153 | If( 154 | "UseEncryption", 155 | Ref(AWS_NO_VALUE), 156 | aws.Statement( 157 | Effect=aws.Allow, 158 | Action=[ 159 | aws.Action('kms', 'Create*'), # Don't ask me why this is needed... 160 | aws.Action('kms', 'DescribeKey'), 161 | ], 162 | Resource=[Ref(kms_key_parameter)] 163 | ), 164 | ), 165 | ]) 166 | )] 167 | )) 168 | 169 | backup_rds_function = template.add_resource(awslambda.Function( 170 | 'LambdaBackupRDSFunction', 171 | Description='Copies RDS backups to another region', 172 | Code=awslambda.Code( 173 | S3Bucket=Ref(s3_bucket_parameter), 174 | S3Key=Ref(source_zip_parameter), 175 | ), 176 | Handler='backup-rds.lambda_handler', 177 | MemorySize=128, 178 | Role=GetAtt(backup_rds_role, 'Arn'), 179 | Runtime='python3.6', 180 | Timeout=30, 181 | Environment=awslambda.Environment( 182 | Variables={ 183 | 'SOURCE_REGION': Ref(AWS_REGION), 184 | 'TARGET_REGION': Ref(target_region_parameter), 185 | 'KMS_KEY_ID': Ref(kms_key_parameter), 186 | 'CLUSTERS_TO_USE': Ref(clusters_to_use_parameter) 187 | } 188 | ) 189 | )) 190 | 191 | # SNS topic for event subscriptions 192 | rds_topic = template.add_resource(sns.Topic( 193 | 'RDSBackupTopic', 194 | Subscription=[sns.Subscription( 195 | Protocol="lambda", 196 | Endpoint=GetAtt(backup_rds_function, 'Arn'), 197 | )] 198 | )) 199 | 200 | # Event subscription - RDS will notify SNS when backup is started and finished 201 | template.add_resource(rds.EventSubscription( 202 | "RDSBackupEvent", 203 | Enabled=True, 204 | EventCategories=["backup"], 205 | SourceType="db-instance", 206 | SnsTopicArn=Ref(rds_topic), 207 | SourceIds=If("UseAllDatabases", Ref(AWS_NO_VALUE), Ref(databases_to_use_parameter)) 208 | )) 209 | 210 | # Permission for SNS to trigger the Lambda 211 | template.add_resource(awslambda.Permission( 212 | "SNSPermissionForLambda", 213 | Action="lambda:invokeFunction", 214 | FunctionName=Ref(backup_rds_function), 215 | Principal="sns.amazonaws.com", 216 | SourceArn=Ref(rds_topic) 217 | )) 218 | 219 | schedule_event = template.add_resource(events.Rule( 220 | "AuroraBackupEvent", 221 | Condition="IncludeAurora", 222 | Description="Copy Aurora clusters to another region", 223 | ScheduleExpression="rate(1 day)", 224 | State="ENABLED", 225 | Targets=[ 226 | events.Target( 227 | Arn=GetAtt(backup_rds_function, "Arn"), 228 | Id="backup_rds_function" 229 | ) 230 | ] 231 | )) 232 | 233 | # Permission for CloudWatch Events to trigger the Lambda 234 | template.add_resource(awslambda.Permission( 235 | "EventsPermissionForLambda", 236 | Condition="IncludeAurora", 237 | Action="lambda:invokeFunction", 238 | FunctionName=Ref(backup_rds_function), 239 | Principal="events.amazonaws.com", 240 | SourceArn=GetAtt(schedule_event, "Arn") 241 | )) 242 | 243 | print(template.to_json()) 244 | -------------------------------------------------------------------------------- /infrastructure/templates/cloudtrail-notifications.json: -------------------------------------------------------------------------------- 1 | { 2 | "Description": "Lambda function monitoring cloudtrail logs", 3 | "Outputs": { 4 | "SNSNotificationTopic": { 5 | "Description": "SNS topic to which the alerts will be send", 6 | "Value": { 7 | "Ref": "NotifcationTopic" 8 | } 9 | } 10 | }, 11 | "Resources": { 12 | "Bucket": { 13 | "DeletionPolicy": "Retain", 14 | "Properties": { 15 | "AccessControl": "Private", 16 | "BucketName": { 17 | "Fn::Join": [ 18 | "-", 19 | [ 20 | { 21 | "Ref": "AWS::StackName" 22 | }, 23 | { 24 | "Ref": "AWS::AccountId" 25 | } 26 | ] 27 | ] 28 | } 29 | }, 30 | "Type": "AWS::S3::Bucket" 31 | }, 32 | "BucketPolicy": { 33 | "Properties": { 34 | "Bucket": { 35 | "Ref": "Bucket" 36 | }, 37 | "PolicyDocument": { 38 | "Statement": [ 39 | { 40 | "Action": [ 41 | "s3:GetBucketAcl" 42 | ], 43 | "Effect": "Allow", 44 | "Principal": { 45 | "Service": [ 46 | "cloudtrail.amazonaws.com" 47 | ] 48 | }, 49 | "Resource": [ 50 | { 51 | "Fn::Join": [ 52 | "", 53 | [ 54 | "arn:aws:s3:::", 55 | { 56 | "Ref": "Bucket" 57 | } 58 | ] 59 | ] 60 | } 61 | ], 62 | "Sid": "AWSCloudTrailAclCheck" 63 | }, 64 | { 65 | "Action": [ 66 | "s3:PutObject" 67 | ], 68 | "Condition": { 69 | "StringEquals": { 70 | "s3:x-amz-acl": "bucket-owner-full-control" 71 | } 72 | }, 73 | "Effect": "Allow", 74 | "Principal": { 75 | "Service": [ 76 | "cloudtrail.amazonaws.com" 77 | ] 78 | }, 79 | "Resource": [ 80 | { 81 | "Fn::Join": [ 82 | "", 83 | [ 84 | "arn:aws:s3:::", 85 | { 86 | "Ref": "Bucket" 87 | }, 88 | "/AWSLogs/", 89 | { 90 | "Ref": "AWS::AccountId" 91 | }, 92 | "/*" 93 | ] 94 | ] 95 | } 96 | ], 97 | "Sid": "AWSCloudTrailWrite" 98 | } 99 | ] 100 | } 101 | }, 102 | "Type": "AWS::S3::BucketPolicy" 103 | }, 104 | "CloudTrail": { 105 | "DependsOn": "BucketPolicy", 106 | "Properties": { 107 | "IncludeGlobalServiceEvents": "true", 108 | "IsLogging": "true", 109 | "IsMultiRegionTrail": "true", 110 | "S3BucketName": { 111 | "Ref": "Bucket" 112 | }, 113 | "SnsTopicName": { 114 | "Ref": "CloudtrailTopic" 115 | } 116 | }, 117 | "Type": "AWS::CloudTrail::Trail" 118 | }, 119 | "CloudtrailTopic": { 120 | "Properties": { 121 | "Subscription": [ 122 | { 123 | "Endpoint": { 124 | "Fn::GetAtt": [ 125 | "LambdaFunction", 126 | "Arn" 127 | ] 128 | }, 129 | "Protocol": "lambda" 130 | } 131 | ] 132 | }, 133 | "Type": "AWS::SNS::Topic" 134 | }, 135 | "CloudtrailTopicPolicy": { 136 | "Properties": { 137 | "PolicyDocument": { 138 | "Statement": [ 139 | { 140 | "Action": [ 141 | "sns:publish" 142 | ], 143 | "Effect": "Allow", 144 | "Principal": { 145 | "Service": [ 146 | "cloudtrail.amazonaws.com" 147 | ] 148 | }, 149 | "Resource": [ 150 | { 151 | "Ref": "CloudtrailTopic" 152 | } 153 | ], 154 | "Sid": "AWSCloudTrailSNSPolicy" 155 | } 156 | ] 157 | }, 158 | "Topics": [ 159 | { 160 | "Ref": "CloudtrailTopic" 161 | } 162 | ] 163 | }, 164 | "Type": "AWS::SNS::TopicPolicy" 165 | }, 166 | "LambdaErrorsAlarm": { 167 | "Properties": { 168 | "AlarmActions": [ 169 | { 170 | "Ref": "NotifcationTopic" 171 | } 172 | ], 173 | "ComparisonOperator": "GreaterThanThreshold", 174 | "Dimensions": [ 175 | { 176 | "Name": "FunctionName", 177 | "Value": { 178 | "Ref": "LambdaFunction" 179 | } 180 | } 181 | ], 182 | "EvaluationPeriods": 1, 183 | "MetricName": "Errors", 184 | "Namespace": "AWS/Lambda", 185 | "Period": 300, 186 | "Statistic": "Maximum", 187 | "Threshold": "0" 188 | }, 189 | "Type": "AWS::CloudWatch::Alarm" 190 | }, 191 | "LambdaFunction": { 192 | "Properties": { 193 | "Code": { 194 | "ZipFile": "import json\nimport boto3\nimport gzip\n\n\ndef lambda_handler(event, context):\n sns_topic = None\n\n info = boto3.client('lambda').get_function(\n FunctionName=context.function_name\n )\n\n iam = boto3.client('iam')\n role_name = info['Configuration']['Role'].split('/')[1]\n\n policies = iam.list_role_policies(\n RoleName=role_name\n )\n\n for policy in policies['PolicyNames']:\n details = iam.get_role_policy(\n RoleName=role_name,\n PolicyName=policy\n )\n\n for statement in details['PolicyDocument']['Statement']:\n for action in statement['Action']:\n if action == 'sns:publish':\n sns_topic = statement['Resource']\n break\n\n if sns_topic is None:\n raise Exception(\"Could not find SNS topic for notifications!\")\n\n sns = boto3.client('sns')\n\n if 'Records' not in event:\n raise Exception(\"Invalid message received!\")\n\n for record in event['Records']:\n if 'Message' not in record['Sns']:\n print(record)\n raise Exception(\"Invalid record!\")\n\n message = json.loads(record['Sns']['Message'])\n\n if 's3Bucket' not in message or 's3ObjectKey' not in message:\n raise Exception(\"s3Bucket or s3ObjectKey missing from Message!\")\n\n s3 = boto3.resource('s3')\n\n for s3key in message['s3ObjectKey']:\n s3.meta.client.download_file(message['s3Bucket'], s3key, '/tmp/s3file.json.gz')\n\n with gzip.open('/tmp/s3file.json.gz', 'rb') as f:\n file_content = json.loads(f.read())\n for record in file_content['Records']:\n if record['eventSource'] == \"ec2.amazonaws.com\" and record['eventName'] == 'RunInstances':\n print(record)\n for topic in sns_topic:\n sns.publish(\n TopicArn=topic,\n Message=json.dumps(record),\n Subject=\"RunInstances invoked at \" + record['eventTime']\n )\n\n\nif __name__ == '__main__':\n lambda_handler({\n \"Records\": [{\n \"Sns\": {\n \"Message\": \"{\\\"s3Bucket\\\":\\\"cloudtrail-xxx\\\",\\\"s3ObjectKey\\\":[\\\"AWSLogs/xxx/CloudTrail/ap-northeast-1/2016/06/15/abc.json.gz\\\"]}\"\n }\n }]\n }, None)\n" 195 | }, 196 | "Description": "Monitors CloudTrail", 197 | "Handler": "index.lambda_handler", 198 | "MemorySize": 128, 199 | "Role": { 200 | "Fn::GetAtt": [ 201 | "LambdaRole", 202 | "Arn" 203 | ] 204 | }, 205 | "Runtime": "python2.7", 206 | "Timeout": 10 207 | }, 208 | "Type": "AWS::Lambda::Function" 209 | }, 210 | "LambdaPermission": { 211 | "Properties": { 212 | "Action": "lambda:InvokeFunction", 213 | "FunctionName": { 214 | "Ref": "LambdaFunction" 215 | }, 216 | "Principal": "sns.amazonaws.com", 217 | "SourceAccount": { 218 | "Ref": "AWS::AccountId" 219 | }, 220 | "SourceArn": { 221 | "Ref": "CloudtrailTopic" 222 | } 223 | }, 224 | "Type": "AWS::Lambda::Permission" 225 | }, 226 | "LambdaRole": { 227 | "Properties": { 228 | "AssumeRolePolicyDocument": { 229 | "Statement": [ 230 | { 231 | "Action": [ 232 | "sts:AssumeRole" 233 | ], 234 | "Effect": "Allow", 235 | "Principal": { 236 | "Service": [ 237 | "lambda.amazonaws.com" 238 | ] 239 | } 240 | } 241 | ] 242 | }, 243 | "Policies": [ 244 | { 245 | "PolicyDocument": { 246 | "Statement": [ 247 | { 248 | "Action": [ 249 | "s3:GetObject" 250 | ], 251 | "Effect": "Allow", 252 | "Resource": [ 253 | { 254 | "Fn::Join": [ 255 | "", 256 | [ 257 | "arn:aws:s3:::", 258 | { 259 | "Ref": "Bucket" 260 | }, 261 | "/*" 262 | ] 263 | ] 264 | } 265 | ] 266 | }, 267 | { 268 | "Action": [ 269 | "logs:CreateLogGroup", 270 | "logs:CreateLogStream", 271 | "logs:PutLogEvents" 272 | ], 273 | "Effect": "Allow", 274 | "Resource": [ 275 | "arn:aws:logs:*:*:*" 276 | ] 277 | }, 278 | { 279 | "Action": [ 280 | "lambda:GetFunction" 281 | ], 282 | "Effect": "Allow", 283 | "Resource": [ 284 | "*" 285 | ] 286 | }, 287 | { 288 | "Action": [ 289 | "sns:publish" 290 | ], 291 | "Effect": "Allow", 292 | "Resource": [ 293 | { 294 | "Ref": "NotifcationTopic" 295 | } 296 | ] 297 | }, 298 | { 299 | "Action": [ 300 | "iam:ListRolePolicies", 301 | "iam:GetRolePolicy" 302 | ], 303 | "Effect": "Allow", 304 | "Resource": [ 305 | "*" 306 | ] 307 | } 308 | ] 309 | }, 310 | "PolicyName": "LambdaCloudtrailPolicy" 311 | } 312 | ] 313 | }, 314 | "Type": "AWS::IAM::Role" 315 | }, 316 | "LambdaThrottlesAlarm": { 317 | "Properties": { 318 | "AlarmActions": [ 319 | { 320 | "Ref": "NotifcationTopic" 321 | } 322 | ], 323 | "ComparisonOperator": "GreaterThanThreshold", 324 | "Dimensions": [ 325 | { 326 | "Name": "FunctionName", 327 | "Value": { 328 | "Ref": "LambdaFunction" 329 | } 330 | } 331 | ], 332 | "EvaluationPeriods": 1, 333 | "MetricName": "Throttles", 334 | "Namespace": "AWS/Lambda", 335 | "Period": 300, 336 | "Statistic": "Maximum", 337 | "Threshold": "0" 338 | }, 339 | "Type": "AWS::CloudWatch::Alarm" 340 | }, 341 | "NotifcationTopic": { 342 | "Properties": { 343 | "DisplayName": "CloudTrail Monitor Alerts" 344 | }, 345 | "Type": "AWS::SNS::Topic" 346 | } 347 | } 348 | } 349 | -------------------------------------------------------------------------------- /infrastructure/templates/create-ebs-snapshots.json: -------------------------------------------------------------------------------- 1 | { 2 | "Description": "Automated EBS snapshots and retention management", 3 | "Metadata": { 4 | "AWS::CloudFormation::Interface": { 5 | "ParameterGroups": [ 6 | { 7 | "Label": { 8 | "default": "Basic configuration" 9 | }, 10 | "Parameters": [ 11 | "S3BucketParameter", 12 | "SourceZipParameter" 13 | ] 14 | } 15 | ], 16 | "ParameterLabels": { 17 | "S3BucketParameter": { 18 | "default": "Name of S3 bucket" 19 | }, 20 | "SourceZipParameter": { 21 | "default": "Name of ZIP file" 22 | } 23 | } 24 | } 25 | }, 26 | "Parameters": { 27 | "S3BucketParameter": { 28 | "Description": "Name of the S3 bucket where you uploaded the source code zip", 29 | "Type": "String" 30 | }, 31 | "SourceZipParameter": { 32 | "Default": "ebs-snapshots.zip", 33 | "Description": "Name of the zip file inside the S3 bucket", 34 | "Type": "String" 35 | } 36 | }, 37 | "Resources": { 38 | "EventsPermissionForLambda": { 39 | "Properties": { 40 | "Action": "lambda:invokeFunction", 41 | "FunctionName": { 42 | "Ref": "LambdaFunction" 43 | }, 44 | "Principal": "events.amazonaws.com", 45 | "SourceArn": { 46 | "Fn::GetAtt": [ 47 | "LambdaTriggerRule", 48 | "Arn" 49 | ] 50 | } 51 | }, 52 | "Type": "AWS::Lambda::Permission" 53 | }, 54 | "LambdaFunction": { 55 | "Properties": { 56 | "Code": { 57 | "S3Bucket": { 58 | "Ref": "S3BucketParameter" 59 | }, 60 | "S3Key": { 61 | "Ref": "SourceZipParameter" 62 | } 63 | }, 64 | "Description": "Maintains EBS snapshots of tagged instances", 65 | "Handler": "ebs-snapshots.lambda_handler", 66 | "MemorySize": 128, 67 | "Role": { 68 | "Fn::GetAtt": [ 69 | "LambdaRole", 70 | "Arn" 71 | ] 72 | }, 73 | "Runtime": "python3.6", 74 | "Timeout": 30 75 | }, 76 | "Type": "AWS::Lambda::Function" 77 | }, 78 | "LambdaRole": { 79 | "Properties": { 80 | "AssumeRolePolicyDocument": { 81 | "Statement": [ 82 | { 83 | "Action": [ 84 | "sts:AssumeRole" 85 | ], 86 | "Effect": "Allow", 87 | "Principal": { 88 | "Service": [ 89 | "lambda.amazonaws.com" 90 | ] 91 | } 92 | } 93 | ] 94 | }, 95 | "Policies": [ 96 | { 97 | "PolicyDocument": { 98 | "Statement": [ 99 | { 100 | "Action": [ 101 | "ec2:Describe*", 102 | "ec2:CreateSnapshot", 103 | "ec2:DeleteSnapshot", 104 | "ec2:CreateTags", 105 | "ec2:ModifySnapshotAttribute", 106 | "ec2:ResetSnapshotAttribute" 107 | ], 108 | "Effect": "Allow", 109 | "Resource": [ 110 | "*" 111 | ] 112 | }, 113 | { 114 | "Action": [ 115 | "logs:CreateLogGroup", 116 | "logs:CreateLogStream", 117 | "logs:PutLogEvents" 118 | ], 119 | "Effect": "Allow", 120 | "Resource": [ 121 | "arn:aws:logs:*:*:*" 122 | ] 123 | } 124 | ] 125 | }, 126 | "PolicyName": "AccessToEC2Snapshots" 127 | } 128 | ] 129 | }, 130 | "Type": "AWS::IAM::Role" 131 | }, 132 | "LambdaTriggerRule": { 133 | "Properties": { 134 | "Description": "Trigger EBS snapshot Lambda", 135 | "ScheduleExpression": "rate(1 day)", 136 | "State": "ENABLED", 137 | "Targets": [ 138 | { 139 | "Arn": { 140 | "Fn::GetAtt": [ 141 | "LambdaFunction", 142 | "Arn" 143 | ] 144 | }, 145 | "Id": "ebs-snapshot-lambda" 146 | } 147 | ] 148 | }, 149 | "Type": "AWS::Events::Rule" 150 | } 151 | } 152 | } 153 | -------------------------------------------------------------------------------- /infrastructure/templates/maintenace-lambdas.json: -------------------------------------------------------------------------------- 1 | { 2 | "Description": "Stack with Lambda function performing maintenance tasks", 3 | "Parameters": { 4 | "AlarmEmail": { 5 | "Default": "contact@example.com", 6 | "Description": "Email where Lambda errors alarms should be sent to", 7 | "Type": "String" 8 | } 9 | }, 10 | "Resources": { 11 | "LambdaBaseErrorsAlarm": { 12 | "Properties": { 13 | "AlarmActions": [ 14 | { 15 | "Ref": "LambdaErrorTopic" 16 | } 17 | ], 18 | "ComparisonOperator": "GreaterThanThreshold", 19 | "Dimensions": [ 20 | { 21 | "Name": "FunctionName", 22 | "Value": { 23 | "Ref": "LambdaBaseFunction" 24 | } 25 | } 26 | ], 27 | "EvaluationPeriods": 1, 28 | "MetricName": "Errors", 29 | "Namespace": "AWS/Lambda", 30 | "Period": 300, 31 | "Statistic": "Maximum", 32 | "Threshold": "0" 33 | }, 34 | "Type": "AWS::CloudWatch::Alarm" 35 | }, 36 | "LambdaBaseFunction": { 37 | "Properties": { 38 | "Code": { 39 | "ZipFile": "import boto3\nimport operator\n\n\ndef lambda_handler(event, context):\n LIMIT = 10\n client = boto3.client('ec2', 'eu-west-1')\n\n response = client.describe_images(\n Owners=['self'],\n Filters=[{'Name': 'tag:Type', 'Values': ['BaseImage']}]\n )\n\n if len(response['Images']) == 0:\n raise Exception('no AMIs with Type=BaseImage tag found')\n\n images = {}\n for image in response['Images']:\n for tag in image['Tags']:\n if tag['Key'] == \"Project\":\n if tag['Value'] not in images.keys():\n images[tag['Value']] = {}\n images[tag['Value']][image['ImageId']] = image['CreationDate']\n break\n\n to_remove = []\n for project in images:\n sorted_x = sorted(images[project].items(), key=operator.itemgetter(1), reverse=True)\n if len(sorted_x) > LIMIT:\n to_remove = to_remove + [i[0] for i in sorted_x[LIMIT:]]\n\n if len(to_remove) == 0:\n print(\"Nothing to do\")\n return 0\n\n print(\"Will remove \" + str(len(to_remove)) + \" images\")\n\n for ami in to_remove:\n print(\"Removing: \" + ami)\n client.deregister_image(ImageId=ami)\n\n\nif __name__ == '__main__':\n lambda_handler(None, None)\n" 40 | }, 41 | "Description": "Clears Base AMI images", 42 | "Handler": "index.lambda_handler", 43 | "MemorySize": 128, 44 | "Role": { 45 | "Fn::GetAtt": [ 46 | "LambdaCleanImagesRole", 47 | "Arn" 48 | ] 49 | }, 50 | "Runtime": "python2.7", 51 | "Timeout": 10 52 | }, 53 | "Type": "AWS::Lambda::Function" 54 | }, 55 | "LambdaCleanEESThrottlesAlarm": { 56 | "Properties": { 57 | "AlarmActions": [ 58 | { 59 | "Ref": "LambdaErrorTopic" 60 | } 61 | ], 62 | "ComparisonOperator": "GreaterThanThreshold", 63 | "Dimensions": [ 64 | { 65 | "Name": "FunctionName", 66 | "Value": { 67 | "Ref": "LambdaCleanESFunction" 68 | } 69 | } 70 | ], 71 | "EvaluationPeriods": 1, 72 | "MetricName": "Throttles", 73 | "Namespace": "AWS/Lambda", 74 | "Period": 300, 75 | "Statistic": "Maximum", 76 | "Threshold": "0" 77 | }, 78 | "Type": "AWS::CloudWatch::Alarm" 79 | }, 80 | "LambdaCleanESErrorsAlarm": { 81 | "Properties": { 82 | "AlarmActions": [ 83 | { 84 | "Ref": "LambdaErrorTopic" 85 | } 86 | ], 87 | "ComparisonOperator": "GreaterThanThreshold", 88 | "Dimensions": [ 89 | { 90 | "Name": "FunctionName", 91 | "Value": { 92 | "Ref": "LambdaCleanESFunction" 93 | } 94 | } 95 | ], 96 | "EvaluationPeriods": 1, 97 | "MetricName": "Errors", 98 | "Namespace": "AWS/Lambda", 99 | "Period": 300, 100 | "Statistic": "Maximum", 101 | "Threshold": "0" 102 | }, 103 | "Type": "AWS::CloudWatch::Alarm" 104 | }, 105 | "LambdaCleanESFunction": { 106 | "Properties": { 107 | "Code": { 108 | "ZipFile": "import os\nimport datetime\nimport hashlib\nimport hmac\nimport urllib2\nimport json\n\nENDPOINTS_ACCOUNTS = {\n 'account-1': 'elastic-search-endpoint',\n 'account-2': 'elastic-search-endpoint',\n}\n\nTHRESHOLD_ACCOUNTS = {\n 'account-1': 20,\n 'account-2': 60\n}\n\n\ndef sign(key, msg):\n return hmac.new(key, msg.encode('utf-8'), hashlib.sha256).digest()\n\n\ndef getSignatureKey(key, dateStamp, regionName, serviceName):\n kDate = sign(('AWS4' + key).encode('utf-8'), dateStamp)\n kRegion = sign(kDate, regionName)\n kService = sign(kRegion, serviceName)\n kSigning = sign(kService, 'aws4_request')\n return kSigning\n\n\ndef get_signature(endpoint, method, canonical_uri):\n region = 'eu-west-1'\n service = 'es'\n access_key = os.environ.get('AWS_ACCESS_KEY_ID')\n secret_key = os.environ.get('AWS_SECRET_ACCESS_KEY')\n session_key = os.environ.get('AWS_SESSION_TOKEN')\n t = datetime.datetime.utcnow()\n amzdate = t.strftime('%Y%m%dT%H%M%SZ')\n datestamp = t.strftime('%Y%m%d')\n canonical_querystring = ''\n canonical_headers = 'host:' + endpoint + '\\nx-amz-date:' + amzdate + '\\nx-amz-security-token:' + session_key + \"\\n\"\n signed_headers = 'host;x-amz-date;x-amz-security-token'\n payload_hash = hashlib.sha256('').hexdigest()\n canonical_request = method + '\\n' + canonical_uri + '\\n' + canonical_querystring + '\\n' + canonical_headers + '\\n' + signed_headers + '\\n' + payload_hash\n algorithm = 'AWS4-HMAC-SHA256'\n credential_scope = datestamp + '/' + region + '/' + service + '/' + 'aws4_request'\n string_to_sign = algorithm + '\\n' + amzdate + '\\n' + credential_scope + '\\n' + hashlib.sha256(\n canonical_request).hexdigest()\n signing_key = getSignatureKey(secret_key, datestamp, region, service)\n signature = hmac.new(signing_key, (string_to_sign).encode('utf-8'), hashlib.sha256).hexdigest()\n authorization_header = algorithm + ' ' + 'Credential=' + access_key + '/' + credential_scope + ', ' + 'SignedHeaders=' + signed_headers + ', ' + 'Signature=' + signature\n headers = {'x-amz-date': amzdate, 'x-amz-security-token': session_key, 'Authorization': authorization_header}\n request_url = 'https://' + endpoint + canonical_uri + '?' + canonical_querystring\n\n return {'url': request_url, 'headers': headers}\n\n\ndef lambda_handler(event, context):\n INDEXPREFIX = 'cwl-'\n\n if 'account' in event:\n if event['account'] not in ENDPOINTS_ACCOUNTS.keys():\n raise Exception(\"No endpoint configured for account \" + str(event['account']))\n ENDPOINT = ENDPOINTS_ACCOUNTS[event['account']]\n TOLEAVE = THRESHOLD_ACCOUNTS[event['account']]\n else:\n raise Exception(\"No account specified in event\")\n\n response = json.loads(get_index_list(ENDPOINT))\n indexes = []\n for index in response:\n if index.startswith(INDEXPREFIX):\n indexes.append(index)\n\n indexes.sort(reverse=True)\n to_remove = indexes[TOLEAVE:]\n for index in to_remove:\n print(\"Removing \" + index)\n delete_index(ENDPOINT, index)\n\n\ndef delete_index(endpoint, index):\n info = get_signature(endpoint, 'DELETE', '/' + index)\n\n opener = urllib2.build_opener(urllib2.HTTPHandler)\n request = urllib2.Request(info['url'], headers=info['headers'])\n request.get_method = lambda: 'DELETE'\n\n r = opener.open(request)\n if r.getcode() != 200:\n raise Exception(\"Non 200 response when calling, got: \" + str(r.getcode()))\n\n\ndef get_index_list(endpoint):\n info = get_signature(endpoint, 'GET', '/_aliases')\n\n request = urllib2.Request(info['url'], headers=info['headers'])\n r = urllib2.urlopen(request)\n if r.getcode() != 200:\n raise Exception(\"Non 200 response when calling, got: \" + str(r.getcode()))\n\n return r.read()\n\n\nif __name__ == '__main__':\n lambda_handler({'account': 'account-1'}, None)\n" 109 | }, 110 | "Description": "Removes old ElasticSearch indexes", 111 | "Handler": "index.lambda_handler", 112 | "MemorySize": 128, 113 | "Role": { 114 | "Fn::GetAtt": [ 115 | "LambdaESExecRole", 116 | "Arn" 117 | ] 118 | }, 119 | "Runtime": "python2.7", 120 | "Timeout": 60 121 | }, 122 | "Type": "AWS::Lambda::Function" 123 | }, 124 | "LambdaCleanImagesRole": { 125 | "Properties": { 126 | "AssumeRolePolicyDocument": { 127 | "Statement": [ 128 | { 129 | "Action": [ 130 | "sts:AssumeRole" 131 | ], 132 | "Effect": "Allow", 133 | "Principal": { 134 | "Service": [ 135 | "lambda.amazonaws.com" 136 | ] 137 | } 138 | } 139 | ] 140 | }, 141 | "Policies": [ 142 | { 143 | "PolicyDocument": { 144 | "Statement": [ 145 | { 146 | "Action": [ 147 | "ec2:DescribeImages", 148 | "ec2:DeregisterImage" 149 | ], 150 | "Effect": "Allow", 151 | "Resource": [ 152 | "*" 153 | ] 154 | }, 155 | { 156 | "Action": [ 157 | "logs:CreateLogGroup", 158 | "logs:CreateLogStream", 159 | "logs:PutLogEvents" 160 | ], 161 | "Effect": "Allow", 162 | "Resource": [ 163 | "arn:aws:logs:*:*:*" 164 | ] 165 | } 166 | ] 167 | }, 168 | "PolicyName": "LambdaCleanBaseImagesPolicy" 169 | } 170 | ] 171 | }, 172 | "Type": "AWS::IAM::Role" 173 | }, 174 | "LambdaESExecRole": { 175 | "Properties": { 176 | "AssumeRolePolicyDocument": { 177 | "Statement": [ 178 | { 179 | "Action": [ 180 | "sts:AssumeRole" 181 | ], 182 | "Effect": "Allow", 183 | "Principal": { 184 | "Service": [ 185 | "lambda.amazonaws.com" 186 | ] 187 | } 188 | } 189 | ] 190 | }, 191 | "Policies": [ 192 | { 193 | "PolicyDocument": { 194 | "Statement": [ 195 | { 196 | "Action": [ 197 | "logs:CreateLogGroup", 198 | "logs:CreateLogStream", 199 | "logs:PutLogEvents" 200 | ], 201 | "Effect": "Allow", 202 | "Resource": [ 203 | "arn:aws:logs:*:*:*" 204 | ] 205 | }, 206 | { 207 | "Action": [ 208 | "es:*" 209 | ], 210 | "Effect": "Allow", 211 | "Resource": [ 212 | "arn:aws:es:*:*:*" 213 | ] 214 | } 215 | ] 216 | }, 217 | "PolicyName": "LambdaCleanBaseImagesPolicy" 218 | } 219 | ] 220 | }, 221 | "Type": "AWS::IAM::Role" 222 | }, 223 | "LambdaErrorTopic": { 224 | "Properties": { 225 | "Subscription": [ 226 | { 227 | "Endpoint": { 228 | "Ref": "AlarmEmail" 229 | }, 230 | "Protocol": "email" 231 | } 232 | ] 233 | }, 234 | "Type": "AWS::SNS::Topic" 235 | }, 236 | "LambdaReleaseErrorsAlarm": { 237 | "Properties": { 238 | "AlarmActions": [ 239 | { 240 | "Ref": "LambdaErrorTopic" 241 | } 242 | ], 243 | "ComparisonOperator": "GreaterThanThreshold", 244 | "Dimensions": [ 245 | { 246 | "Name": "FunctionName", 247 | "Value": { 248 | "Ref": "LambdaReleaseFunction" 249 | } 250 | } 251 | ], 252 | "EvaluationPeriods": 1, 253 | "MetricName": "Errors", 254 | "Namespace": "AWS/Lambda", 255 | "Period": 300, 256 | "Statistic": "Maximum", 257 | "Threshold": "0" 258 | }, 259 | "Type": "AWS::CloudWatch::Alarm" 260 | }, 261 | "LambdaReleaseFunction": { 262 | "Properties": { 263 | "Code": { 264 | "ZipFile": "import boto3\nimport operator\n\n\ndef clean_images(region, limit):\n client = boto3.client('ec2', region)\n\n response = client.describe_images(\n Owners=['self'],\n Filters=[{'Name': 'tag:Type', 'Values': ['ReleaseImage']}]\n )\n\n if len(response['Images']) == 0:\n raise Exception('no AMIs with Type=BaseImage tag found')\n\n images = {}\n for image in response['Images']:\n for tag in image['Tags']:\n if tag['Key'] == \"Project\":\n if tag['Value'] not in images.keys():\n images[tag['Value']] = {}\n images[tag['Value']][image['ImageId']] = image['CreationDate']\n break\n\n to_remove = [];\n for project in images:\n sorted_x = sorted(images[project].items(), key=operator.itemgetter(1), reverse=True)\n if len(sorted_x) > limit:\n to_remove = to_remove + [i[0] for i in sorted_x[limit:]]\n\n if len(to_remove) == 0:\n print(\"Nothing to do\")\n return 0\n\n print(\"Will remove \" + str(len(to_remove)) + \" images\")\n\n for ami in to_remove:\n print(\"Removing: \" + ami)\n client.deregister_image(ImageId=ami)\n\n\ndef lambda_handler(event, context):\n clean_images('eu-west-1', 50)\n clean_images('eu-central-1', 1)\n\n\nif __name__ == '__main__':\n lambda_handler(None, None)\n" 265 | }, 266 | "Description": "Clears Release AMI images", 267 | "Handler": "index.lambda_handler", 268 | "MemorySize": 128, 269 | "Role": { 270 | "Fn::GetAtt": [ 271 | "LambdaCleanImagesRole", 272 | "Arn" 273 | ] 274 | }, 275 | "Runtime": "python2.7", 276 | "Timeout": 10 277 | }, 278 | "Type": "AWS::Lambda::Function" 279 | } 280 | } 281 | } 282 | -------------------------------------------------------------------------------- /infrastructure/templates/rds-cross-region-backup.json: -------------------------------------------------------------------------------- 1 | { 2 | "Conditions": { 3 | "IncludeAurora": { 4 | "Fn::Equals": [ 5 | { 6 | "Ref": "IncludeAuroraClusters" 7 | }, 8 | "Yes" 9 | ] 10 | }, 11 | "UseAllDatabases": { 12 | "Fn::Equals": [ 13 | { 14 | "Fn::Join": [ 15 | "", 16 | { 17 | "Ref": "DatabasesToUse" 18 | } 19 | ] 20 | }, 21 | "" 22 | ] 23 | }, 24 | "UseEncryption": { 25 | "Fn::Equals": [ 26 | { 27 | "Ref": "KMSKeyParameter" 28 | }, 29 | "" 30 | ] 31 | } 32 | }, 33 | "Description": "Resources copying RDS backups to another region", 34 | "Metadata": { 35 | "AWS::CloudFormation::Interface": { 36 | "ParameterGroups": [ 37 | { 38 | "Label": { 39 | "default": "Basic configuration" 40 | }, 41 | "Parameters": [ 42 | "TargetRegionParameter", 43 | "S3BucketParameter", 44 | "SourceZipParameter" 45 | ] 46 | }, 47 | { 48 | "Label": { 49 | "default": "Encryption - see https://github.com/pbudzon/aws-maintenance#encryption for details" 50 | }, 51 | "Parameters": [ 52 | "KMSKeyParameter" 53 | ] 54 | }, 55 | { 56 | "Label": { 57 | "default": "Optional: limit to specific RDS database(s)" 58 | }, 59 | "Parameters": [ 60 | "DatabasesToUse" 61 | ] 62 | }, 63 | { 64 | "Label": { 65 | "default": "Optional: Aurora support" 66 | }, 67 | "Parameters": [ 68 | "IncludeAuroraClusters", 69 | "ClustersToUse" 70 | ] 71 | } 72 | ], 73 | "ParameterLabels": { 74 | "ClustersToUse": { 75 | "default": "Aurora clusters to use for" 76 | }, 77 | "DatabasesToUse": { 78 | "default": "Databases to use for" 79 | }, 80 | "IncludeAuroraClusters": { 81 | "default": "Use for Aurora clusters" 82 | }, 83 | "KMSKeyParameter": { 84 | "default": "KMS Key in target region" 85 | }, 86 | "S3BucketParameter": { 87 | "default": "Name of S3 bucket" 88 | }, 89 | "SourceZipParameter": { 90 | "default": "Name of ZIP file" 91 | }, 92 | "TargetRegionParameter": { 93 | "default": "Target region" 94 | } 95 | } 96 | } 97 | }, 98 | "Parameters": { 99 | "ClustersToUse": { 100 | "Default": "", 101 | "Description": "Optional: If including Aurora clusters - comma-delimited list of Aurora Clusters to use for. Leave empty to use for all clusters in source region.", 102 | "Type": "String" 103 | }, 104 | "DatabasesToUse": { 105 | "Description": "Optional: comma-delimited list of RDS instance (not Aurora clusters!) names to use. Leave empty to use for all instances in source region.", 106 | "Type": "CommaDelimitedList" 107 | }, 108 | "IncludeAuroraClusters": { 109 | "AllowedValues": [ 110 | "Yes", 111 | "No" 112 | ], 113 | "Default": "No", 114 | "Description": "Choose 'Yes' if you have Aurora Clusters that you want to use this for, will add daily schedule.", 115 | "Type": "String" 116 | }, 117 | "KMSKeyParameter": { 118 | "Description": "KMS Key ARN in target region. Required if using encrypted RDS instances, optional otherwise.", 119 | "Type": "String" 120 | }, 121 | "S3BucketParameter": { 122 | "Description": "Name of the S3 bucket where you uploaded the source code zip", 123 | "Type": "String" 124 | }, 125 | "SourceZipParameter": { 126 | "Default": "backup-rds.zip", 127 | "Description": "Name of the zip file inside the S3 bucket", 128 | "Type": "String" 129 | }, 130 | "TargetRegionParameter": { 131 | "AllowedPattern": "^[a-z]+-[a-z]+-[0-9]+$", 132 | "ConstraintDescription": "The target region needs to be valid AWS region, for example: us-east-1", 133 | "Description": "Region where to store the copies of snapshots (for example: eu-central-1)", 134 | "Type": "String" 135 | } 136 | }, 137 | "Resources": { 138 | "AuroraBackupEvent": { 139 | "Condition": "IncludeAurora", 140 | "Properties": { 141 | "Description": "Copy Aurora clusters to another region", 142 | "ScheduleExpression": "rate(1 day)", 143 | "State": "ENABLED", 144 | "Targets": [ 145 | { 146 | "Arn": { 147 | "Fn::GetAtt": [ 148 | "LambdaBackupRDSFunction", 149 | "Arn" 150 | ] 151 | }, 152 | "Id": "backup_rds_function" 153 | } 154 | ] 155 | }, 156 | "Type": "AWS::Events::Rule" 157 | }, 158 | "EventsPermissionForLambda": { 159 | "Condition": "IncludeAurora", 160 | "Properties": { 161 | "Action": "lambda:invokeFunction", 162 | "FunctionName": { 163 | "Ref": "LambdaBackupRDSFunction" 164 | }, 165 | "Principal": "events.amazonaws.com", 166 | "SourceArn": { 167 | "Fn::GetAtt": [ 168 | "AuroraBackupEvent", 169 | "Arn" 170 | ] 171 | } 172 | }, 173 | "Type": "AWS::Lambda::Permission" 174 | }, 175 | "LambdaBackupRDSFunction": { 176 | "Properties": { 177 | "Code": { 178 | "S3Bucket": { 179 | "Ref": "S3BucketParameter" 180 | }, 181 | "S3Key": { 182 | "Ref": "SourceZipParameter" 183 | } 184 | }, 185 | "Description": "Copies RDS backups to another region", 186 | "Environment": { 187 | "Variables": { 188 | "CLUSTERS_TO_USE": { 189 | "Ref": "ClustersToUse" 190 | }, 191 | "KMS_KEY_ID": { 192 | "Ref": "KMSKeyParameter" 193 | }, 194 | "SOURCE_REGION": { 195 | "Ref": "AWS::Region" 196 | }, 197 | "TARGET_REGION": { 198 | "Ref": "TargetRegionParameter" 199 | } 200 | } 201 | }, 202 | "Handler": "backup-rds.lambda_handler", 203 | "MemorySize": 128, 204 | "Role": { 205 | "Fn::GetAtt": [ 206 | "LambdaBackupRDSRole", 207 | "Arn" 208 | ] 209 | }, 210 | "Runtime": "python3.6", 211 | "Timeout": 30 212 | }, 213 | "Type": "AWS::Lambda::Function" 214 | }, 215 | "LambdaBackupRDSRole": { 216 | "Properties": { 217 | "AssumeRolePolicyDocument": { 218 | "Statement": [ 219 | { 220 | "Action": [ 221 | "sts:AssumeRole" 222 | ], 223 | "Effect": "Allow", 224 | "Principal": { 225 | "Service": [ 226 | "lambda.amazonaws.com" 227 | ] 228 | } 229 | } 230 | ] 231 | }, 232 | "Policies": [ 233 | { 234 | "PolicyDocument": { 235 | "Statement": [ 236 | { 237 | "Action": [ 238 | "rds:DescribeDbSnapshots", 239 | "rds:CopyDbSnapshot", 240 | "rds:DeleteDbSnapshot", 241 | "rds:DeleteDbClusterSnapshot", 242 | "rds:DescribeDbClusters", 243 | "rds:DescribeDbClusterSnapshots", 244 | "rds:CopyDBClusterSnapshot" 245 | ], 246 | "Effect": "Allow", 247 | "Resource": [ 248 | "*" 249 | ] 250 | }, 251 | { 252 | "Action": [ 253 | "logs:CreateLogGroup", 254 | "logs:CreateLogStream", 255 | "logs:PutLogEvents" 256 | ], 257 | "Effect": "Allow", 258 | "Resource": [ 259 | "arn:aws:logs:*:*:*" 260 | ] 261 | }, 262 | { 263 | "Fn::If": [ 264 | "UseEncryption", 265 | { 266 | "Ref": "AWS::NoValue" 267 | }, 268 | { 269 | "Action": [ 270 | "kms:Create*", 271 | "kms:DescribeKey" 272 | ], 273 | "Effect": "Allow", 274 | "Resource": [ 275 | { 276 | "Ref": "KMSKeyParameter" 277 | } 278 | ] 279 | } 280 | ] 281 | } 282 | ] 283 | }, 284 | "PolicyName": "AccessToRDSAndLogs" 285 | } 286 | ] 287 | }, 288 | "Type": "AWS::IAM::Role" 289 | }, 290 | "RDSBackupEvent": { 291 | "Properties": { 292 | "Enabled": "true", 293 | "EventCategories": [ 294 | "backup" 295 | ], 296 | "SnsTopicArn": { 297 | "Ref": "RDSBackupTopic" 298 | }, 299 | "SourceIds": { 300 | "Fn::If": [ 301 | "UseAllDatabases", 302 | { 303 | "Ref": "AWS::NoValue" 304 | }, 305 | { 306 | "Ref": "DatabasesToUse" 307 | } 308 | ] 309 | }, 310 | "SourceType": "db-instance" 311 | }, 312 | "Type": "AWS::RDS::EventSubscription" 313 | }, 314 | "RDSBackupTopic": { 315 | "Properties": { 316 | "Subscription": [ 317 | { 318 | "Endpoint": { 319 | "Fn::GetAtt": [ 320 | "LambdaBackupRDSFunction", 321 | "Arn" 322 | ] 323 | }, 324 | "Protocol": "lambda" 325 | } 326 | ] 327 | }, 328 | "Type": "AWS::SNS::Topic" 329 | }, 330 | "SNSPermissionForLambda": { 331 | "Properties": { 332 | "Action": "lambda:invokeFunction", 333 | "FunctionName": { 334 | "Ref": "LambdaBackupRDSFunction" 335 | }, 336 | "Principal": "sns.amazonaws.com", 337 | "SourceArn": { 338 | "Ref": "RDSBackupTopic" 339 | } 340 | }, 341 | "Type": "AWS::Lambda::Permission" 342 | } 343 | } 344 | } 345 | --------------------------------------------------------------------------------