├── .gitignore ├── LICENSE ├── README.md ├── aws-autoscaling-rollout ├── README.md └── aws-autoscaling-rollout.py ├── aws-choose-profile ├── README.md ├── aws-choose-profile-helper.py ├── aws-choose-profile.bash ├── aws-choose-profile.fish └── demo.png ├── aws-env-vars-into-pbcopy.sh ├── aws-iam-require-mfa-allow-self-service ├── README.md └── policy.json ├── aws-image-ourself └── aws-image-ourself ├── aws-mfa-login ├── README.md └── aws-mfa-login ├── aws-push-cloudwatch-instance-metrics ├── README.md └── aws-push-cloudwatch-instance-metrics.py ├── cleanup-packer-aws-resources ├── README.md ├── aws-permissions.json ├── aws-permissions.txt └── cleanup-packer-aws-resources.py ├── ec2-metadata ├── README.md └── ec2-metadata └── shared └── python_aws_helpers.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | env/ 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | 27 | # PyInstaller 28 | # Usually these files are written by a python script from a template 29 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 30 | *.manifest 31 | *.spec 32 | 33 | # Installer logs 34 | pip-log.txt 35 | pip-delete-this-directory.txt 36 | 37 | # Unit test / coverage reports 38 | htmlcov/ 39 | .tox/ 40 | .coverage 41 | .coverage.* 42 | .cache 43 | nosetests.xml 44 | coverage.xml 45 | *,cover 46 | .hypothesis/ 47 | 48 | # Translations 49 | *.mo 50 | *.pot 51 | 52 | # Django stuff: 53 | *.log 54 | local_settings.py 55 | 56 | # Flask stuff: 57 | instance/ 58 | .webassets-cache 59 | 60 | # Scrapy stuff: 61 | .scrapy 62 | 63 | # Sphinx documentation 64 | docs/_build/ 65 | 66 | # PyBuilder 67 | target/ 68 | 69 | # IPython Notebook 70 | .ipynb_checkpoints 71 | 72 | # pyenv 73 | .python-version 74 | 75 | # celery beat schedule file 76 | celerybeat-schedule 77 | 78 | # dotenv 79 | .env 80 | 81 | # virtualenv 82 | venv/ 83 | ENV/ 84 | 85 | # Spyder project settings 86 | .spyderproject 87 | 88 | # Rope project settings 89 | .ropeproject 90 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2016 Farley 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # AWS Missing Tools 2 | Random tools and helpers I've written that come in handy that are generally missing in the Amazon CLI Tools. These tools all extend/expand on the Amazon AWS CLI tools which must be installed for any of these to work. 3 | 4 | # [aws-choose-profile](https://github.com/DevOps-Nirvana/aws-missing-tools/tree/master/aws-choose-profile) 5 | A bash/fish script that scans for profiles defined in ~/.aws/credentials and in ~/.aws/config and asks you to choose one of them, and then sets the AWS_PROFILE and AWS_DEFAULT_PROFILE environment variables for you from the chosen profile. Great for sysadmins/devs that manage more than one AWS account via command-line based tools. 6 | 7 | # [aws-mfa-login](https://github.com/DevOps-Nirvana/aws-missing-tools/tree/master/aws-mfa-login) 8 | A bash script that allows you to login to an virtual MFA (2FA) for an cli access key or assume into an role via 2FA. This makes it so you don't need to be so paranoid about having access/secret keys on your employee laptops or worry about them leaking to Github. It allows you to skip having to setup complex client-side systems for your employees such as [AWS Vault](https://github.com/99designs/aws-vault) to try to encrypt your devs credentials on every computer. 9 | 10 | # [aws-autoscaling-rollout](https://github.com/DevOps-Nirvana/aws-missing-tools/tree/master/aws-autoscaling-rollout) 11 | Performs a zero-downtime rolling deploy of servers in an autoscaling group. Very loosely based on the "aws-ha-release" tool from colinbjohnson combined with a tool I wrote back in 2009 and has been pending open-sourcing forever. This is currently in use at a dozen places that I know of that I engineered their CI/CD Pipelines. 12 | 13 | # [aws-iam-require-mfa-allow-self-service](https://github.com/DevOps-Nirvana/aws-missing-tools/tree/master/aws-iam-require-mfa-allow-self-service) 14 | A "best-practice" IAM Profile which ideally is assigned to an IAM Role which is assigned to all your users to ensure/guarantee all users use 2FA constantly. 15 | 16 | # [ec2-metadata](https://github.com/DevOps-Nirvana/aws-missing-tools/tree/master/ec2-metadata) 17 | A amazon-written helper that should really be automatically installed on every instance automatically IMHO to query the metadata. It is easy enough to do on your own via curl, but it comes in handy to have a helper as well. 18 | 19 | # [aws-push-cloudwatch-instance-metrics](https://github.com/DevOps-Nirvana/aws-missing-tools/tree/master/aws-push-cloudwatch-instance-metrics) 20 | This helper, intended to be run from cron every minute, pushes memory and disk usage to cloudwatch. If this server is part of an autoscaling group, it also pushes against that dimension (within the EC2 namespace) to be able to query the autoscaler's combined memory & disk usage. 21 | 22 | # [cleanup-packer-aws-resources](https://github.com/DevOps-Nirvana/aws-missing-tools/tree/master/cleanup-packer-aws-resources) 23 | This performs a cleanup of packer resources. Sometimes packer dies, horribly, and leaves instances and/or keys and/or security groups laying around. This script goes through every region and cleans them up (after they are 24 hours old). This can be very useful to be installed in a Lambda with the right permissions and run every day or so. 24 | 25 | ## Contributing 26 | Feel free to contribute a tool you wrote that is missing in AWS via a pull request, or fix a bug in one of mine! 27 | 28 | *NOTE*: This repository is a always a constant work in progress. 29 | -------------------------------------------------------------------------------- /aws-autoscaling-rollout/README.md: -------------------------------------------------------------------------------- 1 | # AWS Autoscaling Rollout 2 | 3 | ## Introduction: 4 | aws-autoscaling-rollout allows the high-availability / no downtime replacement of all EC2 Instances in an Auto Scaling Group (that can be attached to an Elastic Load Balancer). It does this in a "rolling" fashion, one server at a time. This script supports BOTH Application Load Balancers (ALB) and the "Classic" Load Balancers (CLB) from Amazon, but does **NOT currently support ECS**. 5 | 6 | ## Potential Use: 7 | Some potential uses for aws-autoscaling-rollout are listed below: 8 | 9 | 1. Delivery of new code or application logic, typically an autoscaler will be changed to launch a new launch configuration (eg: terraform). Utilization of this script will cause the termination of EC2 instances in order to release new code in a High-Availability fashion without incurring any downtime. 10 | 1. To rotate and refresh your instances to a fresh/vanilla state, all older EC2 instances can be replaced with newer EC2 instances. This can help reset your instances incase logs or temporary files have filled up your instance, or your application has consumed all available RAM/Disk resources. 11 | 12 | ## Directions For Use: 13 | `AWS_DEFAULT_REGION=us-east-1 aws-autoscaling-rollout -a my-scaling-group` 14 | 15 | 16 | ## Script options: 17 | 18 | There are various options you can pass to aws-autoscaling-rollout to tweak its logic slightly to fit your deployment pattern, use-case, etc. These have all been added based on various environments' needs to support their use-cases. If there's a use-case that isn't handed in an option, perhaps submit a Github bug/feature request and I'll add it. Or implement it yourself, and get me a Pull Request. The current options are... 19 | 20 | ### --force 21 | If specified, then we want to force this deployment by skipping health pre-checks, and will ignore and reset currently suspended processes. NOTE: This will NOT skip the external health check commands or wait for seconds options if you specify them. This is to help deploy against 22 | a environment which is currently unhealthy, down, or is having issues (eg: a few servers are unhealthy in an ELB but a few are healthy, you want to just push this rotation through to try to 23 | move things ahead). 24 | 25 | ### --skip-elb-health-check 26 | If specified this script will skip the ELB health check of new instances as they come up (often used with --force. Force above merely skips checking the CURRENT instances, not newly scaled up instances. Warning: with --force and this option specified, your environment may go down, thus it won't be as "HA" as you might like. This can be useful to deploy to a development environment 27 | which may not be very stable in nature. 28 | 29 | ### --wait-for-seconds 30 | The number of extra seconds to wait in-between instance terminations (0 by default to disable). This can be helpful if your instances need some time to cache or be "more" healthy after being marked healthy in a ELB. This can be especially helpful if your autoscaler is not attached to an ELB. Note: --force does NOT override this. 31 | 32 | ### --check-if-new-server-is-up-command 33 | This allows you to specify a custom one-liner shell command which can do an external health check against your newly created instance to verify a new instance is healthy before continuing deployment. This should be a valid 'shell' command that can run on this server, it can include pipes to be able to run multiple commands. This command supports _simple_ templating in the form of string replacing NEW_INSTANCE_ID, NEW_INSTANCE_PRIVATE_IP_ADDRESS, NEW_INSTANCE_PUBLIC_IP_ADDRESS. Often used to do custom health checks when an autoscaler is not attached to an ELB, or to check that a new server joined a cluster properly before continuing (eg: Consul). This could also be used to add this server to an external monitoring system (Nagios, Pingdom) or could be used to add ECS support with a little creativity. When this command returns a retval of 0 then the deployment considers this server healthy and continues. 34 | 35 | ### --run-before-server-going-down-command 36 | This allows you to run an external command right before a server goes down, this is run BEFORE the wait-for-seconds (if provided). This should be a valid 'shell' command that can run on this server. This command supports _simple_ templating in the form of string replacing OLD_INSTANCE_ID, OLD_INSTANCE_PRIVATE_IP_ADDRESS, OLD_INSTANCE_PUBLIC_IP_ADDRESS. Often used to do stuff like pull a server out of a cluster (eg: to force-leave a cluster, or to remove from a monitoring system). This feature could also be used to add ECS support with a little creativity. This command MUST return a retval of 0 otherwise this deployment will halt 37 | 38 | ### --run-after-server-going-down-command 39 | This is an external command that will run after the server gets sent the terminate command. **WARNING**: Due to possible delays in Amazon's API and other factors it is not guaranteed that the server will be completely down when this command is run. This should be a valid 'shell' command that can run on this server. This command supports _simple_ templating in the form of string replacing OLD_INSTANCE_ID, OLD_INSTANCE_PRIVATE_IP_ADDRESS, OLD_INSTANCE_PUBLIC_IP_ADDRESS. Often used to do stuff like pull a server out of a custom monitoring system (eg: Zabbix/Nagios). This command MUST return a retval of 0 otherwise this deployment will hal 40 | 41 | ### --check-if-instances-need-to-be-terminated 42 | Given an autoscaling group that is partially updated, i.e. some instances are already running with current configuration, we can skip such instances with this option specified. 43 | 44 | 45 | ## Detailed Description: 46 | This script does a rollout of an autoscaling group gradually, while waiting/checking 47 | that whatever Elastic Load Balancer (ELB) it is attached to is healthy before 48 | continuing (if attached). This applies to both Classic ELBs (CLB) and Application Load Balancers (ALBs). 49 | 50 | This script is written in python and requires a python interpreter, and it heavily leverages 51 | boto3, you will likely need to install boto3 with `pip install boto3`. **NOTE:** Same as the 52 | AWS cli utilities, there is no option to set the AWS region or credentials in this script. 53 | Boto automatically reads from typical AWS environment variables/profiles so to set the 54 | region/profile/credentials please use the typical aws cli methods to do so. Eg: 55 | 56 | ``` 57 | AWS_DEFAULT_PROFILE=client_name AWS_DEFAULT_REGION=us-east-1 aws-autoscaling-rollout.py -a autoscalername 58 | ``` 59 | 60 | **WARNING:** This script does NOT work (yet) for doing rollouts of autoscaled groups that are 61 | attached to ALBs that are used in an ECS cluster. That's a WHOLE other beast, 62 | that I would love for this script to handle one day... but alas, it does not yet. 63 | If you try to use this script against an autoscaler that is used in an ECS cluster 64 | it will have unexpected and most likely undesired results. So be warned!!!!!!! 65 | 66 | Pieces of logic in this script are loosely based on (but intended to replace) the 67 | now abandoned ["aws-ha-release"](https://github.com/colinbjohnson/aws-missing-tools/tree/master/aws-ha-release) tool from Colin Johnson 68 | 69 | This tool is also based on AWS deployment code pieces from numerous deployments scripts in 70 | bits and pieces I have written over the years, refactored/improved to add ALB support and unify 71 | the logic into a single rollout script. 72 | 73 | ## Installation: 74 | I recommend you symlink this into your user or system bin folder 75 | 76 | ### Installation Examples: 77 | 78 | Symlink into place, so you can "git pull" from where you cloned this to update this command from time to time 79 | ``` 80 | ln -s $(pwd)/aws-autoscaling-rollout.py /usr/local/bin/ 81 | ``` 82 | or install as a super-user into your /usr/local/bin folder, depending on your preference 83 | ``` 84 | sudo cp -a aws-autoscaling-rollout.py /usr/local/bin/ 85 | ``` 86 | 87 | ## Simplified Logic Walkthrough: 88 | 89 | Described below is the step-by-step logic this script takes, for anyone who wants to know in detail what this script will do to your system, and/or for anyone who wishes to possible contribute feedback/patches to it. 90 | 1. _(pre)_ Check if this autoscaler name is valid 91 | 1. _(pre)_ (if not --force) Check that this autoscaler has no bad suspended processes 92 | 1. _(pre)_ Wait for the autoscaler to "settle" (in-case it's mid-scaling activity) 93 | 1. _(pre)_ (if not --force) Check that every instance of the autoscaler is healthy on whatever CLB/ALBs its associated with 94 | 1. _(pre)_ (if the desired capacity == max capacity) Scale up the max capacity by one 95 | 1. _(pre)_ Suspend various autoscaling processes so things like alarms or scheduled actions won't interrupt this deployment 96 | 1. _(pre)_ Scale up the desired capacity by one, and wait for the autoscaler to show the new server as healthy (in the autoscaler) 97 | 1. _(main-loop)_ Wait for the number of healthy servers on the autoscaler to equal the number of desired servers 98 | 1. _(main-loop)_ (if not --skip-elb-health-check) Wait for the new server to get healthy in all attached CLB/TGs 99 | 1. _(main-loop)_ (if --check-if-new-server-is-up-command ) Run the specified command every 10 seconds until it returns retval of 0 100 | 1. _(main-loop)_ Detach one of the old instances from all attached CLB/TGs 101 | 1. _(main-loop)_ Wait for the old instance to fully detach from all attached CLB/TGs (waits for connection draining and autoscaling detachment hooks) 102 | 1. _(main-loop)_ (if --run-before-server-going-down-command) Run the specified command before terminating, it must return a retval of 0 103 | 1. _(main-loop)_ (if --wait-for-seconds) Wait for --wait-for-seconds number of seconds before continuing 104 | 1. _(main-loop)_ Terminate the old instance 105 | 1. _(main-loop)_ (if --run-after-server-going-down-command) Run the specified command after terminating, it must return a retval of 0 106 | 1. _(main-loop)_ Jump to the start of the main loop and repeat until all old instances are replaced 107 | 1. _(cleanup)_ (if we changed the max capacity above) Shrink the max capacity by one 108 | 1. _(cleanup)_ Un-suspend the suspended autoscaling processes 109 | 1. **Profit / Success!** 110 | 111 | 112 | ## Todo: 113 | * Implement a max-timeout feature, so you know when this script fails so it won't infinite loop on a bad deploy. 114 | * Implement a check-interval feature, and use it script-wide to know how often to re-check on the status of things. Right now most intervals are hardcoded to 10 seconds. 115 | * Support instances that are hosting ECS containers that are attached to an ALB 116 | * Implement the old (PHP-based Farley/internal) deploy-servers sexy-CLI output so people are in awe of this awesome script 117 | * Move all the "debug/verbose" output to a --verbose argument to clean the output up, but to still allow people who want to debug or provide feedback to be able to provide logs. 118 | * Implement a retry mechanism to (try to) prevent errors if Amazon's API is being slow 119 | 120 | ## Additional Information: 121 | - Author(s): Farley farley@neonsurge.com / farley@olindata.com 122 | - First Published: 24-06-2016 123 | - Last Updated: 11-06-2017 124 | - Version 1.0.1 125 | - License Type: MIT 126 | -------------------------------------------------------------------------------- /aws-autoscaling-rollout/aws-autoscaling-rollout.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | ########################################################################################## 3 | # 4 | # This script does a rollout of an autoscaling group gradually, while waiting/checking 5 | # that whatever elasitc load balancer or target group it is attached to is healthy before 6 | # continuing (if attached) 7 | # 8 | # This script leverages boto and the python aws scripts helper. There is no option to 9 | # set the AWS region or credentials in this script, but boto reads from typical AWS 10 | # environment variables/profiles so to set the region, please use the typical aws-cli 11 | # methods to do so, same goes for the AWS credentials. Eg: 12 | # 13 | # AWS_DEFAULT_PROFILE=client_name AWS_DEFAULT_REGION=us-east-1 aws-autoscaling-rollout.py -a autoscalername 14 | # 15 | # WARNING: This script does NOT work (yet) for doing rollouts of autoscaled groups that are 16 | # attached to ALBs that are used in an ECS cluster. That's a WHOLE other beast, 17 | # that I would love for this script to handle one day... but alas, it does not yet. 18 | # If you try to use this script against an autoscaler that is used in an ECS cluster 19 | # it will have unexpected and most likely undesired results. So be warned!!!!!!! 20 | # 21 | # The latest version of this code and more documentation can be found at: 22 | # https://github.com/DevOps-Nirvana/aws-missing-tools 23 | # 24 | # Author: 25 | # Farley 26 | # 27 | ########################################################################################## 28 | 29 | ###################### 30 | # Libraries and instantiations of libraries 31 | ###################### 32 | import boto3 33 | import time 34 | # For debugging 35 | from pprint import pprint 36 | # For CLI Parsing of args 37 | from optparse import OptionParser 38 | # This is for the pre/post external health check feature 39 | from subprocess import call 40 | try: 41 | elb = boto3.client('elb') 42 | autoscaling = boto3.client('autoscaling') 43 | ec2 = boto3.client('ec2') 44 | elbv2 = boto3.client('elbv2') 45 | except: 46 | elb = boto3.client('elb', region_name='eu-west-1') 47 | autoscaling = boto3.client('autoscaling', region_name='eu-west-1') 48 | ec2 = boto3.client('ec2', region_name='eu-west-1') 49 | elbv2 = boto3.client('elbv2', region_name='eu-west-1') 50 | 51 | 52 | ###################### 53 | # CLI Argument handling 54 | ###################### 55 | usage = "usage: %prog -a autoscaler" 56 | parser = OptionParser(usage=usage) 57 | parser.add_option("-a", "--autoscaler", 58 | dest="autoscaler", 59 | default="", 60 | help="Autoscaler to rollout", 61 | metavar="as-name") 62 | parser.add_option("-f", "--force", 63 | dest="force", 64 | action="store_true", 65 | help="If we want to force-deployment by skipping health pre-checks, and will ignore and reset currently suspended processes. NOTE: This will NOT skip the external health check commands or wait for seconds") 66 | parser.add_option("-s", "--skip-elb-health-check", 67 | dest="skip", 68 | action="store_true", 69 | help="If we want to skip the ELB health check of new instances as they come up (often used with --force)") 70 | parser.add_option("-w", "--wait-for-seconds", 71 | dest="waitforseconds", 72 | default="0", 73 | type="int", 74 | help="The number of extra seconds to wait in-between instance terminations (0 to disable)", 75 | metavar="seconds") 76 | parser.add_option("-u", "--check-if-new-server-is-up-command", 77 | dest="checkifnewserverisupcommand", 78 | default="", 79 | help="An external health check command to run to verify a new instance is healthy before continuing deployment. This should be a valid 'shell' command that can run on this server. This command supports _simple_ templating in the form of string replacing NEW_INSTANCE_ID, NEW_INSTANCE_PRIVATE_IP_ADDRESS, NEW_INSTANCE_PUBLIC_IP_ADDRESS. Often used to do custom health checks when an autoscaler is not attached to an ELB. This feature could also be used to add ECS support with a little creativity. When this command returns retval of 0 then the deployment continues", 80 | metavar="command") 81 | parser.add_option("-b", "--run-before-server-going-down-command", 82 | dest="runbeforeserverdowncommand", 83 | default="", 84 | help="An external command to run before a server goes down, this is run BEFORE the wait-for-seconds. This should be a valid 'shell' command that can run on this server. This command supports _simple_ templating in the form of string replacing OLD_INSTANCE_ID, OLD_INSTANCE_PRIVATE_IP_ADDRESS, OLD_INSTANCE_PUBLIC_IP_ADDRESS. Often used to do stuff like pull a server out of a cluster (eg: to force-leave Consul). This feature could also be used to add ECS support with a little creativity. This command MUST return a retval of 0 otherwise this deployment will halt.", 85 | metavar="command") 86 | parser.add_option("-d", "--run-after-server-going-down-command", 87 | dest="runafterserverdowncommand", 88 | default="", 89 | help="An external command to run before a server goes down, this is run BEFORE the wait-for-seconds. This should be a valid 'shell' command that can run on this server. This command supports _simple_ templating in the form of string replacing OLD_INSTANCE_ID, OLD_INSTANCE_PRIVATE_IP_ADDRESS, OLD_INSTANCE_PUBLIC_IP_ADDRESS. Often used to do stuff like pull a server out of a cluster (eg: to force-leave Consul). This command MUST return a retval of 0 otherwise this deployment will halt.", 90 | metavar="command") 91 | parser.add_option("-c", "--check-if-instances-need-to-be-terminated", 92 | dest="checkifinstancesneedtobeterminated", 93 | action="store_true", 94 | help="Check if instance launch configuration or launch template is already updated. This is useful in case the rollout fails and leave an Auto Scaling Group with a lot of instances partially updated.") 95 | (options, args) = parser.parse_args() 96 | 97 | # Startup simple checks... 98 | if options.autoscaler == "": 99 | print("ERROR: You MUST specify the autoscaler with -a") 100 | parser.print_usage() 101 | exit(1) 102 | if options.force: 103 | print("ALERT: We are force-deploying this autoscaler, which may cause downtime under some circumstances") 104 | if options.skip: 105 | print("ALERT: We are skipping ELB health checks of new instances as they come up, this will probably cause downtime") 106 | 107 | 108 | ###################### 109 | # Helper functions 110 | ###################### 111 | 112 | 113 | # Get a load balancer 114 | def get_load_balancer(loadbalancer_name): 115 | try: 116 | fetched_data = elb.describe_load_balancers( 117 | LoadBalancerNames=[ 118 | loadbalancer_name, 119 | ], 120 | PageSize=1 121 | ) 122 | 123 | if len(fetched_data['LoadBalancerDescriptions']) > 0: 124 | return fetched_data['LoadBalancerDescriptions'][0] 125 | except Exception as e: 126 | raise Exception("Error searching for loadbalancer with name [{}]".format(loadbalancer_name), e) 127 | raise Exception("No loadbalancer found with name [{}]".format(loadbalancer_name)) 128 | 129 | 130 | # Get a application load balancer 131 | def get_application_load_balancer( loadbalancer_name ): 132 | try: 133 | fetched_data = elbv2.describe_load_balancers( 134 | Names=[ 135 | loadbalancer_name, 136 | ], 137 | ) 138 | if len(fetched_data['LoadBalancers']) > 0: 139 | return fetched_data['LoadBalancers'][0] 140 | except Exception as e: 141 | raise Exception("Error searching for loadbalancer with name [{}]".format(loadbalancer_name), e) 142 | raise Exception("No loadbalancer found with name [{}]".format(loadbalancer_name)) 143 | 144 | # Describe launch configuration 145 | def describe_launch_configuration( launch_configuration_name ): 146 | try: 147 | fetched_data = autoscaling.describe_launch_configurations( 148 | LaunchConfigurationNames=[ 149 | launch_configuration_name, 150 | ], 151 | ) 152 | if len(fetched_data['LaunchConfigurations']) > 0: 153 | return fetched_data['LaunchConfigurations'][0] 154 | except Exception as e: 155 | raise Exception("Error searching for launch configuration with name [{}]".format(launch_configuration_name), e) 156 | raise Exception("No launch configuration found with name [{}]".format(launch_configuration_name)) 157 | 158 | # Update auto scaling group max size 159 | def update_auto_scaling_group_max_size( autoscaling_group_name, max_size ): 160 | response = autoscaling.update_auto_scaling_group( 161 | AutoScalingGroupName=autoscaling_group_name, 162 | MaxSize=max_size 163 | ) 164 | if response['ResponseMetadata']['HTTPStatusCode'] == 200: 165 | return True 166 | else: 167 | print("ERROR: Unable to set max autoscaling group size on '" + autoscaling_group_name + "'") 168 | return False 169 | 170 | # Get target group 171 | def get_target_group( target_group_name ): 172 | try: 173 | fetched_data = elbv2.describe_target_groups( 174 | Names=[ 175 | target_group_name 176 | ], 177 | PageSize=1 178 | ) 179 | if len(fetched_data['TargetGroups']) > 0: 180 | return fetched_data['TargetGroups'][0] 181 | except Exception as e: 182 | raise Exception("Error searching for target group with name [{}]".format(target_group_name), e) 183 | raise Exception("No target group found with name [{}]".format(target_group_name)) 184 | 185 | 186 | # Get a autoscaling group 187 | def get_autoscaling_group( autoscaling_group_name ): 188 | try: 189 | fetched_data = autoscaling.describe_auto_scaling_groups( 190 | AutoScalingGroupNames=[ 191 | autoscaling_group_name, 192 | ], 193 | MaxRecords=1 194 | ) 195 | if len(fetched_data['AutoScalingGroups']) > 0: 196 | return fetched_data['AutoScalingGroups'][0] 197 | except Exception as e: 198 | raise Exception("Error searching for autoscaling group with name [{}]".format(autoscaling_group_name), e) 199 | raise Exception("Error searching for autoscaling group with name [{}]".format(autoscaling_group_name)) 200 | 201 | 202 | # Get all autoscaling groups 203 | def get_all_autoscaling_groups( ): 204 | try: 205 | fetched_data = autoscaling.describe_auto_scaling_groups( 206 | MaxRecords=100 207 | ) 208 | 209 | if AutoScalingGroups in fetched_data: 210 | return fetched_data['AutoScalingGroups'] 211 | except Exception as e: 212 | raise Exception("Error getting all autoscaling groups", e) 213 | raise Exception("Error getting all autoscaling groups") 214 | 215 | 216 | # Get autoscaling group configuration 217 | def get_autoscaling_group_configuration(autoscaler): 218 | configuration = autoscaler.get('LaunchConfigurationName', None) 219 | if not configuration: 220 | configuration = autoscaler.get('MixedInstancesPolicy', None) 221 | if configuration: 222 | configuration = configuration['LaunchTemplate']['LaunchTemplateSpecification']['LaunchTemplateName'] 223 | else: 224 | raise Exception( 225 | "Error searching configuration for autoscaling group with name [{}]".format(autoscaler['AutoScalingGroupName'])) 226 | return configuration 227 | 228 | 229 | # Get instance configuration 230 | def get_instance_configuration(instance): 231 | configuration = instance.get('LaunchConfigurationName', None) 232 | if not configuration: 233 | configuration = instance.get('LaunchTemplate', None) 234 | if configuration: 235 | configuration = configuration['LaunchTemplateName'] 236 | else: 237 | raise Exception( 238 | "Error searching configuration for instance with id [{}]".format(instance['InstanceId'])) 239 | return configuration 240 | 241 | 242 | # Return a list of instances to skip 243 | def get_instances_to_skip(instances, autoscaler): 244 | output = [] 245 | 246 | for instance in instances: 247 | if get_autoscaling_group_configuration(autoscaler) == get_instance_configuration(instance): 248 | output.append(instance) 249 | 250 | return output 251 | 252 | 253 | # Gets the suspended processes for an autoscaling group (by name or predefined to save API calls) 254 | def get_suspended_processes( autoscaling_group_name_or_definition ): 255 | if type(autoscaling_group_name_or_definition) is str: 256 | autoscaling_group = get_autoscaling_group( autoscaling_group_name_or_definition ) 257 | else: 258 | autoscaling_group = autoscaling_group_name_or_definition 259 | 260 | output = [] 261 | for item in autoscaling_group['SuspendedProcesses']: 262 | output.append(item['ProcessName']) 263 | 264 | return output 265 | 266 | # Gets an single instance's details 267 | def describe_instance(instance_id): 268 | # Get detailed instance information from the instances attached to the autoscaler 269 | instances = ec2.describe_instances(InstanceIds=[instance_id]) 270 | for reservation in instances["Reservations"]: 271 | for instance in reservation["Instances"]: 272 | return instance 273 | 274 | 275 | # Gets the suspended processes for an autoscaling group (by name or predefined to save API calls) 276 | def suspend_processes( autoscaling_group_name, processes_to_suspend ): 277 | response = autoscaling.suspend_processes( 278 | AutoScalingGroupName=autoscaling_group_name, 279 | ScalingProcesses=processes_to_suspend 280 | ) 281 | if response['ResponseMetadata']['HTTPStatusCode'] == 200: 282 | return True 283 | else: 284 | print("ERROR: Unable to suspend_processes on '" + autoscaling_group_name + "'") 285 | return False 286 | 287 | 288 | # Gets the suspended processes for an autoscaling group (by name or predefined to save API calls) 289 | def resume_processes( autoscaling_group_name, processes_to_resume ): 290 | response = autoscaling.resume_processes( 291 | AutoScalingGroupName=autoscaling_group_name, 292 | ScalingProcesses=processes_to_resume 293 | ) 294 | if response['ResponseMetadata']['HTTPStatusCode'] == 200: 295 | return True 296 | else: 297 | print("ERROR: Unable to resume_processes on '" + autoscaling_group_name + "'") 298 | return False 299 | 300 | 301 | def resume_all_processes( autoscaling_group_name ): 302 | response = autoscaling.resume_processes( 303 | AutoScalingGroupName=autoscaling_group_name 304 | ) 305 | if response['ResponseMetadata']['HTTPStatusCode'] == 200: 306 | return True 307 | else: 308 | print("ERROR: Unable to resume_all_processes on '" + autoscaling_group_name + "'") 309 | return False 310 | 311 | 312 | # Check if an autoscaler is currently performing a scaling activity 313 | def check_if_autoscaler_is_scaling( autoscaling_group_name ): 314 | # Get the autoscaling group 315 | autoscaler = autoscaling.describe_auto_scaling_groups( 316 | AutoScalingGroupNames=[ 317 | autoscaling_group_name, 318 | ], 319 | MaxRecords=1 320 | ) 321 | 322 | # Quick error checking 323 | if len(autoscaler['AutoScalingGroups']) != 1: 324 | print("ERROR: Unable to get describe autoscaling group: " + autoscaling_group_name) 325 | exit(1) 326 | autoscaler = autoscaler['AutoScalingGroups'][0] 327 | 328 | # Check if our healthy instance count matches our desired capacity 329 | healthy_instance_count = get_number_of_autoscaler_healthy_instances( autoscaler ) 330 | if healthy_instance_count != autoscaler['DesiredCapacity']: 331 | print("INFO: Our autoscaler must be scaling, desired " + str(autoscaler['DesiredCapacity']) + ", healthy instances " + str(healthy_instance_count)) 332 | return True 333 | 334 | return False 335 | 336 | 337 | def deregister_instance_from_load_balancer( instance_id, loadbalancer_name ): 338 | response = elb.deregister_instances_from_load_balancer( 339 | LoadBalancerName=loadbalancer_name, 340 | Instances=[ 341 | { 342 | 'InstanceId': instance_id 343 | }, 344 | ] 345 | ) 346 | if response['ResponseMetadata']['HTTPStatusCode'] == 200: 347 | return True 348 | else: 349 | print("ERROR: Unable to deregister instance '" + instance_id + "' from load balancer '" + loadbalancer_name + "'") 350 | return False 351 | 352 | 353 | def deregister_instance_from_target_group( instance_id, target_group_arn ): 354 | response = elbv2.deregister_targets( 355 | TargetGroupArn=target_group_arn, 356 | Targets=[ 357 | { 358 | 'Id': instance_id, 359 | }, 360 | ] 361 | ) 362 | if response['ResponseMetadata']['HTTPStatusCode'] == 200: 363 | return True 364 | else: 365 | print("ERROR: Unable to deregister instance '" + instance_id + "' from load balancer '" + loadbalancer_name + "'") 366 | return False 367 | 368 | 369 | def wait_for_autoscaler_to_have_healthy_desired_instances( autoscaling_group_name_or_definition ): 370 | if type(autoscaling_group_name_or_definition) is str: 371 | autoscaler_description = get_autoscaling_group( autoscaling_group_name_or_definition ) 372 | else: 373 | autoscaler_description = autoscaling_group_name_or_definition 374 | # Get our desired capacity 375 | desired_capacity = int(autoscaler_description['DesiredCapacity']) 376 | 377 | while True: 378 | healthy_instance_count = int(get_number_of_autoscaler_healthy_instances( autoscaler_description['AutoScalingGroupName'] )) 379 | if desired_capacity != healthy_instance_count: 380 | print("WARNING: We have " + str(healthy_instance_count) + " healthy instances on the autoscaler but we want " + str(desired_capacity)) 381 | elif check_if_autoscaler_is_scaling( autoscaler_description['AutoScalingGroupName'] ): 382 | print("WARNING: We are currently performing some autoscaling, we should wait...") 383 | else: 384 | print("SUCCESS: We currently have desired capacity of " + str(desired_capacity) + " on this autoscaler") 385 | break 386 | print("Waiting for 5 seconds...") 387 | time.sleep(5) 388 | 389 | 390 | # Get the number of healthy instances from the autoscaling group definition 391 | def get_number_of_autoscaler_healthy_instances( autoscaler_description ): 392 | return len(get_autoscaler_healthy_instances( autoscaler_description )) 393 | 394 | 395 | # Get the healthy instances from the autoscaling group definition or name 396 | def get_autoscaler_healthy_instances( autoscaling_group_name_or_definition ): 397 | if type(autoscaling_group_name_or_definition) is str: 398 | autoscaler_description = get_autoscaling_group( autoscaling_group_name_or_definition ) 399 | else: 400 | autoscaler_description = autoscaling_group_name_or_definition 401 | 402 | healthy_instances = [] 403 | for instance in autoscaler_description['Instances']: 404 | if instance['HealthStatus'] == 'Healthy': 405 | healthy_instances.append(instance) 406 | return healthy_instances 407 | 408 | 409 | def terminate_instance_in_auto_scaling_group( instance_id, autoscaling_group_name, decrement_capacity=False ): 410 | print("DEBUG: Terminating instance '" + instance_id + "' from the autoscaling group '" + autoscaling_group_name + "'...") 411 | 412 | if decrement_capacity is True: 413 | response = autoscaling.terminate_instance_in_auto_scaling_group( 414 | InstanceId=instance_id, 415 | ShouldDecrementDesiredCapacity=True 416 | ) 417 | else: 418 | response = autoscaling.terminate_instance_in_auto_scaling_group( 419 | InstanceId=instance_id, 420 | ShouldDecrementDesiredCapacity=False 421 | ) 422 | 423 | if response['ResponseMetadata']['HTTPStatusCode'] == 200: 424 | print("DEBUG: Executed okay") 425 | return True 426 | else: 427 | print("ERROR: Unable to detach autoscaler '" + autoscaling_group_name + "' from the load balancer '" + loadbalancer_name) 428 | exit(1) 429 | 430 | 431 | def set_desired_capacity( autoscaling_group_name, desired_capacity ): 432 | print("DEBUG: Setting desired capacity of '" + autoscaling_group_name + "' to '" + str(desired_capacity) + "'...") 433 | response = autoscaling.set_desired_capacity( 434 | AutoScalingGroupName=autoscaling_group_name, 435 | DesiredCapacity=desired_capacity, 436 | HonorCooldown=False 437 | ) 438 | 439 | # Check if this executed okay... 440 | if response['ResponseMetadata']['HTTPStatusCode'] == 200: 441 | print("DEBUG: Executed okay") 442 | return True 443 | else: 444 | print("ERROR: Unable to set_desired_capacity on '" + autoscaling_group_name + "'") 445 | exit(1) 446 | 447 | 448 | def get_instance_ids_of_target_group( target_group_arn ): 449 | 450 | response = elbv2.describe_target_health( 451 | TargetGroupArn=target_group_arn 452 | ) 453 | 454 | output = [] 455 | for target in response['TargetHealthDescriptions']: 456 | output.append(target['Target']['Id']) 457 | return output 458 | 459 | 460 | def get_instance_ids_of_load_balancer( loadbalancer_name_or_definition ): 461 | if type(loadbalancer_name_or_definition) is str: 462 | loadbalancer = get_load_balancer( loadbalancer_name_or_definition ) 463 | else: 464 | loadbalancer = loadbalancer_name_or_definition 465 | 466 | output = [] 467 | for instance in loadbalancer['Instances']: 468 | output.append(instance['InstanceId']) 469 | return output 470 | 471 | 472 | def wait_for_complete_targetgroup_autoscaler_attachment( target_group_arn, autoscaling_group_name ): 473 | 474 | print("DEBUG: Waiting for attachment of autoscaler " + autoscaling_group_name + " to target_group_arn: " + target_group_arn) 475 | 476 | while True: 477 | # Get instances from target group 478 | print("DEBUG: Getting target group instances") 479 | target_group = elbv2.describe_target_health( 480 | TargetGroupArn=target_group_arn 481 | ) 482 | 483 | # Get healthy instance ids from target group 484 | print("DEBUG: Getting instance ids from load balancer") 485 | instance_health_flat = [] 486 | for instance in target_group['TargetHealthDescriptions']: 487 | if (instance['TargetHealth']['State'] == 'healthy'): 488 | instance_health_flat.append(instance['Target']['Id']) 489 | 490 | # Get our healthy instances from our autoscaler 491 | print("DEBUG: Getting healthy instances on our autoscaler") 492 | autoscaler = get_autoscaling_group( autoscaling_group_name ) 493 | as_instances = get_autoscaler_healthy_instances( autoscaler ) 494 | 495 | successes = 0 496 | for instance in as_instances: 497 | if instance['InstanceId'] in instance_health_flat: 498 | print("DEBUG: SUCCESS - Instance " + instance['InstanceId'] + " is healthy in our target group") 499 | successes = successes + 1 500 | else: 501 | print("DEBUG: FAIL - Instance " + instance['InstanceId'] + " is unhealthy or not present in our target group") 502 | 503 | if successes >= len(as_instances): 504 | if int(autoscaler['DesiredCapacity']) == successes: 505 | print("DEBUG: We have " + str(successes) + " healthy instances on the target group and on the ASG") 506 | break 507 | else: 508 | print("DEBUG: FAIL - We have " + str(successes) + " healthy instances on the target group but we have desired instances set to " + str(autoscaler['DesiredCapacity']) + " on the ASG") 509 | else: 510 | print("WAIT: Found " + str(successes) + " healthy instances on the target group from the ASG " + str(autoscaler['DesiredCapacity']) + " to continue. Waiting 10 seconds...") 511 | 512 | time.sleep( 10 ) 513 | 514 | 515 | def wait_for_instances_to_detach_from_loadbalancer( instance_ids, loadbalancer_name ): 516 | print("DEBUG: Waiting for detachment of instance_ids ") 517 | print(instance_ids) 518 | print(" from load balancer:" + loadbalancer_name) 519 | 520 | while True: 521 | loadbalancer = get_load_balancer(loadbalancer_name) 522 | lb_instances = get_instance_ids_of_load_balancer(loadbalancer) 523 | 524 | failures = 0 525 | for instance in instance_ids: 526 | print(" DEBUG: Checking if " + instance + " is attached to load balancer...") 527 | if instance in lb_instances: 528 | print(" ERROR: Currently attached to the load balancer...") 529 | failures = failures + 1 530 | else: 531 | print(" SUCCESS: Instance is not attached to the load balancer") 532 | 533 | if failures == 0: 534 | print("SUCCESS: Done waiting for detachment of instance ids") 535 | break 536 | 537 | print("DEBUG: Waiting for 10 seconds and trying again...") 538 | time.sleep( 10 ) 539 | 540 | print("DEBUG: DONE waiting for detachment of instances from " + loadbalancer_name) 541 | 542 | 543 | 544 | def wait_for_instances_to_detach_from_target_group( instance_ids, target_group_arn ): 545 | print("DEBUG: Waiting for detachment of instance_ids ") 546 | print(instance_ids) 547 | print(" from target group:" + target_group_arn) 548 | 549 | while True: 550 | print("DEBUG: Getting target group instances") 551 | target_group = elbv2.describe_target_health( 552 | TargetGroupArn=target_group_arn 553 | ) 554 | 555 | # Get healthy instance ids from target group 556 | print("DEBUG: Getting instance ids from load balancer") 557 | instance_health_flat = [] 558 | for instance in target_group['TargetHealthDescriptions']: 559 | instance_health_flat.append(instance['Target']['Id']) 560 | 561 | failures = 0 562 | for instance in instance_ids: 563 | print(" DEBUG: Checking if " + instance + " is attached to target group...") 564 | if instance in instance_health_flat: 565 | print(" ERROR: Currently attached to the target group...") 566 | failures = failures + 1 567 | else: 568 | print(" SUCCESS: Instance is not attached to the target group") 569 | 570 | if failures == 0: 571 | print("SUCCESS: Done waiting for detachment of instance ids") 572 | break 573 | 574 | print("DEBUG: Waiting for 10 seconds and trying again...") 575 | time.sleep( 10 ) 576 | 577 | print("DEBUG: DONE waiting for detachment of instances from " + target_group_arn) 578 | 579 | 580 | 581 | def wait_for_complete_targetgroup_autoscaler_detachment( target_group_arn, autoscaling_group_name ): 582 | 583 | print("DEBUG: Waiting for detachment of autoscaler " + autoscaling_group_name + " from target_group_arn:" + target_group_arn) 584 | 585 | while True: 586 | # Get instances from target group 587 | print("DEBUG: Getting target group instances") 588 | target_group = elbv2.describe_target_health( 589 | TargetGroupArn=target_group_arn 590 | ) 591 | 592 | # Get healthy instance ids from target group 593 | print("DEBUG: Getting instance ids from load balancer") 594 | instance_health_flat = [] 595 | for instance in target_group['TargetHealthDescriptions']: 596 | instance_health_flat.append(instance['Target']['Id']) 597 | 598 | # Get our healthy instances from our autoscaler 599 | print("DEBUG: Getting healthy instances on our autoscaler") 600 | as_instances = get_autoscaler_healthy_instances( autoscaling_group_name ) 601 | 602 | failures = 0 603 | for instance in as_instances: 604 | if instance['InstanceId'] in instance_health_flat: 605 | print("DEBUG: FAIL - Instance " + instance['InstanceId'] + " from our autoscaler is still in our target group") 606 | failures = failures + 1 607 | else: 608 | print("DEBUG: Success - Instance " + instance['InstanceId'] + " from our autoscaler is not in our target group") 609 | 610 | if failures == 0: 611 | print("DEBUG: SUCCESS - We have no instances from the autoscaling group on this target group...") 612 | break 613 | else: 614 | print("WAIT: Found " + str(failures) + " instances still on the target group from the ASG. Waiting 10 seconds...") 615 | 616 | time.sleep( 10 ) 617 | 618 | 619 | 620 | def flatten_instance_health_array_from_loadbalancer( input_instance_array ): 621 | output = [] 622 | for instance in input_instance_array: 623 | output.append(instance['InstanceId']) 624 | return output 625 | 626 | 627 | 628 | def flatten_instance_health_array_from_loadbalancer_only_healthy( input_instance_array ): 629 | output = [] 630 | for instance in input_instance_array: 631 | if instance['State'] == 'InService': 632 | output.append(instance['InstanceId']) 633 | 634 | return output 635 | 636 | 637 | def wait_for_complete_loadbalancer_autoscaler_attachment( loadbalancer_name, autoscaling_group_name ): 638 | print("DEBUG: Waiting for attachment of autoscaler " + autoscaling_group_name + " to load balancer:" + loadbalancer_name) 639 | 640 | while True: 641 | # Get instances from load balancer 642 | print("DEBUG: Getting load balancer") 643 | loadbalancer = get_load_balancer(loadbalancer_name) 644 | 645 | # Get instance ids from load balancer 646 | print("DEBUG: Getting instance ids from load balancer") 647 | temptwo = get_instance_ids_of_load_balancer(loadbalancer) 648 | 649 | # Get their healths (on the ELB) 650 | print("DEBUG: Getting instance health on the load balancer") 651 | instance_health = elb.describe_instance_health( 652 | LoadBalancerName=loadbalancer_name, 653 | Instances=loadbalancer['Instances'] 654 | ) 655 | instance_health = instance_health['InstanceStates'] 656 | 657 | # Put it into a flat array so we can check "in" it 658 | instance_health_flat = flatten_instance_health_array_from_loadbalancer_only_healthy(instance_health) 659 | 660 | # Get our healthy instances from our autoscaler 661 | print("DEBUG: Getting healthy instances on our autoscaler") 662 | autoscaler = get_autoscaling_group( autoscaling_group_name ) 663 | as_instances = get_autoscaler_healthy_instances( autoscaler ) 664 | 665 | successes = 0 666 | for instance in as_instances: 667 | if instance['InstanceId'] in instance_health_flat: 668 | print("DEBUG: SUCCESS - Instance " + instance['InstanceId'] + " is healthy in our ELB") 669 | successes = successes + 1 670 | else: 671 | print("DEBUG: FAIL - Instance " + instance['InstanceId'] + " is unhealthy or not present in our ELB") 672 | 673 | if successes >= len(as_instances): 674 | if int(autoscaler['DesiredCapacity']) == successes: 675 | print("DEBUG: We have " + str(successes) + " healthy instances on the elb and on the ASG") 676 | break 677 | else: 678 | print("WAIT: Found " + str(successes) + " healthy instances on the elb from the ASG " + str(autoscaler['DesiredCapacity']) + " to continue. Waiting 10 seconds...") 679 | else: 680 | print("WAIT: Found " + str(successes) + " healthy instances on the elb from the ASG " + str(autoscaler['DesiredCapacity']) + " to continue. Waiting 10 seconds...") 681 | 682 | time.sleep( 10 ) 683 | 684 | 685 | 686 | 687 | 688 | 689 | 690 | 691 | 692 | 693 | 694 | 695 | 696 | ###################### 697 | # Core application logic 698 | ###################### 699 | 700 | # Verify/get our load balancer 701 | print("Ensuring that \"" + options.autoscaler + "\" is a valid autoscaler in the current region...") 702 | autoscaler = get_autoscaling_group(options.autoscaler) 703 | if autoscaler is False: 704 | print("ERROR: '" + options.autoscaler + "' is NOT a valid autoscaler, exiting...") 705 | parser.print_usage() 706 | exit(1) 707 | 708 | # Grab some variables we need to use/save/reuse below 709 | autoscaler_old_max_size = int(autoscaler['MaxSize']) 710 | autoscaler_old_desired_capacity = int(autoscaler['DesiredCapacity']) 711 | 712 | # Check if we need to increase our max size 713 | print("Checking if our current desired size is equal to our max size (if so we have to increase max size to deploy)...") 714 | if autoscaler_old_max_size == autoscaler_old_desired_capacity: 715 | print("Updating max size of autoscaler by one from " + str(autoscaler_old_max_size)) 716 | if update_auto_scaling_group_max_size(options.autoscaler, (autoscaler_old_max_size + 1) ) is True: 717 | print("Successfully expanded autoscalers max size temporarily for deployment...") 718 | else: 719 | print("Failed expanding max-size, will be unable to deploy (until someone implements a different mechanism to deploy)") 720 | exit(1) 721 | 722 | # Letting the user know what this autoscaler is attached to... 723 | if len(autoscaler['LoadBalancerNames']) > 0: 724 | print("This autoscaler is attached to the following Elastic Load Balancers (ELBs): ") 725 | for name in autoscaler['LoadBalancerNames']: 726 | print(" ELB: " + name) 727 | else: 728 | print("This autoscaler is not attached to any ELBs") 729 | 730 | if len(autoscaler['TargetGroupARNs']) > 0: 731 | print("This autoscaler is attached to the following Target Groups (for ALBs): ") 732 | for name in autoscaler['TargetGroupARNs']: 733 | print(" TG: " + name) 734 | else: 735 | print("This autoscaler is not attached to any Target Groups") 736 | 737 | if (options.force): 738 | print("ALERT: We are force-deploying so we're going to skip checking for and setting suspended processes...") 739 | resume_all_processes( options.autoscaler ) 740 | else: 741 | print("Ensuring that we don't have certain suspended processes that we will need to proceed...") 742 | required_processes = ['Terminate','Launch','HealthCheck','AddToLoadBalancer'] 743 | suspended = get_suspended_processes(autoscaler) 744 | succeed = True 745 | for process in required_processes: 746 | if process in suspended: 747 | print("Error: This autoscaler currently has the required suspended process: " + process) 748 | succeed = False 749 | if succeed == False: 750 | exit(1) 751 | 752 | # Suspending processes so things on an autoscaler can settle 753 | print("Suspending processes so everything can settle on ELB/ALB/TGs: ") 754 | suspend_new_processes = ['ScheduledActions', 'AlarmNotification', 'AZRebalance'] 755 | suspend_processes( options.autoscaler, suspend_new_processes ) 756 | 757 | print("Waiting 3 seconds so the autoscaler can settle from the above change...") 758 | time.sleep(3) 759 | 760 | # Get our autoscaler info again... just-incase something changed on it before doing the below health-check logic... 761 | autoscaler = get_autoscaling_group(options.autoscaler) 762 | 763 | # Wait to have healthy == desired instances on the autoscaler 764 | print("Ensuring that we have the right number of instances on the autoscaler") 765 | wait_for_autoscaler_to_have_healthy_desired_instances(autoscaler) 766 | 767 | # Only if we want to not force-deploy do we check if the instances get health on their respective load balancers/target groups 768 | if (not options.force): 769 | # Wait to have healthy instances on the load balancers 770 | if len(autoscaler['LoadBalancerNames']) > 0: 771 | print("Ensuring that these instances are healthy on the load balancer(s)") 772 | for name in autoscaler['LoadBalancerNames']: 773 | print("Waiting for all instances to be healthy in " + name + "...") 774 | wait_for_complete_loadbalancer_autoscaler_attachment( name, options.autoscaler ) 775 | 776 | # Wait to have healthy instances on the target groups 777 | if len(autoscaler['TargetGroupARNs']) > 0: 778 | print("Ensuring that these instances are healthy on the target group(s)") 779 | for name in autoscaler['TargetGroupARNs']: 780 | print("Waiting for all instances to be healthy in " + name + "...") 781 | wait_for_complete_targetgroup_autoscaler_attachment( name, options.autoscaler ) 782 | 783 | print("====================================================") 784 | print("Performing rollout...") 785 | print("====================================================") 786 | 787 | # Get our autoscaler info _one_ last time, to make sure we have the instances that we'll be rolling out of service... 788 | autoscaler = get_autoscaling_group(options.autoscaler) 789 | 790 | # Gather the instances we need to kill... 791 | instances_to_kill = get_autoscaler_healthy_instances(autoscaler) 792 | if options.checkifinstancesneedtobeterminated: 793 | print("INFO: Checking if there are instances to skip") 794 | instances_to_skip = get_instances_to_skip(instances_to_kill, autoscaler) 795 | for instance in instances_to_skip: 796 | print("DEBUG: Skiping instance " + instance['InstanceId']) 797 | instances_to_kill.remove(instance) 798 | 799 | # Keep a tally of current instances... 800 | current_instance_list = get_autoscaler_healthy_instances(autoscaler) 801 | 802 | def find_aws_instances_in_first_list_but_not_in_second( array_one, array_two ): 803 | output = [] 804 | for instance_array_one in array_one: 805 | # print("Found " + instance_array_one['InstanceId'] + " in array one...") 806 | found = False 807 | for instance_array_two in array_two: 808 | if instance_array_two['InstanceId'] == instance_array_one['InstanceId']: 809 | # print("Found " + instance_array_two['InstanceId'] + " in array two also") 810 | found = True 811 | 812 | if (not found): 813 | # print("Did not find instance in array two, returning this...") 814 | output.append(instance_array_one) 815 | 816 | return output 817 | 818 | # Increase our desired size by one so a new instance will be started (usually from a new launch configuration) 819 | # Don't increase desired capacity if there is no instance to kill 820 | if len(instances_to_kill) > 0: 821 | print("Increasing desired capacity by one from " + str(autoscaler['DesiredCapacity']) + " to " + str(autoscaler['DesiredCapacity'] + 1)) 822 | set_desired_capacity( options.autoscaler, autoscaler['DesiredCapacity'] + 1 ) 823 | 824 | downscaled = False 825 | 826 | 827 | for i, instance in enumerate(instances_to_kill): 828 | 829 | # Sleep a little bit every loop, just incase... 830 | print("Sleeping for 3 seconds so the autoscaler can catch-up...") 831 | time.sleep(3) 832 | 833 | # This is used in the external "down" helper below, but we need to do this here before we start shutting down this instance 834 | old_instance_details = describe_instance(instance['InstanceId']) 835 | 836 | # Wait to have healthy == desired instances on the autoscaler 837 | print("Ensuring that we have the right number of instances on the autoscaler") 838 | wait_for_autoscaler_to_have_healthy_desired_instances( options.autoscaler ) 839 | 840 | # Wait for new instances to spin up... 841 | while True: 842 | print("Waiting for new instance(s) to spin up...") 843 | # Lets figure out what the new instance ID(s) are here... 844 | new_current_instance_list = get_autoscaler_healthy_instances(options.autoscaler) 845 | new_instances = find_aws_instances_in_first_list_but_not_in_second(new_current_instance_list, current_instance_list) 846 | if len(new_instances) == 0: 847 | print("There are no new instances yet... waiting 10 seconds...") 848 | time.sleep(10) 849 | else: 850 | break; 851 | 852 | # Only if we instructed that we want to not skip the health checks on the way up 853 | if (not options.skip): 854 | # Wait to have healthy instances on the load balancers 855 | if len(autoscaler['LoadBalancerNames']) > 0: 856 | print("Ensuring that these instances are healthy on the load balancer(s)") 857 | for name in autoscaler['LoadBalancerNames']: 858 | print("Waiting for all instances to be healthy in " + name + "...") 859 | wait_for_complete_loadbalancer_autoscaler_attachment( name, options.autoscaler ) 860 | 861 | # Wait to have healthy instances on the target groups 862 | if len(autoscaler['TargetGroupARNs']) > 0: 863 | print("Ensuring that these instances are healthy on the target group(s)") 864 | for name in autoscaler['TargetGroupARNs']: 865 | print("Waiting for all instances to be healthy in " + name + "...") 866 | wait_for_complete_targetgroup_autoscaler_attachment( name, options.autoscaler ) 867 | 868 | # Wait for instance to get healthy (custom handler) if desired... 869 | if (options.checkifnewserverisupcommand): 870 | print("Running external health up check upon request...") 871 | while True: 872 | succeeded_health_up_check = True 873 | # String replacing the instance ID and/or the instance IP address into the external script 874 | for new_instance in new_instances: 875 | try: 876 | instance_details = describe_instance(new_instance['InstanceId']) 877 | private_ip_address = instance_details['PrivateIpAddress'] 878 | if 'PublicIpAddress' in instance_details: 879 | public_ip_address = instance_details['PublicIpAddress'] 880 | print("Found new instance " + new_instance['InstanceId'] + " with private IP address " + private_ip_address + " and public IP " + public_ip_address) 881 | else: 882 | print("Found new instance " + new_instance['InstanceId'] + " with private IP address " + private_ip_address + " and NO public IP address") 883 | 884 | tmpcommand = str(options.checkifnewserverisupcommand) 885 | tmpcommand = tmpcommand.replace('NEW_INSTANCE_ID',new_instance['InstanceId']) 886 | tmpcommand = tmpcommand.replace('NEW_INSTANCE_PRIVATE_IP_ADDRESS', private_ip_address) 887 | if 'PublicIpAddress' in instance_details: 888 | tmpcommand = tmpcommand.replace('NEW_INSTANCE_PUBLIC_IP_ADDRESS', public_ip_address) 889 | print("Executing external health shell command: " + tmpcommand) 890 | retval = call(tmpcommand, shell=True) 891 | # print "Got return value " + str(retval) 892 | if (retval != 0): 893 | succeeded_health_up_check = False 894 | except: 895 | print("WARNING: Failed trying to figure out if new instance is healthy") 896 | 897 | if succeeded_health_up_check: 898 | print("SUCCESS: We are done checking instances with a custom command") 899 | break 900 | else: 901 | print("FAIL: We are done checking instances with a custom command, but (at least one) has failed, re-trying in 10 seconds...") 902 | time.sleep(10) 903 | 904 | print("Should de-register instance " + instance['InstanceId'] + " from ALB/ELBs if attached...") 905 | 906 | # If we have load balancers... 907 | if len(autoscaler['LoadBalancerNames']) > 0: 908 | for name in autoscaler['LoadBalancerNames']: 909 | print("De-registering " + instance['InstanceId'] + " from load balancer " + name + "...") 910 | deregister_instance_from_load_balancer( instance['InstanceId'], name ) 911 | 912 | # If we have target groups... 913 | if len(autoscaler['TargetGroupARNs']) > 0: 914 | for name in autoscaler['TargetGroupARNs']: 915 | print("De-registering " + instance['InstanceId'] + " from target group " + name + "...") 916 | deregister_instance_from_target_group( instance['InstanceId'], name ) 917 | 918 | # If we have load balancers... 919 | if len(autoscaler['LoadBalancerNames']) > 0: 920 | for name in autoscaler['LoadBalancerNames']: 921 | while True: 922 | instance_ids = get_instance_ids_of_load_balancer( name ) 923 | print("Got instance ids...") 924 | pprint(instance_ids) 925 | if instance['InstanceId'] in instance_ids: 926 | print("Instance ID is still in load balancer, sleeping for 10 seconds...") 927 | time.sleep(10) 928 | else: 929 | print("Instance ID is removed from load balancer, continuing...") 930 | break 931 | 932 | # If we have target groups... 933 | if len(autoscaler['TargetGroupARNs']) > 0: 934 | for name in autoscaler['TargetGroupARNs']: 935 | while True: 936 | instance_ids = get_instance_ids_of_target_group( name ) 937 | if instance['InstanceId'] in instance_ids: 938 | print("Instance ID is still in target group, sleeping for 10 seconds...") 939 | time.sleep(10) 940 | else: 941 | print("Instance ID is removed from target group, continuing...") 942 | break 943 | 944 | # Run a command on server going down, if desired... 945 | if (options.runbeforeserverdowncommand): 946 | print("Running external server down command...") 947 | # String replacing the instance ID and/or the instance IP address into the external script 948 | old_private_ip_address = old_instance_details['PrivateIpAddress'] 949 | if 'PublicIpAddress' in old_instance_details: 950 | old_public_ip_address = old_instance_details['PublicIpAddress'] 951 | 952 | tmpcommand = str(options.runbeforeserverdowncommand) 953 | tmpcommand = tmpcommand.replace('OLD_INSTANCE_ID',old_instance_details['InstanceId']) 954 | tmpcommand = tmpcommand.replace('OLD_INSTANCE_PRIVATE_IP_ADDRESS', old_private_ip_address) 955 | if 'PublicIpAddress' in old_instance_details: 956 | tmpcommand = tmpcommand.replace('OLD_INSTANCE_PUBLIC_IP_ADDRESS', old_public_ip_address) 957 | print("Executing before server down command: " + tmpcommand) 958 | retval = call(tmpcommand, shell=True) 959 | # print "Got return value " + str(retval) 960 | if (retval != 0): 961 | print("WARNING: Server down command returned retval of " + str(retval)) 962 | 963 | # If the user specified they want to wait 964 | if (options.waitforseconds > 0): 965 | print("User requested to wait for {0} before terminating instances...".format(options.waitforseconds)) 966 | time.sleep(options.waitforseconds) 967 | 968 | # Re-get our current instance list, for the custom health check script 969 | time.sleep(2) 970 | current_instance_list = get_autoscaler_healthy_instances(options.autoscaler) 971 | 972 | # Now terminate our instance in our autoscaling group... 973 | # If this is our last time in this loop then we want to decrement the capacity along with it 974 | if (i + 1) == len(instances_to_kill): 975 | terminate_instance_in_auto_scaling_group( instance['InstanceId'], options.autoscaler, True ) 976 | downscaled = True 977 | # Otherwise, simply kill this server and wait for it to be replaced, keeping the desired capacity 978 | else: 979 | terminate_instance_in_auto_scaling_group( instance['InstanceId'], options.autoscaler ) 980 | 981 | # Run a command on server going down, if desired... 982 | if (options.runafterserverdowncommand): 983 | print("Running external server down command after...") 984 | time.sleep(2) 985 | # String replacing the instance ID and/or the instance IP address into the external script 986 | old_private_ip_address = old_instance_details['PrivateIpAddress'] 987 | if 'PublicIpAddress' in old_instance_details: 988 | old_public_ip_address = old_instance_details['PublicIpAddress'] 989 | 990 | tmpcommand = str(options.runafterserverdowncommand) 991 | tmpcommand = tmpcommand.replace('OLD_INSTANCE_ID',old_instance_details['InstanceId']) 992 | tmpcommand = tmpcommand.replace('OLD_INSTANCE_PRIVATE_IP_ADDRESS', old_private_ip_address) 993 | if 'PublicIpAddress' in old_instance_details: 994 | tmpcommand = tmpcommand.replace('OLD_INSTANCE_PUBLIC_IP_ADDRESS', old_public_ip_address) 995 | print("Executing after server down command: " + tmpcommand) 996 | retval = call(tmpcommand, shell=True) 997 | # print "Got return value " + str(retval) 998 | if (retval != 0): 999 | print("WARNING: Server down command returned retval of " + str(retval)) 1000 | 1001 | instances_to_kill_flat = flatten_instance_health_array_from_loadbalancer( instances_to_kill ) 1002 | 1003 | # Before exiting, just incase lets wait for proper detachment of the Classic ELBs (wait for: idle timeout / connection draining to finish) 1004 | if (not options.force): 1005 | if len(autoscaler['LoadBalancerNames']) > 0: 1006 | print("Ensuring that these instances are fully detached from the load balancer(s)") 1007 | for name in autoscaler['LoadBalancerNames']: 1008 | print("Waiting for complete detachment of old instances from load balancer '" + name + "'...") 1009 | wait_for_instances_to_detach_from_loadbalancer( instances_to_kill_flat, name ) 1010 | 1011 | # Before exiting, just incase lets wait for proper detachment of the TGs (wait for: idle timeout / connection draining to finish) 1012 | if len(autoscaler['TargetGroupARNs']) > 0: 1013 | print("Ensuring that these instances are fully detached from the target group(s)") 1014 | for name in autoscaler['TargetGroupARNs']: 1015 | print("Waiting for complete detachment of old instances from target group '" + name + "'...") 1016 | wait_for_instances_to_detach_from_target_group( instances_to_kill_flat, name ) 1017 | 1018 | # This should never happen unless the above for loop breaks out unexpectedly 1019 | if downscaled == False: 1020 | print("Manually decreasing desired capacity back to " + str(autoscaler_old_desired_capacity)) 1021 | set_desired_capacity( options.autoscaler, autoscaler_old_desired_capacity ) 1022 | 1023 | # Resume our processes... 1024 | if (options.force): 1025 | print("ALERT: Resuming all autoscaling processes because of --force...") 1026 | resume_all_processes( options.autoscaler ) 1027 | else: 1028 | print("Resuming suspended processes...") 1029 | resume_processes(options.autoscaler, suspend_new_processes) 1030 | 1031 | # Check if we need to decrease our max size back to what it was 1032 | print("Checking if we changed our max size, if so, shrink it again...") 1033 | if autoscaler_old_max_size == autoscaler_old_desired_capacity: 1034 | print("Updating max size of autoscaler down one to " + str(autoscaler_old_max_size)) 1035 | if update_auto_scaling_group_max_size(options.autoscaler, autoscaler_old_max_size ) is True: 1036 | print("Successfully shrunk autoscalers max size back to its old value") 1037 | else: 1038 | print("Failed shrinking max-size for some reason") 1039 | exit(1) 1040 | else: 1041 | print("Didn't need to shrink our max size") 1042 | 1043 | print("Successfully zero-downtime deployed!") 1044 | exit(0) 1045 | -------------------------------------------------------------------------------- /aws-choose-profile/README.md: -------------------------------------------------------------------------------- 1 | # AWS Choose Profile, bash/fish + python 2 | 3 | ![Demo of aws-choose-profile](https://raw.githubusercontent.com/DevOps-Nirvana/aws-missing-tools/master/aws-choose-profile/demo.png "Demo of AWS Choose Profile helper") 4 | 5 | aws-choose-profile is a shell script (for bash/fish so far) that scans for profiles defined in ~/.aws/credentials and in ~/.aws/config and asks you to choose one of them, and then sets the AWS_PROFILE and AWS_DEFAULT_PROFILE environment variables for you from the chosen profile. This is ONLY 6 | possible if you `source` this program (due to the way shell environments work). 7 | 8 | If you do not source it, this script will detect this state and warn you about it, no harm done 9 | 10 | ## Installation: 11 | I recommend you symlink this into your user or system bin folder. NOTE: if you choose to "install" this, you must also install 12 | the file "aws-choose-profile-helper.py" along side it, which has the the actual profile selection logic since mangling arrays and managing data is difficult in bash alone. 13 | 14 | ### Installation Examples: 15 | 16 | ``` 17 | # Desired, symlink in place, so you can "git pull" and update this command from time to time 18 | # FOR Bash users 19 | ln -s $(pwd)/aws-choose-profile.bash /usr/local/bin/aws-choose-profile 20 | # FOR Fish users (from a fish shell) 21 | ln -s (pwd)/aws-choose-profile.fish /usr/local/bin/aws-choose-profile 22 | ``` 23 | or copying it into place with... 24 | ``` 25 | # Not desired, but possible depending on your preference 26 | # For Bash users 27 | cp aws-choose-profile.bash /usr/local/bin/aws-choose-profile 28 | # For Fish users 29 | cp aws-choose-profile.fish /usr/local/bin/aws-choose-profile 30 | # And also for all users 31 | cp aws-choose-profile-helper.py /usr/local/bin/ 32 | ``` 33 | 34 | ## Directions For Use: 35 | ``` 36 | source aws-choose-profile 37 | ``` 38 | or even shorter with... 39 | ``` 40 | . aws-choose-profile 41 | ``` 42 | 43 | ## Potential Use: 44 | For sysadmins and geeks who manage more than one AWS-based client and/or have multiple accounts to manage with consolidated billing, this script helps wranggle your local credentials to manage those various AWS accounts. 45 | 46 | 47 | ## Todo: 48 | If you'd like to help contribute (or when the author is bored) there are some features that could be added... 49 | - Add ability to also set AWS_DEFAULT_REGION automatically based on the profile selected if that profile has a default region 50 | - Add another "wrapper" besides bash/fish-based for use in other shells, or powershell? Anyone interested? Or request a feature and I'll add it. 51 | - Others? Submit feature requests as a bug in Github 52 | - If desired, add support/debug/confirm working within other shells (sh, csh, ksh) 53 | 54 | ## Additional Information: 55 | - Author(s): Farley farley@neonsurge.com 56 | - First Published: 2016-06-11 57 | - Version 0.0.1 58 | - License Type: MIT 59 | -------------------------------------------------------------------------------- /aws-choose-profile/aws-choose-profile-helper.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | ############################################################################### 4 | # 5 | # aws-choose-profile-helper.py Written by Farley 6 | # 7 | # This script does the actual scanning of the aws config files and asking the 8 | # user for their selection. Please see more detailed description and info 9 | # in the parent script, aws-choose-profile 10 | # 11 | ############################################################################### 12 | 13 | # Libraries 14 | # Config parsing 15 | import configparser 16 | Config = configparser.ConfigParser() 17 | # For homedir and isfile 18 | import os 19 | home_dir = os.path.expanduser("~") 20 | # For argv handling 21 | import sys 22 | 23 | ################### 24 | # Globals helpers # 25 | ################### 26 | def contains_value(array, string): 27 | for item in array: 28 | if string == item: 29 | return True 30 | return False 31 | 32 | def represents_int(s): 33 | try: 34 | int(s) 35 | return True 36 | except ValueError: 37 | return False 38 | ################### 39 | 40 | # If we want to print debug information along with running this 41 | debug_output = False 42 | write_output_to_file = False 43 | if debug_output: 44 | print("Starting up...") 45 | 46 | # Check for CLI args (used to determine where to write the output/result) 47 | if len(sys.argv) >= 2: 48 | write_output_to_file = sys.argv.pop(1) 49 | if debug_output: 50 | print("Writing chosen profile to: " + temp) 51 | if os.path.isfile(write_output_to_file): 52 | if debug_output: 53 | print("Removing chosen profile temp file") 54 | os.remove(write_output_to_file) 55 | else: 56 | if debug_output: 57 | print("Echoing out chosen profile to screen") 58 | 59 | # We always have a "default" profile 60 | profiles = [] 61 | profiles.append('default') 62 | 63 | # Read our credentials file 64 | if os.path.isfile(home_dir + "/.aws/credentials"): 65 | if debug_output: 66 | print("Reading from " + home_dir + "/.aws/credentials...") 67 | Config.read(home_dir + "/.aws/credentials") 68 | 69 | # Debug output 70 | if debug_output: 71 | print("Got the following profiles...") 72 | print(Config.sections()) 73 | 74 | for item in Config.sections(): 75 | if not contains_value(profiles, item): 76 | profiles.append(item) 77 | else: 78 | if debug_output: 79 | print("No file to read from at " + home_dir + "/.aws/credentials") 80 | 81 | # Read our config file 82 | if os.path.isfile(home_dir + "/.aws/credentials"): 83 | if debug_output: 84 | print("Reading from " + home_dir + "/aws/config...") 85 | Config.read(home_dir + "/.aws/config") 86 | 87 | # Debug output 88 | if debug_output: 89 | print("Got the following profiles...") 90 | print(Config.sections()) 91 | 92 | for item in Config.sections(): 93 | # First, cleanse our "profile " prefix 94 | cleanitem = item.replace("profile ", "") 95 | if not contains_value(profiles, cleanitem): 96 | profiles.append(cleanitem) 97 | else: 98 | print("No file to read from at " + home_dir + "/.aws/config") 99 | 100 | # Finally sort alphabetically 101 | sorted(profiles, key=str.lower) 102 | # And remove and re-insert "default" to make sure it's always first on the list 103 | profiles.remove('default') 104 | profiles.insert(0, 'default') 105 | if debug_output: 106 | print(profiles) 107 | 108 | # Print profiles available 109 | print("===============================") 110 | print(" Profiles available") 111 | print("===============================") 112 | count = 1 113 | for profile in profiles: 114 | print(str(count) + ". " + profile) 115 | count = count + 1 116 | print("===============================") 117 | count = count - 1 118 | 119 | # Ask the user to choose a profile infinitely 120 | while True: 121 | var = input("Choose a profile number: [1-" + str(count) + "]: ") 122 | 123 | if represents_int(var) and int(var) > 0 and int(var) <= count: 124 | break; 125 | else: 126 | print("Invalid input") 127 | var = str(int(var) - 1) 128 | 129 | # Take the chosen profile and print or write it to a file 130 | chosen = profiles.pop(int(var)) 131 | if debug_output: 132 | print("You chose: " + chosen) 133 | if write_output_to_file == False: 134 | if debug_output: 135 | print("Printing to screen") 136 | print(chosen) 137 | else: 138 | print("Writing output to file " + write_output_to_file) 139 | text_file = open(write_output_to_file, "w") 140 | text_file.write(chosen) 141 | text_file.close() 142 | 143 | exit(0) 144 | -------------------------------------------------------------------------------- /aws-choose-profile/aws-choose-profile.bash: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # 3 | ############################################################################### 4 | # 5 | # aws-choose-profile.bash Written by Farley 6 | # 7 | # This helper scans for profiles defined in ~/.aws/credentials and in 8 | # ~/.aws/config and asks you to choose one of them, and then sets the 9 | # AWS_PROFILE and AWS_DEFAULT_PROFILE environment variables. This is ONLY 10 | # possible if you `source` this program in the `bash` shell, for other 11 | # shell wrappers, see the other files with different extensions, if you don't 12 | # see a shell you want, ask me and I'll add it! 13 | # 14 | # Usage example: 15 | # source aws-choose-profile 16 | # or 17 | # . aws-choose-profile 18 | # 19 | # If you do not source it, this script will detect this state 20 | # and warn you about it, and not allow you to choose (since it's) 21 | # useless 22 | # 23 | # I recommend you symlink this into your user or system bin folder 24 | # Please note: if you choose to install (aka cp) this, you must also install 25 | # the file "aws-choose-profile-helper.py" along side it, which has the the 26 | # aws profile selection logic 27 | # 28 | ############################################################################### 29 | 30 | # Get actual folder this script is in, resolving all symlinks to files/folders 31 | SOURCE="${BASH_SOURCE[0]}" 32 | while [ -h "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink 33 | DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )" 34 | SOURCE="$(readlink "$SOURCE")" 35 | [[ $SOURCE != /* ]] && SOURCE="$DIR/$SOURCE" # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located 36 | done 37 | DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )" 38 | 39 | # Always print our current profile 40 | if [[ $AWS_DEFAULT_PROFILE != '' ]]; then 41 | echo "Current AWS Profile: $AWS_DEFAULT_PROFILE" 42 | else 43 | echo "Current AWS Profile: none" 44 | fi 45 | 46 | # Simple check if sourced... 47 | if [ "$0" = "$BASH_SOURCE" ]; then 48 | echo "ERROR: Not sourced, please run source $0" 49 | else 50 | unlink /tmp/aws-choose-profile.temp 2>/dev/null 51 | $DIR/aws-choose-profile-helper.py /tmp/aws-choose-profile.temp 52 | CHOSENPROFILE=`tail -n1 /tmp/aws-choose-profile.temp 2>/dev/null` 53 | unlink /tmp/aws-choose-profile.temp 54 | if [[ $CHOSENPROFILE = '' ]]; then 55 | echo "" 56 | else 57 | echo "Chosen Profile: $CHOSENPROFILE" 58 | export AWS_DEFAULT_PROFILE=$CHOSENPROFILE 59 | export AWS_PROFILE=$CHOSENPROFILE 60 | echo "Exported AWS_PROFILE and AWS_DEFAULT_PROFILE to $CHOSENPROFILE" 61 | echo "" 62 | fi 63 | fi 64 | 65 | 66 | 67 | -------------------------------------------------------------------------------- /aws-choose-profile/aws-choose-profile.fish: -------------------------------------------------------------------------------- 1 | #!/usr/local/bin/fish 2 | # 3 | ############################################################################### 4 | # 5 | # aws-choose-profile.fish Written by Farley 6 | # 7 | # This helper scans for profiles defined in ~/.aws/credentials and in 8 | # ~/.aws/config and asks you to choose one of them, and then sets the 9 | # AWS_PROFILE and AWS_DEFAULT_PROFILE environment variables. This is ONLY 10 | # possible if you `source` this program in the `fish` shell, for other 11 | # shell wrappers, see the other files with different extensions, if you don't 12 | # see a shell you want, ask me and I'll add it! 13 | # 14 | # Usage example: 15 | # source aws-choose-profile 16 | # or 17 | # . aws-choose-profile 18 | # 19 | # If you do not source it, this script will detect this state 20 | # and warn you about it, and not allow you to choose (since it's) 21 | # useless 22 | # 23 | # I recommend you symlink this into your user or system bin folder 24 | # Please note: if you choose to install (aka cp) this, you must also install 25 | # the file "aws-choose-profile-helper.py" along side it, which has the the 26 | # aws profile selection logic 27 | # 28 | ############################################################################### 29 | 30 | # Get actual full file path of this script 31 | set -l SCRIPTNAME (status -f) 32 | # If our path is a full path already, don't prefix with pwd when figuring out our directory 33 | if [ (string sub --start 1 -l 1 $SCRIPTNAME) = '/' ] 34 | set DIR (dirname (fish_realpath (status -f))) 35 | else 36 | set DIR (dirname (fish_realpath (pwd)/(status -f))) 37 | end 38 | 39 | # Always print our current profile 40 | if [ $AWS_DEFAULT_PROFILE ] 41 | echo "Current AWS Profile: $AWS_DEFAULT_PROFILE" 42 | else 43 | echo "Current AWS Profile: none" 44 | end 45 | 46 | 47 | # Simple check if we are sourced 48 | set -l result (status -t) 49 | switch "$result" 50 | case '*sourcing*' 51 | # If we were caught here then COOL, we are "sourced" 52 | case '*' 53 | echo "ERROR: Not sourced, please run source " (status -f) 54 | exit 1 55 | end 56 | 57 | # Now do the magic... 58 | eval unlink /tmp/aws-choose-profile.temp 2> /dev/null 59 | eval $DIR/aws-choose-profile-helper.py /tmp/aws-choose-profile.temp 60 | set -l CHOSEN_PROFILE (tail -n1 /tmp/aws-choose-profile.temp 2>/dev/null) 61 | 62 | # Remove our temp file 63 | eval unlink /tmp/aws-choose-profile.temp 2>/dev/null 64 | 65 | # Always set our environment variables if the user said to 66 | if [ $CHOSEN_PROFILE ] 67 | echo "Chosen Profile: $CHOSEN_PROFILE" 68 | set -x AWS_DEFAULT_PROFILE $CHOSEN_PROFILE 69 | set -x AWS_PROFILE $CHOSEN_PROFILE 70 | echo "Exported AWS_PROFILE and AWS_DEFAULT_PROFILE to $CHOSEN_PROFILE" 71 | exit 0 72 | else 73 | echo "No profile chosen, so no profile set" 74 | exit 1 75 | end 76 | -------------------------------------------------------------------------------- /aws-choose-profile/demo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DevOps-Nirvana/aws-missing-tools/fdec7507c71143cbb6dda3e05cdf29f4a3db785e/aws-choose-profile/demo.png -------------------------------------------------------------------------------- /aws-env-vars-into-pbcopy.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | BACKUPFILE="/tmp/backup-env-vars" 3 | env | grep AWS_ | grep -v _PROFILE > $BACKUPFILE 4 | # echo 'export PS1="$(echo $PS1)"' # >> $BACKUPFILE 5 | # Set the export up 6 | sed -i -e 's/^/export /' $BACKUPFILE 7 | echo "cd $PWD" >> $BACKUPFILE 8 | cat $BACKUPFILE | pbcopy 9 | rm -f $BACKUPFILE || true 10 | -------------------------------------------------------------------------------- /aws-iam-require-mfa-allow-self-service/README.md: -------------------------------------------------------------------------------- 1 | # IAM Require MFA - Allow Self Service 2 | 3 | ## Introduction: 4 | This is a simple / example profile that uses Amazon's base profile, but fills in a lot of gaps to allow self management. See [Amazon's Official MFA Self-Manage Profile Here](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_iam_mfa-selfmanage.html). It is recommended to apply this to all users which are actual users (aka, not service users) to mandate that they use 2FA. 5 | 6 | This combos nicely with the [AWSCLI Login Virtual MFA Script](../aws-mfa-login) 7 | -------------------------------------------------------------------------------- /aws-iam-require-mfa-allow-self-service/policy.json: -------------------------------------------------------------------------------- 1 | { 2 | "Version": "2012-10-17", 3 | "Statement": [ 4 | { 5 | "Sid": "AllowListUsersAllIfMFA", 6 | "Effect": "Allow", 7 | "Action": [ 8 | "iam:ListUsers", 9 | "iam:ListPolicies", 10 | "iam:ListGroups", 11 | "iam:ListGroupPolicies", 12 | "iam:ListAttachedGroupPolicies", 13 | "iam:GetServiceLastAccessedDetails", 14 | "iam:GetPolicyVersion", 15 | "iam:GetGroup", 16 | "access-analyzer:ListPolicyGenerations" 17 | ], 18 | "Resource": "*", 19 | "Condition": { 20 | "Bool": { 21 | "aws:MultiFactorAuthPresent": [ 22 | "true" 23 | ] 24 | } 25 | } 26 | }, 27 | { 28 | "Sid": "AllowIndividualUserToDescribeTheirOwnMFAAndSecurityObjects", 29 | "Effect": "Allow", 30 | "Action": [ 31 | "iam:getUser", 32 | "iam:ResyncMFADevice", 33 | "iam:ListVirtualMFADevices", 34 | "iam:ListUserTags", 35 | "iam:ListUserPolicies", 36 | "iam:ListSigningCertificates", 37 | "iam:ListServiceSpecificCredentials", 38 | "iam:ListSSHPublicKeys", 39 | "iam:ListPoliciesGrantingServiceAccess", 40 | "iam:ListMFADevices", 41 | "iam:ListGroupsForUser", 42 | "iam:ListAttachedUserPolicies", 43 | "iam:ListAccessKeys", 44 | "iam:GetSSHPublicKey", 45 | "iam:GenerateServiceLastAccessedDetails", 46 | "iam:EnableMFADevice", 47 | "iam:CreateVirtualMFADevice" 48 | ], 49 | "Resource": [ 50 | "arn:aws:iam::*:user/${aws:username}", 51 | "arn:aws:iam::*:mfa/${aws:username}" 52 | ] 53 | }, 54 | { 55 | "Sid": "AllowIndividualUserToManageTheirOwnMFAWhenUsingMFA", 56 | "Effect": "Allow", 57 | "Action": [ 58 | "iam:UploadSigningCertificate", 59 | "iam:UploadSSHPublicKey", 60 | "iam:UpdateSigningCertificate", 61 | "iam:UpdateServiceSpecificCredential", 62 | "iam:UpdateAccessKey", 63 | "iam:ResetServiceSpecificCredential", 64 | "iam:DeleteVirtualMFADevice", 65 | "iam:DeleteSigningCertificate", 66 | "iam:DeleteServiceSpecificCredential", 67 | "iam:DeleteSSHPublicKey", 68 | "iam:DeleteAccessKey", 69 | "iam:DeactivateMFADevice", 70 | "iam:CreateServiceSpecificCredential", 71 | "iam:CreateAccessKey" 72 | ], 73 | "Resource": [ 74 | "arn:aws:iam::*:user/${aws:username}", 75 | "arn:aws:iam::*:mfa/${aws:username}" 76 | ], 77 | "Condition": { 78 | "Bool": { 79 | "aws:MultiFactorAuthPresent": [ 80 | "true" 81 | ] 82 | } 83 | } 84 | }, 85 | { 86 | "Sid": "BlockMostAccessUnlessSignedInWithMFA", 87 | "Effect": "Deny", 88 | "NotAction": [ 89 | "iam:getUser", 90 | "iam:ResyncMFADevice", 91 | "iam:ListVirtualMFADevices", 92 | "iam:ListSigningCertificates", 93 | "iam:ListServiceSpecificCredentials", 94 | "iam:ListSSHPublicKeys", 95 | "iam:ListMFADevices", 96 | "iam:ListAccessKeys", 97 | "iam:EnableMFADevice", 98 | "iam:CreateVirtualMFADevice", 99 | "iam:ChangePassword" 100 | ], 101 | "Resource": "*", 102 | "Condition": { 103 | "BoolIfExists": { 104 | "aws:MultiFactorAuthPresent": [ 105 | "false" 106 | ] 107 | } 108 | } 109 | } 110 | ] 111 | } 112 | -------------------------------------------------------------------------------- /aws-image-ourself/aws-image-ourself: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # 3 | # TODO ADD COMMENTS 4 | # Very simple helper to make an AMI of the server we're currently on 5 | # TODO: Migrate existing advanced PHP-based logic instead of this simple below logic 6 | # 7 | EC2_INSTANCE_ID=$(curl --max-time 2 --silent http://169.254.169.254/latest/meta-data/instance-id) 8 | 9 | # Check if EC2_INSTANCE_ID is empty! AKA, not on Amazon or no networking stack 10 | 11 | if [ $# -ge 1 ] 12 | then 13 | IMAGE_NAME=$1_$(date +%Y%m%d_%H%M%S) 14 | else 15 | IMAGE_NAME=$(echo $EC2_INSTANCE_ID)_$(date +%Y%m%d_%H%M%S) 16 | fi 17 | 18 | 19 | echo "Our instance id: $EC2_INSTANCE_ID" 20 | echo "Creating image: $IMAGE_NAME" 21 | echo -n "Please wait... " 22 | aws ec2 create-image --no-reboot --name $IMAGE_NAME --instance-id $EC2_INSTANCE_ID 23 | -------------------------------------------------------------------------------- /aws-mfa-login/README.md: -------------------------------------------------------------------------------- 1 | # AWS MFA Login 2 | 3 | ## Introduction: 4 | This script makes an AWS Access Key / Secret Key login via 2FA on your CLI, or assume into an role via 2FA. 5 | 6 | * Allows you to login to 2FA via the CLI for an existing Access/Secret key 7 | * Allows you to assume into an role on the same or different AWS account 8 | * Preserves your region if set on the source role/account 9 | * Currently only supports the Virtual MFA, send a PR if you wish to add others 10 | * Prevents potential huge security issues leaking because of accidentally committed access/secret keys to SCM/Git 11 | * _**NOTE:** When combined with [AWS IAM Require MFA Allow Self Service Profile](#example-user-with-mandatory-2fa)_ 12 | * Allows you to ensure better/best security practices on AWS, making all your roles mandatory 2FA to assume into them. 13 | * _**NOTE:** See [example below](#example-role-with-mandatory-2fa) for roles with mandatory 2FA on them_ 14 | 15 | This script assumes you're already having a working AWS CLI and profile. If you are using AWS CLI profiles, make sure you set your profile before running this script with eg: `export SOURCE_PROFILE=mycompany` which is the same name you used to `aws configure --profile mycompany`. 16 | 17 | For a helper for this, see: https://github.com/DevOps-Nirvana/aws-missing-tools/tree/master/aws-choose-profile 18 | 19 | OR... 20 | 21 | ## Enable 2FA on an Access/Secret Key 22 | 23 | ```bash 24 | # First, try this command (replace mycompany with your awscli profile name and the name of your aws account alias or company name) 25 | AWS_DEFAULT_PROFILE=mycompany ASSUME_ROLE_LABEL=mycompany aws-mfa-login 26 | 27 | # And if that works, you have to check which shell you are using. The default since recently on OS-X is (annoyingly) zsh, so we'll have instructions for zsh/bash 28 | echo $SHELL 29 | 30 | # If $SHELL has "bash" in it, you'll do the following... 31 | # #1: make an alias in your ~/.bash_profile like this... 32 | echo "alias mycompany_aws_2fa='AWS_DEFAULT_PROFILE=mycompany-aws-root ASSUME_ROLE_LABEL=mycompany aws-mfa-login'" >> ~/.bash_profile 33 | # #2: begin using it instantly 34 | source ~/.bash_profile 35 | 36 | # OR If $SHELL has "zsh" in it, you'll do the following... 37 | # #1: make an alias in your ~/.zshrc like this... 38 | echo "alias mycompany_aws_2fa='AWS_DEFAULT_PROFILE=mycompany-aws-root ASSUME_ROLE_LABEL=mycompany aws-mfa-login'" >> ~/.zshrc 39 | # #2: begin using it instantly 40 | source ~/.zshrc 41 | 42 | # then run it with... 43 | mycompany_aws_2fa 44 | ``` 45 | 46 | ## Assume into another role with 2FA 47 | 48 | ```bash 49 | # First, try this command (replace the values below, similar to above, but add the ASSUME_ROLE_ARN with the ARN you want to assume) 50 | AWS_DEFAULT_PROFILE=mycompany ASSUME_ROLE_LABEL=my_role_name_here ASSUME_ROLE_ARN=arn:aws:iam::1231231234:role/role_name_here aws-mfa-login 51 | 52 | # If you get an error about invalid session duration (defaults to 12h) lower it to 1h (3600 seconds) with the following... 53 | AWS_DEFAULT_PROFILE=mycompany ASSUME_ROLE_LABEL=my_role_name_here ASSUME_ROLE_ARN=arn:aws:iam::1231231234:role/role_name_here SESSION_DURATION=3600 aws-mfa-login 54 | 55 | # And if that works, similar as above you have to check which shell you are using. The default since recently on OS-X is (annoyingly) zsh, so we'll have instructions for zsh/bash 56 | echo $SHELL 57 | 58 | # If $SHELL has "bash" in it, you'll do the following... 59 | # #1: make an alias in your ~/.bash_profile like this... 60 | echo "alias my_other_company_role_aws_2fa='AWS_DEFAULT_PROFILE=mycompany ASSUME_ROLE_LABEL=my_role_name_here ASSUME_ROLE_ARN=arn:aws:iam::1231231234:role/role_name_here aws-mfa-login'" >> ~/.bash_profile 61 | # #2: begin using it instantly 62 | source ~/.bash_profile 63 | 64 | # OR If $SHELL has "zsh" in it, you'll do the following... 65 | # #1: make an alias in your ~/.zshrc like this... 66 | echo "alias my_other_company_role_aws_2fa='AWS_DEFAULT_PROFILE=mycompany ASSUME_ROLE_LABEL=my_role_name_here ASSUME_ROLE_ARN=arn:aws:iam::1231231234:role/role_name_here aws-mfa-login'" >> ~/.zshrc 67 | # #2: begin using it instantly 68 | source ~/.zshrc 69 | 70 | # then run it with... 71 | mycompany_aws_2fa 72 | ``` 73 | 74 | ## Example: User with Mandatory 2FA 75 | 76 | The aws-mfa-login tool combos nicely with the below self-service IAM profile which enforces 2FA for AWS IAM Access Keys which makes them inherently less sensitive. This would prevent a potential security/privacy leak in your organization if someone accidentally committed their access/secret keys somewhere (like Github). 77 | 78 | This allows you to skip having to setup complex client-side systems for your employees such as [AWS Vault](https://github.com/99designs/aws-vault) to try to encrypt your IAM access-keys/secrets, instead leveraging industry-standard 2FA on top of your existing Access/Secret credential pairs. 79 | 80 | Below is the JSON copied from [AWS IAM Require MFA Allow Self Service Profile](../aws-iam-require-mfa-allow-self-service/) for your reference, this is an ideal and battle-tested configuration allowing a user to self-manage themselves enough with no MFA to enable it, and then allowing a standard amount of read/list operations that most users would appreciate from the get-go and to prevent a ton of errors in the AWS Console. It's a combination of a few recommended permissions from various AWS articles. 81 | 82 | A recommended setup would be to assign this Profile to an AWS Group (eg: called "mandatory-2fa") and then assign it to all your _actual_ users (NOT service account users, eg: SES email user) to ensure org-wide mandatory 2FA policies. 83 | 84 | ```json 85 | { 86 | "Version": "2012-10-17", 87 | "Statement": [ 88 | { 89 | "Sid": "AllowListUsersAllIfMFA", 90 | "Effect": "Allow", 91 | "Action": [ 92 | "iam:ListUsers", 93 | "iam:ListPolicies", 94 | "iam:ListGroups", 95 | "iam:ListGroupPolicies", 96 | "iam:ListAttachedGroupPolicies", 97 | "iam:GetServiceLastAccessedDetails", 98 | "iam:GetPolicyVersion", 99 | "iam:GetGroup", 100 | "access-analyzer:ListPolicyGenerations" 101 | ], 102 | "Resource": "*", 103 | "Condition": { 104 | "Bool": { 105 | "aws:MultiFactorAuthPresent": [ 106 | "true" 107 | ] 108 | } 109 | } 110 | }, 111 | { 112 | "Sid": "AllowIndividualUserToDescribeTheirOwnMFAAndSecurityObjects", 113 | "Effect": "Allow", 114 | "Action": [ 115 | "iam:getUser", 116 | "iam:ResyncMFADevice", 117 | "iam:ListVirtualMFADevices", 118 | "iam:ListUserTags", 119 | "iam:ListUserPolicies", 120 | "iam:ListSigningCertificates", 121 | "iam:ListServiceSpecificCredentials", 122 | "iam:ListSSHPublicKeys", 123 | "iam:ListPoliciesGrantingServiceAccess", 124 | "iam:ListMFADevices", 125 | "iam:ListGroupsForUser", 126 | "iam:ListAttachedUserPolicies", 127 | "iam:ListAccessKeys", 128 | "iam:GetSSHPublicKey", 129 | "iam:GenerateServiceLastAccessedDetails", 130 | "iam:EnableMFADevice", 131 | "iam:CreateVirtualMFADevice" 132 | ], 133 | "Resource": [ 134 | "arn:aws:iam::*:user/${aws:username}", 135 | "arn:aws:iam::*:mfa/${aws:username}" 136 | ] 137 | }, 138 | { 139 | "Sid": "AllowIndividualUserToManageTheirOwnMFAWhenUsingMFA", 140 | "Effect": "Allow", 141 | "Action": [ 142 | "iam:UploadSigningCertificate", 143 | "iam:UploadSSHPublicKey", 144 | "iam:UpdateSigningCertificate", 145 | "iam:UpdateServiceSpecificCredential", 146 | "iam:UpdateAccessKey", 147 | "iam:ResetServiceSpecificCredential", 148 | "iam:DeleteVirtualMFADevice", 149 | "iam:DeleteSigningCertificate", 150 | "iam:DeleteServiceSpecificCredential", 151 | "iam:DeleteSSHPublicKey", 152 | "iam:DeleteAccessKey", 153 | "iam:DeactivateMFADevice", 154 | "iam:CreateServiceSpecificCredential", 155 | "iam:CreateAccessKey" 156 | ], 157 | "Resource": [ 158 | "arn:aws:iam::*:user/${aws:username}", 159 | "arn:aws:iam::*:mfa/${aws:username}" 160 | ], 161 | "Condition": { 162 | "Bool": { 163 | "aws:MultiFactorAuthPresent": [ 164 | "true" 165 | ] 166 | } 167 | } 168 | }, 169 | { 170 | "Sid": "BlockMostAccessUnlessSignedInWithMFA", 171 | "Effect": "Deny", 172 | "NotAction": [ 173 | "iam:getUser", 174 | "iam:ResyncMFADevice", 175 | "iam:ListVirtualMFADevices", 176 | "iam:ListSigningCertificates", 177 | "iam:ListServiceSpecificCredentials", 178 | "iam:ListSSHPublicKeys", 179 | "iam:ListMFADevices", 180 | "iam:ListAccessKeys", 181 | "iam:EnableMFADevice", 182 | "iam:CreateVirtualMFADevice", 183 | "iam:ChangePassword" 184 | ], 185 | "Resource": "*", 186 | "Condition": { 187 | "BoolIfExists": { 188 | "aws:MultiFactorAuthPresent": [ 189 | "false" 190 | ] 191 | } 192 | } 193 | } 194 | ] 195 | } 196 | ``` 197 | 198 | 199 | ## Example: Role with Mandatory 2FA 200 | 201 | Below is the JSON for the "Trust Relationship" for an IAM role which will make MFA/2FA mandatory, ensuring users can't skip 2FA to get into the role. 202 | 203 | ```json 204 | { 205 | "Version": "2012-10-17", 206 | "Statement": [ 207 | { 208 | "Sid": "AllowUsersToAssumeFromAccountWithMFA", 209 | "Effect": "Allow", 210 | "Principal": { 211 | "AWS": "arn:aws:iam::123123123123:root" 212 | }, 213 | "Action": "sts:AssumeRole", 214 | "Condition": { 215 | "Bool": { 216 | "aws:MultiFactorAuthPresent": "true" 217 | } 218 | } 219 | } 220 | ] 221 | } 222 | ``` 223 | -------------------------------------------------------------------------------- /aws-mfa-login/aws-mfa-login: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | ##################################### 3 | # This script logs in via 2FA for an IAM Access/Secret Key. 4 | # OR 5 | # This script allows assuming into an role via 2FA if ASSUME_ROLE_ARN is set 6 | # 7 | # This script assumes you're already having a working AWS CLI and profile 8 | # If you are using AWS CLI profiles, make sure you set your profile before running 9 | # this script with eg: `export AWS_DEFAULT_PROFILE=mycompany` which is the same 10 | # name you used to `aws configure --profile mycompany`. 11 | # 12 | # For a helper for this, see: https://github.com/DevOps-Nirvana/aws-missing-tools/tree/master/aws-choose-profile 13 | # OR... 14 | # Make some aliases in your .bash_profile which instantly choose the org and label. For example try this command... 15 | # AWS_DEFAULT_PROFILE=mycompany ASSUME_ROLE_LABEL=mycompany aws-mfa-login 16 | # and if that works, make an alias in your ~/.bash_profile like this... 17 | # echo "alias mycompany_aws_2fa='AWS_DEFAULT_PROFILE=mycompany-aws-root ASSUME_ROLE_LABEL=mycompany aws-mfa-login'" >> ~/.bash_profile 18 | # or if using zsh, make an alias in your ~/.zshrc like this... 19 | # echo "alias mycompany_aws_2fa='AWS_DEFAULT_PROFILE=mycompany-aws-root ASSUME_ROLE_LABEL=mycompany aws-mfa-login'" >> ~/.zshrc 20 | # begin using it instantly 21 | # source ~/.bash_profile 22 | # or 23 | # source ~/.zshrc 24 | # then run it with... 25 | # mycompany_aws_2fa 26 | # 27 | # For use to assume into an role (eg: in the same or another account) you'll need one more variable to pass in... 28 | # ASSUME_ROLE_ARN=arn:aws:iam::1231231234:role/role_name_here AWS_DEFAULT_PROFILE=mycompany ASSUME_ROLE_LABEL=mycompany aws-mfa-login 29 | # If you get an error about "session duration" you may need to lower your duration by setting SESSION_DURATION which is set to 43200 (12 hours) by default down to 3600 (1 hour) 30 | # SESSION_DURATION=3600 ASSUME_ROLE_ARN=arn:aws:iam::1231231234:role/role_name_here AWS_DEFAULT_PROFILE=mycompany ASSUME_ROLE_LABEL=mycompany aws-mfa-login 31 | ##################################### 32 | 33 | # First, try to get our username so we know our MFA device path from it, this also checks if our CLI is functional 34 | echo "Checking if you have a function awscli / active profile" 35 | 36 | # This script uses JSON output, so we'll set this here forcibly 37 | export AWS_DEFAULT_OUTPUT="json" 38 | 39 | echo "aws sts get-caller-identity" 40 | data=$(aws sts get-caller-identity) 41 | export AWS_ACCOUNT_ID=$(echo -- "$data" | sed -n 's!.*"Account": "\(.*\)".*!\1!p') 42 | export AWS_USER_ARN=$(echo -- "$data" | sed -n 's!.*"Arn": "\(.*\)".*!\1!p') 43 | export AWS_MFA_DEVICE_ARN=$(echo "$AWS_USER_ARN" | sed 's|user/|mfa/|') 44 | echo "Using MFA: $AWS_MFA_DEVICE_ARN" 45 | if [ ! -n "$AWS_MFA_DEVICE_ARN" ]; then 46 | echo "" 47 | echo "Error, the call to 'aws sts get-caller-identity' failed, you either: " 48 | echo "" 49 | echo " * Never setup your CLI with 'aws configure', or " 50 | echo " * Your subshell session expired, aka you ran this and your session expired" 51 | echo " if so, simply type 'exit' and re-try this command" 52 | echo " * You forgot to choose a profile by running..." 53 | echo " export AWS_DEFAULT_PROFILE=mycompany && $0" 54 | echo "" 55 | exit 1 56 | fi 57 | 58 | echo "aws configure get region" 59 | AUTODETECTED_REGION=$(aws configure get region) 60 | if [ -z "$AUTODETECTED_REGION" ]; then 61 | echo "No default region detected, not setting any region as default, likely to use us-east-1 as default" 62 | else 63 | echo "Using region: $AUTODETECTED_REGION" 64 | fi 65 | 66 | # If you need to override this so you have a sex(ier) name for your organization please run... 67 | # ASSUME_ROLE_LABEL=mycompanyname aws-mfa-login 68 | export ASSUME_ROLE_LABEL=${ASSUME_ROLE_LABEL-$AWS_ACCOUNT_ID-2FA} 69 | 70 | # Output for debugging / informational purposes 71 | echo "Found AWS Account ID: $AWS_ACCOUNT_ID" 72 | echo "Found AWS User ARN: $AWS_USER_ARN" 73 | echo "Using Assume Role Label: $ASSUME_ROLE_LABEL" 74 | 75 | # Detect if we want to assume an specific role instead of just 2FA as our user 76 | if [ -z "$ASSUME_ROLE_ARN" ]; then 77 | echo "No ASSUME_ROLE_ARN found, we are 2FA-ing for our current user" 78 | HRS="12" 79 | else 80 | echo "Found Assume Role: $ASSUME_ROLE_ARN" 81 | SESSION_DURATION=${SESSION_DURATION-43200} # In seconds, by default we allow 12 hours 82 | HRS=`expr ${SESSION_DURATION} / 3600` 83 | echo "Using SESSION_DURATION of: $HRS hours" 84 | fi 85 | 86 | # Prompt user for 2FA code 87 | while true; do 88 | read -p "Please type in your MFA code: " MFA_CODE 89 | if [ "MFA_CODE" == "" ]; then 90 | echo "Error, no input found..." 91 | else 92 | break 93 | fi 94 | done 95 | 96 | # Assume role 97 | if [ -z "$ASSUME_ROLE_ARN" ]; then 98 | echo "Trying to process 2FA for to $ASSUME_ROLE_LABEL" 99 | echo "aws sts get-session-token --serial-number $AWS_MFA_DEVICE_ARN --token-code $MFA_CODE" 100 | tokens=$(aws sts get-session-token --serial-number $AWS_MFA_DEVICE_ARN --token-code $MFA_CODE) 101 | else 102 | echo "Trying to assume role with 2FA into: $ASSUME_ROLE_ARN" 103 | echo "aws sts assume-role --role-arn $ASSUME_ROLE_ARN --serial-number $AWS_MFA_DEVICE_ARN --token-code $MFA_CODE --role-session-name cli-2fa-assume-role --duration-seconds $SESSION_DURATION" 104 | tokens=$(aws sts assume-role --role-arn $ASSUME_ROLE_ARN --serial-number $AWS_MFA_DEVICE_ARN --token-code $MFA_CODE --role-session-name cli-2fa-assume-role --duration-seconds $SESSION_DURATION) 105 | fi 106 | 107 | # Validate the tokens are there 108 | if [ "$?" -ne 0 ]; then 109 | echo $tokens 110 | echo "Error while trying to enable 2FA and/or assume role" 111 | exit 1 112 | fi 113 | 114 | echo "Parsing credentials..." 115 | export AWS_ACCESS_KEY_ID=$(echo -- "$tokens" | sed -n 's!.*"AccessKeyId": "\(.*\)".*!\1!p') 116 | export AWS_SECRET_ACCESS_KEY=$(echo -- "$tokens" | sed -n 's!.*"SecretAccessKey": "\(.*\)".*!\1!p') 117 | export AWS_SESSION_TOKEN=$(echo -- "$tokens" | sed -n 's!.*"SessionToken": "\(.*\)".*!\1!p') 118 | export AWS_SESSION_EXPIRATION_TIME=$(echo -- "$tokens" | sed -n 's!.*"Expiration": "\(.*\)".*!\1!p') 119 | echo "Session expires at: $AWS_SESSION_EXPIRATION_TIME UTC (in $HRS hours)" 120 | 121 | if [ ! -z "$AUTODETECTED_REGION" ]; then 122 | echo "Setting region to: $AUTODETECTED_REGION" 123 | export AWS_DEFAULT_REGION=$AUTODETECTED_REGION 124 | fi 125 | 126 | # Enter a subshell with permissions of assumed role 127 | echo "Entering subshell..." 128 | 129 | # Alternatively to $SHELL, get parent shell by comparing $PPID to ps aux output and grabbing process name 130 | # typeset parentshell=$( 131 | # ps aux | while IFS= read -r line; do 132 | # shellpid=$(echo $line | awk '{print $2}') 133 | # if [[ "$shellpid" -eq "$PPID" ]]; then 134 | # echo $line | awk '{print $11}' 135 | # fi 136 | # done) 137 | # Get shell for this user's defaults 138 | 139 | # Spawn subshell based on parent shell process name 140 | if [[ $SHELL == *"zsh"* ]]; then 141 | mkdir ~/.zshrc_1 2> /dev/null 142 | echo 'source ~/.zshrc' > ~/.zshrc_1/.zshrc 143 | echo 'unset AWS_PROFILE AWS_DEFAULT_PROFILE && PS1="$ASSUME_ROLE_LABEL $PS1"' >> ~/.zshrc_1/.zshrc 144 | ZDOTDIR=~/.zshrc_1 zsh 2> /dev/null 145 | elif [[ $SHELL == *"bash"* ]]; then 146 | touch ~/.bashrc 147 | touch ~/.bash_profile 148 | bash --rcfile <(echo 'source ~/.bashrc && source ~/.bash_profile && unset AWS_PROFILE AWS_DEFAULT_PROFILE && PS1="$ASSUME_ROLE_LABEL $PS1"') 149 | else 150 | echo "Unknown or invalid shell: $parentshell" 151 | exit 1 152 | fi 153 | 154 | echo "Exiting subshell..." 155 | -------------------------------------------------------------------------------- /aws-push-cloudwatch-instance-metrics/README.md: -------------------------------------------------------------------------------- 1 | # AWS Push Cloudwatch Instance Metrics 2 | 3 | aws-push-cloudwatch-instance-metrics.py is a python script that grabs metrics about this instance including memory usage, swap usage, and disk usage, and pushes them to CloudWatch against our EC2 instance. If this server is part of an autoscaling group, it also pushes against that dimension (within the EC2 namespace) 4 | 5 | This script is loosely based around the old boto2 version of this which is no longer available, which used to be at... 6 | https://github.com/colinbjohnson/aws-missing-tools 7 | 8 | ## Installation: 9 | I recommend you symlink this into your user or system bin folder 10 | 11 | ### Installation Examples: 12 | 13 | ``` 14 | # You first need to have 'boto3' installed, install it with, you may not have 15 | # pip yet, if not, please install pip first usually... 16 | # apt-get install python-pip or yum install python-pip 17 | pip install boto3 18 | 19 | # Then if desired, symlink in place, so you can "git pull" and update this command from time to time 20 | ln -s $(pwd)/aws-push-cloudwatch-instance-metrics.py /usr/local/bin/ 21 | ``` 22 | or copying it into place with... 23 | ``` 24 | # Not desired, but possible depending on your preference 25 | cp -a aws-push-cloudwatch-instance-metrics.py /usr/local/bin/ 26 | ``` 27 | 28 | ## Directions For Use: 29 | ``` 30 | ./aws-push-cloudwatch-instance-metrics.py 31 | ``` 32 | 33 | Intended to be used via cron... 34 | ``` 35 | # Every minute 36 | * * * * * /usr/local/bin/aws-push-cloudwatch-instance-metrics.py >> /var/log/aws-push-cloudwatch-instance-metrics.py.log 37 | # or every 5 minutes 38 | */5 * * * * /usr/local/bin/aws-push-cloudwatch-instance-metrics.py >> /var/log/aws-push-cloudwatch-instance-metrics.py.log 39 | ``` 40 | 41 | ## Potential Use: 42 | For getting useful EC2 metrics into cloudwatch which you can set alarms for and react accordingly, including setting alarms with automatic autoscaling actions. 43 | 44 | 45 | ## Todo: 46 | If you'd like to help contribute (or when the author is bored) there are some features that could be added... 47 | - Probably needs a lot of testing, but I've been using it for a number of clients for years in some iteration 48 | - Others? Submit feature requests as a bug in Github 49 | 50 | 51 | ## Additional Information: 52 | - Author(s): Farley farley@neonsurge.com 53 | - First Published: 2016-06-20 54 | - Version 0.0.1 55 | - License Type: MIT 56 | -------------------------------------------------------------------------------- /aws-push-cloudwatch-instance-metrics/aws-push-cloudwatch-instance-metrics.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | ''' 3 | Send memory usage metrics to Amazon CloudWatch 4 | 5 | This is intended to run on an Amazon EC2 instance and requires an IAM 6 | role allowing to write CloudWatch metrics. Alternatively, you can create 7 | a boto credentials file and rely on it instead. 8 | 9 | Original idea based on https://github.com/colinbjohnson/aws-missing-tools 10 | modified to detect if we are under an autoscaler and then send those metrics 11 | to the autoscaler also 12 | 13 | NOTE: You may need to install boto3 on a default Ubuntu/CentOS install before this script will work 14 | 15 | Current incarnation by: Farley 16 | ''' 17 | 18 | import sys 19 | import re 20 | import datetime 21 | import subprocess 22 | import boto3 23 | from pprint import pprint 24 | 25 | def get_region(): 26 | import urllib2 27 | try: 28 | region = urllib2.urlopen('http://169.254.169.254/latest/meta-data/placement/availability-zone').read() 29 | return region[0:-1] 30 | except: 31 | return False 32 | 33 | 34 | 35 | def get_instance_id(): 36 | import urllib2 37 | try: 38 | instance_id = urllib2.urlopen('http://169.254.169.254/latest/meta-data/instance-id').read() 39 | return instance_id 40 | except: 41 | return False 42 | 43 | 44 | def collect_memory_usage(): 45 | meminfo = {} 46 | pattern = re.compile('([\w\(\)]+):\s*(\d+)(:?\s*(\w+))?') 47 | with open('/proc/meminfo') as f: 48 | for line in f: 49 | match = pattern.match(line) 50 | if match: 51 | # For now we don't care about units (match.group(3)) 52 | meminfo[match.group(1)] = float(match.group(2)) 53 | return meminfo 54 | 55 | 56 | def get_root_disk_usage_percentage(): 57 | df = subprocess.Popen(["df", "/"], stdout=subprocess.PIPE) 58 | output = df.communicate()[0] 59 | device, size, used, available, percent, mountpoint = output.split("\n")[1].split() 60 | return percent[0:-1] 61 | 62 | 63 | 64 | region = get_region() 65 | cloudwatch = boto3.client('cloudwatch', region_name=region) 66 | 67 | 68 | def send_cloud_metrics_against_instance_and_autoscaler(instance_id, region, metrics, namespace="EC2", unit='Percent'): 69 | ''' 70 | Send EC2 metrics to CloudWatch 71 | metrics is expected to be a map of key -> value pairs of metrics 72 | ''' 73 | 74 | # First push all our metrics against our instance id dimension 75 | for key, metric in metrics.iteritems(): 76 | # print(" key " + key + " metric " + str(metric)) 77 | cloudwatch.put_metric_data( 78 | Namespace=namespace, 79 | MetricData=[ 80 | { 81 | 'MetricName': key, 82 | 'Dimensions': [{ 83 | 'Name': 'InstanceId', 84 | 'Value': instance_id 85 | }], 86 | 'Timestamp': datetime.datetime.now(), 87 | 'Value': float(metric), 88 | 'Unit': unit 89 | } 90 | ] 91 | ) 92 | 93 | autoscaling = boto3.client('autoscaling', region_name=region) 94 | autoscalers = autoscaling.describe_auto_scaling_instances( 95 | InstanceIds= [ 96 | instance_id 97 | ])['AutoScalingInstances'] 98 | for autoscaler in autoscalers: 99 | for key, metric in metrics.iteritems(): 100 | # print(" key " + key + " metric " + str(metric)) 101 | cloudwatch.put_metric_data( 102 | Namespace=namespace, 103 | MetricData=[ 104 | { 105 | 'MetricName': key, 106 | 'Dimensions': [{ 107 | 'Name': 'AutoScalingGroupName', 108 | 'Value': autoscaler['AutoScalingGroupName'] 109 | }], 110 | 'Timestamp': datetime.datetime.now(), 111 | 'Value': float(metric), 112 | 'Unit': unit 113 | } 114 | ] 115 | ) 116 | 117 | if __name__ == '__main__': 118 | instance_id = get_instance_id() 119 | mem_usage = collect_memory_usage() 120 | mem_free = mem_usage['MemFree'] + mem_usage['Buffers'] + mem_usage['Cached'] 121 | mem_used = mem_usage['MemTotal'] - mem_free 122 | if mem_usage['SwapTotal'] != 0 : 123 | swap_used = mem_usage['SwapTotal'] - mem_usage['SwapFree'] - mem_usage['SwapCached'] 124 | swap_percent = swap_used / mem_usage['SwapTotal'] * 100 125 | else: 126 | swap_percent = 0 127 | disk_usage = get_root_disk_usage_percentage() 128 | 129 | metrics = {'MemUsage': mem_used / mem_usage['MemTotal'] * 100, 130 | 'SwapUsage': swap_percent, 131 | 'DiskUsage': disk_usage } 132 | 133 | result = send_cloud_metrics_against_instance_and_autoscaler(instance_id, region, metrics) 134 | 135 | print str(datetime.datetime.now()) + ": sent metrics (" + instance_id + ": " + str(metrics) + ") to CloudWatch" 136 | -------------------------------------------------------------------------------- /cleanup-packer-aws-resources/README.md: -------------------------------------------------------------------------------- 1 | # AWS Cleanup Packer Resources 2 | 3 | TODO 4 | -------------------------------------------------------------------------------- /cleanup-packer-aws-resources/aws-permissions.json: -------------------------------------------------------------------------------- 1 | { 2 | "Version": "2012-10-17", 3 | "Statement": [ 4 | { 5 | "Effect": "Allow", 6 | "Action": [ 7 | "logs:CreateLogGroup", 8 | "logs:CreateLogStream", 9 | "logs:PutLogEvents" 10 | ], 11 | "Resource": "arn:aws:logs:*:*:*" 12 | }, 13 | { 14 | "Effect": "Allow", 15 | "Action": [ 16 | "ec2:DescribeInstances", 17 | "ec2:DescribeKeyPairs", 18 | "ec2:DescribeSecurityGroups", 19 | "ec2:deleteKeyPair", 20 | "ec2:deleteSecurityGroup", 21 | "ec2:describeRegions" 22 | ], 23 | "Resource": "*" 24 | }, 25 | { 26 | "Effect": "Allow", 27 | "Action": [ 28 | "ec2:TerminateInstances" 29 | ], 30 | "Condition": { 31 | "StringEquals": { 32 | "ec2:ResourceTag/Name": "Packer Builder" 33 | } 34 | }, 35 | "Resource": [ 36 | "arn:aws:ec2:*" 37 | ] 38 | } 39 | ] 40 | } -------------------------------------------------------------------------------- /cleanup-packer-aws-resources/aws-permissions.txt: -------------------------------------------------------------------------------- 1 | { 2 | "Version": "2012-10-17", 3 | "Statement": [ 4 | { 5 | "Effect": "Allow", 6 | "Action": [ 7 | "logs:CreateLogGroup", 8 | "logs:CreateLogStream", 9 | "logs:PutLogEvents" 10 | ], 11 | "Resource": "arn:aws:logs:*:*:*" 12 | }, 13 | { 14 | "Effect": "Allow", 15 | "Action": [ 16 | "ec2:DescribeInstances", 17 | "ec2:DescribeKeyPairs", 18 | "ec2:DescribeSecurityGroups", 19 | "ec2:deleteKeyPair", 20 | "ec2:deleteSecurityGroup", 21 | "ec2:describeRegions" 22 | ], 23 | "Resource": "*" 24 | }, 25 | { 26 | "Effect": "Allow", 27 | "Action": [ 28 | "ec2:TerminateInstances" 29 | ], 30 | 31 | "Condition": { 32 | "StringEquals": { 33 | "ec2:ResourceTag/Name":"Packer Builder" 34 | } 35 | }, 36 | "Resource": [ 37 | "arn:aws:ec2:*" 38 | ] 39 | } 40 | ] 41 | } -------------------------------------------------------------------------------- /cleanup-packer-aws-resources/cleanup-packer-aws-resources.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | ############################################################################### 4 | # 5 | # cleanup-packer-aws-resources.py Written by Farley 6 | # 7 | # Packer when used on an AWS account from tools like Jenkins or Rundeck, often 8 | # leaves remnants of its existance such as instances running, security groups, 9 | # and SSH keys. This script scans all regions of AWS for leftover packer 10 | # resources and removes them. 11 | # 12 | # This script can be run on the command-line standalone or ideally put into packer 13 | # and run via cloudwatch scheduled events like once a day or so 14 | # 15 | # This is from Farley's AWS missing tools 16 | # https://github.com/DevOps-Nirvana/aws-missing-tools/ 17 | # 18 | ############################################################################### 19 | # 20 | # Minimm AWS Permissions Necessary to run this script 21 | # 22 | # NOTE: The lambda:InvokeFunction is only needed if you want to run this from AWS Lambda 23 | # Similar to the logs:* functions are only needed if you want to run from lambda and if you want logging 24 | # 25 | # { 26 | # "Version": "2012-10-17", 27 | # "Statement": [ 28 | # { 29 | # "Effect": "Allow", 30 | # "Action": [ 31 | # "logs:CreateLogGroup", 32 | # "logs:CreateLogStream", 33 | # "logs:DescribeLogGroups", 34 | # "logs:DescribeLogStreams", 35 | # "logs:PutLogEvents", 36 | # "ec2:DescribeRegions", 37 | # "ec2:DescribeInstances", 38 | # "ec2:DescribeKeyPairs", 39 | # "ec2:DescribeSecurityGroups", 40 | # "ec2:TerminateInstances", 41 | # "ec2:DeleteKeyPair", 42 | # "ec2:DeleteSecurityGroup" 43 | # ], 44 | # "Resource": "*" 45 | # }, 46 | # { 47 | # "Action": "lambda:InvokeFunction", 48 | # "Effect": "Allow", 49 | # "Resource": "*" 50 | # } 51 | # ] 52 | # } 53 | # 54 | ############################################################################### 55 | 56 | from __future__ import print_function 57 | 58 | # For AWS 59 | import boto3 60 | # For pretty-print 61 | from pprint import pprint 62 | from datetime import datetime 63 | import calendar 64 | # For checking runtime context 65 | import os 66 | 67 | # The maximum age (in seconds) of a packer instance before we terminate it 68 | # 86400 == 1 day 69 | # 21600 == 6 hours 70 | # 10800 == 3 hours 71 | # 3600 == 1 hour 72 | max_age = 21600 73 | 74 | # Whether or not to output debug info as it does things 75 | debug = bool(os.environ.get('LOG_DEBUG')) 76 | 77 | # Our AWS regions, we'll call the AWS API to get the list of regions, so this is always up to date 78 | ec2 = boto3.client('ec2', region_name='us-west-1') 79 | regions = [] 80 | awsregions = ec2.describe_regions()['Regions'] 81 | for region in awsregions: 82 | regions.append(region['RegionName']) 83 | del ec2, awsregions 84 | 85 | # Helper to convert datetime with TZ to Unix time 86 | def dt2ts(dt): 87 | return calendar.timegm(dt.utctimetuple()) 88 | 89 | # Helper to convert seconds to a sexy format "x hours, x minutes, x seconds" etc 90 | def display_time(seconds, granularity=2): 91 | intervals = ( 92 | ('months', 18144000), # 60 * 60 * 24 * 7 * 30 (roughly) 93 | ('weeks', 604800), # 60 * 60 * 24 * 7 94 | ('days', 86400), # 60 * 60 * 24 95 | ('hours', 3600), # 60 * 60 96 | ('minutes', 60), 97 | ('seconds', 1), 98 | ) 99 | 100 | result = [] 101 | 102 | for name, count in intervals: 103 | value = seconds // count 104 | if value: 105 | seconds -= value * count 106 | if value == 1: 107 | name = name.rstrip('s') 108 | result.append("{} {}".format(value, name)) 109 | return ', '.join(result[:granularity]) 110 | 111 | # Get instances from AWS from all regions that... 112 | # #1: Are currently running 113 | # #2: Have the name "Packer Builder" 114 | # #3: Have been alive longer than our specified limit 115 | def get_zombie_packer_instances(regions, maximum_age): 116 | global debug 117 | output = {} 118 | 119 | # Get our "now" timestamp for knowing how long ago instances were launched 120 | utc_now = datetime.now() 121 | utc_now_ts = int(utc_now.strftime("%s")) 122 | 123 | for region in regions: 124 | regionoutput = [] 125 | if debug is True: 126 | print(f"Scanning region {region} for instances") 127 | 128 | # Create our EC2 Handler 129 | ec2 = boto3.client('ec2', region_name=region) 130 | 131 | response = ec2.describe_instances( 132 | MaxResults=1000 133 | ) 134 | 135 | for reservation in response['Reservations']: 136 | for instance in reservation['Instances']: 137 | 138 | if debug is True: 139 | print(f"Found instance: {instance['InstanceId']}") 140 | if instance['State']['Name'] == "running": 141 | if debug is True: 142 | print(" Instance is currently running") 143 | else: 144 | if debug is True: 145 | print(" Instance is not currently running, skipping...") 146 | continue 147 | 148 | if instance.get('KeyName', '').startswith('packer_'): 149 | if debug is True: 150 | print(" Instance is a packer builder") 151 | else: 152 | if debug is True: 153 | print(" Instance is NOT a packer building, skipping...") 154 | continue 155 | 156 | if debug is True: 157 | print(f" Found packer instance: {instance['InstanceId']}") 158 | launched_at = dt2ts(instance['LaunchTime']) 159 | if debug is True: 160 | print(f" Instance started {display_time(utc_now_ts - launched_at)} ago ") 161 | if (utc_now_ts - launched_at) > maximum_age: 162 | if debug is True: 163 | print(" Instance started more than a day ago, should be marked for termination") 164 | regionoutput.append({ 165 | "region": region, 166 | "instance_id": instance['InstanceId'], 167 | "keyname": instance['KeyName'], 168 | "security_groups": instance['SecurityGroups'] 169 | }) 170 | else: 171 | if debug is True: 172 | print(" Instance is too new to be terminated") 173 | 174 | output[region] = regionoutput 175 | return output 176 | 177 | def get_zombie_packer_keys(regions): 178 | global debug 179 | output = {} 180 | for region in regions: 181 | regionoutput = [] 182 | if debug is True: 183 | print(f"Scanning region {region} for keys") 184 | 185 | # Create our EC2 Handler 186 | ec2 = boto3.client('ec2', region_name=region) 187 | 188 | response = ec2.describe_key_pairs( 189 | Filters=[ 190 | { 191 | 'Name': 'key-name', 192 | 'Values': ['packer_*'], 193 | }, 194 | ] 195 | ) 196 | 197 | for pair in response['KeyPairs']: 198 | regionoutput.append(pair['KeyName']) 199 | 200 | output[region] = regionoutput 201 | return output 202 | 203 | 204 | def get_zombie_packer_security_groups(regions): 205 | global debug 206 | output = {} 207 | for region in regions: 208 | regionoutput = [] 209 | if debug is True: 210 | print(f"Scanning region {region} for security groups") 211 | 212 | # Create our EC2 Handler 213 | ec2 = boto3.client('ec2', region_name=region) 214 | 215 | response = ec2.describe_security_groups( 216 | Filters=[ 217 | { 218 | 'Name': 'group-name', 219 | 'Values': ['packer_*'], 220 | }, 221 | ] 222 | ) 223 | 224 | # NOTE: 225 | # Checking for stales doesn't seem to work, so we'll just try to delete without checking stale 226 | for pair in response['SecurityGroups']: 227 | regionoutput.append(pair['GroupName']) 228 | 229 | output[region] = regionoutput 230 | return output 231 | 232 | def lambda_handler(event, context): 233 | global regions, max_age 234 | 235 | print(f"Scanning {len(regions)} AWS regions for zombie packer instances...") 236 | 237 | zombies = get_zombie_packer_instances(regions, max_age) 238 | 239 | for region,instances in zombies.items(): 240 | if len(instances) == 0: 241 | print(f"Found NO zombie instances in {region}, skipping...") 242 | continue 243 | 244 | print(f"Found {len(instances)} zombie packer instances in {region}, now terminating...") 245 | ec2 = boto3.client('ec2', region_name=region) 246 | 247 | instance_ids = [] 248 | for instance in instances: 249 | instance_ids.append(instance['instance_id']) 250 | 251 | try: 252 | response = ec2.terminate_instances( 253 | InstanceIds=instance_ids 254 | ) 255 | print("Successfully terminated instances") 256 | except: 257 | print("ERROR: Unable to terminate some or all resources") 258 | 259 | print(f"Scanning {len(regions)} AWS regions for zombie packer keys...") 260 | 261 | zombies = get_zombie_packer_keys(regions) 262 | for region,keynames in zombies.items(): 263 | if len(keynames) == 0: 264 | print(f"Found NO zombie keys in {region}, skipping...") 265 | continue 266 | 267 | print(f"Found {len(keynames)} zombie packer keys in {region}, now deleting...") 268 | ec2 = boto3.client('ec2', region_name=region) 269 | 270 | for keyname in keynames: 271 | try: 272 | print("Deleting key " + keyname) 273 | response = ec2.delete_key_pair( 274 | KeyName=keyname 275 | ) 276 | print("Deleted") 277 | except: 278 | print("Error while trying to terminate resources") 279 | 280 | 281 | print(f"Scanning {len(regions)} AWS regions for zombie packer security groups...") 282 | 283 | zombies = get_zombie_packer_security_groups(regions) 284 | for region,security_groups in zombies.items(): 285 | if len(security_groups) == 0: 286 | print(f"Found NO zombie security groups in {region}, skipping...") 287 | continue 288 | 289 | print(f"Found {len(security_groups)} zombie security groups in {region}, now terminating...") 290 | ec2 = boto3.client('ec2', region_name=region) 291 | 292 | for security_group in security_groups: 293 | try: 294 | print(f"Deleting security group {security_group}") 295 | response = ec2.delete_security_group( 296 | GroupName=security_group 297 | ) 298 | print("Deleted") 299 | except: 300 | print("Error while trying to terminate resources") 301 | 302 | # References: 303 | # https://unbiased-coder.com/detect-aws-env-python-nodejs/ 304 | # https://docs.aws.amazon.com/lambda/latest/dg/configuration-envvars.html 305 | def is_aws_env(): 306 | return os.environ.get('AWS_LAMBDA_FUNCTION_NAME') or os.environ.get('AWS_EXECUTION_ENV') 307 | 308 | if not is_aws_env(): 309 | lambda_handler({}, {}) 310 | -------------------------------------------------------------------------------- /ec2-metadata/README.md: -------------------------------------------------------------------------------- 1 | # EC2 Metadata - bash 2 | 3 | ## Introduction: 4 | ec2-metadata - A simple bash script that uses curl to query the EC2 instance Metadata from within a running EC2 instance. 5 | 6 | This helper came out a long while ago. This is NOT written by me, but is used a lot by me so I put it in my toolkit. The originally is from... 7 | https://aws.amazon.com/code/1825 8 | and 9 | http://s3.amazonaws.com/ec2metadata/ec2-metadata 10 | 11 | ## Installation: 12 | This script never changes, and is typically installed on EVERY AWS-based instance I manage, typically done via a packer/ansible/chef/puppet script, usually auto-downloaded from "http://s3.amazonaws.com/ec2metadata/ec2-metadata" 13 | 14 | 15 | ### Installation Example: 16 | 17 | ``` 18 | curl http://s3.amazonaws.com/ec2metadata/ec2-metadata > /usr/local/bin/ec2-metadata 19 | ``` 20 | or from here incase that disappears... 21 | ``` 22 | curl https://raw.githubusercontent.com/DevOps-Nirvana/aws-missing-tools/master/ec2-metadata/ec2-metadata > /usr/local/bin/ec2-metadata 23 | ``` 24 | 25 | ## Directions For Use: 26 | 27 | ``` 28 | ec2-metadata 29 | ``` 30 | -------------------------------------------------------------------------------- /ec2-metadata/ec2-metadata: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # 3 | ######################################################################### 4 | #This software code is made available "AS IS" without warranties of any # 5 | #kind. You may copy, display, modify and redistribute the software # 6 | #code either by itself or as incorporated into your code; provided that # 7 | #you do not remove any proprietary notices. Your use of this software # 8 | #code is at your own risk and you waive any claim against Amazon # 9 | #Digital Services, Inc. or its affiliates with respect to your use of # 10 | #this software code. (c) 2006-2007 Amazon Digital Services, Inc. or its # 11 | #affiliates. # 12 | ######################################################################### 13 | 14 | function print_help() 15 | { 16 | echo "ec2-metadata v0.1.1 17 | Use to retrieve EC2 instance metadata from within a running EC2 instance. 18 | e.g. to retrieve instance id: ec2-metadata -i 19 | to retrieve ami id: ec2-metadata -a 20 | to get help: ec2-metadata --help 21 | For more information on Amazon EC2 instance meta-data, refer to the documentation at 22 | http://docs.amazonwebservices.com/AWSEC2/2008-05-05/DeveloperGuide/AESDG-chapter-instancedata.html 23 | 24 | Usage: ec2-metadata