├── CHANGELOG.md ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE.txt ├── NOTICE.txt ├── README.md ├── deployment ├── build-s3-dist.sh ├── genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker.template └── run-unit-tests.sh └── source ├── GenomicsLearningCode ├── awscli_test.sh ├── code_cfn.yml ├── copyresources_buildspec.yml ├── resources │ ├── notebooks │ │ ├── variant_classifier-autopilot.ipynb │ │ └── variant_predictor.ipynb │ └── scripts │ │ └── process_clinvar.py └── setup │ ├── crhelper-2.0.6.dist-info │ ├── INSTALLER │ ├── LICENSE │ ├── METADATA │ ├── NOTICE │ ├── RECORD │ ├── WHEEL │ └── top_level.txt │ ├── crhelper │ ├── __init__.py │ ├── __pycache__ │ │ ├── __init__.cpython-38.pyc │ │ ├── log_helper.cpython-38.pyc │ │ ├── resource_helper.cpython-38.pyc │ │ └── utils.cpython-38.pyc │ ├── log_helper.py │ ├── resource_helper.py │ └── utils.py │ ├── lambda.py │ ├── requirements.txt │ └── tests │ ├── __init__.py │ ├── __pycache__ │ ├── __init__.cpython-38.pyc │ ├── test_log_helper.cpython-38.pyc │ ├── test_resource_helper.cpython-38.pyc │ └── test_utils.cpython-38.pyc │ ├── test_log_helper.py │ ├── test_resource_helper.py │ ├── test_utils.py │ └── unit │ ├── __init__.py │ └── __pycache__ │ └── __init__.cpython-38.pyc ├── GenomicsLearningPipe └── pipe_cfn.yml ├── GenomicsLearningZone └── zone_cfn.yml ├── setup.sh ├── setup_cfn.yml └── teardown.sh /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # Changelog 2 | All notable changes to this project will be documented in this file. 3 | 4 | The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), 5 | and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). 6 | 7 | ## [1.0.0] - 2020-08-03 8 | ### Added 9 | - Initial public release -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check [existing open](https://github.com/awslabs/genomics-learning/issues), or [recently closed](https://github.com/awslabs/genomics-learning/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aclosed%20), issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *master* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels ((enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ['help wanted'](https://github.com/awslabs/genomics-learning/labels/help%20wanted) issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](https://github.com/awslabs/genomics-learning/blob/master/LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | 61 | We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes. 62 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. -------------------------------------------------------------------------------- /NOTICE.txt: -------------------------------------------------------------------------------- 1 | Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker 2 | Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. 3 | Licensed under the Apache License Version 2.0 (the "License"). You may not use this file except 4 | in compliance with the License. A copy of the License is located at http://www.apache.org/licenses/ 5 | or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, 6 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied. See the License for the 7 | specific language governing permissions and limitations under the License. 8 | 9 | ********************** 10 | THIRD PARTY COMPONENTS 11 | ********************** 12 | This software includes third party software subject to the following copyrights: 13 | 14 | AWS SDK under the Apache License Version 2.0 15 | AWS Custom Resource Helper under the Apache License Version 2.0 16 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Deprecation Notice 2 | 3 | This AWS Solution has been archived and is no longer maintained by AWS. A new version of the solution is here: [Guidance for Multi-Omics and Multi-Modal Data Integration and Analysis on AWS](https://aws.amazon.com/solutions/guidance/multi-omics-and-multi-modal-data-integration-and-analysis/). To discover other solutions, please visit the [AWS Solutions Library](https://aws.amazon.com/solutions/). 4 | 5 | # Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker 6 | 7 | 8 | 9 | The Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker solution creates a scalable environment in AWS to develop machine learning models using genomics data, generate predictions, and evaluate model performance. This solution demonstrates how to 1) automate the preparation of a genomics machine learning training dataset, 2) develop genomics machine learning model training and deployment pipelines and, 3) generate predictions and evaluate model performance using test data. 10 | 11 | ## Standard deployment 12 | 13 | To deploy this solution in your account use the "Launch in the AWS Console" button found on the [solution landing page](https://aws.amazon.com/solutions/implementations/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/?did=sl_card&trk=sl_card). 14 | 15 | We recommend deploying the solution this way for most use cases. 16 | 17 | ## Customized deployment 18 | 19 | A fully customized solution can be deployed for the following use cases: 20 | 21 | * Modifying or adding additional resources deployed during installation 22 | * Modifying the "Landing Zone" of the solution - e.g. adding additional artifacts or customizing the "Pipe" CodePipeline 23 | 24 | Fully customized solutions need to be self-hosted in your own AWS account, and you will be responsible for any costs incurred in doing so. 25 | 26 | To deploy and self-host a fully customized solution use the instructions below. 27 | 28 | _Note_: All commands assume a `bash` shell. 29 | 30 | ### Customize 31 | 32 | Clone the repository, and make desired changes 33 | 34 | #### File Structure 35 | 36 | ``` 37 | . 38 | ├── CHANGELOG.md 39 | ├── CODE_OF_CONDUCT.md 40 | ├── CONTRIBUTING.md 41 | ├── LICENSE.txt 42 | ├── NOTICE.txt 43 | ├── README.md 44 | ├── buildspec.yml 45 | ├── deploy.sh 46 | ├── deployment 47 | │   ├── build-open-source-dist.sh 48 | │   ├── build-s3-dist.sh 49 | │   └── run-unit-tests.sh 50 | └── source 51 | ├── GenomicsLearningCode 52 | │   ├── awscli_test.sh 53 | │   ├── code_cfn.yml 54 | │   ├── copyresources_buildspec.yml 55 | │   ├── resources 56 | │   │   ├── notebooks 57 | │   │   │   ├── variant_classifier-autopilot.ipynb 58 | │   │   │   └── variant_predictor.ipynb 59 | │   │   └── scripts 60 | │   │   └── process_clinvar.py 61 | │   └── setup 62 | │   ├── lambda.py 63 | │   └── requirements.txt 64 | ├── GenomicsLearningPipe 65 | │   └── pipe_cfn.yml 66 | ├── GenomicsLearningZone 67 | │   └── zone_cfn.yml 68 | ├── setup.sh 69 | ├── setup_cfn.yml 70 | └── teardown.sh 71 | 72 | ``` 73 | 74 | | Path | Description | 75 | | :- | :- | 76 | | deployment | Scripts for building and deploying a customized distributable | 77 | | deployment/build-s3-dist.sh | Shell script for packaging distribution assets | 78 | | deployment/run-unit-tests.sh | Shell script for execution unit tests | 79 | | source | Source code for the solution | 80 | | source/setup_cfn.yaml | CloudFormation template used to install the solution | 81 | | source/GenomicsLearningZone/ | Source code for the solution landing zone - location for common assets and artifacts used by the solution | 82 | | source/GenomicsLearningPipe/ | Source code for the solution deployment pipeline - the CI/CD pipeline that builds and deploys the solution codebase | 83 | | source/GenomicsLearningCode/ | Source code for the solution codebase - source code for the training job and ML notebooks | 84 | 85 | ### Run unit tests 86 | 87 | ```bash 88 | cd ./deployment 89 | chmod +x ./run-unit-tests.sh 90 | ./run-unit-tests.sh 91 | ``` 92 | 93 | ### Build and deploy 94 | 95 | #### Create deployment buckets 96 | 97 | The solution requires two buckets for deployment: 98 | 99 | 1. `` for the solution's primary CloudFormation template 100 | 2. `-` for additional artifacts and assets that the solution requires - these are stored regionally to reduce latency during installation and avoid inter-regional transfer costs 101 | 102 | #### Configure and build the distributable 103 | 104 | ```bash 105 | export DIST_OUTPUT_BUCKET= 106 | export SOLUTION_NAME= 107 | export VERSION= 108 | 109 | chmod +x ./build-s3-dist.sh 110 | ./build-s3-dist.sh $DIST_OUTPUT_BUCKET $SOLUTION_NAME $VERSION 111 | ``` 112 | 113 | #### Deploy the distributable 114 | 115 | _Note:_ you must have the AWS Command Line Interface (CLI) installed for this step. Learn more about the AWS CLI [here](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html). 116 | 117 | ```bash 118 | cd ./deployment 119 | 120 | # deploy global assets 121 | # this only needs to be done once 122 | aws s3 cp \ 123 | ./global-s3-assets/ s3:///$SOLUTION_NAME/$VERSION \ 124 | --recursive \ 125 | --acl bucket-owner-full-control 126 | 127 | # deploy regional assets 128 | # repeat this step for as many regions as needed 129 | aws s3 cp \ 130 | ./regional-s3-assets/ s3://-/$SOLUTION_NAME/$VERSION \ 131 | --recursive \ 132 | --acl bucket-owner-full-control 133 | ``` 134 | 135 | ### Install the customized solution 136 | 137 | The link to the primary CloudFormation template will look something like: 138 | 139 | ```text 140 | https://.s3-.amazonaws.com/genomics-tertiary-analysis-and-data-lakes-using-amazon-sagemaker.template 141 | ``` 142 | 143 | Use this link to install the customized solution into your AWS account in a specific region using the [AWS Cloudformation Console](https://us-west-2.console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/create/template). 144 | 145 | --- 146 | 147 | This solution collects anonymous operational metrics to help AWS improve the 148 | quality of features of the solution. For more information, including how to disable 149 | this capability, please see the [implementation guide](https://docs.aws.amazon.com/solutions/latest/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/appendix-f.html). 150 | 151 | --- 152 | 153 | Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. 154 | 155 | Licensed under the Apache License Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at 156 | 157 | http://www.apache.org/licenses/ 158 | 159 | or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied. See the License for the specific language governing permissions and limitations under the License. 160 | -------------------------------------------------------------------------------- /deployment/build-s3-dist.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # 3 | # This assumes all of the OS-level configuration has been completed and git repo has already been cloned 4 | # 5 | # This script should be run from the repo's deployment directory 6 | # cd deployment 7 | # ./build-s3-dist.sh source-bucket-base-name solution-name version-code 8 | # 9 | # Paramenters: 10 | # - source-bucket-base-name: Name for the S3 bucket location where the template will source the Lambda 11 | # code from. The template will append '-[region_name]' to this bucket name. 12 | # For example: ./build-s3-dist.sh solutions my-solution v1.0.0 13 | # The template will then expect the source code to be located in the solutions-[region_name] bucket 14 | # 15 | # - solution-name: name of the solution for consistency 16 | # 17 | # - version-code: version of the package 18 | 19 | # Check to see if input has been provided: 20 | if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ]; then 21 | echo "Please provide the base source bucket name, trademark approved solution name and version where the lambda code will eventually reside." 22 | echo "For example: ./build-s3-dist.sh solutions trademarked-solution-name v1.0.0" 23 | exit 1 24 | fi 25 | 26 | # Get reference for all important folders 27 | template_dir="$PWD" 28 | template_dist_dir="$template_dir/global-s3-assets" 29 | build_dist_dir="$template_dir/regional-s3-assets" 30 | source_dir="$template_dir/../source" 31 | 32 | cp $source_dir/setup_cfn.yml $template_dir/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker.template 33 | 34 | echo "------------------------------------------------------------------------------" 35 | echo "[Init] Clean old dist" 36 | echo "------------------------------------------------------------------------------" 37 | echo "rm -rf $template_dist_dir" 38 | rm -rf $template_dist_dir 39 | echo "mkdir -p $template_dist_dir" 40 | mkdir -p $template_dist_dir 41 | echo "rm -rf $build_dist_dir" 42 | rm -rf $build_dist_dir 43 | echo "mkdir -p $build_dist_dir" 44 | mkdir -p $build_dist_dir 45 | 46 | echo "------------------------------------------------------------------------------" 47 | echo "[Packing] Templates" 48 | echo "------------------------------------------------------------------------------" 49 | echo "cp $template_dir/*.template $template_dist_dir/" 50 | cp $template_dir/*.template $template_dist_dir/ 51 | echo "copy yaml templates and rename" 52 | cp $template_dir/*.yaml $template_dist_dir/ 53 | cd $template_dist_dir 54 | # Rename all *.yaml to *.template 55 | for f in *.yaml; do 56 | mv -- "$f" "${f%.yaml}.template" 57 | done 58 | 59 | cd .. 60 | echo "Updating code source bucket in template with $1" 61 | replace="s/%%BUCKET_NAME%%/$1/g" 62 | echo "sed -i '' -e $replace $template_dist_dir/*.template" 63 | sed -i '' -e $replace $template_dist_dir/*.template 64 | replace="s/%%SOLUTION_NAME%%/$2/g" 65 | echo "sed -i '' -e $replace $template_dist_dir/*.template" 66 | sed -i '' -e $replace $template_dist_dir/*.template 67 | replace="s/%%VERSION%%/$3/g" 68 | echo "sed -i '' -e $replace $template_dist_dir/*.template" 69 | sed -i '' -e $replace $template_dist_dir/*.template 70 | 71 | mkdir $build_dist_dir/annotation 72 | mkdir $build_dist_dir/annotation/clinvar/ 73 | 74 | echo "------------------------------------------------------------------------------" 75 | echo "[Rebuild] Solution" 76 | echo "------------------------------------------------------------------------------" 77 | 78 | cd $source_dir 79 | 80 | bundle_dir="$source_dir/../bundle" 81 | mkdir -p $bundle_dir 82 | 83 | # create the lambda function deployment pacakage for the solution setup 84 | cd $source_dir/GenomicsLearningCode/setup 85 | pip install -t . crhelper 86 | zip -r $bundle_dir/SolutionSetup.zip . 87 | 88 | # package the solution 89 | cd $source_dir 90 | zip -r $bundle_dir/Solution.zip . 91 | 92 | cd $bundle_dir 93 | cp Solution.zip $template_dist_dir/ 94 | cp SolutionSetup.zip $template_dist_dir/ 95 | cp Solution.zip $build_dist_dir/ 96 | cp SolutionSetup.zip $build_dist_dir/ 97 | -------------------------------------------------------------------------------- /deployment/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker.template: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: 2010-09-09 2 | 3 | Description: | 4 | (SO0078) - The Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker solution creates a scalable environment 5 | in AWS to develop machine learning models using genomics data, generate predictions, and evaluate model performance. 6 | This solution demonstrates how to 1) automate the preparation of a genomics machine learning training dataset, 7 | 2) develop genomics machine learning model training and deployment pipelines and, 8 | 3) generate predictions and evaluate model performance using test data. 9 | 10 | Mappings: 11 | Send: 12 | AnonymousUsage: 13 | Data: Yes 14 | SourceCode: 15 | General: 16 | S3Bucket: '%%BUCKET_NAME%%' 17 | KeyPrefix: '%%SOLUTION_NAME%%/%%VERSION%%' 18 | 19 | Parameters: 20 | Project: 21 | Type: String 22 | Description: > 23 | The project name for this solution. The project name will be used to prefix resources created by this solution. Project names should be unique to a project. 24 | AllowedPattern: "[a-zA-Z0-9-]{3,24}" 25 | ConstraintDescription: > 26 | Project name should be unique, 3-24 characters in length, and only have alphanumeric characters and hyphens ([a-zA-Z0-9-]{3,32}). 27 | Default: GenomicsLearning 28 | 29 | Resources: 30 | Setup: 31 | Type: Custom::Setup 32 | DependsOn: 33 | - CodeBuild 34 | Version: 1.0 35 | Properties: 36 | ServiceToken: !Sub ${SetupLambda.Arn} 37 | CodeBuildProjectName: !Sub ${CodeBuild} 38 | 39 | SetupLambda: 40 | Type: AWS::Lambda::Function 41 | DependsOn: 42 | - SetupLambdaRole 43 | Properties: 44 | Handler: lambda.handler 45 | Runtime: python3.8 46 | FunctionName: !Sub ${Project}Setup 47 | Code: 48 | S3Bucket: !Join ["-", [!FindInMap ["SourceCode", "General", "S3Bucket"], Ref: "AWS::Region"]] 49 | S3Key: !Join ["", [!FindInMap ["SourceCode", "General", "KeyPrefix"], "/SolutionSetup.zip"]] 50 | Role: !Sub ${SetupLambdaRole.Arn} 51 | Timeout: 600 52 | Metadata: 53 | cfn_nag: 54 | rules_to_suppress: 55 | - id: W58 56 | reason: Bug in CfnNag. Lambda functions require permission to write CloudWatch Logs. Looking for PutLogEvent instead of PutLogEvents 57 | 58 | SetupLambdaRole: 59 | Type: AWS::IAM::Role 60 | DependsOn: 61 | - CodeBuild 62 | Properties: 63 | AssumeRolePolicyDocument: 64 | Version: 2012-10-17 65 | Statement: 66 | - Action: 67 | - sts:AssumeRole 68 | Effect: Allow 69 | Principal: 70 | Service: 71 | - lambda.amazonaws.com 72 | Path: / 73 | Policies: 74 | - PolicyName: LogsAccess 75 | PolicyDocument: 76 | Statement: 77 | - Effect: Allow 78 | Action: 79 | - logs:CreateLogGroup 80 | - logs:CreateLogStream 81 | - logs:PutLogEvents 82 | Resource: 83 | - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/lambda/${Project}* 84 | - PolicyName: CodeBuildAccess 85 | PolicyDocument: 86 | Statement: 87 | - Effect: Allow 88 | Action: 89 | - codebuild:BatchGetProjects 90 | - codebuild:BatchGetBuilds 91 | - codebuild:StartBuild 92 | Resource: 93 | - !Sub ${CodeBuild.Arn} 94 | - PolicyName: EventsAccess 95 | PolicyDocument: 96 | Statement: 97 | - Effect: Allow 98 | Action: 99 | - events:DeleteRule 100 | - events:PutRule 101 | - events:PutTargets 102 | - events:RemoveTargets 103 | Resource: 104 | - !Sub arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/Setup* 105 | - PolicyName: LambdaAccess 106 | PolicyDocument: 107 | Statement: 108 | - Effect: Allow 109 | Action: 110 | - lambda:AddPermission 111 | - lambda:RemovePermission 112 | Resource: 113 | - !Sub arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:${Project}* 114 | 115 | CodeBuildRole: 116 | Type: AWS::IAM::Role 117 | Properties: 118 | AssumeRolePolicyDocument: 119 | Version: 2012-10-17 120 | Statement: 121 | - Action: 122 | - sts:AssumeRole 123 | Effect: Allow 124 | Principal: 125 | Service: 126 | - codebuild.amazonaws.com 127 | Path: / 128 | Policies: 129 | - PolicyName: CloudFormationAccess 130 | PolicyDocument: 131 | Statement: 132 | - Action: 133 | - cloudformation:CreateStack 134 | - cloudformation:DescribeStacks 135 | - cloudformation:DescribeStackResource 136 | - cloudformation:UpdateStack 137 | - cloudformation:DeleteStack 138 | - cloudformation:UpdateTerminationProtection 139 | Effect: Allow 140 | Resource: !Sub arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${Project}* 141 | - PolicyName: LogsAccess 142 | PolicyDocument: 143 | Statement: 144 | - Effect: Allow 145 | Action: 146 | - logs:CreateLogGroup 147 | - logs:CreateLogStream 148 | - logs:PutLogEvents 149 | Resource: 150 | - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/codebuild/${Project}* 151 | - PolicyName: IAMAccess 152 | PolicyDocument: 153 | Statement: 154 | - Effect: Allow 155 | Action: 156 | - iam:CreateRole 157 | - iam:DeleteRole 158 | - iam:PutRolePolicy 159 | - iam:DeleteRolePolicy 160 | - iam:AttachRolePolicy 161 | - iam:DetachRolePolicy 162 | - iam:UpdateAssumeRolePolicy 163 | - iam:PassRole 164 | - iam:GetRole 165 | - iam:GetInstanceProfile 166 | - iam:CreateInstanceProfile 167 | - iam:DeleteInstanceProfile 168 | - iam:AddRoleToInstanceProfile 169 | - iam:RemoveRoleFromInstanceProfile 170 | Resource: 171 | - !Sub arn:aws:iam::${AWS::AccountId}:role/${Project}* 172 | - !Sub arn:aws:iam::${AWS::AccountId}:instance-profile/${Project}* 173 | - PolicyName: CodeBuildAccess 174 | PolicyDocument: 175 | Statement: 176 | - Effect: Allow 177 | Action: 178 | - codebuild:CreateProject 179 | - codebuild:UpdateProject 180 | - codebuild:ListProjects 181 | - codebuild:BatchGetProjects 182 | - codebuild:DeleteProject 183 | Resource: 184 | - !Sub arn:aws:codebuild:${AWS::Region}:${AWS::AccountId}:project/${Project}* 185 | - PolicyName: CodePipelineAccess 186 | PolicyDocument: 187 | Statement: 188 | - Effect: Allow 189 | Action: 190 | - codepipeline:CreatePipeline 191 | - codepipeline:GetPipeline 192 | - codepipeline:UpdatePipeline 193 | - codepipeline:DeletePipeline 194 | - codepipeline:GetPipelineState 195 | - codepipeline:ListPipelineExecutions 196 | Resource: 197 | - !Sub arn:aws:codepipeline:${AWS::Region}:${AWS::AccountId}:${Project}* 198 | - PolicyName: CodeCommitAccess 199 | PolicyDocument: 200 | Statement: 201 | - Effect: Allow 202 | Action: 203 | - codecommit:CreateBranch 204 | - codecommit:CreateRepository 205 | - codecommit:GetRepository 206 | - codecommit:DeleteRepository 207 | - codecommit:CreateCommit 208 | - codecommit:GitPush 209 | - codecommit:GitPull 210 | - codecommit:DeleteBranch 211 | Resource: 212 | - !Sub arn:aws:codecommit:${AWS::Region}:${AWS::AccountId}:${Project}* 213 | - Effect: Allow 214 | Action: 215 | - codecommit:ListRepositories 216 | Resource: '*' 217 | - PolicyName: EventsAccess 218 | PolicyDocument: 219 | Statement: 220 | - Effect: Allow 221 | Action: 222 | - events:DescribeRule 223 | - events:PutRule 224 | - events:DeleteRule 225 | - events:PutTargets 226 | - events:RemoveTargets 227 | Resource: 228 | - !Sub arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/* 229 | - PolicyName: GlueAccess 230 | PolicyDocument: 231 | Statement: 232 | - Effect: Allow 233 | Action: 234 | - glue:StartJob 235 | - glue:GetJob 236 | Resource: '*' 237 | - PolicyName: S3Access 238 | PolicyDocument: 239 | Statement: 240 | - Effect: Allow 241 | Action: 242 | - s3:GetObject 243 | Resource: 244 | !Join 245 | - '' 246 | - - 'arn:aws:s3:::' 247 | - !FindInMap ["SourceCode", "General", "S3Bucket"] 248 | - '/*' 249 | - Effect: Allow 250 | Action: 251 | - s3:GetObject 252 | Resource: 253 | !Join 254 | - '' 255 | - - 'arn:aws:s3:::' 256 | - !Join 257 | - '-' 258 | - - !FindInMap ["SourceCode", "General", "S3Bucket"] 259 | - Ref: "AWS::Region" 260 | - '/' 261 | - !FindInMap ["SourceCode", "General", "KeyPrefix"] 262 | - '/*' 263 | - Effect: Allow 264 | Action: 265 | - s3:ListBucket 266 | Resource: 267 | !Join 268 | - '' 269 | - - 'arn:aws:s3:::' 270 | - !FindInMap ["SourceCode", "General", "S3Bucket"] 271 | 272 | - Effect: Allow 273 | Action: 274 | - s3:PutObjectAcl 275 | - s3:GetObject 276 | - s3:PutObject 277 | - s3:DeleteObject 278 | - s3:ListBucket 279 | - s3:CreateBucket 280 | - s3:DeleteBucket 281 | - s3:PutEncryptionConfiguration 282 | - s3:PutBucketPublicAccessBlock 283 | - s3:PutBucketLogging 284 | - s3:PutBucketAcl 285 | Resource: 286 | - arn:aws:s3:::*pipe* 287 | - arn:aws:s3:::*pipe*/* 288 | - Effect: Allow 289 | Action: 290 | - s3:ListBucket 291 | Resource: 292 | !Join 293 | - '' 294 | - - 'arn:aws:s3:::' 295 | - !FindInMap ["SourceCode", "General", "S3Bucket"] 296 | - Effect: Allow 297 | Action: 298 | - s3:CreateBucket 299 | - s3:DeleteBucket 300 | - s3:ListBucket 301 | - s3:PutEncryptionConfiguration 302 | - s3:PutBucketPublicAccessBlock 303 | - s3:PutBucketLogging 304 | - s3:PutBucketAcl 305 | - s3:PutObject 306 | - s3:PutObjectAcl 307 | Resource: 308 | - arn:aws:s3:::*pipe* 309 | - arn:aws:s3:::*pipe*/* 310 | Metadata: 311 | cfn_nag: 312 | rules_to_suppress: 313 | - id: W11 314 | reason: Star required for codecommit:ListRepositories and Glue actions. 315 | 316 | CodeBuild: 317 | Type: AWS::CodeBuild::Project 318 | Properties: 319 | Name: !Sub ${Project}Setup 320 | Artifacts: 321 | Type: NO_ARTIFACTS 322 | Source: 323 | Type: NO_SOURCE 324 | BuildSpec: !Sub | 325 | version: 0.2 326 | phases: 327 | install: 328 | commands: 329 | - git config --global user.name automated_user 330 | - git config --global user.email automated_email 331 | - git config --global credential.helper '!aws codecommit credential-helper $@' 332 | - git config --global credential.UseHttpPath true 333 | - aws s3 cp s3://$ARTIFACT_BUCKET/$ARTIFACT_KEY_PREFIX/Solution.zip . 334 | - unzip Solution.zip 335 | - ./$SOLUTION_ACTION.sh 336 | Environment: 337 | ComputeType: BUILD_GENERAL1_SMALL 338 | EnvironmentVariables: 339 | - Name: SOLUTION_ACTION 340 | Value: setup 341 | - Name: PROJECT_NAME 342 | Value: !Ref Project 343 | - Name: ARTIFACT_BUCKET 344 | Value: !Join ["-", [!FindInMap ["SourceCode", "General", "S3Bucket"], Ref: "AWS::Region"]] 345 | - Name: ARTIFACT_KEY_PREFIX 346 | Value: !FindInMap ["SourceCode", "General", "KeyPrefix"] 347 | Image: aws/codebuild/standard:3.0 348 | Type: LINUX_CONTAINER 349 | ServiceRole: !Sub ${CodeBuildRole} 350 | TimeoutInMinutes: 30 351 | Metadata: 352 | cfn_nag: 353 | rules_to_suppress: 354 | - id: W32 355 | reason: Customer can enable encryption if desired. 356 | -------------------------------------------------------------------------------- /deployment/run-unit-tests.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # 3 | # This assumes all of the OS-level configuration has been completed and git repo has already been cloned 4 | # 5 | # This script should be run from the repo's deployment directory 6 | # cd deployment 7 | # ./run-unit-tests.sh 8 | # 9 | 10 | # Get reference for all important folders 11 | template_dir="$PWD" 12 | source_dir="$template_dir/../source" 13 | 14 | echo "------------------------------------------------------------------------------" 15 | echo "[Init] Clean old dist and node_modules folders" 16 | echo "------------------------------------------------------------------------------" 17 | 18 | echo "------------------------------------------------------------------------------" 19 | echo "[Test] Services - Example Function" 20 | echo "------------------------------------------------------------------------------" 21 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/awscli_test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash -e 2 | 3 | export AWS_DEFAULT_OUTPUT=text 4 | 5 | project_name=${PROJECT_NAME:-GenomicsLearning} 6 | 7 | resource_prefix=${project_name} 8 | 9 | resource_prefix_lowercase=$(echo ${resource_prefix} | tr '[:upper:]' '[:lower:]') 10 | 11 | process_data_job="${resource_prefix_lowercase}-create-trainingset" 12 | 13 | aws glue get-job --job-name ${process_data_job} 14 | 15 | printf "Test:Job exists\n" -------------------------------------------------------------------------------- /source/GenomicsLearningCode/code_cfn.yml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: 2010-09-09 2 | 3 | Description: GenomicsLearningCode 4 | 5 | Parameters: 6 | ResourcePrefix: 7 | Type: String 8 | Default: GenomicsLearning 9 | ResourcePrefixLowercase: 10 | Type: String 11 | Default: genomicslearning 12 | ResourcesBucket: 13 | Type: String 14 | DataLakeBucket: 15 | Type: String 16 | 17 | Resources: 18 | 19 | JobRole: 20 | Type: AWS::IAM::Role 21 | Properties: 22 | AssumeRolePolicyDocument: 23 | Version: 2012-10-17 24 | Statement: 25 | - Effect: Allow 26 | Principal: 27 | Service: 28 | - glue.amazonaws.com 29 | Action: 30 | - sts:AssumeRole 31 | Path: / 32 | ManagedPolicyArns: 33 | - arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole 34 | Policies: 35 | - PolicyName: s3_access 36 | PolicyDocument: 37 | Version: 2012-10-17 38 | Statement: 39 | - Effect: Allow 40 | Action: 41 | - s3:GetObject 42 | - s3:ListBucket 43 | Resource: 44 | - !Sub arn:aws:s3:::${ResourcesBucket} 45 | - !Sub arn:aws:s3:::${ResourcesBucket}/* 46 | - Effect: Allow 47 | Action: 48 | - s3:PutObject 49 | - s3:GetObject 50 | - s3:ListBucket 51 | - s3:DeleteObject 52 | Resource: 53 | - !Sub arn:aws:s3:::${DataLakeBucket} 54 | - !Sub arn:aws:s3:::${DataLakeBucket}/* 55 | 56 | RunbookRole: 57 | Type: AWS::IAM::Role 58 | Properties: 59 | AssumeRolePolicyDocument: 60 | Version: 2012-10-17 61 | Statement: 62 | - Effect: Allow 63 | Principal: 64 | Service: 65 | - sagemaker.amazonaws.com 66 | Action: 67 | - sts:AssumeRole 68 | Path: / 69 | Policies: 70 | - PolicyName: logs_access 71 | PolicyDocument: 72 | Version: 2012-10-17 73 | Statement: 74 | - Effect: Allow 75 | Action: 76 | - logs:CreateLogStream 77 | - logs:DescribeLogStreams 78 | - logs:CreateLogGroup 79 | - logs:PutLogEvents 80 | - logs:GetLogEvents 81 | Resource: 82 | - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/sagemaker/* 83 | - PolicyName: s3_access 84 | PolicyDocument: 85 | Version: 2012-10-17 86 | Statement: 87 | - Effect: Allow 88 | Action: 89 | - s3:CreateBucket 90 | - s3:ListBucket 91 | Resource: 92 | - !Sub arn:aws:s3:::sagemaker-${AWS::Region}-${AWS::AccountId} 93 | - Effect: Allow 94 | Action: 95 | - iam:GetRole 96 | - sagemaker:DescribeNotebookInstance 97 | Resource: '*' 98 | - Effect: Allow 99 | Action: 100 | - s3:ListBucket 101 | - s3:GetBucketLocation 102 | Resource: 103 | - !Sub arn:aws:s3:::${DataLakeBucket} 104 | - !Sub arn:aws:s3:::${ResourcesBucket} 105 | - Effect: Allow 106 | Action: 107 | - s3:GetObject 108 | - s3:PutObject 109 | - s3:DeleteObject 110 | Resource: 111 | - !Sub arn:aws:s3:::${DataLakeBucket}/* 112 | - !Sub arn:aws:s3:::sagemaker-${AWS::Region}-${AWS::AccountId}/* 113 | - Effect: Allow 114 | Action: 115 | - s3:GetObject 116 | Resource: 117 | - !Sub arn:aws:s3:::${ResourcesBucket}/* 118 | - PolicyName: glue_access 119 | PolicyDocument: 120 | Version: 2012-10-17 121 | Statement: 122 | - Effect: Allow 123 | Action: 124 | - glue:StartJobRun 125 | - glue:StopJobRun 126 | Resource: 127 | - !Sub arn:aws:glue:${AWS::Region}:${AWS::AccountId}:job/${ResourcePrefix}* 128 | - PolicyName: cfn_access 129 | PolicyDocument: 130 | Version: 2012-10-17 131 | Statement: 132 | - Effect: Allow 133 | Action: 134 | - cloudformation:DescribeStacks 135 | Resource: 136 | - !Sub arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${ResourcePrefix}Pipe/* 137 | - !Sub arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${ResourcePrefix}* 138 | - PolicyName: sagemaker_access 139 | PolicyDocument: 140 | Version: 2012-10-17 141 | Statement: 142 | - Effect: Allow 143 | Action: 144 | - iam:CreateServiceLinkedRole 145 | Resource: 146 | - !Sub arn:aws:iam::*:role/aws-service-role/sagemaker.application-autoscaling.amazonaws.com/AWSServiceRoleForApplicationAutoScaling_SageMakerEndpoint 147 | Condition: 148 | StringLike: 149 | iam:AWSServiceName: sagemaker.application-autoscaling.amazonaws.com 150 | - Effect: Allow 151 | Action: 152 | - iam:CreateServiceLinkedRole 153 | Resource: '*' 154 | Condition: 155 | StringEquals: 156 | iam:AWSServiceName: robomaker.amazonaws.com 157 | - Effect: Allow 158 | Action: 159 | - iam:PassRole 160 | Resource: 161 | - !Sub arn:aws:iam::*:role/* 162 | Condition: 163 | StringEquals: 164 | iam:PassedToService: 165 | - sagemaker.amazonaws.com 166 | - glue.amazonaws.com 167 | - robomaker.amazonaws.com 168 | - states.amazonaws.com 169 | - Effect: Allow 170 | Action: 171 | - sagemaker:ListEndpoints 172 | Resource: '*' 173 | - Effect: Allow 174 | Action: 175 | - sagemaker:DescribeTrainingJob 176 | - sagemaker:DescribeTransformJob 177 | - sagemaker:CreateTrainingJob 178 | - sagemaker:CreateAutoMLJob 179 | - sagemaker:CreateTransformJob 180 | - sagemaker:StopTransformJob 181 | - sagemaker:CreateHyperParameterTuningJob 182 | - sagemaker:StopHyperParameterTuningJob 183 | - sagemaker:DescribeHyperParameterTuningJob 184 | - sagemaker:DescribeEndpoint 185 | - sagemaker:DescribeEndpointConfig 186 | - sagemaker:CreateEndpointConfig 187 | - sagemaker:CreateEndpoint 188 | - sagemaker:InvokeEndpoint 189 | - sagemaker:ListTrainingJobsForHyperParameterTuningJob 190 | - sagemaker:CreateModel 191 | - sagemaker:ListTags 192 | - logs:GetLogEvents 193 | - sagemaker:DeleteModel 194 | - sagemaker:StopAutoMLJob 195 | - sagemaker:ListAutoMLJob 196 | - sagemaker:DescribeAutoMLJob 197 | Resource: 198 | - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:training-job* 199 | - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:automl-job* 200 | - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:transform-job* 201 | - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:model* 202 | - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:endpoint* 203 | - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:hyper-parameter-tuning-job* 204 | - Effect: Allow 205 | Action: 206 | - iam:ListRoles 207 | Resource: '*' 208 | Metadata: 209 | cfn_nag: 210 | rules_to_suppress: 211 | - id: W11 212 | reason: GetRole, DescribeSageMakerRole, ListRoles and CreateServiceLinkedRole require star. 213 | 214 | ProcessClinvarDataJob: 215 | Type: AWS::Glue::Job 216 | Properties: 217 | Command: 218 | Name: pythonshell 219 | PythonVersion: '3' 220 | ScriptLocation: !Sub s3://${ResourcesBucket}/scripts/process_clinvar.py 221 | DefaultArguments: 222 | --job-bookmark-option: job-bookmark-disable 223 | --input_bucket: !Sub ${DataLakeBucket} 224 | --clinvar_input_key: !Sub annotation/clinvar/clinvar.vcf.gz 225 | --clinvar_annotated_input_key: !Sub annotation/clinvar/clinvar.annotated.vcf.gz 226 | --output_bucket: !Sub ${DataLakeBucket} 227 | --output_key: !Sub annotation/clinvar/conflicting/clinvar_conflicting.csv 228 | MaxCapacity: 1 229 | ExecutionProperty: 230 | MaxConcurrentRuns: 2 231 | MaxRetries: 0 232 | Name: !Sub ${ResourcePrefixLowercase}-create-trainingset 233 | Role: !Ref JobRole 234 | 235 | RunbookLifecycle: 236 | Type: AWS::SageMaker::NotebookInstanceLifecycleConfig 237 | Properties: 238 | NotebookInstanceLifecycleConfigName: !Sub ${ResourcePrefix}Runbook 239 | OnStart: 240 | - Content: 241 | Fn::Base64: !Sub | 242 | #!/bin/bash 243 | cd /home/ec2-user/SageMaker 244 | set -e 245 | pip install awscli 246 | # download notebooks from S3 247 | aws s3 cp s3://${ResourcesBucket}/notebooks/variant_classifier-autopilot.ipynb ./ --acl public-read-write 248 | chmod 666 variant_classifier-autopilot.ipynb 249 | aws s3 cp s3://${ResourcesBucket}/notebooks/variant_predictor.ipynb ./ --acl public-read-write 250 | chmod 666 variant_predictor.ipynb 251 | echo "export RESOURCE_PREFIX='${ResourcePrefix}'" > /home/ec2-user/anaconda3/envs/tensorflow_p36/etc/conda/activate.d/env_vars.sh 252 | 253 | Runbook: 254 | Type: AWS::SageMaker::NotebookInstance 255 | Properties: 256 | NotebookInstanceName: !Sub ${ResourcePrefix}Runbook 257 | InstanceType: ml.t2.medium 258 | LifecycleConfigName: !GetAtt RunbookLifecycle.NotebookInstanceLifecycleConfigName 259 | RoleArn: !GetAtt RunbookRole.Arn -------------------------------------------------------------------------------- /source/GenomicsLearningCode/copyresources_buildspec.yml: -------------------------------------------------------------------------------- 1 | version: 0.2 2 | phases: 3 | install: 4 | runtime-versions: 5 | python: 3.8 6 | commands: 7 | - apt-get update -y 8 | build: 9 | commands: 10 | - aws s3 sync ./resources s3://${RESOURCES_BUCKET} --size-only 11 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/resources/notebooks/variant_classifier-autopilot.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## Create Data Exploration and Candidate Notebooks with Sagemaker Autopilot" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "### Introduction\n", 15 | "Amazon SageMaker Autopilot is an automated machine learning (commonly referred to as AutoML) solution for tabular datasets. You can use SageMaker Autopilot in different ways: on autopilot (hence the name) or with human guidance, without code through SageMaker Studio, or using the AWS SDKs.\n", 16 | "\n", 17 | "### Problem Definition\n", 18 | "Reference: https://www.kaggle.com/kevinarvai/clinvar-conflicting\n", 19 | "\n", 20 | "[clinvar](https://www.ncbi.nlm.nih.gov/clinvar/) is a public resource containing annotations about human genetic variants. These variants are (usually manually) classified by clinical laboratories on a categorical spectrum ranging from benign, likely benign, uncertain significance, likely pathogenic, and pathogenic. Variants that have conflicting classifications (from laboratory to laboratory) can cause confusion when clinicians or researchers try to interpret whether the variant has an impact on the disease of a given patient.\n", 21 | "The objective is to predict whether a ClinVar variant will have conflicting classifications. This is presented here as a binary classification problem, where each record in the dataset is a genetic variant.\n", 22 | "\n", 23 | "### Acknowledgements\n", 24 | "Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, Gu B, Hart J, Hoffman D, Jang W, Karapetyan K, Katz K, Liu C, Maddipatla Z, Malheiro A, McDaniel K, Ovetsky M, Riley G, Zhou G, Holmes JB, Kattman BL, Maglott DR. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018 Jan 4. PubMed PMID: 29165669." 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "metadata": {}, 30 | "source": [ 31 | "### Setup\n", 32 | "\n", 33 | "Let's start by specifying:\n", 34 | "\n", 35 | "The Region Name, Sagemaker Session, The S3 bucket and prefix that you want to use for training and model data. This should be within the same region as the Notebook Instance, training, and hosting.The IAM role arn used to give training and hosting access to your data." 36 | ] 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": null, 41 | "metadata": {}, 42 | "outputs": [], 43 | "source": [ 44 | "import sagemaker\n", 45 | "import boto3\n", 46 | "import os, jmespath\n", 47 | "from sagemaker import get_execution_role\n", 48 | "import pandas as pd\n", 49 | "from time import gmtime, strftime, sleep\n", 50 | "\n", 51 | "region = boto3.Session().region_name\n", 52 | "\n", 53 | "session = sagemaker.Session()\n", 54 | "bucket = session.default_bucket()\n", 55 | "\n", 56 | "prefix = 'sagemaker/autopilot-vc'\n", 57 | "\n", 58 | "role = get_execution_role()\n", 59 | "\n", 60 | "sm = boto3.Session().client(service_name='sagemaker',region_name=region)" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "### Get datalake bucket" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": null, 73 | "metadata": {}, 74 | "outputs": [], 75 | "source": [ 76 | "cfn = boto3.client('cloudformation')\n", 77 | "\n", 78 | "project_name = os.environ.get('RESOURCE_PREFIX')\n", 79 | "resources = cfn.describe_stacks(StackName='{0}-Pipeline'.format(project_name))\n", 80 | "query = 'Stacks[].Outputs[?OutputKey==`DataLakeBucket`].OutputValue'\n", 81 | "data_lake_bucket = path = jmespath.search(query, resources)[0][0]\n", 82 | "print(data_lake_bucket)" 83 | ] 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": {}, 88 | "source": [ 89 | "### Dataset\n", 90 | "Lets load the raw data into a dataframe. The raw data is stored in S3 in the file clinvar_conflicting.csv. This file is downloaded from the follwoing location:https://github.com/arvkevi/clinvar-kaggle/blob/master/clinvar_conflicting.csv" 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": null, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "# Load the raw data into a dataframe from S3\n", 100 | "raw_data=pd.read_csv(\"s3://{0}/annotation/clinvar/conflicting/clinvar_conflicting.csv\".format(data_lake_bucket))\n", 101 | "\n", 102 | "# Take 80% of the data for training\n", 103 | "train_data = raw_data.sample(frac=0.8,random_state=200)\n", 104 | "\n", 105 | "# Take the remaining 20% for testing\n", 106 | "test_data = raw_data.drop(train_data.index)\n", 107 | "\n", 108 | "#save the train and test data as a CSV file and load it to S3\n", 109 | "train_file = 'train_data.csv';\n", 110 | "train_data.to_csv(train_file, index=False, header=True)\n", 111 | "train_data_s3_path = session.upload_data(path=train_file, key_prefix=prefix + \"/train\")\n", 112 | "print('Train data uploaded to: ' + train_data_s3_path)\n", 113 | "\n", 114 | "test_file = 'test_data.csv';\n", 115 | "test_data.to_csv(test_file, index=False, header=True)\n", 116 | "test_data_s3_path = session.upload_data(path=test_file, key_prefix=prefix + \"/test\")\n", 117 | "print('Test data uploaded to: ' + test_data_s3_path)\n", 118 | "\n", 119 | "train_data.head()\n" 120 | ] 121 | }, 122 | { 123 | "cell_type": "markdown", 124 | "metadata": {}, 125 | "source": [ 126 | "### Setting up the SageMaker Autopilot Job\n", 127 | "After uploading the dataset to Amazon S3, you can invoke Autopilot to find the best ML pipeline to train a model on this dataset.\n", 128 | "\n", 129 | "The required inputs for invoking a Autopilot job are:\n", 130 | "\n", 131 | "* Amazon S3 location for input dataset and for all output artifacts\n", 132 | "* Name of the column of the dataset you want to predict (y in this case)\n", 133 | "* An IAM role\n", 134 | "\n" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": null, 140 | "metadata": {}, 141 | "outputs": [], 142 | "source": [ 143 | "input_data_config = [{\n", 144 | " 'DataSource': {\n", 145 | " 'S3DataSource': {\n", 146 | " 'S3DataType': 'S3Prefix',\n", 147 | " 'S3Uri': 's3://{}/{}/train'.format(bucket,prefix)\n", 148 | " }\n", 149 | " },\n", 150 | " 'TargetAttributeName': 'CLASS'\n", 151 | " }\n", 152 | " ]\n", 153 | "\n", 154 | "output_data_config = {\n", 155 | " 'S3OutputPath': 's3://{}/{}/output'.format(bucket,prefix)\n", 156 | " }" 157 | ] 158 | }, 159 | { 160 | "cell_type": "markdown", 161 | "metadata": {}, 162 | "source": [ 163 | "You can also specify the type of problem you want to solve with your dataset (Regression, MulticlassClassification, BinaryClassification). In case you are not sure, SageMaker Autopilot will infer the problem type based on statistics of the target column (the column you want to predict).\n", 164 | "\n", 165 | "You have the option to limit the running time of a SageMaker Autopilot job by providing either the maximum number of pipeline evaluations or candidates (one pipeline evaluation is called a Candidate because it generates a candidate model) or providing the total time allocated for the overall Autopilot job. Under default settings, this job takes about four hours to run. This varies between runs because of the nature of the exploratory process Autopilot uses to find optimal training parameters.\n", 166 | "For our model, we are going to just generate the Candidate Notebooks and explore it ourselves instead of running the complete default experiment. This is done by setting the flag \"GenerateCandidateDefinitionsOnly=True\"\n", 167 | "\n", 168 | "### Launching the SageMaker Autopilot Job\n", 169 | "You can now launch the Autopilot job by calling the create_auto_ml_job API.\n", 170 | "\n", 171 | "**NOTE: The name of the Autopilot job is important because it is used to create the names for all the resources created by Sagemaker like the model name and the endpoint name.**" 172 | ] 173 | }, 174 | { 175 | "cell_type": "code", 176 | "execution_count": null, 177 | "metadata": {}, 178 | "outputs": [], 179 | "source": [ 180 | "timestamp_suffix = strftime('%d-%H-%M-%S', gmtime())\n", 181 | "\n", 182 | "auto_ml_job_name = 'automl-vc-' + timestamp_suffix\n", 183 | "print('AutoMLJobName: ' + auto_ml_job_name)\n", 184 | "\n", 185 | "sm.create_auto_ml_job(AutoMLJobName=auto_ml_job_name,\n", 186 | " InputDataConfig=input_data_config,\n", 187 | " OutputDataConfig=output_data_config,\n", 188 | " GenerateCandidateDefinitionsOnly=True,\n", 189 | " RoleArn=role)\n", 190 | "\n", 191 | "\n", 192 | "\n", 193 | "print ('JobStatus - Secondary Status')\n", 194 | "print('------------------------------')\n", 195 | "\n", 196 | "\n", 197 | "describe_response = sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)\n", 198 | "print (describe_response['AutoMLJobStatus'] + \" - \" + describe_response['AutoMLJobSecondaryStatus'])\n", 199 | "job_run_status = describe_response['AutoMLJobStatus']\n", 200 | " \n" 201 | ] 202 | }, 203 | { 204 | "cell_type": "markdown", 205 | "metadata": {}, 206 | "source": [ 207 | "We will now wait for Sagemaker autopilot to generate the candidate notebooks." 208 | ] 209 | }, 210 | { 211 | "cell_type": "code", 212 | "execution_count": null, 213 | "metadata": {}, 214 | "outputs": [], 215 | "source": [ 216 | "while job_run_status not in ('Failed', 'Completed', 'Stopped'):\n", 217 | " describe_response = sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)\n", 218 | " job_run_status = describe_response['AutoMLJobStatus']\n", 219 | " \n", 220 | " print (describe_response['AutoMLJobStatus'] + \" - \" + describe_response['AutoMLJobSecondaryStatus'])\n", 221 | " sleep(30)\n", 222 | "\n", 223 | "\n", 224 | "\n", 225 | "candidate_nb=sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)['AutoMLJobArtifacts']['CandidateDefinitionNotebookLocation']\n", 226 | "\n", 227 | "data_nb=sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)['AutoMLJobArtifacts']['DataExplorationNotebookLocation']\n", 228 | "\n", 229 | "print (\"Data Exploration Notebook: \"+data_nb)\n", 230 | "print(\"------------------------------------------------------------------------------------------------\")\n", 231 | "print(\"Candidate Generation Notebook: \"+candidate_nb)\n" 232 | ] 233 | }, 234 | { 235 | "cell_type": "markdown", 236 | "metadata": {}, 237 | "source": [ 238 | "### Downloading the autopilot candidate Notebooks\n", 239 | "Now that Sagemaker autopilot has analyzed our data and created the candidate notebooks, lets download them and explore." 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": null, 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "!aws s3 cp $data_nb .\n", 249 | "!aws s3 cp $candidate_nb ." 250 | ] 251 | }, 252 | { 253 | "cell_type": "markdown", 254 | "metadata": {}, 255 | "source": [ 256 | "### Analyzing the candidate notebooks\n", 257 | "Reference: https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-automate-model-development-notebook-output.html\n", 258 | "\n", 259 | "During the analysis phase of the AutoML job, two notebooks are created that describe the plan that Autopilot follows to generate candidate models. A candidate model consists of a (pipeline, algorithm) pair. First, there’s a data exploration notebook, that describes what Autopilot learned about the data that you provided. Second, there’s a candidate generation notebook, which uses the information about the data to generate candidates.\n", 260 | "\n", 261 | "You can run both notebooks in SageMaker or locally if you have installed the SageMaker Python SDK. You can share the notebooks just like any other SageMaker Studio notebook. The notebooks are created for you to conduct experiment. For example, you could edit the following items in the notebooks:\n", 262 | "\n", 263 | "* the preprocessors used on the data\n", 264 | "\n", 265 | "* the number of hyperparameter optimization (HPO) runs and their parallelism\n", 266 | "\n", 267 | "* the algorithms to try\n", 268 | "\n", 269 | "* the instance types used for the HPO jobs\n", 270 | "\n", 271 | "* the hyperparameter ranges\n", 272 | "\n", 273 | "Modifications to the candidate generation notebook are encouraged to be used as a learning tool. This capability allows you to learn about how the decisions made during the machine learning process impact the your results." 274 | ] 275 | }, 276 | { 277 | "cell_type": "markdown", 278 | "metadata": {}, 279 | "source": [ 280 | "### Next Steps\n", 281 | "You can now switch over to the two notebooks. Feel free to change parameters and modify them as needed for your final ML model deployment. At the end of the candidate notebook, you will have a hosted model on Sagemaker with an endpoint. We have provided a notebook \"variant_predictor.ipynb\" that runs predictions on the model using the test data we saved earlier. So, to summarize the next steps:\n", 282 | "* Explore and run the SageMakerAutopilotDataExplorationNotebook.ipynb notebook.\n", 283 | "* Explore and run the SageMakerAutopilotCandidateDefinitionNotebook.ipynb notebook.\n", 284 | "* Explore and run the variant_predictor.ipynb notebook." 285 | ] 286 | } 287 | ], 288 | "metadata": { 289 | "kernelspec": { 290 | "display_name": "conda_tensorflow_p36", 291 | "language": "python", 292 | "name": "conda_tensorflow_p36" 293 | }, 294 | "language_info": { 295 | "codemirror_mode": { 296 | "name": "ipython", 297 | "version": 3 298 | }, 299 | "file_extension": ".py", 300 | "mimetype": "text/x-python", 301 | "name": "python", 302 | "nbconvert_exporter": "python", 303 | "pygments_lexer": "ipython3", 304 | "version": "3.6.6" 305 | } 306 | }, 307 | "nbformat": 4, 308 | "nbformat_minor": 2 309 | } 310 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/resources/notebooks/variant_predictor.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## Generate predictions and evaluate the model" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "### Introduction\n", 15 | "I this notebook, we will run predictions on the model that we trained and deployed in the previous steps. If you recall, the model is hosted on Sagemaker realtime prediction endpoint. We will invoke that endpoint to generate the binary labels(1,0) on a few rows that we have in our test file (test_data.csv). We will then evaluate the results against the ground truth and see how the model performs." 16 | ] 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": [ 22 | "### Prerequisits\n", 23 | "Before proceeding, make sure you have run the following notebook in order without any errors:\n", 24 | "1. variant_classifier-autopilot.ipynb\n", 25 | "2. SageMakerAutopilotDataExplorationNotebook.ipynb\n", 26 | "3. SageMakerAutopilotCandidateDefinitionNotebook.ipynb\n", 27 | "\n", 28 | "If not, please go back to the notebooks and run them before proceeding. " 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "metadata": {}, 34 | "source": [ 35 | "### Setup\n", 36 | "Lets start by importing the libraries that we will need for executing this notebook." 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": null, 42 | "metadata": {}, 43 | "outputs": [], 44 | "source": [ 45 | "import pandas as pd\n", 46 | "import sagemaker \n", 47 | "from sagemaker.predictor import RealTimePredictor\n", 48 | "from sagemaker.content_types import CONTENT_TYPE_CSV\n", 49 | "import boto3\n", 50 | "from sklearn import metrics\n", 51 | "import numpy as np\n", 52 | "import seaborn as sns\n", 53 | "import matplotlib.pyplot as plt" 54 | ] 55 | }, 56 | { 57 | "cell_type": "markdown", 58 | "metadata": {}, 59 | "source": [ 60 | "### Get the endpoint name\n", 61 | "To generate predictions on test data, we need to get the endpoint name of the model that we deployed at the end of the SageMakerAutopilotCandidateDefinitionNotebook.ipynb notebook. To do this, we find the endpoint among the list of endpoints that starts with the string \"AutoML-automl-vc\". This is the default naming format that has been used in the variant_classifier-autopilot.ipynb and SageMakerAutopilotCandidateDefinitionNotebook.ipynb notebooks.\n", 62 | "\n", 63 | "**NOTE:** If you changed the naming convention and/or have multiple endpoints beginning with the string \"AutoML-automl-vc\", the endpoint retrieved may not be the correct one. You can verify by logging into the AWS console, navigating to Sagemaker and selecting \"Endpoints\" from the left hand menu. Here you will see all the endpoints that have been created in your account. Select the one that you created as part of the SageMakerAutopilotCandidateDefinitionNotebook.ipynb notebook. If the correct endpoint is not selected, you can overwrite the name of the variable \"endpoint_name\" with the correct endpoint name. Make sure the correct endpoint is selected before proceeding." 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "metadata": {}, 70 | "outputs": [], 71 | "source": [ 72 | "sm = boto3.client('sagemaker')\n", 73 | "endpoints=sm.list_endpoints()['Endpoints']\n", 74 | "for val in endpoints:\n", 75 | " ep=val.get(\"EndpointName\")\n", 76 | " if ep.startswith('AutoML-automl-vc'):\n", 77 | " endpoint_name=ep\n", 78 | " print ('Model endpoint: '+endpoint_name)\n", 79 | " print ('Make sure this is the correct model endpoint before proceeding')\n", 80 | " break\n", 81 | " print('No endpoint found. Make sure you have completed the steps mentioned in the prerequisits above.')" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | "### Data Preprocessing\n", 89 | "We will now read the file \"test_data.csv\" into a dataframe and randomly sample 1000 records from it. " 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": null, 95 | "metadata": {}, 96 | "outputs": [], 97 | "source": [ 98 | "test_file=pd.read_csv('test_data.csv')\n", 99 | "test_rows=test_file.sample(1000)\n", 100 | "test_rows.head()" 101 | ] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "metadata": {}, 106 | "source": [ 107 | "As you can see, the test rows look exactly like the rows in the training dataset as expected. We will now saperate out our target variable \"CLASS\" from the test data and store it in a new dataframe \"actual\"" 108 | ] 109 | }, 110 | { 111 | "cell_type": "code", 112 | "execution_count": null, 113 | "metadata": {}, 114 | "outputs": [], 115 | "source": [ 116 | "test_rows_notarget=test_rows.drop(['CLASS'],axis=1)\n", 117 | "actual=test_rows['CLASS'].to_frame(name=\"actual\")\n", 118 | "actual.reset_index(drop=True, inplace=True)" 119 | ] 120 | }, 121 | { 122 | "cell_type": "markdown", 123 | "metadata": {}, 124 | "source": [ 125 | "### Generate Predictions\n", 126 | "Next, we will invoke the endpoint of our model with the test rows and generate a prediction for each row. We will then store the results of the predciton in a new dataframe called \"predicted\"." 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "execution_count": null, 132 | "metadata": {}, 133 | "outputs": [], 134 | "source": [ 135 | "sm_session = sagemaker.Session()\n", 136 | "variant_predictor=RealTimePredictor(endpoint=endpoint_name,sagemaker_session=sm_session,content_type=CONTENT_TYPE_CSV,\n", 137 | " accept=CONTENT_TYPE_CSV)" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": null, 143 | "metadata": {}, 144 | "outputs": [], 145 | "source": [ 146 | "predicted_str=variant_predictor.predict(test_rows_notarget.to_csv(sep=',', header=False, index=False)).decode('utf-8')\n", 147 | "predicted=pd.Series(predicted_str.split(),name='predicted').to_frame().astype(int)" 148 | ] 149 | }, 150 | { 151 | "cell_type": "markdown", 152 | "metadata": {}, 153 | "source": [ 154 | "Finally, we combine \"actual\" and \"predicted\" values into a single dataframe called \"results\"" 155 | ] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": null, 160 | "metadata": {}, 161 | "outputs": [], 162 | "source": [ 163 | "results=pd.concat([actual, predicted],axis=1)\n", 164 | "results.head()" 165 | ] 166 | }, 167 | { 168 | "cell_type": "markdown", 169 | "metadata": {}, 170 | "source": [ 171 | "### Model Evaluation\n", 172 | "We will now generate some evaluation metrics for our binary classification model. We will start with a [confusion matrix](https://en.wikipedia.org/wiki/Confusion_matrix) and follow that up with an [Receiver Operating Characteristic (ROC) curve](https://en.wikipedia.org/wiki/Receiver_operating_characteristic)." 173 | ] 174 | }, 175 | { 176 | "cell_type": "code", 177 | "execution_count": null, 178 | "metadata": {}, 179 | "outputs": [], 180 | "source": [ 181 | "cf_matrix = metrics.confusion_matrix(results['actual'], results['predicted'])\n", 182 | "group_names = ['True Neg','False Pos','False Neg','True Pos']\n", 183 | "group_counts = [\"{0:0.0f}\".format(value) for value in\n", 184 | " cf_matrix.flatten()]\n", 185 | "group_percentages = [\"{0:.2%}\".format(value) for value in\n", 186 | " cf_matrix.flatten()/np.sum(cf_matrix)]\n", 187 | "labels = [f\"{v1}\\n{v2}\\n{v3}\" for v1, v2, v3 in\n", 188 | " zip(group_names,group_counts,group_percentages)]\n", 189 | "labels = np.asarray(labels).reshape(2,2)\n", 190 | "sns.heatmap(cf_matrix, annot=labels, fmt='', cmap='Blues');" 191 | ] 192 | }, 193 | { 194 | "cell_type": "code", 195 | "execution_count": null, 196 | "metadata": {}, 197 | "outputs": [], 198 | "source": [ 199 | "fpr, tpr, thresholds = metrics.roc_curve(results['actual'], results['predicted'])" 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": null, 205 | "metadata": {}, 206 | "outputs": [], 207 | "source": [ 208 | "roc_auc=metrics.auc(fpr, tpr)\n", 209 | "accuracy=metrics.accuracy_score(results['actual'], results['predicted'])\n", 210 | "precision=metrics.precision_score(results['actual'], results['predicted'])\n", 211 | "recall=metrics.recall_score(results['actual'], results['predicted'])\n", 212 | "f1score=metrics.f1_score(results['actual'], results['predicted'])\n", 213 | "plt.figure()\n", 214 | "lw = 2\n", 215 | "plt.plot(fpr, tpr, color='darkorange',\n", 216 | " lw=lw, label='ROC curve (area = %0.2f)' % roc_auc)\n", 217 | "plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')\n", 218 | "plt.xlim([0.0, 1.0])\n", 219 | "plt.ylim([0.0, 1.05])\n", 220 | "plt.xlabel('False Positive Rate')\n", 221 | "plt.ylabel('True Positive Rate')\n", 222 | "plt.title('Receiver operating characteristic (ROC) curve')\n", 223 | "plt.legend(loc=\"lower right\")\n", 224 | "plt.text(1.1,0.75,s='Accuracy: '+str(round(accuracy,2))+'\\nPrecision: '+str(round(precision,2))+\n", 225 | "'\\nRecall: '+str(round(recall,2))+'\\nF1 Score: '+str(round(f1score,2)),bbox=dict(boxstyle=\"square\",\n", 226 | " ec=(1., 0.5, 0.5),\n", 227 | " fc=(1., 0.8, 0.8),\n", 228 | " ))\n", 229 | "\n", 230 | "plt.show()" 231 | ] 232 | } 233 | ], 234 | "metadata": { 235 | "kernelspec": { 236 | "display_name": "conda_python3", 237 | "language": "python", 238 | "name": "conda_python3" 239 | }, 240 | "language_info": { 241 | "codemirror_mode": { 242 | "name": "ipython", 243 | "version": 3 244 | }, 245 | "file_extension": ".py", 246 | "mimetype": "text/x-python", 247 | "name": "python", 248 | "nbconvert_exporter": "python", 249 | "pygments_lexer": "ipython3", 250 | "version": "3.6.5" 251 | } 252 | }, 253 | "nbformat": 4, 254 | "nbformat_minor": 4 255 | } 256 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/resources/scripts/process_clinvar.py: -------------------------------------------------------------------------------- 1 | ############################################################################### 2 | # Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved. # 3 | # # 4 | # Licensed under the Apache License, Version 2.0 (the "License"). # 5 | # You may not use this file except in compliance with the License. 6 | # A copy of the License is located at # 7 | # # 8 | # http://www.apache.org/licenses/LICENSE-2.0 # 9 | # # 10 | # or in the "license" file accompanying this file. This file is distributed # 11 | # on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express # 12 | # or implied. See the License for the specific language governing permissions# 13 | # and limitations under the License. # 14 | ############################################################################### 15 | 16 | import gzip, re 17 | import pandas as pd 18 | import csv 19 | import sys 20 | from awsglue.utils import getResolvedOptions 21 | import boto3 22 | 23 | s3 = boto3.client('s3') 24 | s3_resource = boto3.resource('s3') 25 | 26 | args = getResolvedOptions(sys.argv, 27 | ['input_bucket', 'clinvar_input_key', 'clinvar_annotated_input_key', 'output_bucket', 28 | 'output_key']) 29 | 30 | 31 | def download_to_local(filename): 32 | new_filename = filename.split('/')[-1] 33 | s3_resource.meta.client.download_file(args['input_bucket'], filename, '/tmp/' + new_filename) 34 | return new_filename 35 | 36 | 37 | def list_to_dict(l): 38 | """Convert list to dict.""" 39 | return {k: v for k, v in (x.split("=") for x in l)} 40 | 41 | 42 | fieldnames = [ 43 | "CHROM", 44 | "POS", 45 | "REF", 46 | "ALT", 47 | "AF_ESP", 48 | "AF_EXAC", 49 | "AF_TGP", 50 | "CLNDISDB", 51 | "CLNDISDBINCL", 52 | "CLNDN", 53 | "CLNDNINCL", 54 | "CLNHGVS", 55 | "CLNSIGINCL", 56 | "CLNVC", 57 | "CLNVI", 58 | "MC", 59 | "ORIGIN", 60 | "SSR", 61 | "CLASS", 62 | "Allele", 63 | "Consequence", 64 | "IMPACT", 65 | "SYMBOL", 66 | "Feature_type", 67 | "Feature", 68 | "BIOTYPE", 69 | "EXON", 70 | "INTRON", 71 | "cDNA_position", 72 | "CDS_position", 73 | "Protein_position", 74 | "Amino_acids", 75 | "Codons", 76 | "DISTANCE", 77 | "STRAND", 78 | "BAM_EDIT", 79 | "SIFT", 80 | "PolyPhen", 81 | "MOTIF_NAME", 82 | "MOTIF_POS", 83 | "HIGH_INF_POS", 84 | "MOTIF_SCORE_CHANGE", 85 | "LoFtool", 86 | "CADD_PHRED", 87 | "CADD_RAW", 88 | "BLOSUM62", 89 | ] 90 | 91 | obj = s3.get_object(Bucket=args['input_bucket'], Key=args['clinvar_input_key']) 92 | cv_columns = {} 93 | with gzip.GzipFile(fileobj=obj['Body'], mode='rb') as f: 94 | for metaline in f: 95 | if metaline.startswith(b'##INFO'): 96 | colname = re.search(b'ID=(\w+),', metaline.strip(b'#\n')) 97 | coldesc = re.search(b'.*Description=(.*)>', metaline.strip(b'#\n')) 98 | cv_columns[colname.group(1)] = coldesc.group(1).strip(b'"') 99 | 100 | # read clinvar vcf 101 | obj = s3.get_object(Bucket=args['input_bucket'], Key=args['clinvar_input_key']) 102 | with gzip.GzipFile(fileobj=obj['Body'], mode='rb') as f: 103 | cv_df = pd.read_csv(f, sep='\t', comment='#', header=None, usecols=[0, 1, 2, 3, 4, 7], dtype={0: object}) 104 | 105 | # convert dictionaries to columns 106 | cv_df = pd.concat( 107 | [ 108 | cv_df.drop([7], axis=1), 109 | cv_df[7].str.split(";").apply(list_to_dict).apply(pd.Series), 110 | ], 111 | axis=1, 112 | ) 113 | # rename columns 114 | cv_df.rename(columns={0: "CHROM", 1: "POS", 2: "ID", 3: "REF", 4: "ALT"}, inplace=True) 115 | 116 | # drop columns we know we won't need 117 | cv_df = cv_df.drop(columns=["CHROM", "POS", "REF", "ALT"]) 118 | 119 | # assign classes 120 | cv_df["CLASS"] = 0 121 | cv_df.loc[cv_df["CLNSIGCONF"].notnull(), "CLASS"] = 1 122 | 123 | # convert NaN to 0 where allele frequencies are null 124 | cv_df[["AF_ESP", "AF_EXAC", "AF_TGP"]] = cv_df[["AF_ESP", "AF_EXAC", "AF_TGP"]].fillna( 125 | 0 126 | ) 127 | 128 | # select variants that have beeen submitted by multiple organizations. 129 | cv_df = cv_df.loc[ 130 | cv_df["CLNREVSTAT"].isin( 131 | [ 132 | "criteria_provided,_multiple_submitters,_no_conflicts", 133 | "criteria_provided,_conflicting_interpretations", 134 | ] 135 | ) 136 | ] 137 | 138 | # Reduce the size of the dataset below 139 | cv_df.drop(columns=["ALLELEID", "RS", "DBVARID"], inplace=True) 140 | # drop columns that would reveal class 141 | cv_df.drop(columns=["CLNSIG", "CLNSIGCONF", "CLNREVSTAT"], inplace=True) 142 | # drop this redundant columns 143 | cv_df.drop(columns=["CLNVCSO", "GENEINFO"], inplace=True) 144 | 145 | # dictionary to map ID to clinvar annotations 146 | clinvar_annotations = cv_df.set_index("ID")[ 147 | [col for col in cv_df.columns if col in fieldnames] 148 | ].to_dict(orient="index") 149 | 150 | # open the output file 151 | outfile = "/tmp/clinvar_conflicting.csv" 152 | with open(outfile, "w") as fout: 153 | dw = csv.DictWriter( 154 | fout, delimiter=",", fieldnames=fieldnames, extrasaction="ignore" 155 | ) 156 | dw.writeheader() 157 | # read the VEP-annotated vcf file line-by-line 158 | filename = download_to_local(args['clinvar_annotated_input_key']) 159 | filename = "/tmp/" + filename 160 | with gzip.GzipFile(filename, mode='rb') as f: 161 | for line in f: 162 | line = line.decode("utf-8") 163 | if line.startswith("##INFO=', line) 165 | cols = m.group(1).split("|") 166 | continue 167 | 168 | if line.startswith("#"): 169 | continue 170 | record = line.split("\t") 171 | ( 172 | chromosome, 173 | position, 174 | clinvar_id, 175 | reference_base, 176 | alternate_base, 177 | qual, 178 | filter_, 179 | info, 180 | ) = record 181 | info_field = info.strip("\n").split(";") 182 | 183 | # to lookup in clivnar_annotaitons 184 | clinvar_id = int(clinvar_id) 185 | 186 | # only keep the variants that have been evaluated by multiple submitters 187 | if clinvar_id in clinvar_annotations: 188 | # initialize a dictionary to hold all the VEP annotation data 189 | annotation_data = {column: None for column in cols} 190 | annotation_data.update(clinvar_annotations[clinvar_id]) 191 | # fields directly from the vcf 192 | annotation_data["CHROM"] = str(chromosome) 193 | annotation_data["POS"] = position 194 | annotation_data["REF"] = reference_base 195 | annotation_data["ALT"] = alternate_base 196 | 197 | for annotations in info_field: 198 | column, value = annotations.split("=") 199 | 200 | if column == "CSQ": 201 | for csq_column, csq_value in zip(cols, value.split("|")): 202 | annotation_data[csq_column] = csq_value 203 | continue 204 | 205 | annotation_data[column] = value 206 | dw.writerow(annotation_data) 207 | 208 | s3_resource.meta.client.upload_file(outfile, args['output_bucket'], args['output_key']) -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/INSTALLER: -------------------------------------------------------------------------------- 1 | pip 2 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/METADATA: -------------------------------------------------------------------------------- 1 | Metadata-Version: 2.1 2 | Name: crhelper 3 | Version: 2.0.6 4 | Summary: crhelper simplifies authoring CloudFormation Custom Resources 5 | Home-page: https://github.com/aws-cloudformation/custom-resource-helper 6 | Author: Jay McConnell 7 | Author-email: jmmccon@amazon.com 8 | License: Apache2 9 | Platform: UNKNOWN 10 | Classifier: Programming Language :: Python :: 3.6 11 | Classifier: Programming Language :: Python :: 3.7 12 | Classifier: License :: OSI Approved :: Apache Software License 13 | Classifier: Operating System :: OS Independent 14 | Description-Content-Type: text/markdown 15 | 16 | ## Custom Resource Helper 17 | 18 | Simplify best practice Custom Resource creation, sending responses to CloudFormation and providing exception, timeout 19 | trapping, and detailed configurable logging. 20 | 21 | [![PyPI Version](https://img.shields.io/pypi/v/crhelper.svg)](https://pypi.org/project/crhelper/) 22 | ![Python Versions](https://img.shields.io/pypi/pyversions/crhelper.svg) 23 | [![Build Status](https://travis-ci.com/aws-cloudformation/custom-resource-helper.svg?branch=master)](https://travis-ci.com/aws-cloudformation/custom-resource-helper) 24 | [![Test Coverage](https://codecov.io/gh/aws-cloudformation/custom-resource-helper/branch/master/graph/badge.svg)](https://codecov.io/gh/aws-cloudformation/custom-resource-helper) 25 | 26 | ## Features 27 | 28 | * Dead simple to use, reduces the complexity of writing a CloudFormation custom resource 29 | * Guarantees that CloudFormation will get a response even if an exception is raised 30 | * Returns meaningful errors to CloudFormation Stack events in the case of a failure 31 | * Polling enables run times longer than the lambda 15 minute limit 32 | * JSON logging that includes request id's, stack id's and request type to assist in tracing logs relevant to a 33 | particular CloudFormation event 34 | * Catches function timeouts and sends CloudFormation a failure response 35 | * Static typing (mypy) compatible 36 | 37 | ## Installation 38 | 39 | Install into the root folder of your lambda function 40 | 41 | ```json 42 | cd my-lambda-function/ 43 | pip install crhelper -t . 44 | ``` 45 | 46 | ## Example Usage 47 | 48 | [This blog](https://aws.amazon.com/blogs/infrastructure-and-automation/aws-cloudformation-custom-resource-creation-with-python-aws-lambda-and-crhelper/) covers usage in more detail. 49 | 50 | ```python 51 | from __future__ import print_function 52 | from crhelper import CfnResource 53 | import logging 54 | 55 | logger = logging.getLogger(__name__) 56 | # Initialise the helper, all inputs are optional, this example shows the defaults 57 | helper = CfnResource(json_logging=False, log_level='DEBUG', boto_level='CRITICAL', sleep_on_delete=120) 58 | 59 | try: 60 | ## Init code goes here 61 | pass 62 | except Exception as e: 63 | helper.init_failure(e) 64 | 65 | 66 | @helper.create 67 | def create(event, context): 68 | logger.info("Got Create") 69 | # Optionally return an ID that will be used for the resource PhysicalResourceId, 70 | # if None is returned an ID will be generated. If a poll_create function is defined 71 | # return value is placed into the poll event as event['CrHelperData']['PhysicalResourceId'] 72 | # 73 | # To add response data update the helper.Data dict 74 | # If poll is enabled data is placed into poll event as event['CrHelperData'] 75 | helper.Data.update({"test": "testdata"}) 76 | 77 | # To return an error to cloudformation you raise an exception: 78 | if not helper.Data.get("test"): 79 | raise ValueError("this error will show in the cloudformation events log and console.") 80 | 81 | return "MyResourceId" 82 | 83 | 84 | @helper.update 85 | def update(event, context): 86 | logger.info("Got Update") 87 | # If the update resulted in a new resource being created, return an id for the new resource. 88 | # CloudFormation will send a delete event with the old id when stack update completes 89 | 90 | 91 | @helper.delete 92 | def delete(event, context): 93 | logger.info("Got Delete") 94 | # Delete never returns anything. Should not fail if the underlying resources are already deleted. 95 | # Desired state. 96 | 97 | 98 | @helper.poll_create 99 | def poll_create(event, context): 100 | logger.info("Got create poll") 101 | # Return a resource id or True to indicate that creation is complete. if True is returned an id 102 | # will be generated 103 | return True 104 | 105 | 106 | def handler(event, context): 107 | helper(event, context) 108 | ``` 109 | 110 | ### Polling 111 | 112 | If you need longer than the max runtime of 15 minutes, you can enable polling by adding additional decorators for 113 | `poll_create`, `poll_update` or `poll_delete`. When a poll function is defined for `create`/`update`/`delete` the 114 | function will not send a response to CloudFormation and instead a CloudWatch Events schedule will be created to 115 | re-invoke the lambda function every 2 minutes. When the function is invoked the matching `@helper.poll_` function will 116 | be called, logic to check for completion should go here, if the function returns `None` then the schedule will run again 117 | in 2 minutes. Once complete either return a PhysicalResourceID or `True` to have one generated. The schedule will be 118 | deleted and a response sent back to CloudFormation. If you use polling the following additional IAM policy must be 119 | attached to the function's IAM role: 120 | 121 | ```yaml 122 | { 123 | "Version": "2012-10-17", 124 | "Statement": [ 125 | { 126 | "Effect": "Allow", 127 | "Action": [ 128 | "lambda:AddPermission", 129 | "lambda:RemovePermission", 130 | "events:PutRule", 131 | "events:DeleteRule", 132 | "events:PutTargets", 133 | "events:RemoveTargets" 134 | ], 135 | "Resource": "*" 136 | } 137 | ] 138 | } 139 | ``` 140 | 141 | ## Credits 142 | 143 | Decorator implementation inspired by https://github.com/ryansb/cfn-wrapper-python 144 | 145 | Log implementation inspired by https://gitlab.com/hadrien/aws_lambda_logging 146 | 147 | ## License 148 | 149 | This library is licensed under the Apache 2.0 License. 150 | 151 | 152 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/NOTICE: -------------------------------------------------------------------------------- 1 | Custom Resource Helper 2 | Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. 3 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/RECORD: -------------------------------------------------------------------------------- 1 | crhelper-2.0.6.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4 2 | crhelper-2.0.6.dist-info/LICENSE,sha256=CeipvOyAZxBGUsFoaFqwkx54aPnIKEtm9a5u2uXxEws,10142 3 | crhelper-2.0.6.dist-info/METADATA,sha256=0FEfmNkHpgUGUHmR-GGoiZwcGJsEYmJE92mkBI_tQ1Q,5537 4 | crhelper-2.0.6.dist-info/NOTICE,sha256=gDru0mjdrGkrCJfnHTVboKMdS7U85Ha8bV_PQTCckfM,96 5 | crhelper-2.0.6.dist-info/RECORD,, 6 | crhelper-2.0.6.dist-info/WHEEL,sha256=g4nMs7d-Xl9-xC9XovUrsDHGXt-FT0E17Yqo92DEfvY,92 7 | crhelper-2.0.6.dist-info/top_level.txt,sha256=pe_5uNErAyss8aUfseYKAjd3a1-LXM6bPjnkun7vbso,15 8 | crhelper/__init__.py,sha256=VSvHU2MKgP96DHSDXR1OYxnbC8j7yfuVhZubBLU7Pns,66 9 | crhelper/__pycache__/__init__.cpython-38.pyc,, 10 | crhelper/__pycache__/log_helper.cpython-38.pyc,, 11 | crhelper/__pycache__/resource_helper.cpython-38.pyc,, 12 | crhelper/__pycache__/utils.cpython-38.pyc,, 13 | crhelper/log_helper.py,sha256=18n4WKlGgxXL_iiYPqE8dWv9TW4sPZc4Ae3px5dbHmY,2665 14 | crhelper/resource_helper.py,sha256=jlFCL0YMi1lEN9kOqhRtKkMcDovoJJpwq1oTk3W5hX0,12637 15 | crhelper/utils.py,sha256=HX_ZnUy3DP81L5ofOVshhWK9NwYnZ9dzIWUPnOfFm5w,1384 16 | tests/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0 17 | tests/__pycache__/__init__.cpython-38.pyc,, 18 | tests/__pycache__/test_log_helper.cpython-38.pyc,, 19 | tests/__pycache__/test_resource_helper.cpython-38.pyc,, 20 | tests/__pycache__/test_utils.cpython-38.pyc,, 21 | tests/test_log_helper.py,sha256=T25g-RnRYrwp05v__25thYiodWIIDtoSXDFAqe9Z7rQ,3256 22 | tests/test_resource_helper.py,sha256=5BzbcWX49kSZN0GveRpG8Bt3PHAYUGubJMOmbAigFP0,14462 23 | tests/test_utils.py,sha256=HbLMvoXfYbF952AMM-ey8RNasbYHFqfX17rqajluOKM,1407 24 | tests/unit/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0 25 | tests/unit/__pycache__/__init__.cpython-38.pyc,, 26 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/WHEEL: -------------------------------------------------------------------------------- 1 | Wheel-Version: 1.0 2 | Generator: bdist_wheel (0.34.2) 3 | Root-Is-Purelib: true 4 | Tag: py3-none-any 5 | 6 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/top_level.txt: -------------------------------------------------------------------------------- 1 | crhelper 2 | tests 3 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper/__init__.py: -------------------------------------------------------------------------------- 1 | from crhelper.resource_helper import CfnResource, SUCCESS, FAILED 2 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper/__pycache__/__init__.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/crhelper/__pycache__/__init__.cpython-38.pyc -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper/__pycache__/log_helper.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/crhelper/__pycache__/log_helper.cpython-38.pyc -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper/__pycache__/resource_helper.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/crhelper/__pycache__/resource_helper.cpython-38.pyc -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper/__pycache__/utils.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/crhelper/__pycache__/utils.cpython-38.pyc -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper/log_helper.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import json 3 | import logging 4 | 5 | 6 | def _json_formatter(obj): 7 | """Formatter for unserialisable values.""" 8 | return str(obj) 9 | 10 | 11 | class JsonFormatter(logging.Formatter): 12 | """AWS Lambda Logging formatter. 13 | 14 | Formats the log message as a JSON encoded string. If the message is a 15 | dict it will be used directly. If the message can be parsed as JSON, then 16 | the parse d value is used in the output record. 17 | """ 18 | 19 | def __init__(self, **kwargs): 20 | super(JsonFormatter, self).__init__() 21 | self.format_dict = { 22 | 'timestamp': '%(asctime)s', 23 | 'level': '%(levelname)s', 24 | 'location': '%(name)s.%(funcName)s:%(lineno)d', 25 | } 26 | self.format_dict.update(kwargs) 27 | self.default_json_formatter = kwargs.pop( 28 | 'json_default', _json_formatter) 29 | 30 | def format(self, record): 31 | record_dict = record.__dict__.copy() 32 | record_dict['asctime'] = self.formatTime(record) 33 | 34 | log_dict = { 35 | k: v % record_dict 36 | for k, v in self.format_dict.items() 37 | if v 38 | } 39 | 40 | if isinstance(record_dict['msg'], dict): 41 | log_dict['message'] = record_dict['msg'] 42 | else: 43 | log_dict['message'] = record.getMessage() 44 | 45 | # Attempt to decode the message as JSON, if so, merge it with the 46 | # overall message for clarity. 47 | try: 48 | log_dict['message'] = json.loads(log_dict['message']) 49 | except (TypeError, ValueError): 50 | pass 51 | 52 | if record.exc_info: 53 | # Cache the traceback text to avoid converting it multiple times 54 | # (it's constant anyway) 55 | # from logging.Formatter:format 56 | if not record.exc_text: 57 | record.exc_text = self.formatException(record.exc_info) 58 | 59 | if record.exc_text: 60 | log_dict['exception'] = record.exc_text 61 | 62 | json_record = json.dumps(log_dict, default=self.default_json_formatter) 63 | 64 | if hasattr(json_record, 'decode'): # pragma: no cover 65 | json_record = json_record.decode('utf-8') 66 | 67 | return json_record 68 | 69 | 70 | def setup(level='DEBUG', formatter_cls=JsonFormatter, boto_level=None, **kwargs): 71 | if formatter_cls: 72 | for handler in logging.root.handlers: 73 | handler.setFormatter(formatter_cls(**kwargs)) 74 | 75 | logging.root.setLevel(level) 76 | 77 | if not boto_level: 78 | boto_level = level 79 | 80 | logging.getLogger('boto').setLevel(boto_level) 81 | logging.getLogger('boto3').setLevel(boto_level) 82 | logging.getLogger('botocore').setLevel(boto_level) 83 | logging.getLogger('urllib3').setLevel(boto_level) 84 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper/resource_helper.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | TODO: 4 | * Async mode – take a wait condition handle as an input, increases max timeout to 12 hours 5 | * Idempotency – If a duplicate request comes in (say there was a network error in signaling back to cfn) the subsequent 6 | request should return the already created response, will need a persistent store of some kind... 7 | * Functional tests 8 | """ 9 | 10 | from __future__ import print_function 11 | import threading 12 | from crhelper.utils import _send_response 13 | from crhelper import log_helper 14 | import logging 15 | import random 16 | import boto3 17 | import string 18 | import json 19 | import os 20 | from time import sleep 21 | 22 | logger = logging.getLogger(__name__) 23 | 24 | SUCCESS = 'SUCCESS' 25 | FAILED = 'FAILED' 26 | 27 | 28 | class CfnResource(object): 29 | 30 | def __init__(self, json_logging=False, log_level='DEBUG', boto_level='ERROR', polling_interval=2, sleep_on_delete=120): 31 | self._sleep_on_delete= sleep_on_delete 32 | self._create_func = None 33 | self._update_func = None 34 | self._delete_func = None 35 | self._poll_create_func = None 36 | self._poll_update_func = None 37 | self._poll_delete_func = None 38 | self._timer = None 39 | self._init_failed = None 40 | self._json_logging = json_logging 41 | self._log_level = log_level 42 | self._boto_level = boto_level 43 | self._send_response = False 44 | self._polling_interval = polling_interval 45 | self.Status = "" 46 | self.Reason = "" 47 | self.PhysicalResourceId = "" 48 | self.StackId = "" 49 | self.RequestId = "" 50 | self.LogicalResourceId = "" 51 | self.Data = {} 52 | self._event = {} 53 | self._context = None 54 | self._response_url = "" 55 | self._sam_local = os.getenv('AWS_SAM_LOCAL') 56 | self._region = os.getenv('AWS_REGION') 57 | try: 58 | if not self._sam_local: 59 | self._lambda_client = boto3.client('lambda', region_name=self._region) 60 | self._events_client = boto3.client('events', region_name=self._region) 61 | self._logs_client = boto3.client('logs', region_name=self._region) 62 | if json_logging: 63 | log_helper.setup(log_level, boto_level=boto_level, RequestType='ContainerInit') 64 | else: 65 | log_helper.setup(log_level, formatter_cls=None, boto_level=boto_level) 66 | except Exception as e: 67 | logger.error(e, exc_info=True) 68 | self.init_failure(e) 69 | 70 | def __call__(self, event, context): 71 | try: 72 | self._log_setup(event, context) 73 | logger.debug(event) 74 | if not self._crhelper_init(event, context): 75 | return 76 | # Check for polling functions 77 | if self._poll_enabled() and self._sam_local: 78 | logger.info("Skipping poller functionality, as this is a local invocation") 79 | elif self._poll_enabled(): 80 | self._polling_init(event) 81 | # If polling is not enabled, then we should respond 82 | else: 83 | logger.debug("enabling send_response") 84 | self._send_response = True 85 | logger.debug("_send_response: %s" % self._send_response) 86 | if self._send_response: 87 | if self.RequestType == 'Delete': 88 | self._wait_for_cwlogs() 89 | self._cfn_response(event) 90 | except Exception as e: 91 | logger.error(e, exc_info=True) 92 | self._send(FAILED, str(e)) 93 | finally: 94 | if self._timer: 95 | self._timer.cancel() 96 | 97 | def _wait_for_cwlogs(self, sleep=sleep): 98 | time_left = int(self._context.get_remaining_time_in_millis() / 1000) - 15 99 | sleep_time = 0 100 | 101 | if time_left > self._sleep_on_delete: 102 | sleep_time = self._sleep_on_delete 103 | 104 | if sleep_time > 1: 105 | sleep(sleep_time) 106 | 107 | def _log_setup(self, event, context): 108 | if self._json_logging: 109 | log_helper.setup(self._log_level, boto_level=self._boto_level, RequestType=event['RequestType'], 110 | StackId=event['StackId'], RequestId=event['RequestId'], 111 | LogicalResourceId=event['LogicalResourceId'], aws_request_id=context.aws_request_id) 112 | else: 113 | log_helper.setup(self._log_level, boto_level=self._boto_level, formatter_cls=None) 114 | 115 | def _crhelper_init(self, event, context): 116 | self._send_response = False 117 | self.Status = SUCCESS 118 | self.Reason = "" 119 | self.PhysicalResourceId = "" 120 | self.StackId = event["StackId"] 121 | self.RequestId = event["RequestId"] 122 | self.LogicalResourceId = event["LogicalResourceId"] 123 | self.Data = {} 124 | if "CrHelperData" in event.keys(): 125 | self.Data = event["CrHelperData"] 126 | self.RequestType = event["RequestType"] 127 | self._event = event 128 | self._context = context 129 | self._response_url = event['ResponseURL'] 130 | if self._timer: 131 | self._timer.cancel() 132 | if self._init_failed: 133 | self._send(FAILED, str(self._init_failed)) 134 | return False 135 | self._set_timeout() 136 | self._wrap_function(self._get_func()) 137 | return True 138 | 139 | def _polling_init(self, event): 140 | # Setup polling on initial request 141 | logger.debug("pid1: %s" % self.PhysicalResourceId) 142 | if 'CrHelperPoll' not in event.keys() and self.Status != FAILED: 143 | logger.info("Setting up polling") 144 | self.Data["PhysicalResourceId"] = self.PhysicalResourceId 145 | self._setup_polling() 146 | self.PhysicalResourceId = None 147 | logger.debug("pid2: %s" % self.PhysicalResourceId) 148 | # if physical id is set, or there was a failure then we're done 149 | logger.debug("pid3: %s" % self.PhysicalResourceId) 150 | if self.PhysicalResourceId or self.Status == FAILED: 151 | logger.info("Polling complete, removing cwe schedule") 152 | self._remove_polling() 153 | self._send_response = True 154 | 155 | def generate_physical_id(self, event): 156 | return '_'.join([ 157 | event['StackId'].split('/')[1], 158 | event['LogicalResourceId'], 159 | self._rand_string(8) 160 | ]) 161 | 162 | def _cfn_response(self, event): 163 | # Use existing PhysicalResourceId if it's in the event and no ID was set 164 | if not self.PhysicalResourceId and "PhysicalResourceId" in event.keys(): 165 | logger.info("PhysicalResourceId present in event, Using that for response") 166 | self.PhysicalResourceId = event['PhysicalResourceId'] 167 | # Generate a physical id if none is provided 168 | elif not self.PhysicalResourceId or self.PhysicalResourceId is True: 169 | logger.info("No physical resource id returned, generating one...") 170 | self.PhysicalResourceId = self.generate_physical_id(event) 171 | self._send() 172 | 173 | def _poll_enabled(self): 174 | return getattr(self, "_poll_{}_func".format(self._event['RequestType'].lower())) 175 | 176 | def create(self, func): 177 | self._create_func = func 178 | return func 179 | 180 | def update(self, func): 181 | self._update_func = func 182 | return func 183 | 184 | def delete(self, func): 185 | self._delete_func = func 186 | return func 187 | 188 | def poll_create(self, func): 189 | self._poll_create_func = func 190 | return func 191 | 192 | def poll_update(self, func): 193 | self._poll_update_func = func 194 | return func 195 | 196 | def poll_delete(self, func): 197 | self._poll_delete_func = func 198 | return func 199 | 200 | def _wrap_function(self, func): 201 | try: 202 | self.PhysicalResourceId = func(self._event, self._context) if func else '' 203 | except Exception as e: 204 | logger.error(str(e), exc_info=True) 205 | self.Reason = str(e) 206 | self.Status = FAILED 207 | 208 | def _timeout(self): 209 | logger.error("Execution is about to time out, sending failure message") 210 | self._send(FAILED, "Execution timed out") 211 | 212 | def _set_timeout(self): 213 | self._timer = threading.Timer((self._context.get_remaining_time_in_millis() / 1000.00) - 0.5, 214 | self._timeout) 215 | self._timer.start() 216 | 217 | def _get_func(self): 218 | request_type = "_{}_func" 219 | if "CrHelperPoll" in self._event.keys(): 220 | request_type = "_poll" + request_type 221 | return getattr(self, request_type.format(self._event['RequestType'].lower())) 222 | 223 | def _send(self, status=None, reason="", send_response=_send_response): 224 | if len(str(str(self.Reason))) > 256: 225 | self.Reason = "ERROR: (truncated) " + str(self.Reason)[len(str(self.Reason)) - 240:] 226 | if len(str(reason)) > 256: 227 | reason = "ERROR: (truncated) " + str(reason)[len(str(reason)) - 240:] 228 | response_body = { 229 | 'Status': self.Status, 230 | 'PhysicalResourceId': str(self.PhysicalResourceId), 231 | 'StackId': self.StackId, 232 | 'RequestId': self.RequestId, 233 | 'LogicalResourceId': self.LogicalResourceId, 234 | 'Reason': str(self.Reason), 235 | 'Data': self.Data, 236 | } 237 | if status: 238 | response_body.update({'Status': status, 'Reason': reason}) 239 | send_response(self._response_url, response_body) 240 | 241 | def init_failure(self, error): 242 | self._init_failed = error 243 | logger.error(str(error), exc_info=True) 244 | 245 | def _cleanup_response(self): 246 | for k in ["CrHelperPoll", "CrHelperPermission", "CrHelperRule"]: 247 | if k in self.Data.keys(): 248 | del self.Data[k] 249 | 250 | @staticmethod 251 | def _rand_string(l): 252 | return ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(l)) 253 | 254 | def _add_permission(self, rule_arn): 255 | sid = self._event['LogicalResourceId'] + self._rand_string(8) 256 | self._lambda_client.add_permission( 257 | FunctionName=self._context.function_name, 258 | StatementId=sid, 259 | Action='lambda:InvokeFunction', 260 | Principal='events.amazonaws.com', 261 | SourceArn=rule_arn 262 | ) 263 | return sid 264 | 265 | def _put_rule(self): 266 | response = self._events_client.put_rule( 267 | Name=self._event['LogicalResourceId'] + self._rand_string(8), 268 | ScheduleExpression='rate({} minutes)'.format(self._polling_interval), 269 | State='ENABLED', 270 | ) 271 | return response["RuleArn"] 272 | 273 | def _put_targets(self, func_name): 274 | region = self._event['CrHelperRule'].split(":")[3] 275 | account_id = self._event['CrHelperRule'].split(":")[4] 276 | partition = self._event['CrHelperRule'].split(":")[1] 277 | rule_name = self._event['CrHelperRule'].split("/")[1] 278 | logger.debug(self._event) 279 | self._events_client.put_targets( 280 | Rule=rule_name, 281 | Targets=[ 282 | { 283 | 'Id': '1', 284 | 'Arn': 'arn:%s:lambda:%s:%s:function:%s' % (partition, region, account_id, func_name), 285 | 'Input': json.dumps(self._event) 286 | } 287 | ] 288 | ) 289 | 290 | def _remove_targets(self, rule_arn): 291 | self._events_client.remove_targets( 292 | Rule=rule_arn.split("/")[1], 293 | Ids=['1'] 294 | ) 295 | 296 | def _remove_permission(self, sid): 297 | self._lambda_client.remove_permission( 298 | FunctionName=self._context.function_name, 299 | StatementId=sid 300 | ) 301 | 302 | def _delete_rule(self, rule_arn): 303 | self._events_client.delete_rule( 304 | Name=rule_arn.split("/")[1] 305 | ) 306 | 307 | def _setup_polling(self): 308 | self._event['CrHelperData'] = self.Data 309 | self._event['CrHelperPoll'] = True 310 | self._event['CrHelperRule'] = self._put_rule() 311 | self._event['CrHelperPermission'] = self._add_permission(self._event['CrHelperRule']) 312 | self._put_targets(self._context.function_name) 313 | 314 | def _remove_polling(self): 315 | if 'CrHelperData' in self._event.keys(): 316 | self._event.pop('CrHelperData') 317 | if "PhysicalResourceId" in self.Data.keys(): 318 | self.Data.pop("PhysicalResourceId") 319 | if 'CrHelperRule' in self._event.keys(): 320 | self._remove_targets(self._event['CrHelperRule']) 321 | else: 322 | logger.error("Cannot remove CloudWatch events rule, Rule arn not available in event") 323 | if 'CrHelperPermission' in self._event.keys(): 324 | self._remove_permission(self._event['CrHelperPermission']) 325 | else: 326 | logger.error("Cannot remove lambda events permission, permission id not available in event") 327 | if 'CrHelperRule' in self._event.keys(): 328 | self._delete_rule(self._event['CrHelperRule']) 329 | else: 330 | logger.error("Cannot remove CloudWatch events target, Rule arn not available in event") 331 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/crhelper/utils.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import json 3 | import logging as logging 4 | import time 5 | from urllib.parse import urlsplit, urlunsplit 6 | from http.client import HTTPSConnection 7 | 8 | logger = logging.getLogger(__name__) 9 | 10 | 11 | def _send_response(response_url, response_body): 12 | try: 13 | json_response_body = json.dumps(response_body) 14 | except Exception as e: 15 | msg = "Failed to convert response to json: {}".format(str(e)) 16 | logger.error(msg, exc_info=True) 17 | response_body = {'Status': 'FAILED', 'Data': {}, 'Reason': msg} 18 | json_response_body = json.dumps(response_body) 19 | logger.debug("CFN response URL: {}".format(response_url)) 20 | logger.debug(json_response_body) 21 | headers = {'content-type': '', 'content-length': str(len(json_response_body))} 22 | split_url = urlsplit(response_url) 23 | host = split_url.netloc 24 | url = urlunsplit(("", "", *split_url[2:])) 25 | while True: 26 | try: 27 | connection = HTTPSConnection(host) 28 | connection.request(method="PUT", url=url, body=json_response_body, headers=headers) 29 | response = connection.getresponse() 30 | logger.info("CloudFormation returned status code: {}".format(response.reason)) 31 | break 32 | except Exception as e: 33 | logger.error("Unexpected failure sending response to CloudFormation {}".format(e), exc_info=True) 34 | time.sleep(5) 35 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/lambda.py: -------------------------------------------------------------------------------- 1 | # /********************************************************************************************************************* 2 | # * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. * 3 | # * * 4 | # * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance * 5 | # * with the License. A copy of the License is located at * 6 | # * * 7 | # * http://www.apache.org/licenses/LICENSE-2.0 * 8 | # * * 9 | # * or in the 'license' file accompanying this file. This file is distributed on an 'AS IS' BASIS, WITHOUT WARRANTIES * 10 | # * OR CONDITIONS OF ANY KIND, express or implied. See the License for the specific language governing permissions * 11 | # * and limitations under the License. * 12 | # *********************************************************************************************************************/ 13 | 14 | from __future__ import print_function 15 | from crhelper import CfnResource 16 | import logging 17 | import boto3 18 | import time 19 | 20 | logger = logging.getLogger(__name__) 21 | # Initialise the helper, all inputs are optional, this example shows the defaults 22 | helper = CfnResource(json_logging=False, log_level='DEBUG', boto_level='CRITICAL') 23 | 24 | try: 25 | codebuild = boto3.client('codebuild') 26 | # pass 27 | except Exception as e: 28 | helper.init_failure(e) 29 | 30 | 31 | @helper.create 32 | def create(event, context): 33 | logger.info("Got Create") 34 | start_build_job(event, context) 35 | 36 | 37 | @helper.update 38 | def update(event, context): 39 | logger.info("Got Update") 40 | start_build_job(event, context) 41 | 42 | 43 | @helper.delete 44 | def delete(event, context): 45 | logger.info("Got Delete") 46 | start_build_job(event, context, action='teardown') 47 | # Delete never returns anything. Should not fail if the underlying resources are already deleted. Desired state. 48 | 49 | 50 | @helper.poll_create 51 | def poll_create(event, context): 52 | logger.info("Got Create poll") 53 | return check_build_job_status(event, context) 54 | 55 | 56 | @helper.poll_update 57 | def poll_update(event, context): 58 | logger.info("Got Update poll") 59 | return check_build_job_status(event, context) 60 | 61 | 62 | @helper.poll_delete 63 | def poll_delete(event, context): 64 | logger.info("Got Delete poll") 65 | return check_build_job_status(event, context) 66 | 67 | 68 | def handler(event, context): 69 | helper(event, context) 70 | 71 | 72 | def start_build_job(event, context, action='setup'): 73 | response = codebuild.start_build( 74 | projectName=event['ResourceProperties']['CodeBuildProjectName'], 75 | environmentVariablesOverride=[{ 76 | 'name': 'SOLUTION_ACTION', 77 | 'value': action, 78 | 'type': 'PLAINTEXT' 79 | }] 80 | ) 81 | logger.info(response) 82 | 83 | helper.Data.update({"JobID": response['build']['id']}) 84 | 85 | 86 | def check_build_job_status(event, context): 87 | code_build_project_name = event['ResourceProperties']['CodeBuildProjectName'] 88 | 89 | if not helper.Data.get("JobID"): 90 | raise ValueError("Job ID missing in the polling event.") 91 | 92 | job_id = helper.Data.get("JobID") 93 | 94 | # 'SUCCEEDED' | 'FAILED' | 'FAULT' | 'TIMED_OUT' | 'IN_PROGRESS' | 'STOPPED' 95 | response = codebuild.batch_get_builds(ids=[job_id]) 96 | build_status = response['builds'][0]['buildStatus'] 97 | 98 | if build_status == 'IN_PROGRESS': 99 | logger.info(build_status) 100 | return None 101 | else: 102 | if build_status == 'SUCCEEDED': 103 | logger.info(build_status) 104 | return True 105 | else: 106 | msg = "Code Build job '{0}' in project '{1}' exited with a build status of '{2}'. Please check the code build job output log for more information." \ 107 | .format(job_id, code_build_project_name, build_status) 108 | logger.info(msg) 109 | raise ValueError(msg) 110 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/requirements.txt: -------------------------------------------------------------------------------- 1 | crhelper 2 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/__init__.py -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/tests/__pycache__/__init__.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/__pycache__/__init__.cpython-38.pyc -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/tests/__pycache__/test_log_helper.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/__pycache__/test_log_helper.cpython-38.pyc -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/tests/__pycache__/test_resource_helper.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/__pycache__/test_resource_helper.cpython-38.pyc -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/tests/__pycache__/test_utils.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/__pycache__/test_utils.cpython-38.pyc -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/tests/test_log_helper.py: -------------------------------------------------------------------------------- 1 | from crhelper.log_helper import * 2 | import unittest 3 | import logging 4 | 5 | 6 | class TestLogHelper(unittest.TestCase): 7 | 8 | def test_logging_no_formatting(self): 9 | logger = logging.getLogger('1') 10 | handler = logging.StreamHandler() 11 | logger.addHandler(handler) 12 | orig_formatters = [] 13 | for c in range(len(logging.root.handlers)): 14 | orig_formatters.append(logging.root.handlers[c].formatter) 15 | setup(level='DEBUG', formatter_cls=None, boto_level='CRITICAL') 16 | new_formatters = [] 17 | for c in range(len(logging.root.handlers)): 18 | new_formatters.append(logging.root.handlers[c].formatter) 19 | self.assertEqual(orig_formatters, new_formatters) 20 | 21 | def test_logging_boto_explicit(self): 22 | logger = logging.getLogger('2') 23 | handler = logging.StreamHandler() 24 | logger.addHandler(handler) 25 | setup(level='DEBUG', formatter_cls=None, boto_level='CRITICAL') 26 | for t in ['boto', 'boto3', 'botocore', 'urllib3']: 27 | b_logger = logging.getLogger(t) 28 | self.assertEqual(b_logger.level, 50) 29 | 30 | def test_logging_json(self): 31 | logger = logging.getLogger('3') 32 | handler = logging.StreamHandler() 33 | logger.addHandler(handler) 34 | setup(level='DEBUG', formatter_cls=JsonFormatter, RequestType='ContainerInit') 35 | for handler in logging.root.handlers: 36 | self.assertEqual(JsonFormatter, type(handler.formatter)) 37 | 38 | def test_logging_boto_implicit(self): 39 | logger = logging.getLogger('4') 40 | handler = logging.StreamHandler() 41 | logger.addHandler(handler) 42 | setup(level='DEBUG', formatter_cls=JsonFormatter, RequestType='ContainerInit') 43 | for t in ['boto', 'boto3', 'botocore', 'urllib3']: 44 | b_logger = logging.getLogger(t) 45 | self.assertEqual(b_logger.level, 10) 46 | 47 | def test_logging_json_keys(self): 48 | with self.assertLogs() as ctx: 49 | logger = logging.getLogger() 50 | handler = logging.StreamHandler() 51 | logger.addHandler(handler) 52 | setup(level='DEBUG', formatter_cls=JsonFormatter, RequestType='ContainerInit') 53 | logger.info("test") 54 | logs = json.loads(ctx.output[0]) 55 | self.assertEqual(["timestamp", "level", "location", "RequestType", "message"], list(logs.keys())) 56 | 57 | def test_logging_json_parse_message(self): 58 | with self.assertLogs() as ctx: 59 | logger = logging.getLogger() 60 | handler = logging.StreamHandler() 61 | logger.addHandler(handler) 62 | setup(level='DEBUG', formatter_cls=JsonFormatter, RequestType='ContainerInit') 63 | logger.info("{}") 64 | logs = json.loads(ctx.output[0]) 65 | self.assertEqual({}, logs["message"]) 66 | 67 | def test_logging_json_exception(self): 68 | with self.assertLogs() as ctx: 69 | logger = logging.getLogger() 70 | handler = logging.StreamHandler() 71 | logger.addHandler(handler) 72 | setup(level='DEBUG', formatter_cls=JsonFormatter, RequestType='ContainerInit') 73 | try: 74 | 1 + 't' 75 | except Exception as e: 76 | logger.info("[]", exc_info=True) 77 | logs = json.loads(ctx.output[0]) 78 | self.assertIn("exception", logs.keys()) 79 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/tests/test_resource_helper.py: -------------------------------------------------------------------------------- 1 | import os 2 | import crhelper 3 | import unittest 4 | from unittest.mock import call, patch, Mock 5 | import threading 6 | 7 | test_events = { 8 | "Create": { 9 | "RequestType": "Create", 10 | "RequestId": "test-event-id", 11 | "StackId": "arn/test-stack-id/guid", 12 | "LogicalResourceId": "TestResourceId", 13 | "ResponseURL": "response_url" 14 | }, 15 | "Update": { 16 | "RequestType": "Update", 17 | "RequestId": "test-event-id", 18 | "StackId": "test-stack-id", 19 | "LogicalResourceId": "TestResourceId", 20 | "PhysicalResourceId": "test-pid", 21 | "ResponseURL": "response_url" 22 | }, 23 | "Delete": { 24 | "RequestType": "Delete", 25 | "RequestId": "test-event-id", 26 | "StackId": "test-stack-id", 27 | "LogicalResourceId": "TestResourceId", 28 | "PhysicalResourceId": "test-pid", 29 | "ResponseURL": "response_url" 30 | } 31 | } 32 | 33 | 34 | class MockContext(object): 35 | 36 | function_name = "test-function" 37 | ms_remaining = 9000 38 | 39 | @staticmethod 40 | def get_remaining_time_in_millis(): 41 | return MockContext.ms_remaining 42 | 43 | 44 | class TestCfnResource(unittest.TestCase): 45 | def setUp(self): 46 | os.environ['AWS_REGION'] = 'us-east-1' 47 | 48 | def tearDown(self): 49 | os.environ.pop('AWS_REGION', None) 50 | 51 | @patch('crhelper.log_helper.setup', return_value=None) 52 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 53 | def test_init(self, mock_method): 54 | crhelper.resource_helper.CfnResource() 55 | mock_method.assert_called_once_with('DEBUG', boto_level='ERROR', formatter_cls=None) 56 | 57 | crhelper.resource_helper.CfnResource(json_logging=True) 58 | mock_method.assert_called_with('DEBUG', boto_level='ERROR', RequestType='ContainerInit') 59 | 60 | @patch('crhelper.log_helper.setup', return_value=None) 61 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 62 | def test_init_failure(self, mock_method): 63 | mock_method.side_effect = Exception("test") 64 | c = crhelper.resource_helper.CfnResource(json_logging=True) 65 | self.assertTrue(c._init_failed) 66 | 67 | @patch('crhelper.log_helper.setup', Mock()) 68 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 69 | @patch('crhelper.resource_helper.CfnResource._polling_init', Mock()) 70 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 71 | @patch('crhelper.resource_helper.CfnResource._send') 72 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 73 | @patch('crhelper.resource_helper.CfnResource._wrap_function', Mock()) 74 | def test_init_failure_call(self, mock_send): 75 | c = crhelper.resource_helper.CfnResource() 76 | c.init_failure(Exception('TestException')) 77 | 78 | event = test_events["Create"] 79 | c.__call__(event, MockContext) 80 | 81 | self.assertEqual([call('FAILED', 'TestException')], mock_send.call_args_list) 82 | 83 | @patch('crhelper.log_helper.setup', Mock()) 84 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 85 | @patch('crhelper.resource_helper.CfnResource._polling_init', Mock()) 86 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 87 | @patch('crhelper.resource_helper.CfnResource._send', Mock()) 88 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 89 | @patch('crhelper.resource_helper.CfnResource._wrap_function', Mock()) 90 | @patch('crhelper.resource_helper.CfnResource._cfn_response', return_value=None) 91 | def test_call(self, cfn_response_mock): 92 | c = crhelper.resource_helper.CfnResource() 93 | event = test_events["Create"] 94 | c.__call__(event, MockContext) 95 | self.assertTrue(c._send_response) 96 | cfn_response_mock.assert_called_once_with(event) 97 | 98 | c._sam_local = True 99 | c._poll_enabled = Mock(return_value=True) 100 | c._polling_init = Mock() 101 | c.__call__(event, MockContext) 102 | c._polling_init.assert_not_called() 103 | self.assertEqual(1, len(cfn_response_mock.call_args_list)) 104 | 105 | c._sam_local = False 106 | c._send_response = False 107 | c.__call__(event, MockContext) 108 | c._polling_init.assert_called() 109 | self.assertEqual(1, len(cfn_response_mock.call_args_list)) 110 | 111 | event = test_events["Delete"] 112 | c._wait_for_cwlogs = Mock() 113 | c._poll_enabled = Mock(return_value=False) 114 | c.__call__(event, MockContext) 115 | c._wait_for_cwlogs.assert_called() 116 | 117 | c._send = Mock() 118 | cfn_response_mock.side_effect = Exception("test") 119 | c.__call__(event, MockContext) 120 | c._send.assert_called_with('FAILED', "test") 121 | 122 | @patch('crhelper.log_helper.setup', Mock()) 123 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 124 | @patch('crhelper.resource_helper.CfnResource._polling_init', Mock()) 125 | @patch('crhelper.resource_helper.CfnResource._send', Mock()) 126 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 127 | @patch('crhelper.resource_helper.CfnResource._wrap_function', Mock()) 128 | @patch('crhelper.resource_helper.CfnResource._cfn_response', Mock(return_value=None)) 129 | def test_wait_for_cwlogs(self): 130 | 131 | c = crhelper.resource_helper.CfnResource() 132 | c._context = MockContext 133 | s = Mock() 134 | c._wait_for_cwlogs(sleep=s) 135 | s.assert_not_called() 136 | MockContext.ms_remaining = 140000 137 | c._wait_for_cwlogs(sleep=s) 138 | s.assert_called_once() 139 | 140 | @patch('crhelper.log_helper.setup', Mock()) 141 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 142 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 143 | @patch('crhelper.resource_helper.CfnResource._send', Mock()) 144 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 145 | @patch('crhelper.resource_helper.CfnResource._wrap_function', Mock()) 146 | @patch('crhelper.resource_helper.CfnResource._cfn_response', Mock()) 147 | def test_polling_init(self): 148 | c = crhelper.resource_helper.CfnResource() 149 | event = test_events['Create'] 150 | c._setup_polling = Mock() 151 | c._remove_polling = Mock() 152 | c._polling_init(event) 153 | c._setup_polling.assert_called_once() 154 | c._remove_polling.assert_not_called() 155 | self.assertEqual(c.PhysicalResourceId, None) 156 | 157 | c.Status = 'FAILED' 158 | c._setup_polling.assert_called_once() 159 | c._setup_polling.assert_called_once() 160 | 161 | c = crhelper.resource_helper.CfnResource() 162 | event = test_events['Create'] 163 | c._setup_polling = Mock() 164 | c._remove_polling = Mock() 165 | event['CrHelperPoll'] = "Some stuff" 166 | c.PhysicalResourceId = None 167 | c._polling_init(event) 168 | c._remove_polling.assert_not_called() 169 | c._setup_polling.assert_not_called() 170 | 171 | c.Status = 'FAILED' 172 | c._polling_init(event) 173 | c._remove_polling.assert_called_once() 174 | c._setup_polling.assert_not_called() 175 | 176 | c.Status = '' 177 | c.PhysicalResourceId = "some-id" 178 | c._remove_polling.assert_called() 179 | c._setup_polling.assert_not_called() 180 | 181 | @patch('crhelper.log_helper.setup', Mock()) 182 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 183 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 184 | @patch('crhelper.resource_helper.CfnResource._send', Mock()) 185 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 186 | @patch('crhelper.resource_helper.CfnResource._wrap_function', Mock()) 187 | def test_cfn_response(self): 188 | c = crhelper.resource_helper.CfnResource() 189 | event = test_events['Create'] 190 | c._send = Mock() 191 | 192 | orig_pid = c.PhysicalResourceId 193 | self.assertEqual(orig_pid, '') 194 | c._cfn_response(event) 195 | c._send.assert_called_once() 196 | print("RID: [%s]" % [c.PhysicalResourceId]) 197 | self.assertEqual(True, c.PhysicalResourceId.startswith('test-stack-id_TestResourceId_')) 198 | 199 | c._send = Mock() 200 | c.PhysicalResourceId = 'testpid' 201 | c._cfn_response(event) 202 | c._send.assert_called_once() 203 | self.assertEqual('testpid', c.PhysicalResourceId) 204 | 205 | c._send = Mock() 206 | c.PhysicalResourceId = True 207 | c._cfn_response(event) 208 | c._send.assert_called_once() 209 | self.assertEqual(True, c.PhysicalResourceId.startswith('test-stack-id_TestResourceId_')) 210 | 211 | c._send = Mock() 212 | c.PhysicalResourceId = '' 213 | event['PhysicalResourceId'] = 'pid-from-event' 214 | c._cfn_response(event) 215 | c._send.assert_called_once() 216 | self.assertEqual('pid-from-event', c.PhysicalResourceId) 217 | 218 | @patch('crhelper.log_helper.setup', Mock()) 219 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 220 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 221 | @patch('crhelper.resource_helper.CfnResource._send', Mock()) 222 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 223 | def test_wrap_function(self): 224 | c = crhelper.resource_helper.CfnResource() 225 | 226 | def func(e, c): 227 | return 'testpid' 228 | 229 | c._wrap_function(func) 230 | self.assertEqual('testpid', c.PhysicalResourceId) 231 | self.assertNotEqual('FAILED', c.Status) 232 | 233 | def func(e, c): 234 | raise Exception('test exception') 235 | 236 | c._wrap_function(func) 237 | self.assertEqual('FAILED', c.Status) 238 | self.assertEqual('test exception', c.Reason) 239 | 240 | @patch('crhelper.log_helper.setup', Mock()) 241 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 242 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 243 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 244 | def test_send(self): 245 | c = crhelper.resource_helper.CfnResource() 246 | s = Mock() 247 | c._send(send_response=s) 248 | s.assert_called_once() 249 | 250 | @patch('crhelper.log_helper.setup', Mock()) 251 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 252 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 253 | @patch('crhelper.resource_helper.CfnResource._send', return_value=None) 254 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 255 | def test_timeout(self, s): 256 | c = crhelper.resource_helper.CfnResource() 257 | c._timeout() 258 | s.assert_called_with('FAILED', "Execution timed out") 259 | 260 | @patch('crhelper.log_helper.setup', Mock()) 261 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 262 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 263 | @patch('crhelper.resource_helper.CfnResource._send', Mock()) 264 | def test_set_timeout(self): 265 | c = crhelper.resource_helper.CfnResource() 266 | c._context = MockContext() 267 | def func(): 268 | return None 269 | 270 | c._set_timeout() 271 | t = threading.Timer(1000, func) 272 | self.assertEqual(type(t), type(c._timer)) 273 | t.cancel() 274 | c._timer.cancel() 275 | 276 | @patch('crhelper.log_helper.setup', Mock()) 277 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 278 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 279 | @patch('crhelper.resource_helper.CfnResource._send', Mock()) 280 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 281 | def test_cleanup_response(self): 282 | c = crhelper.resource_helper.CfnResource() 283 | c.Data = {"CrHelperPoll": 1, "CrHelperPermission": 2, "CrHelperRule": 3} 284 | c._cleanup_response() 285 | self.assertEqual({}, c.Data) 286 | 287 | @patch('crhelper.log_helper.setup', Mock()) 288 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 289 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 290 | @patch('crhelper.resource_helper.CfnResource._send', Mock()) 291 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 292 | def test_remove_polling(self): 293 | c = crhelper.resource_helper.CfnResource() 294 | c._context = MockContext() 295 | 296 | c._events_client.remove_targets = Mock() 297 | c._events_client.delete_rule = Mock() 298 | c._lambda_client.remove_permission = Mock() 299 | 300 | with self.assertRaises(Exception) as e: 301 | c._remove_polling() 302 | 303 | self.assertEqual("failed to cleanup CloudWatch event polling", str(e)) 304 | c._events_client.remove_targets.assert_not_called() 305 | c._events_client.delete_rule.assert_not_called() 306 | c._lambda_client.remove_permission.assert_not_called() 307 | 308 | c._event["CrHelperRule"] = "1/2" 309 | c._event["CrHelperPermission"] = "1/2" 310 | c._remove_polling() 311 | c._events_client.remove_targets.assert_called() 312 | c._events_client.delete_rule.assert_called() 313 | c._lambda_client.remove_permission.assert_called() 314 | 315 | @patch('crhelper.log_helper.setup', Mock()) 316 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 317 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 318 | @patch('crhelper.resource_helper.CfnResource._send', Mock()) 319 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 320 | def test_setup_polling(self): 321 | c = crhelper.resource_helper.CfnResource() 322 | c._context = MockContext() 323 | c._event = test_events["Update"] 324 | c._lambda_client.add_permission = Mock() 325 | c._events_client.put_rule = Mock(return_value={"RuleArn": "arn:aws:lambda:blah:blah:function:blah/blah"}) 326 | c._events_client.put_targets = Mock() 327 | c._setup_polling() 328 | c._events_client.put_targets.assert_called() 329 | c._events_client.put_rule.assert_called() 330 | c._lambda_client.add_permission.assert_called() 331 | 332 | @patch('crhelper.log_helper.setup', Mock()) 333 | @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False)) 334 | @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock()) 335 | @patch('crhelper.resource_helper.CfnResource._send', Mock()) 336 | @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock()) 337 | def test_wrappers(self): 338 | c = crhelper.resource_helper.CfnResource() 339 | 340 | def func(): 341 | pass 342 | 343 | for f in ["create", "update", "delete", "poll_create", "poll_update", "poll_delete"]: 344 | self.assertEqual(None, getattr(c, "_%s_func" % f)) 345 | getattr(c, f)(func) 346 | self.assertEqual(func, getattr(c, "_%s_func" % f)) 347 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/tests/test_utils.py: -------------------------------------------------------------------------------- 1 | import json 2 | from unittest.mock import patch, Mock 3 | from crhelper import utils 4 | import unittest 5 | 6 | 7 | class TestLogHelper(unittest.TestCase): 8 | TEST_URL = "https://test_url/this/is/the/url?query=123#aaa" 9 | 10 | @patch('crhelper.utils.HTTPSConnection', autospec=True) 11 | def test_send_succeeded_response(self, https_connection_mock): 12 | utils._send_response(self.TEST_URL, {}) 13 | https_connection_mock.assert_called_once_with("test_url") 14 | https_connection_mock.return_value.request.assert_called_once_with( 15 | body='{}', 16 | headers={"content-type": "", "content-length": "2"}, 17 | method="PUT", 18 | url="/this/is/the/url?query=123#aaa", 19 | ) 20 | 21 | @patch('crhelper.utils.HTTPSConnection', autospec=True) 22 | def test_send_failed_response(self, https_connection_mock): 23 | utils._send_response(self.TEST_URL, Mock()) 24 | https_connection_mock.assert_called_once_with("test_url") 25 | response = json.loads(https_connection_mock.return_value.request.call_args[1]["body"]) 26 | expected_body = '{"Status": "FAILED", "Data": {}, "Reason": "' + response["Reason"] + '"}' 27 | https_connection_mock.return_value.request.assert_called_once_with( 28 | body=expected_body, 29 | headers={"content-type": "", "content-length": str(len(expected_body))}, 30 | method="PUT", 31 | url="/this/is/the/url?query=123#aaa", 32 | ) 33 | -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/tests/unit/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/unit/__init__.py -------------------------------------------------------------------------------- /source/GenomicsLearningCode/setup/tests/unit/__pycache__/__init__.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/unit/__pycache__/__init__.cpython-38.pyc -------------------------------------------------------------------------------- /source/GenomicsLearningPipe/pipe_cfn.yml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: 2010-09-09 2 | 3 | Description: GenomicsLearningPipe 4 | 5 | Parameters: 6 | ResourcePrefix: 7 | Type: String 8 | Default: GenomicsLearning 9 | ResourcePrefixLowercase: 10 | Type: String 11 | Default: genomicslearning 12 | 13 | Resources: 14 | 15 | SourceEvent: 16 | Type: AWS::Events::Rule 17 | DependsOn: 18 | - CodePipeline 19 | - SourceEventRole 20 | Properties: 21 | Description: Rule for Amazon CloudWatch Events to detect changes to the source 22 | repository and trigger pipeline execution 23 | EventPattern: 24 | detail: 25 | event: 26 | - referenceCreated 27 | - referenceUpdated 28 | referenceName: 29 | - master 30 | referenceType: 31 | - branch 32 | detail-type: 33 | - CodeCommit Repository State Change 34 | resources: 35 | - !Sub ${Repo.Arn} 36 | source: 37 | - aws.codecommit 38 | Name: !Sub ${Repo}-Pipeline-Trigger 39 | State: ENABLED 40 | Targets: 41 | - Arn: !Sub arn:aws:codepipeline:${AWS::Region}:${AWS::AccountId}:${CodePipeline} 42 | Id: ProjectPipelineTarget 43 | RoleArn: !Sub ${SourceEventRole.Arn} 44 | 45 | Repo: 46 | DeletionPolicy: Retain 47 | Type: AWS::CodeCommit::Repository 48 | Properties: 49 | RepositoryName: !Sub ${ResourcePrefix} 50 | RepositoryDescription: !Sub ${ResourcePrefix} 51 | 52 | CodeBuildCopyResourcesProject: 53 | Type: AWS::CodeBuild::Project 54 | DependsOn: 55 | - BuildBucket 56 | - CodeBuildRole 57 | - ResourcesBucket 58 | Properties: 59 | Name: !Sub ${ResourcePrefix}CopyResources 60 | Description: !Sub ${ResourcePrefix}CopyResources 61 | Artifacts: 62 | Type: CODEPIPELINE 63 | Environment: 64 | Type: LINUX_CONTAINER 65 | ComputeType: BUILD_GENERAL1_SMALL 66 | Image: aws/codebuild/standard:3.0 67 | EnvironmentVariables: 68 | - Name: RESOURCES_BUCKET 69 | Value: !Sub ${ResourcesBucket} 70 | ServiceRole: !Sub ${CodeBuildRole.Arn} 71 | Source: 72 | Type: CODEPIPELINE 73 | BuildSpec: copyresources_buildspec.yml 74 | Metadata: 75 | cfn_nag: 76 | rules_to_suppress: 77 | - id: W32 78 | reason: Artifact outputs are encrypted by default. 79 | 80 | CodePipeline: 81 | Type: AWS::CodePipeline::Pipeline 82 | DependsOn: 83 | - CodeBuildCopyResourcesProject 84 | - CodePipelineRole 85 | - Repo 86 | Properties: 87 | ArtifactStore: 88 | Location: !Ref BuildBucket 89 | Type: S3 90 | Name: !Sub ${ResourcePrefix}CodePipeline 91 | RoleArn: !GetAtt CodePipelineRole.Arn 92 | Stages: 93 | - Name: Source 94 | Actions: 95 | - Name: CodeCommitRepo 96 | ActionTypeId: 97 | Category: Source 98 | Owner: AWS 99 | Provider: CodeCommit 100 | Version: 1 101 | Configuration: 102 | BranchName: master 103 | RepositoryName: !Sub ${ResourcePrefix} 104 | PollForSourceChanges: false 105 | OutputArtifacts: 106 | - Name: SourceStageOutput 107 | - Name: Build 108 | Actions: 109 | - Name: CopyResources 110 | ActionTypeId: 111 | Category: Build 112 | Owner: AWS 113 | Provider: CodeBuild 114 | Version: 1 115 | Configuration: 116 | ProjectName: !Sub ${ResourcePrefix}CopyResources 117 | InputArtifacts: 118 | - Name: SourceStageOutput 119 | - Name: CreateStack 120 | Actions: 121 | - Name: CreateStack 122 | ActionTypeId: 123 | Category: Deploy 124 | Owner: AWS 125 | Provider: CloudFormation 126 | Version: 1 127 | Configuration: 128 | StackName: !Sub ${ResourcePrefix} 129 | ActionMode: CREATE_UPDATE 130 | Capabilities: CAPABILITY_NAMED_IAM 131 | RoleArn: !Sub ${CloudFormationRole.Arn} 132 | TemplatePath: !Sub SourceStageOutput::code_cfn.yml 133 | ParameterOverrides: !Sub | 134 | { 135 | "ResourcePrefix" : "${ResourcePrefix}", 136 | "ResourcePrefixLowercase" : "${ResourcePrefixLowercase}", 137 | "ResourcesBucket" : "${ResourcesBucket}", 138 | "DataLakeBucket": "${DataLakeBucket}" 139 | } 140 | InputArtifacts: 141 | - Name: SourceStageOutput 142 | OutputArtifacts: [] 143 | 144 | CloudFormationRole: 145 | Type: AWS::IAM::Role 146 | Properties: 147 | Path: / 148 | AssumeRolePolicyDocument: 149 | Version: 2012-10-17 150 | Statement: 151 | - Effect: Allow 152 | Action: 153 | - sts:AssumeRole 154 | Principal: 155 | Service: 156 | - cloudformation.amazonaws.com 157 | Policies: 158 | - PolicyName: CloudFormationRolePolicy 159 | PolicyDocument: 160 | Version: 2012-10-17 161 | Statement: 162 | - Effect: Allow 163 | Action: 164 | - iam:GetRolePolicy 165 | Resource: '*' 166 | - Effect: Allow 167 | Action: 168 | - iam:CreateRole 169 | - iam:DeleteRole 170 | - iam:PutRolePolicy 171 | - iam:GetRolePolicy 172 | - iam:DeleteRolePolicy 173 | - iam:AttachRolePolicy 174 | - iam:DetachRolePolicy 175 | - iam:UpdateAssumeRolePolicy 176 | - iam:PassRole 177 | - iam:GetRole 178 | Resource: 179 | - !Sub arn:aws:iam::${AWS::AccountId}:role/${ResourcePrefix}* 180 | - Effect: Allow 181 | Action: 182 | - glue:CreateJob 183 | - glue:UpdateJob 184 | - glue:DeleteJob 185 | - glue:GetJob 186 | Resource: '*' 187 | - Effect: Allow 188 | Action: 189 | - s3:CreateBucket 190 | - s3:DeleteBucket 191 | - s3:GetObject 192 | Resource: 193 | - !Sub ${BuildBucket.Arn} 194 | - !Sub ${BuildBucket.Arn}/* 195 | - Effect: Allow 196 | Action: 197 | - sagemaker:CreateNotebookInstanceLifecycleConfig 198 | - sagemaker:DescribeNotebookInstanceLifecycleConfig 199 | - sagemaker:UpdateNotebookInstanceLifecycleConfig 200 | - sagemaker:DeleteNotebookInstanceLifecycleConfig 201 | - sagemaker:CreateNotebookInstance 202 | - sagemaker:UpdateNotebookInstance 203 | - sagemaker:StartNotebookInstance 204 | - sagemaker:DescribeNotebookInstance 205 | - sagemaker:DeleteNotebookInstance 206 | - sagemaker:StopNotebookInstance 207 | Resource: 208 | - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:notebook-instance-lifecycle-config/${ResourcePrefixLowercase}* 209 | - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:notebook-instance/${ResourcePrefixLowercase}* 210 | Metadata: 211 | cfn_nag: 212 | rules_to_suppress: 213 | - id: W11 214 | reason: AWS Glue requires * resources for the spedified actions. Same for get role policy. 215 | 216 | CodeBuildRole: 217 | Type: AWS::IAM::Role 218 | DependsOn: ResourcesBucket 219 | Properties: 220 | AssumeRolePolicyDocument: 221 | Version: 2012-10-17 222 | Statement: 223 | - Action: 224 | - sts:AssumeRole 225 | Effect: Allow 226 | Principal: 227 | Service: 228 | - codebuild.amazonaws.com 229 | Path: / 230 | Policies: 231 | - PolicyName: CodeBuildAccess 232 | PolicyDocument: 233 | Version: 2012-10-17 234 | Statement: 235 | - Effect: Allow 236 | Resource: 237 | - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/codebuild/${ResourcePrefix}* 238 | Action: 239 | - logs:CreateLogGroup 240 | - logs:CreateLogStream 241 | - logs:PutLogEvents 242 | - Effect: Allow 243 | Action: 244 | - s3:GetObject 245 | - s3:GetObjectVersion 246 | - s3:PutObject 247 | Resource: !Sub ${BuildBucket.Arn}/* 248 | - Effect: Allow 249 | Action: 250 | - s3:ListBucket 251 | Resource: 252 | - !Sub ${ResourcesBucket.Arn} 253 | - !Sub ${DataLakeBucket.Arn} 254 | - Effect: Allow 255 | Action: 256 | - s3:PutObject 257 | - s3:PutObjectAcl 258 | Resource: 259 | - !Sub ${ResourcesBucket.Arn} 260 | - !Sub ${ResourcesBucket.Arn}/* 261 | - !Sub ${DataLakeBucket.Arn} 262 | - !Sub ${DataLakeBucket.Arn}/* 263 | 264 | CodePipelineRole: 265 | Type: AWS::IAM::Role 266 | Properties: 267 | AssumeRolePolicyDocument: 268 | Version: 2012-10-17 269 | Statement: 270 | - Action: 271 | - sts:AssumeRole 272 | Effect: Allow 273 | Principal: 274 | Service: 275 | - codepipeline.amazonaws.com 276 | Path: / 277 | Policies: 278 | - PolicyName: CloudFormationAccess 279 | PolicyDocument: 280 | Version: 2012-10-17 281 | Statement: 282 | - Action: 283 | - cloudformation:CreateStack 284 | - cloudformation:DescribeStacks 285 | - cloudformation:UpdateStack 286 | Effect: Allow 287 | Resource: !Sub arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${ResourcePrefix}/* 288 | - PolicyName: S3Access 289 | PolicyDocument: 290 | Version: 2012-10-17 291 | Statement: 292 | - Effect: Allow 293 | Action: 294 | - s3:GetObject 295 | - s3:GetObjectVersion 296 | - s3:GetBucketVersioning 297 | - s3:DeleteObject 298 | - s3:PutObject 299 | Resource: 300 | - !Sub ${BuildBucket.Arn} 301 | - !Sub ${BuildBucket.Arn}/* 302 | - PolicyName: CodeBuildAccess 303 | PolicyDocument: 304 | Version: 2012-10-17 305 | Statement: 306 | - Action: 307 | - codebuild:StartBuild 308 | - codebuild:BatchGetBuilds 309 | Effect: Allow 310 | Resource: 311 | - !GetAtt CodeBuildCopyResourcesProject.Arn 312 | - PolicyName: IamAccess 313 | PolicyDocument: 314 | Version: 2012-10-17 315 | Statement: 316 | - Action: 317 | - iam:PassRole 318 | Effect: Allow 319 | Resource: !GetAtt CodeBuildRole.Arn 320 | - PolicyName: IamAccessCF 321 | PolicyDocument: 322 | Version: 2012-10-17 323 | Statement: 324 | - Action: 325 | - iam:PassRole 326 | Effect: Allow 327 | Resource: !Sub ${CloudFormationRole.Arn} 328 | - PolicyName: CodeCommitAccess 329 | PolicyDocument: 330 | Version: 2012-10-17 331 | Statement: 332 | - Effect: Allow 333 | Action: 334 | - codecommit:UploadArchive 335 | - codecommit:GetBranch 336 | - codecommit:GetCommit 337 | - codecommit:GetUploadArchiveStatus 338 | Resource: !GetAtt Repo.Arn 339 | 340 | SourceEventRole: 341 | Type: AWS::IAM::Role 342 | DependsOn: CodePipeline 343 | Description: IAM role to allow Amazon CloudWatch Events to trigger AWS CodePipeline 344 | execution 345 | Properties: 346 | AssumeRolePolicyDocument: 347 | Statement: 348 | - Action: sts:AssumeRole 349 | Effect: Allow 350 | Principal: 351 | Service: 352 | - events.amazonaws.com 353 | Sid: 1 354 | Policies: 355 | - PolicyName: CloudWatchEventPolicy 356 | PolicyDocument: 357 | Statement: 358 | - Action: 359 | - codepipeline:StartPipelineExecution 360 | Effect: Allow 361 | Resource: 362 | - !Sub arn:aws:codepipeline:${AWS::Region}:${AWS::AccountId}:${CodePipeline}* 363 | 364 | BuildBucket: 365 | Type: AWS::S3::Bucket 366 | Properties: 367 | PublicAccessBlockConfiguration: 368 | BlockPublicAcls: True 369 | BlockPublicPolicy: True 370 | IgnorePublicAcls: True 371 | RestrictPublicBuckets: True 372 | LoggingConfiguration: 373 | DestinationBucketName: !Ref LogsBucket 374 | LogFilePrefix: templates_logs/ 375 | BucketEncryption: 376 | ServerSideEncryptionConfiguration: 377 | - ServerSideEncryptionByDefault: 378 | SSEAlgorithm: AES256 379 | Metadata: 380 | cfn_nag: 381 | rules_to_suppress: 382 | - id: W51 383 | reason: Bucket policy is not needed. 384 | 385 | DataLakeBucket: 386 | Type: AWS::S3::Bucket 387 | Properties: 388 | PublicAccessBlockConfiguration: 389 | BlockPublicAcls: True 390 | BlockPublicPolicy: True 391 | IgnorePublicAcls: True 392 | RestrictPublicBuckets: True 393 | LoggingConfiguration: 394 | DestinationBucketName: !Ref LogsBucket 395 | LogFilePrefix: templates_logs/ 396 | BucketEncryption: 397 | ServerSideEncryptionConfiguration: 398 | - ServerSideEncryptionByDefault: 399 | SSEAlgorithm: AES256 400 | Metadata: 401 | cfn_nag: 402 | rules_to_suppress: 403 | - id: W51 404 | reason: Bucket policy is not needed. 405 | 406 | ResourcesBucket: 407 | Type: AWS::S3::Bucket 408 | Properties: 409 | PublicAccessBlockConfiguration: 410 | BlockPublicAcls: True 411 | BlockPublicPolicy: True 412 | IgnorePublicAcls: True 413 | RestrictPublicBuckets: True 414 | LoggingConfiguration: 415 | DestinationBucketName: !Ref LogsBucket 416 | LogFilePrefix: templates_logs/ 417 | BucketEncryption: 418 | ServerSideEncryptionConfiguration: 419 | - ServerSideEncryptionByDefault: 420 | SSEAlgorithm: AES256 421 | Metadata: 422 | cfn_nag: 423 | rules_to_suppress: 424 | - id: W51 425 | reason: Bucket policy is not needed. 426 | 427 | LogsBucket: 428 | DeletionPolicy: Retain 429 | Type: AWS::S3::Bucket 430 | Properties: 431 | PublicAccessBlockConfiguration: 432 | BlockPublicAcls: True 433 | BlockPublicPolicy: True 434 | IgnorePublicAcls: True 435 | RestrictPublicBuckets: True 436 | AccessControl: LogDeliveryWrite 437 | BucketEncryption: 438 | ServerSideEncryptionConfiguration: 439 | - ServerSideEncryptionByDefault: 440 | SSEAlgorithm: AES256 441 | Metadata: 442 | cfn_nag: 443 | rules_to_suppress: 444 | - id: W35 445 | reason: This is the pipeline and solution log bucket and does not require access logging to be configured. 446 | - id: W51 447 | reason: Bucket policy is not needed. 448 | 449 | Outputs: 450 | LogsBucket: 451 | Value: !Ref LogsBucket 452 | BuildBucket: 453 | Value: !Ref BuildBucket 454 | RepoName: 455 | Description: RepoName 456 | Value: !Sub ${Repo.Name} 457 | RepoHttpUrl: 458 | Description: RepoHttpUrl 459 | Value: !Sub ${Repo.CloneUrlHttp} 460 | ResourcesBucket: 461 | Value: !Ref ResourcesBucket 462 | DataLakeBucket: 463 | Value: !Ref DataLakeBucket 464 | Export: 465 | Name: !Sub ${AWS::StackName}-DataLakeBucket 466 | DataLakeBucketArn: 467 | Value: !GetAtt DataLakeBucket.Arn 468 | Export: 469 | Name: !Sub ${AWS::StackName}-DataLakeBucketArn 470 | 471 | # aws cloudformation update-stack --stack-name ${PROJECT_NAME:-GenomicsLearning}-Pipeline --template-body file://pipe_cfn.yml --capabilities CAPABILITY_NAMED_IAM --output text --parameters ParameterKey=ResourcePrefix,ParameterValue=${PROJECT_NAME:-GenomicsLearning} ParameterKey=ResourcePrefixLowercase,ParameterValue=$(echo ${PROJECT_NAME:-GenomicsLearning} | tr '[:upper:]' '[:lower:]'); aws cloudformation wait stack-update-complete --stack-name ${PROJECT_NAME:-GenomicsLearning}-Pipeline 472 | -------------------------------------------------------------------------------- /source/GenomicsLearningZone/zone_cfn.yml: -------------------------------------------------------------------------------- 1 | --- 2 | AWSTemplateFormatVersion: 2010-09-09 3 | Description: GenomicsLearningZone 4 | 5 | # CodeCommit 6 | # Repo 7 | 8 | Parameters: 9 | ResourcePrefix: 10 | Type: String 11 | Default: GenomicsLearning 12 | ResourcePrefixLowercase: 13 | Type: String 14 | Default: genomicslearning 15 | 16 | Resources: 17 | # CodeCommit 18 | Repo: 19 | DeletionPolicy: Retain 20 | Type: AWS::CodeCommit::Repository 21 | Properties: 22 | RepositoryName: !Sub ${ResourcePrefix}-Pipe 23 | RepositoryDescription: !Sub ${ResourcePrefix}-Pipe 24 | Outputs: 25 | RepoName: 26 | Description: RepoName 27 | Value: !Sub ${Repo.Name} 28 | RepoHttpUrl: 29 | Description: RepoHttpUrl 30 | Value: !Sub ${Repo.CloneUrlHttp} 31 | 32 | # aws cloudformation update-stack --stack-name GenomicsLearningZone --template-body file://template_cfn.yml --capabilities CAPABILITY_IAM --output text; aws cloudformation wait stack-update-complete --stack-name GenomicsLearningZone 33 | 34 | # aws cloudformation create-stack --stack-name GenomicsLearningZone --template-body file://template_cfn.yml --capabilities CAPABILITY_IAM --enable-termination-protection --output text; aws cloudformation wait stack-create-complete --stack-name GenomicsLearningZone; aws cloudformation describe-stacks --stack-name GenomicsLearningZone --query 'Stacks[].Outputs[?OutputKey==`RepoCloneCommand`].OutputValue' --output text 35 | 36 | -------------------------------------------------------------------------------- /source/setup.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash -e 2 | 3 | export AWS_DEFAULT_OUTPUT=text 4 | 5 | create_stack() { 6 | local stack_name=${1} 7 | local template_name=${2} 8 | local ResourcePrefix=${3} 9 | 10 | local ResourcePrefix_lowercase=$(echo ${ResourcePrefix} | tr '[:upper:]' '[:lower:]') 11 | 12 | aws cloudformation create-stack --stack-name ${stack_name} --template-body file://${template_name} --parameters ParameterKey=ResourcePrefix,ParameterValue=${ResourcePrefix} ParameterKey=ResourcePrefixLowercase,ParameterValue=${ResourcePrefix_lowercase} --capabilities CAPABILITY_NAMED_IAM --no-enable-termination-protection; aws cloudformation wait stack-create-complete --stack-name ${stack_name} 13 | } 14 | 15 | clone_and_commit() { 16 | local stack_name=${1} 17 | 18 | local repo_http_url=$(aws cloudformation describe-stacks --stack-name ${stack_name} --query 'Stacks[].Outputs[?OutputKey==`RepoHttpUrl`].OutputValue') 19 | 20 | git init .; git remote add origin ${repo_http_url} 21 | 22 | git add *; git commit -m "first commit"; git push --set-upstream origin master 23 | } 24 | 25 | wait_for_pipeline() { 26 | local pipeline_name=${1} 27 | local commit_id=${2} 28 | 29 | local message="Max attempts reached. Pipeline execution failed for commit: ${commit_id}" 30 | for i in {1..60}; do 31 | 32 | stage_status=$(aws codepipeline list-pipeline-executions --pipeline-name ${pipeline_name} --query 'pipelineExecutionSummaries[?sourceRevisions[0].revisionId==`'${commit_id}'`].status') 33 | 34 | if [ "${stage_status}" == "InProgress" ] || [ -z "${stage_status}" ]; then 35 | printf '.' 36 | sleep 30 37 | elif [ "${stage_status}" == "Succeeded" ]; then 38 | message="Pipeline execution succeeded for commit: ${commit_id}" 39 | break 40 | elif [ "${stage_status}" == "Failed" ]; then 41 | message="Pipeline execution Failed for commit: ${commit_id}" 42 | break 43 | fi 44 | 45 | done 46 | printf "\n${message}\n" 47 | } 48 | 49 | copy_test_data() { 50 | local artifact_bucket=${1} 51 | local artifact_key_prefix=${2} 52 | local pipe_stackname=${3} 53 | 54 | local data_lake_bucket=$(aws cloudformation describe-stacks --stack-name ${pipe_stackname} --query 'Stacks[].Outputs[?OutputKey==`DataLakeBucket`].OutputValue' --output text) 55 | 56 | aws s3 cp s3://${artifact_bucket}/${artifact_key_prefix}/annotation/clinvar/clinvar.vcf.gz s3://${data_lake_bucket}/annotation/clinvar/clinvar.vcf.gz 57 | aws s3 cp s3://${artifact_bucket}/${artifact_key_prefix}/annotation/clinvar/clinvar.annotated.vcf.gz s3://${data_lake_bucket}/annotation/clinvar/clinvar.annotated.vcf.gz 58 | aws s3 cp s3://${artifact_bucket}/${artifact_key_prefix}/annotation/clinvar/clinvar_conflicting.csv s3://${data_lake_bucket}/annotation/clinvar/conflicting/clinvar_conflicting.csv 59 | } 60 | 61 | setup() { 62 | 63 | local resource_prefix=$1 64 | local artifact_bucket=$2 65 | local artifact_key_prefix=$3 66 | 67 | local dir_prefix="GenomicsLearning" 68 | 69 | local zone_dir="${dir_prefix}Zone" 70 | local pipe_dir="${dir_prefix}Pipe" 71 | local code_dir="${dir_prefix}Code" 72 | 73 | local zone_stackname=${resource_prefix}-LandingZone 74 | local pipe_stackname=${resource_prefix}-Pipeline 75 | 76 | # Create stacks 77 | create_stack "${zone_stackname}" "${zone_dir}/zone_cfn.yml" "${resource_prefix}" 78 | create_stack "${pipe_stackname}" "${pipe_dir}/pipe_cfn.yml" "${resource_prefix}" 79 | 80 | # Clone and commit resources 81 | cd "${pipe_dir}"; clone_and_commit "${zone_stackname}"; cd .. 82 | cd "${code_dir}"; clone_and_commit "${pipe_stackname}"; 83 | 84 | # Get the last commit id 85 | commit_id=$(git log -1 --pretty=format:%H) 86 | cd .. 87 | 88 | # Get pipeline name 89 | pipeline_name=$(aws cloudformation describe-stack-resource --stack-name ${pipe_stackname} --logical-resource-id CodePipeline --query 'StackResourceDetail.PhysicalResourceId') 90 | 91 | # Wait for pipeline execution using commit id 92 | wait_for_pipeline "${pipeline_name}" "${commit_id}" 93 | 94 | # Copy Test Data 95 | copy_test_data "${artifact_bucket}" "${artifact_key_prefix}" "${pipe_stackname}" 96 | 97 | # Run Test 98 | "${code_dir}/awscli_test.sh" "${resource_prefix}" 99 | } 100 | 101 | project_name=${PROJECT_NAME:-GenomicsLearning} 102 | 103 | setup "$project_name" "${ARTIFACT_BUCKET}" "${ARTIFACT_KEY_PREFIX}" 104 | -------------------------------------------------------------------------------- /source/setup_cfn.yml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: 2010-09-09 2 | 3 | Description: | 4 | (SO0078) - The Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker solution creates a scalable environment 5 | in AWS to develop machine learning models using genomics data, generate predictions, and evaluate model performance. 6 | This solution demonstrates how to 1) automate the preparation of a genomics machine learning training dataset, 7 | 2) develop genomics machine learning model training and deployment pipelines and, 8 | 3) generate predictions and evaluate model performance using test data. 9 | 10 | Mappings: 11 | Send: 12 | AnonymousUsage: 13 | Data: Yes 14 | SourceCode: 15 | General: 16 | S3Bucket: '%%BUCKET_NAME%%' 17 | KeyPrefix: '%%SOLUTION_NAME%%/%%VERSION%%' 18 | 19 | Parameters: 20 | Project: 21 | Type: String 22 | Description: > 23 | The project name for this solution. The project name will be used to prefix resources created by this solution. Project names should be unique to a project. 24 | AllowedPattern: "[a-zA-Z0-9-]{3,24}" 25 | ConstraintDescription: > 26 | Project name should be unique, 3-24 characters in length, and only have alphanumeric characters and hyphens ([a-zA-Z0-9-]{3,32}). 27 | Default: GenomicsLearning 28 | 29 | Resources: 30 | Setup: 31 | Type: Custom::Setup 32 | DependsOn: 33 | - CodeBuild 34 | Version: 1.0 35 | Properties: 36 | ServiceToken: !Sub ${SetupLambda.Arn} 37 | CodeBuildProjectName: !Sub ${CodeBuild} 38 | 39 | SetupLambda: 40 | Type: AWS::Lambda::Function 41 | DependsOn: 42 | - SetupLambdaRole 43 | Properties: 44 | Handler: lambda.handler 45 | Runtime: python3.8 46 | FunctionName: !Sub ${Project}Setup 47 | Code: 48 | S3Bucket: !Join ["-", [!FindInMap ["SourceCode", "General", "S3Bucket"], Ref: "AWS::Region"]] 49 | S3Key: !Join ["", [!FindInMap ["SourceCode", "General", "KeyPrefix"], "/SolutionSetup.zip"]] 50 | Role: !Sub ${SetupLambdaRole.Arn} 51 | Timeout: 600 52 | Metadata: 53 | cfn_nag: 54 | rules_to_suppress: 55 | - id: W58 56 | reason: Bug in CfnNag. Lambda functions require permission to write CloudWatch Logs. Looking for PutLogEvent instead of PutLogEvents 57 | 58 | SetupLambdaRole: 59 | Type: AWS::IAM::Role 60 | DependsOn: 61 | - CodeBuild 62 | Properties: 63 | AssumeRolePolicyDocument: 64 | Version: 2012-10-17 65 | Statement: 66 | - Action: 67 | - sts:AssumeRole 68 | Effect: Allow 69 | Principal: 70 | Service: 71 | - lambda.amazonaws.com 72 | Path: / 73 | Policies: 74 | - PolicyName: LogsAccess 75 | PolicyDocument: 76 | Statement: 77 | - Effect: Allow 78 | Action: 79 | - logs:CreateLogGroup 80 | - logs:CreateLogStream 81 | - logs:PutLogEvents 82 | Resource: 83 | - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/lambda/${Project}* 84 | - PolicyName: CodeBuildAccess 85 | PolicyDocument: 86 | Statement: 87 | - Effect: Allow 88 | Action: 89 | - codebuild:BatchGetProjects 90 | - codebuild:BatchGetBuilds 91 | - codebuild:StartBuild 92 | Resource: 93 | - !Sub ${CodeBuild.Arn} 94 | - PolicyName: EventsAccess 95 | PolicyDocument: 96 | Statement: 97 | - Effect: Allow 98 | Action: 99 | - events:DeleteRule 100 | - events:PutRule 101 | - events:PutTargets 102 | - events:RemoveTargets 103 | Resource: 104 | - !Sub arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/Setup* 105 | - PolicyName: LambdaAccess 106 | PolicyDocument: 107 | Statement: 108 | - Effect: Allow 109 | Action: 110 | - lambda:AddPermission 111 | - lambda:RemovePermission 112 | Resource: 113 | - !Sub arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:${Project}* 114 | 115 | CodeBuildRole: 116 | Type: AWS::IAM::Role 117 | Properties: 118 | AssumeRolePolicyDocument: 119 | Version: 2012-10-17 120 | Statement: 121 | - Action: 122 | - sts:AssumeRole 123 | Effect: Allow 124 | Principal: 125 | Service: 126 | - codebuild.amazonaws.com 127 | Path: / 128 | Policies: 129 | - PolicyName: CloudFormationAccess 130 | PolicyDocument: 131 | Statement: 132 | - Action: 133 | - cloudformation:CreateStack 134 | - cloudformation:DescribeStacks 135 | - cloudformation:DescribeStackResource 136 | - cloudformation:UpdateStack 137 | - cloudformation:DeleteStack 138 | - cloudformation:UpdateTerminationProtection 139 | Effect: Allow 140 | Resource: !Sub arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${Project}* 141 | - PolicyName: LogsAccess 142 | PolicyDocument: 143 | Statement: 144 | - Effect: Allow 145 | Action: 146 | - logs:CreateLogGroup 147 | - logs:CreateLogStream 148 | - logs:PutLogEvents 149 | Resource: 150 | - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/codebuild/${Project}* 151 | - PolicyName: IAMAccess 152 | PolicyDocument: 153 | Statement: 154 | - Effect: Allow 155 | Action: 156 | - iam:CreateRole 157 | - iam:DeleteRole 158 | - iam:PutRolePolicy 159 | - iam:DeleteRolePolicy 160 | - iam:AttachRolePolicy 161 | - iam:DetachRolePolicy 162 | - iam:UpdateAssumeRolePolicy 163 | - iam:PassRole 164 | - iam:GetRole 165 | - iam:GetInstanceProfile 166 | - iam:CreateInstanceProfile 167 | - iam:DeleteInstanceProfile 168 | - iam:AddRoleToInstanceProfile 169 | - iam:RemoveRoleFromInstanceProfile 170 | Resource: 171 | - !Sub arn:aws:iam::${AWS::AccountId}:role/${Project}* 172 | - !Sub arn:aws:iam::${AWS::AccountId}:instance-profile/${Project}* 173 | - PolicyName: CodeBuildAccess 174 | PolicyDocument: 175 | Statement: 176 | - Effect: Allow 177 | Action: 178 | - codebuild:CreateProject 179 | - codebuild:UpdateProject 180 | - codebuild:ListProjects 181 | - codebuild:BatchGetProjects 182 | - codebuild:DeleteProject 183 | Resource: 184 | - !Sub arn:aws:codebuild:${AWS::Region}:${AWS::AccountId}:project/${Project}* 185 | - PolicyName: CodePipelineAccess 186 | PolicyDocument: 187 | Statement: 188 | - Effect: Allow 189 | Action: 190 | - codepipeline:CreatePipeline 191 | - codepipeline:GetPipeline 192 | - codepipeline:UpdatePipeline 193 | - codepipeline:DeletePipeline 194 | - codepipeline:GetPipelineState 195 | - codepipeline:ListPipelineExecutions 196 | Resource: 197 | - !Sub arn:aws:codepipeline:${AWS::Region}:${AWS::AccountId}:${Project}* 198 | - PolicyName: CodeCommitAccess 199 | PolicyDocument: 200 | Statement: 201 | - Effect: Allow 202 | Action: 203 | - codecommit:CreateBranch 204 | - codecommit:CreateRepository 205 | - codecommit:GetRepository 206 | - codecommit:DeleteRepository 207 | - codecommit:CreateCommit 208 | - codecommit:GitPush 209 | - codecommit:GitPull 210 | - codecommit:DeleteBranch 211 | Resource: 212 | - !Sub arn:aws:codecommit:${AWS::Region}:${AWS::AccountId}:${Project}* 213 | - Effect: Allow 214 | Action: 215 | - codecommit:ListRepositories 216 | Resource: '*' 217 | - PolicyName: EventsAccess 218 | PolicyDocument: 219 | Statement: 220 | - Effect: Allow 221 | Action: 222 | - events:DescribeRule 223 | - events:PutRule 224 | - events:DeleteRule 225 | - events:PutTargets 226 | - events:RemoveTargets 227 | Resource: 228 | - !Sub arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/* 229 | - PolicyName: GlueAccess 230 | PolicyDocument: 231 | Statement: 232 | - Effect: Allow 233 | Action: 234 | - glue:StartJob 235 | - glue:GetJob 236 | Resource: '*' 237 | - PolicyName: S3Access 238 | PolicyDocument: 239 | Statement: 240 | - Effect: Allow 241 | Action: 242 | - s3:GetObject 243 | Resource: 244 | !Join 245 | - '' 246 | - - 'arn:aws:s3:::' 247 | - !FindInMap ["SourceCode", "General", "S3Bucket"] 248 | - '/*' 249 | - Effect: Allow 250 | Action: 251 | - s3:GetObject 252 | Resource: 253 | !Join 254 | - '' 255 | - - 'arn:aws:s3:::' 256 | - !Join 257 | - '-' 258 | - - !FindInMap ["SourceCode", "General", "S3Bucket"] 259 | - Ref: "AWS::Region" 260 | - '/' 261 | - !FindInMap ["SourceCode", "General", "KeyPrefix"] 262 | - '/*' 263 | - Effect: Allow 264 | Action: 265 | - s3:ListBucket 266 | Resource: 267 | !Join 268 | - '' 269 | - - 'arn:aws:s3:::' 270 | - !FindInMap ["SourceCode", "General", "S3Bucket"] 271 | 272 | - Effect: Allow 273 | Action: 274 | - s3:PutObjectAcl 275 | - s3:GetObject 276 | - s3:PutObject 277 | - s3:DeleteObject 278 | - s3:ListBucket 279 | - s3:CreateBucket 280 | - s3:DeleteBucket 281 | - s3:PutEncryptionConfiguration 282 | - s3:PutBucketPublicAccessBlock 283 | - s3:PutBucketLogging 284 | - s3:PutBucketAcl 285 | Resource: 286 | - arn:aws:s3:::*pipe* 287 | - arn:aws:s3:::*pipe*/* 288 | - Effect: Allow 289 | Action: 290 | - s3:ListBucket 291 | Resource: 292 | !Join 293 | - '' 294 | - - 'arn:aws:s3:::' 295 | - !FindInMap ["SourceCode", "General", "S3Bucket"] 296 | - Effect: Allow 297 | Action: 298 | - s3:CreateBucket 299 | - s3:DeleteBucket 300 | - s3:ListBucket 301 | - s3:PutEncryptionConfiguration 302 | - s3:PutBucketPublicAccessBlock 303 | - s3:PutBucketLogging 304 | - s3:PutBucketAcl 305 | - s3:PutObject 306 | - s3:PutObjectAcl 307 | Resource: 308 | - arn:aws:s3:::*pipe* 309 | - arn:aws:s3:::*pipe*/* 310 | Metadata: 311 | cfn_nag: 312 | rules_to_suppress: 313 | - id: W11 314 | reason: Star required for codecommit:ListRepositories and Glue actions. 315 | 316 | CodeBuild: 317 | Type: AWS::CodeBuild::Project 318 | Properties: 319 | Name: !Sub ${Project}Setup 320 | Artifacts: 321 | Type: NO_ARTIFACTS 322 | Source: 323 | Type: NO_SOURCE 324 | BuildSpec: !Sub | 325 | version: 0.2 326 | phases: 327 | install: 328 | commands: 329 | - git config --global user.name automated_user 330 | - git config --global user.email automated_email 331 | - git config --global credential.helper '!aws codecommit credential-helper $@' 332 | - git config --global credential.UseHttpPath true 333 | - aws s3 cp s3://$ARTIFACT_BUCKET/$ARTIFACT_KEY_PREFIX/Solution.zip . 334 | - unzip Solution.zip 335 | - ./$SOLUTION_ACTION.sh 336 | Environment: 337 | ComputeType: BUILD_GENERAL1_SMALL 338 | EnvironmentVariables: 339 | - Name: SOLUTION_ACTION 340 | Value: setup 341 | - Name: PROJECT_NAME 342 | Value: !Ref Project 343 | - Name: ARTIFACT_BUCKET 344 | Value: !Join ["-", [!FindInMap ["SourceCode", "General", "S3Bucket"], Ref: "AWS::Region"]] 345 | - Name: ARTIFACT_KEY_PREFIX 346 | Value: !FindInMap ["SourceCode", "General", "KeyPrefix"] 347 | Image: aws/codebuild/standard:3.0 348 | Type: LINUX_CONTAINER 349 | ServiceRole: !Sub ${CodeBuildRole} 350 | TimeoutInMinutes: 30 351 | Metadata: 352 | cfn_nag: 353 | rules_to_suppress: 354 | - id: W32 355 | reason: Customer can enable encryption if desired. 356 | -------------------------------------------------------------------------------- /source/teardown.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash -e 2 | 3 | export AWS_DEFAULT_OUTPUT=text 4 | 5 | export RESOURCE_PREFIX=${PROJECT_NAME:-GenomicsLearning} 6 | export RESOURCE_PREFIX_LOWERCASE=$(echo ${RESOURCE_PREFIX} | tr '[:upper:]' '[:lower:]') 7 | 8 | export ZONE_STACKNAME=${RESOURCE_PREFIX}-LandingZone 9 | export PIPE_STACKNAME=${RESOURCE_PREFIX}-Pipeline 10 | export CODE_STACKNAME=${RESOURCE_PREFIX} 11 | 12 | export REPOSITORY_NAME=${RESOURCE_PREFIX_LOWERCASE} 13 | 14 | # Clear Buckets 15 | 16 | BUILD_BUCKET=$(aws cloudformation describe-stacks --stack-name ${PIPE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`BuildBucket`].OutputValue'); echo ${BUILD_BUCKET} 17 | RESOURCES_BUCKET=$(aws cloudformation describe-stacks --stack-name ${PIPE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`ResourcesBucket`].OutputValue'); echo ${RESOURCES_BUCKET} 18 | DATALAKE_BUCKET=$(aws cloudformation describe-stacks --stack-name ${PIPE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`DataLakeBucket`].OutputValue'); echo ${DATALAKE_BUCKET} 19 | LOGS_BUCKET=$(aws cloudformation describe-stacks --stack-name ${PIPE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`LogsBucket`].OutputValue'); echo ${LOGS_BUCKET} 20 | 21 | aws s3 rm --recursive s3://${BUILD_BUCKET}/ 22 | aws s3 rm --recursive s3://${RESOURCES_BUCKET}/ 23 | aws s3 rm --recursive s3://${DATALAKE_BUCKET}/ 24 | aws s3 rm --recursive s3://${LOGS_BUCKET}/ 25 | 26 | # Disable Termination Protection on Stacks 27 | 28 | aws cloudformation update-termination-protection --no-enable-termination-protection --stack-name ${PIPE_STACKNAME} 29 | aws cloudformation update-termination-protection --no-enable-termination-protection --stack-name ${ZONE_STACKNAME} 30 | 31 | # Get Repo Names from Stacks 32 | 33 | PIPE_REPO=$(aws cloudformation describe-stacks --stack-name ${ZONE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`RepoName`].OutputValue'); echo ${PIPE_REPO} 34 | CODE_REPO=$(aws cloudformation describe-stacks --stack-name ${PIPE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`RepoName`].OutputValue'); echo ${CODE_REPO} 35 | 36 | # Delete Stacks 37 | 38 | aws cloudformation delete-stack --stack-name ${CODE_STACKNAME}; aws cloudformation wait stack-delete-complete --stack-name ${CODE_STACKNAME} 39 | aws cloudformation delete-stack --stack-name ${PIPE_STACKNAME}; aws cloudformation wait stack-delete-complete --stack-name ${PIPE_STACKNAME} 40 | aws cloudformation delete-stack --stack-name ${ZONE_STACKNAME}; aws cloudformation wait stack-delete-complete --stack-name ${ZONE_STACKNAME} 41 | 42 | # Delete Repos 43 | 44 | aws codecommit delete-repository --repository-name ${PIPE_REPO} 45 | aws codecommit delete-repository --repository-name ${CODE_REPO} 46 | 47 | # Cleanup Local Git Repo 48 | 49 | find . \( -name ".git" -o -name ".gitignore" -o -name ".gitmodules" -o -name ".gitattributes" \) -exec rm -rf -- {} + 50 | --------------------------------------------------------------------------------