├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE.txt
├── NOTICE.txt
├── README.md
├── deployment
    ├── build-s3-dist.sh
    ├── genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker.template
    └── run-unit-tests.sh
└── source
    ├── GenomicsLearningCode
        ├── awscli_test.sh
        ├── code_cfn.yml
        ├── copyresources_buildspec.yml
        ├── resources
        │   ├── notebooks
        │   │   ├── variant_classifier-autopilot.ipynb
        │   │   └── variant_predictor.ipynb
        │   └── scripts
        │   │   └── process_clinvar.py
        └── setup
        │   ├── crhelper-2.0.6.dist-info
        │       ├── INSTALLER
        │       ├── LICENSE
        │       ├── METADATA
        │       ├── NOTICE
        │       ├── RECORD
        │       ├── WHEEL
        │       └── top_level.txt
        │   ├── crhelper
        │       ├── __init__.py
        │       ├── __pycache__
        │       │   ├── __init__.cpython-38.pyc
        │       │   ├── log_helper.cpython-38.pyc
        │       │   ├── resource_helper.cpython-38.pyc
        │       │   └── utils.cpython-38.pyc
        │       ├── log_helper.py
        │       ├── resource_helper.py
        │       └── utils.py
        │   ├── lambda.py
        │   ├── requirements.txt
        │   └── tests
        │       ├── __init__.py
        │       ├── __pycache__
        │           ├── __init__.cpython-38.pyc
        │           ├── test_log_helper.cpython-38.pyc
        │           ├── test_resource_helper.cpython-38.pyc
        │           └── test_utils.cpython-38.pyc
        │       ├── test_log_helper.py
        │       ├── test_resource_helper.py
        │       ├── test_utils.py
        │       └── unit
        │           ├── __init__.py
        │           └── __pycache__
        │               └── __init__.cpython-38.pyc
    ├── GenomicsLearningPipe
        └── pipe_cfn.yml
    ├── GenomicsLearningZone
        └── zone_cfn.yml
    ├── setup.sh
    ├── setup_cfn.yml
    └── teardown.sh


/CHANGELOG.md:
--------------------------------------------------------------------------------
1 | # Changelog
2 | All notable changes to this project will be documented in this file.
3 | 
4 | The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
5 | and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6 | 
7 | ## [1.0.0] - 2020-08-03
8 | ### Added
9 | - Initial public release


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | ## Code of Conduct
2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 
3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 
4 | opensource-codeofconduct@amazon.com with any additional questions or comments.
5 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
 1 | # Contributing Guidelines
 2 | 
 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional
 4 | documentation, we greatly value feedback and contributions from our community.
 5 | 
 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
 7 | information to effectively respond to your bug report or contribution.
 8 | 
 9 | 
10 | ## Reporting Bugs/Feature Requests
11 | 
12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features.
13 | 
14 | When filing an issue, please check [existing open](https://github.com/awslabs/genomics-learning/issues), or [recently closed](https://github.com/awslabs/genomics-learning/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aclosed%20), issues to make sure somebody else hasn't already
15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
16 | 
17 | * A reproducible test case or series of steps
18 | * The version of our code being used
19 | * Any modifications you've made relevant to the bug
20 | * Anything unusual about your environment or deployment
21 | 
22 | 
23 | ## Contributing via Pull Requests
24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
25 | 
26 | 1. You are working against the latest source on the *master* branch.
27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted.
29 | 
30 | To send us a pull request, please:
31 | 
32 | 1. Fork the repository.
33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
34 | 3. Ensure local tests pass.
35 | 4. Commit to your fork using clear commit messages.
36 | 5. Send us a pull request, answering any default questions in the pull request interface.
37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
38 | 
39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
41 | 
42 | 
43 | ## Finding contributions to work on
44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels ((enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ['help wanted'](https://github.com/awslabs/genomics-learning/labels/help%20wanted) issues is a great place to start.
45 | 
46 | 
47 | ## Code of Conduct
48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
50 | opensource-codeofconduct@amazon.com with any additional questions or comments.
51 | 
52 | 
53 | ## Security issue notifications
54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
55 | 
56 | 
57 | ## Licensing
58 | 
59 | See the [LICENSE](https://github.com/awslabs/genomics-learning/blob/master/LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
60 | 
61 | We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes.
62 | 


--------------------------------------------------------------------------------
/LICENSE.txt:
--------------------------------------------------------------------------------
  1 | 
  2 |                                  Apache License
  3 |                            Version 2.0, January 2004
  4 |                         http://www.apache.org/licenses/
  5 | 
  6 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  7 | 
  8 |    1. Definitions.
  9 | 
 10 |       "License" shall mean the terms and conditions for use, reproduction,
 11 |       and distribution as defined by Sections 1 through 9 of this document.
 12 | 
 13 |       "Licensor" shall mean the copyright owner or entity authorized by
 14 |       the copyright owner that is granting the License.
 15 | 
 16 |       "Legal Entity" shall mean the union of the acting entity and all
 17 |       other entities that control, are controlled by, or are under common
 18 |       control with that entity. For the purposes of this definition,
 19 |       "control" means (i) the power, direct or indirect, to cause the
 20 |       direction or management of such entity, whether by contract or
 21 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 22 |       outstanding shares, or (iii) beneficial ownership of such entity.
 23 | 
 24 |       "You" (or "Your") shall mean an individual or Legal Entity
 25 |       exercising permissions granted by this License.
 26 | 
 27 |       "Source" form shall mean the preferred form for making modifications,
 28 |       including but not limited to software source code, documentation
 29 |       source, and configuration files.
 30 | 
 31 |       "Object" form shall mean any form resulting from mechanical
 32 |       transformation or translation of a Source form, including but
 33 |       not limited to compiled object code, generated documentation,
 34 |       and conversions to other media types.
 35 | 
 36 |       "Work" shall mean the work of authorship, whether in Source or
 37 |       Object form, made available under the License, as indicated by a
 38 |       copyright notice that is included in or attached to the work
 39 |       (an example is provided in the Appendix below).
 40 | 
 41 |       "Derivative Works" shall mean any work, whether in Source or Object
 42 |       form, that is based on (or derived from) the Work and for which the
 43 |       editorial revisions, annotations, elaborations, or other modifications
 44 |       represent, as a whole, an original work of authorship. For the purposes
 45 |       of this License, Derivative Works shall not include works that remain
 46 |       separable from, or merely link (or bind by name) to the interfaces of,
 47 |       the Work and Derivative Works thereof.
 48 | 
 49 |       "Contribution" shall mean any work of authorship, including
 50 |       the original version of the Work and any modifications or additions
 51 |       to that Work or Derivative Works thereof, that is intentionally
 52 |       submitted to Licensor for inclusion in the Work by the copyright owner
 53 |       or by an individual or Legal Entity authorized to submit on behalf of
 54 |       the copyright owner. For the purposes of this definition, "submitted"
 55 |       means any form of electronic, verbal, or written communication sent
 56 |       to the Licensor or its representatives, including but not limited to
 57 |       communication on electronic mailing lists, source code control systems,
 58 |       and issue tracking systems that are managed by, or on behalf of, the
 59 |       Licensor for the purpose of discussing and improving the Work, but
 60 |       excluding communication that is conspicuously marked or otherwise
 61 |       designated in writing by the copyright owner as "Not a Contribution."
 62 | 
 63 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 64 |       on behalf of whom a Contribution has been received by Licensor and
 65 |       subsequently incorporated within the Work.
 66 | 
 67 |    2. Grant of Copyright License. Subject to the terms and conditions of
 68 |       this License, each Contributor hereby grants to You a perpetual,
 69 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 70 |       copyright license to reproduce, prepare Derivative Works of,
 71 |       publicly display, publicly perform, sublicense, and distribute the
 72 |       Work and such Derivative Works in Source or Object form.
 73 | 
 74 |    3. Grant of Patent License. Subject to the terms and conditions of
 75 |       this License, each Contributor hereby grants to You a perpetual,
 76 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 77 |       (except as stated in this section) patent license to make, have made,
 78 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 79 |       where such license applies only to those patent claims licensable
 80 |       by such Contributor that are necessarily infringed by their
 81 |       Contribution(s) alone or by combination of their Contribution(s)
 82 |       with the Work to which such Contribution(s) was submitted. If You
 83 |       institute patent litigation against any entity (including a
 84 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 85 |       or a Contribution incorporated within the Work constitutes direct
 86 |       or contributory patent infringement, then any patent licenses
 87 |       granted to You under this License for that Work shall terminate
 88 |       as of the date such litigation is filed.
 89 | 
 90 |    4. Redistribution. You may reproduce and distribute copies of the
 91 |       Work or Derivative Works thereof in any medium, with or without
 92 |       modifications, and in Source or Object form, provided that You
 93 |       meet the following conditions:
 94 | 
 95 |       (a) You must give any other recipients of the Work or
 96 |           Derivative Works a copy of this License; and
 97 | 
 98 |       (b) You must cause any modified files to carry prominent notices
 99 |           stating that You changed the files; and
100 | 
101 |       (c) You must retain, in the Source form of any Derivative Works
102 |           that You distribute, all copyright, patent, trademark, and
103 |           attribution notices from the Source form of the Work,
104 |           excluding those notices that do not pertain to any part of
105 |           the Derivative Works; and
106 | 
107 |       (d) If the Work includes a "NOTICE" text file as part of its
108 |           distribution, then any Derivative Works that You distribute must
109 |           include a readable copy of the attribution notices contained
110 |           within such NOTICE file, excluding those notices that do not
111 |           pertain to any part of the Derivative Works, in at least one
112 |           of the following places: within a NOTICE text file distributed
113 |           as part of the Derivative Works; within the Source form or
114 |           documentation, if provided along with the Derivative Works; or,
115 |           within a display generated by the Derivative Works, if and
116 |           wherever such third-party notices normally appear. The contents
117 |           of the NOTICE file are for informational purposes only and
118 |           do not modify the License. You may add Your own attribution
119 |           notices within Derivative Works that You distribute, alongside
120 |           or as an addendum to the NOTICE text from the Work, provided
121 |           that such additional attribution notices cannot be construed
122 |           as modifying the License.
123 | 
124 |       You may add Your own copyright statement to Your modifications and
125 |       may provide additional or different license terms and conditions
126 |       for use, reproduction, or distribution of Your modifications, or
127 |       for any such Derivative Works as a whole, provided Your use,
128 |       reproduction, and distribution of the Work otherwise complies with
129 |       the conditions stated in this License.
130 | 
131 |    5. Submission of Contributions. Unless You explicitly state otherwise,
132 |       any Contribution intentionally submitted for inclusion in the Work
133 |       by You to the Licensor shall be under the terms and conditions of
134 |       this License, without any additional terms or conditions.
135 |       Notwithstanding the above, nothing herein shall supersede or modify
136 |       the terms of any separate license agreement you may have executed
137 |       with Licensor regarding such Contributions.
138 | 
139 |    6. Trademarks. This License does not grant permission to use the trade
140 |       names, trademarks, service marks, or product names of the Licensor,
141 |       except as required for reasonable and customary use in describing the
142 |       origin of the Work and reproducing the content of the NOTICE file.
143 | 
144 |    7. Disclaimer of Warranty. Unless required by applicable law or
145 |       agreed to in writing, Licensor provides the Work (and each
146 |       Contributor provides its Contributions) on an "AS IS" BASIS,
147 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148 |       implied, including, without limitation, any warranties or conditions
149 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150 |       PARTICULAR PURPOSE. You are solely responsible for determining the
151 |       appropriateness of using or redistributing the Work and assume any
152 |       risks associated with Your exercise of permissions under this License.
153 | 
154 |    8. Limitation of Liability. In no event and under no legal theory,
155 |       whether in tort (including negligence), contract, or otherwise,
156 |       unless required by applicable law (such as deliberate and grossly
157 |       negligent acts) or agreed to in writing, shall any Contributor be
158 |       liable to You for damages, including any direct, indirect, special,
159 |       incidental, or consequential damages of any character arising as a
160 |       result of this License or out of the use or inability to use the
161 |       Work (including but not limited to damages for loss of goodwill,
162 |       work stoppage, computer failure or malfunction, or any and all
163 |       other commercial damages or losses), even if such Contributor
164 |       has been advised of the possibility of such damages.
165 | 
166 |    9. Accepting Warranty or Additional Liability. While redistributing
167 |       the Work or Derivative Works thereof, You may choose to offer,
168 |       and charge a fee for, acceptance of support, warranty, indemnity,
169 |       or other liability obligations and/or rights consistent with this
170 |       License. However, in accepting such obligations, You may act only
171 |       on Your own behalf and on Your sole responsibility, not on behalf
172 |       of any other Contributor, and only if You agree to indemnify,
173 |       defend, and hold each Contributor harmless for any liability
174 |       incurred by, or claims asserted against, such Contributor by reason
175 |       of your accepting any such warranty or additional liability.


--------------------------------------------------------------------------------
/NOTICE.txt:
--------------------------------------------------------------------------------
 1 | Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker
 2 | Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
 3 | Licensed under the Apache License Version 2.0 (the "License"). You may not use this file except
 4 | in compliance with the License. A copy of the License is located at http://www.apache.org/licenses/
 5 | or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS,
 6 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied. See the License for the
 7 | specific language governing permissions and limitations under the License.
 8 | 
 9 | **********************
10 | THIRD PARTY COMPONENTS
11 | **********************
12 | This software includes third party software subject to the following copyrights:
13 | 
14 | AWS SDK under the Apache License Version 2.0
15 | AWS Custom Resource Helper under the Apache License Version 2.0
16 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Deprecation Notice
  2 | 
  3 | This AWS Solution has been archived and is no longer maintained by AWS. A new version of the solution is here: [Guidance for Multi-Omics and Multi-Modal Data Integration and Analysis on AWS](https://aws.amazon.com/solutions/guidance/multi-omics-and-multi-modal-data-integration-and-analysis/). To discover other solutions, please visit the [AWS Solutions Library](https://aws.amazon.com/solutions/).
  4 | 
  5 | # Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker
  6 | 
  7 | <img src="https://d1.awsstatic.com/Solutions/Solutions%20Category%20Template%20Draft/Solution%20Architecture%20Diagrams/genomics-tertiary-analysis-and-machine-learning-architecture-diagram.102c69721d29289d37ac46615dc602034e69bcc0.png" style="width:75vw">
  8 | 
  9 | The Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker solution creates a scalable environment in AWS to develop machine learning models using genomics data, generate predictions, and evaluate model performance. This solution demonstrates how to 1) automate the preparation of a genomics machine learning training dataset, 2) develop genomics machine learning model training and deployment pipelines and, 3) generate predictions and evaluate model performance using test data.
 10 | 
 11 | ## Standard deployment
 12 | 
 13 | To deploy this solution in your account use the "Launch in the AWS Console" button found on the [solution landing page](https://aws.amazon.com/solutions/implementations/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/?did=sl_card&trk=sl_card).
 14 | 
 15 | We recommend deploying the solution this way for most use cases.
 16 | 
 17 | ## Customized deployment
 18 | 
 19 | A fully customized solution can be deployed for the following use cases:
 20 | 
 21 | * Modifying or adding additional resources deployed during installation
 22 | * Modifying the "Landing Zone" of the solution - e.g. adding additional artifacts or customizing the "Pipe" CodePipeline
 23 | 
 24 | Fully customized solutions need to be self-hosted in your own AWS account, and you will be responsible for any costs incurred in doing so.
 25 | 
 26 | To deploy and self-host a fully customized solution use the instructions below.
 27 | 
 28 | _Note_: All commands assume a `bash` shell.
 29 | 
 30 | ### Customize
 31 | 
 32 | Clone the repository, and make desired changes
 33 | 
 34 | #### File Structure
 35 | 
 36 | ```
 37 | .
 38 | ├── CHANGELOG.md
 39 | ├── CODE_OF_CONDUCT.md
 40 | ├── CONTRIBUTING.md
 41 | ├── LICENSE.txt
 42 | ├── NOTICE.txt
 43 | ├── README.md
 44 | ├── buildspec.yml
 45 | ├── deploy.sh
 46 | ├── deployment
 47 | │   ├── build-open-source-dist.sh
 48 | │   ├── build-s3-dist.sh
 49 | │   └── run-unit-tests.sh
 50 | └── source
 51 |     ├── GenomicsLearningCode
 52 |     │   ├── awscli_test.sh
 53 |     │   ├── code_cfn.yml
 54 |     │   ├── copyresources_buildspec.yml
 55 |     │   ├── resources
 56 |     │   │   ├── notebooks
 57 |     │   │   │   ├── variant_classifier-autopilot.ipynb
 58 |     │   │   │   └── variant_predictor.ipynb
 59 |     │   │   └── scripts
 60 |     │   │       └── process_clinvar.py
 61 |     │   └── setup
 62 |     │       ├── lambda.py
 63 |     │       └── requirements.txt
 64 |     ├── GenomicsLearningPipe
 65 |     │   └── pipe_cfn.yml
 66 |     ├── GenomicsLearningZone
 67 |     │   └── zone_cfn.yml
 68 |     ├── setup.sh
 69 |     ├── setup_cfn.yml
 70 |     └── teardown.sh
 71 | 
 72 | ```
 73 | 
 74 | | Path | Description |
 75 | | :-   | :-          |
 76 | | deployment | Scripts for building and deploying a customized distributable |
 77 | | deployment/build-s3-dist.sh | Shell script for packaging distribution assets |
 78 | | deployment/run-unit-tests.sh | Shell script for execution unit tests |
 79 | | source     | Source code for the solution |
 80 | | source/setup_cfn.yaml | CloudFormation template used to install the solution |
 81 | | source/GenomicsLearningZone/ | Source code for the solution landing zone - location for common assets and artifacts used by the solution |
 82 | | source/GenomicsLearningPipe/ | Source code for the solution deployment pipeline - the CI/CD pipeline that builds and deploys the solution codebase |
 83 | | source/GenomicsLearningCode/ | Source code for the solution codebase - source code for the training job and ML notebooks |
 84 | 
 85 | ### Run unit tests
 86 | 
 87 | ```bash
 88 | cd ./deployment
 89 | chmod +x ./run-unit-tests.sh
 90 | ./run-unit-tests.sh
 91 | ```
 92 | 
 93 | ### Build and deploy
 94 | 
 95 | #### Create deployment buckets
 96 | 
 97 | The solution requires two buckets for deployment:
 98 | 
 99 | 1. `<bucket-name>` for the solution's primary CloudFormation template
100 | 2. `<bucket-name>-<aws_region>` for additional artifacts and assets that the solution requires - these are stored regionally to reduce latency during installation and avoid inter-regional transfer costs
101 | 
102 | #### Configure and build the distributable
103 | 
104 | ```bash
105 | export DIST_OUTPUT_BUCKET=<bucket-name>
106 | export SOLUTION_NAME=<solution-name>
107 | export VERSION=<version>
108 | 
109 | chmod +x ./build-s3-dist.sh
110 | ./build-s3-dist.sh $DIST_OUTPUT_BUCKET $SOLUTION_NAME $VERSION
111 | ```
112 | 
113 | #### Deploy the distributable
114 | 
115 | _Note:_ you must have the AWS Command Line Interface (CLI) installed for this step. Learn more about the AWS CLI [here](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html).
116 | 
117 | ```bash
118 | cd ./deployment
119 | 
120 | # deploy global assets
121 | # this only needs to be done once
122 | aws s3 cp \
123 |     ./global-s3-assets/ s3://<bucket-name>/$SOLUTION_NAME/$VERSION \
124 |     --recursive \
125 |     --acl bucket-owner-full-control
126 | 
127 | # deploy regional assets
128 | # repeat this step for as many regions as needed
129 | aws s3 cp \
130 |     ./regional-s3-assets/ s3://<bucket-name>-<aws_region>/$SOLUTION_NAME/$VERSION \
131 |     --recursive \
132 |     --acl bucket-owner-full-control
133 | ```
134 | 
135 | ### Install the customized solution
136 | 
137 | The link to the primary CloudFormation template will look something like:
138 | 
139 | ```text
140 | https://<bucket-name>.s3-<region>.amazonaws.com/genomics-tertiary-analysis-and-data-lakes-using-amazon-sagemaker.template
141 | ```
142 | 
143 | Use this link to install the customized solution into your AWS account in a specific region using the [AWS Cloudformation Console](https://us-west-2.console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/create/template).
144 | 
145 | ---
146 | 
147 | This solution collects anonymous operational metrics to help AWS improve the
148 | quality of features of the solution. For more information, including how to disable
149 | this capability, please see the [implementation guide](https://docs.aws.amazon.com/solutions/latest/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/appendix-f.html).
150 | 
151 | ---
152 | 
153 | Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
154 | 
155 | Licensed under the Apache License Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at
156 | 
157 |     http://www.apache.org/licenses/
158 | 
159 | or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied. See the License for the specific language governing permissions and limitations under the License.
160 | 


--------------------------------------------------------------------------------
/deployment/build-s3-dist.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | #
 3 | # This assumes all of the OS-level configuration has been completed and git repo has already been cloned
 4 | #
 5 | # This script should be run from the repo's deployment directory
 6 | # cd deployment
 7 | # ./build-s3-dist.sh source-bucket-base-name solution-name version-code
 8 | #
 9 | # Paramenters:
10 | #  - source-bucket-base-name: Name for the S3 bucket location where the template will source the Lambda
11 | #    code from. The template will append '-[region_name]' to this bucket name.
12 | #    For example: ./build-s3-dist.sh solutions my-solution v1.0.0
13 | #    The template will then expect the source code to be located in the solutions-[region_name] bucket
14 | #
15 | #  - solution-name: name of the solution for consistency
16 | #
17 | #  - version-code: version of the package
18 | 
19 | # Check to see if input has been provided:
20 | if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ]; then
21 |     echo "Please provide the base source bucket name, trademark approved solution name and version where the lambda code will eventually reside."
22 |     echo "For example: ./build-s3-dist.sh solutions trademarked-solution-name v1.0.0"
23 |     exit 1
24 | fi
25 | 
26 | # Get reference for all important folders
27 | template_dir="$PWD"
28 | template_dist_dir="$template_dir/global-s3-assets"
29 | build_dist_dir="$template_dir/regional-s3-assets"
30 | source_dir="$template_dir/../source"
31 | 
32 | cp $source_dir/setup_cfn.yml $template_dir/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker.template
33 | 
34 | echo "------------------------------------------------------------------------------"
35 | echo "[Init] Clean old dist"
36 | echo "------------------------------------------------------------------------------"
37 | echo "rm -rf $template_dist_dir"
38 | rm -rf $template_dist_dir
39 | echo "mkdir -p $template_dist_dir"
40 | mkdir -p $template_dist_dir
41 | echo "rm -rf $build_dist_dir"
42 | rm -rf $build_dist_dir
43 | echo "mkdir -p $build_dist_dir"
44 | mkdir -p $build_dist_dir
45 | 
46 | echo "------------------------------------------------------------------------------"
47 | echo "[Packing] Templates"
48 | echo "------------------------------------------------------------------------------"
49 | echo "cp $template_dir/*.template $template_dist_dir/"
50 | cp $template_dir/*.template $template_dist_dir/
51 | echo "copy yaml templates and rename"
52 | cp $template_dir/*.yaml $template_dist_dir/
53 | cd $template_dist_dir
54 | # Rename all *.yaml to *.template
55 | for f in *.yaml; do
56 |     mv -- "$f" "${f%.yaml}.template"
57 | done
58 | 
59 | cd ..
60 | echo "Updating code source bucket in template with $1"
61 | replace="s/%%BUCKET_NAME%%/$1/g"
62 | echo "sed -i '' -e $replace $template_dist_dir/*.template"
63 | sed -i '' -e $replace $template_dist_dir/*.template
64 | replace="s/%%SOLUTION_NAME%%/$2/g"
65 | echo "sed -i '' -e $replace $template_dist_dir/*.template"
66 | sed -i '' -e $replace $template_dist_dir/*.template
67 | replace="s/%%VERSION%%/$3/g"
68 | echo "sed -i '' -e $replace $template_dist_dir/*.template"
69 | sed -i '' -e $replace $template_dist_dir/*.template
70 | 
71 | mkdir $build_dist_dir/annotation
72 | mkdir $build_dist_dir/annotation/clinvar/
73 | 
74 | echo "------------------------------------------------------------------------------"
75 | echo "[Rebuild] Solution"
76 | echo "------------------------------------------------------------------------------"
77 | 
78 | cd $source_dir
79 | 
80 | bundle_dir="$source_dir/../bundle"
81 | mkdir -p $bundle_dir
82 | 
83 | # create the lambda function deployment pacakage for the solution setup
84 | cd $source_dir/GenomicsLearningCode/setup
85 | pip install -t . crhelper
86 | zip -r $bundle_dir/SolutionSetup.zip .
87 | 
88 | # package the solution
89 | cd $source_dir
90 | zip -r $bundle_dir/Solution.zip .
91 | 
92 | cd $bundle_dir
93 | cp Solution.zip $template_dist_dir/
94 | cp SolutionSetup.zip $template_dist_dir/
95 | cp Solution.zip $build_dist_dir/
96 | cp SolutionSetup.zip $build_dist_dir/
97 | 


--------------------------------------------------------------------------------
/deployment/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker.template:
--------------------------------------------------------------------------------
  1 | AWSTemplateFormatVersion: 2010-09-09
  2 | 
  3 | Description: |
  4 |     (SO0078) - The Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker solution creates a scalable environment 
  5 |     in AWS to develop machine learning models using genomics data, generate predictions, and evaluate model performance. 
  6 |     This solution demonstrates how to 1) automate the preparation of a genomics machine learning training dataset, 
  7 |     2) develop genomics machine learning model training and deployment pipelines and, 
  8 |     3) generate predictions and evaluate model performance using test data.
  9 | 
 10 | Mappings:
 11 |   Send:
 12 |     AnonymousUsage:
 13 |       Data: Yes
 14 |   SourceCode:
 15 |     General:
 16 |       S3Bucket: '%%BUCKET_NAME%%'
 17 |       KeyPrefix: '%%SOLUTION_NAME%%/%%VERSION%%'
 18 | 
 19 | Parameters:
 20 |   Project:
 21 |     Type: String
 22 |     Description: >
 23 |       The project name for this solution. The project name will be used to prefix resources created by this solution. Project names should be unique to a project.
 24 |     AllowedPattern: "[a-zA-Z0-9-]{3,24}"
 25 |     ConstraintDescription: >
 26 |       Project name should be unique, 3-24 characters in length, and only have alphanumeric characters and hyphens ([a-zA-Z0-9-]{3,32}).
 27 |     Default: GenomicsLearning
 28 |       
 29 | Resources:
 30 |   Setup:
 31 |     Type: Custom::Setup
 32 |     DependsOn:
 33 |       - CodeBuild
 34 |     Version: 1.0
 35 |     Properties:
 36 |       ServiceToken: !Sub ${SetupLambda.Arn}
 37 |       CodeBuildProjectName: !Sub ${CodeBuild}
 38 | 
 39 |   SetupLambda:
 40 |     Type: AWS::Lambda::Function
 41 |     DependsOn:
 42 |       - SetupLambdaRole
 43 |     Properties:
 44 |       Handler: lambda.handler
 45 |       Runtime: python3.8
 46 |       FunctionName: !Sub ${Project}Setup
 47 |       Code:
 48 |         S3Bucket: !Join ["-", [!FindInMap ["SourceCode", "General", "S3Bucket"], Ref: "AWS::Region"]]
 49 |         S3Key: !Join ["", [!FindInMap ["SourceCode", "General", "KeyPrefix"], "/SolutionSetup.zip"]]
 50 |       Role: !Sub ${SetupLambdaRole.Arn}
 51 |       Timeout: 600
 52 |     Metadata:
 53 |       cfn_nag:
 54 |         rules_to_suppress:
 55 |           - id: W58
 56 |             reason: Bug in CfnNag. Lambda functions require permission to write CloudWatch Logs. Looking for PutLogEvent instead of PutLogEvents
 57 | 
 58 |   SetupLambdaRole:
 59 |     Type: AWS::IAM::Role
 60 |     DependsOn:
 61 |       - CodeBuild
 62 |     Properties:
 63 |       AssumeRolePolicyDocument:
 64 |         Version: 2012-10-17
 65 |         Statement:
 66 |           - Action:
 67 |               - sts:AssumeRole
 68 |             Effect: Allow
 69 |             Principal:
 70 |               Service:
 71 |                 - lambda.amazonaws.com
 72 |       Path: /
 73 |       Policies:
 74 |         - PolicyName: LogsAccess
 75 |           PolicyDocument:
 76 |             Statement:
 77 |               - Effect: Allow
 78 |                 Action:
 79 |                   - logs:CreateLogGroup
 80 |                   - logs:CreateLogStream
 81 |                   - logs:PutLogEvents
 82 |                 Resource:
 83 |                   - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/lambda/${Project}*
 84 |         - PolicyName: CodeBuildAccess
 85 |           PolicyDocument:
 86 |             Statement:
 87 |               - Effect: Allow
 88 |                 Action:
 89 |                   - codebuild:BatchGetProjects
 90 |                   - codebuild:BatchGetBuilds
 91 |                   - codebuild:StartBuild
 92 |                 Resource:
 93 |                   - !Sub ${CodeBuild.Arn}
 94 |         - PolicyName: EventsAccess
 95 |           PolicyDocument:
 96 |             Statement:
 97 |               - Effect: Allow
 98 |                 Action:
 99 |                   - events:DeleteRule
100 |                   - events:PutRule
101 |                   - events:PutTargets
102 |                   - events:RemoveTargets
103 |                 Resource:
104 |                   - !Sub arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/Setup*
105 |         - PolicyName: LambdaAccess
106 |           PolicyDocument:
107 |             Statement:
108 |               - Effect: Allow
109 |                 Action:
110 |                   - lambda:AddPermission
111 |                   - lambda:RemovePermission
112 |                 Resource:
113 |                   - !Sub arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:${Project}*
114 | 
115 |   CodeBuildRole:
116 |       Type: AWS::IAM::Role
117 |       Properties:
118 |         AssumeRolePolicyDocument:
119 |           Version: 2012-10-17
120 |           Statement:
121 |             - Action:
122 |                 - sts:AssumeRole
123 |               Effect: Allow
124 |               Principal:
125 |                 Service:
126 |                   - codebuild.amazonaws.com
127 |         Path: /
128 |         Policies:
129 |           - PolicyName: CloudFormationAccess
130 |             PolicyDocument:
131 |               Statement:
132 |                 - Action:
133 |                     - cloudformation:CreateStack
134 |                     - cloudformation:DescribeStacks
135 |                     - cloudformation:DescribeStackResource
136 |                     - cloudformation:UpdateStack
137 |                     - cloudformation:DeleteStack
138 |                     - cloudformation:UpdateTerminationProtection
139 |                   Effect: Allow
140 |                   Resource: !Sub arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${Project}*
141 |           - PolicyName: LogsAccess
142 |             PolicyDocument:
143 |               Statement:
144 |                 - Effect: Allow
145 |                   Action:
146 |                     - logs:CreateLogGroup
147 |                     - logs:CreateLogStream
148 |                     - logs:PutLogEvents
149 |                   Resource:
150 |                     - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/codebuild/${Project}*
151 |           - PolicyName: IAMAccess
152 |             PolicyDocument:
153 |               Statement:
154 |                 - Effect: Allow
155 |                   Action:
156 |                     - iam:CreateRole
157 |                     - iam:DeleteRole
158 |                     - iam:PutRolePolicy
159 |                     - iam:DeleteRolePolicy
160 |                     - iam:AttachRolePolicy
161 |                     - iam:DetachRolePolicy
162 |                     - iam:UpdateAssumeRolePolicy
163 |                     - iam:PassRole
164 |                     - iam:GetRole
165 |                     - iam:GetInstanceProfile
166 |                     - iam:CreateInstanceProfile
167 |                     - iam:DeleteInstanceProfile
168 |                     - iam:AddRoleToInstanceProfile
169 |                     - iam:RemoveRoleFromInstanceProfile
170 |                   Resource:
171 |                     - !Sub arn:aws:iam::${AWS::AccountId}:role/${Project}*
172 |                     - !Sub arn:aws:iam::${AWS::AccountId}:instance-profile/${Project}*
173 |           - PolicyName: CodeBuildAccess
174 |             PolicyDocument:
175 |               Statement:
176 |                 - Effect: Allow
177 |                   Action:
178 |                     - codebuild:CreateProject
179 |                     - codebuild:UpdateProject
180 |                     - codebuild:ListProjects
181 |                     - codebuild:BatchGetProjects
182 |                     - codebuild:DeleteProject
183 |                   Resource:
184 |                     - !Sub arn:aws:codebuild:${AWS::Region}:${AWS::AccountId}:project/${Project}*
185 |           - PolicyName: CodePipelineAccess
186 |             PolicyDocument:
187 |               Statement:
188 |                 - Effect: Allow
189 |                   Action:
190 |                     - codepipeline:CreatePipeline
191 |                     - codepipeline:GetPipeline
192 |                     - codepipeline:UpdatePipeline
193 |                     - codepipeline:DeletePipeline
194 |                     - codepipeline:GetPipelineState
195 |                     - codepipeline:ListPipelineExecutions
196 |                   Resource:
197 |                     - !Sub arn:aws:codepipeline:${AWS::Region}:${AWS::AccountId}:${Project}*
198 |           - PolicyName: CodeCommitAccess
199 |             PolicyDocument:
200 |               Statement:
201 |                 - Effect: Allow
202 |                   Action:
203 |                     - codecommit:CreateBranch
204 |                     - codecommit:CreateRepository
205 |                     - codecommit:GetRepository
206 |                     - codecommit:DeleteRepository
207 |                     - codecommit:CreateCommit
208 |                     - codecommit:GitPush
209 |                     - codecommit:GitPull
210 |                     - codecommit:DeleteBranch
211 |                   Resource:
212 |                     - !Sub arn:aws:codecommit:${AWS::Region}:${AWS::AccountId}:${Project}*
213 |                 - Effect: Allow
214 |                   Action:
215 |                     - codecommit:ListRepositories
216 |                   Resource: '*'
217 |           - PolicyName: EventsAccess
218 |             PolicyDocument:
219 |               Statement:
220 |                 - Effect: Allow
221 |                   Action:
222 |                     - events:DescribeRule
223 |                     - events:PutRule
224 |                     - events:DeleteRule
225 |                     - events:PutTargets
226 |                     - events:RemoveTargets
227 |                   Resource:
228 |                     - !Sub arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/*
229 |           - PolicyName: GlueAccess
230 |             PolicyDocument:
231 |               Statement:
232 |                 - Effect: Allow
233 |                   Action:
234 |                     - glue:StartJob
235 |                     - glue:GetJob
236 |                   Resource: '*'
237 |           - PolicyName: S3Access
238 |             PolicyDocument:
239 |               Statement:
240 |                 - Effect: Allow
241 |                   Action:
242 |                     - s3:GetObject
243 |                   Resource:
244 |                     !Join
245 |                       - ''
246 |                       - - 'arn:aws:s3:::'
247 |                         - !FindInMap ["SourceCode", "General", "S3Bucket"]
248 |                         - '/*'
249 |                 - Effect: Allow
250 |                   Action:
251 |                     - s3:GetObject
252 |                   Resource:
253 |                     !Join
254 |                       - ''
255 |                       - - 'arn:aws:s3:::'
256 |                         - !Join
257 |                             - '-'
258 |                             - - !FindInMap ["SourceCode", "General", "S3Bucket"]
259 |                               - Ref: "AWS::Region"
260 |                         - '/'
261 |                         - !FindInMap ["SourceCode", "General", "KeyPrefix"]
262 |                         - '/*'
263 |                 - Effect: Allow
264 |                   Action:
265 |                     - s3:ListBucket
266 |                   Resource:
267 |                     !Join
268 |                       - ''
269 |                       - - 'arn:aws:s3:::'
270 |                         - !FindInMap ["SourceCode", "General", "S3Bucket"]
271 | 
272 |                 - Effect: Allow
273 |                   Action:
274 |                     - s3:PutObjectAcl
275 |                     - s3:GetObject
276 |                     - s3:PutObject
277 |                     - s3:DeleteObject
278 |                     - s3:ListBucket
279 |                     - s3:CreateBucket
280 |                     - s3:DeleteBucket
281 |                     - s3:PutEncryptionConfiguration
282 |                     - s3:PutBucketPublicAccessBlock
283 |                     - s3:PutBucketLogging
284 |                     - s3:PutBucketAcl
285 |                   Resource:
286 |                     - arn:aws:s3:::*pipe*
287 |                     - arn:aws:s3:::*pipe*/*
288 |                 - Effect: Allow
289 |                   Action:
290 |                     - s3:ListBucket
291 |                   Resource:
292 |                     !Join
293 |                       - ''
294 |                       - - 'arn:aws:s3:::'
295 |                         - !FindInMap ["SourceCode", "General", "S3Bucket"]
296 |                 - Effect: Allow
297 |                   Action:
298 |                     - s3:CreateBucket
299 |                     - s3:DeleteBucket
300 |                     - s3:ListBucket
301 |                     - s3:PutEncryptionConfiguration
302 |                     - s3:PutBucketPublicAccessBlock
303 |                     - s3:PutBucketLogging
304 |                     - s3:PutBucketAcl
305 |                     - s3:PutObject
306 |                     - s3:PutObjectAcl
307 |                   Resource:
308 |                     - arn:aws:s3:::*pipe*
309 |                     - arn:aws:s3:::*pipe*/*
310 |       Metadata:
311 |         cfn_nag:
312 |           rules_to_suppress:
313 |             - id: W11
314 |               reason: Star required for codecommit:ListRepositories and Glue actions.
315 |   
316 |   CodeBuild:
317 |     Type: AWS::CodeBuild::Project
318 |     Properties:
319 |       Name: !Sub ${Project}Setup
320 |       Artifacts:
321 |         Type: NO_ARTIFACTS
322 |       Source:
323 |         Type: NO_SOURCE
324 |         BuildSpec: !Sub |
325 |           version: 0.2
326 |           phases:
327 |             install:
328 |               commands:
329 |                 - git config --global user.name automated_user
330 |                 - git config --global user.email automated_email
331 |                 - git config --global credential.helper '!aws codecommit credential-helper $@'
332 |                 - git config --global credential.UseHttpPath true
333 |                 - aws s3 cp s3://$ARTIFACT_BUCKET/$ARTIFACT_KEY_PREFIX/Solution.zip .
334 |                 - unzip Solution.zip
335 |                 - ./$SOLUTION_ACTION.sh
336 |       Environment:
337 |         ComputeType: BUILD_GENERAL1_SMALL
338 |         EnvironmentVariables:
339 |           - Name: SOLUTION_ACTION
340 |             Value: setup
341 |           - Name: PROJECT_NAME
342 |             Value: !Ref Project
343 |           - Name: ARTIFACT_BUCKET
344 |             Value: !Join ["-", [!FindInMap ["SourceCode", "General", "S3Bucket"], Ref: "AWS::Region"]]
345 |           - Name: ARTIFACT_KEY_PREFIX
346 |             Value: !FindInMap ["SourceCode", "General", "KeyPrefix"]
347 |         Image: aws/codebuild/standard:3.0
348 |         Type: LINUX_CONTAINER
349 |       ServiceRole: !Sub ${CodeBuildRole}
350 |       TimeoutInMinutes: 30
351 |     Metadata:
352 |       cfn_nag:
353 |         rules_to_suppress:
354 |           - id: W32
355 |             reason: Customer can enable encryption if desired.
356 | 


--------------------------------------------------------------------------------
/deployment/run-unit-tests.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | #
 3 | # This assumes all of the OS-level configuration has been completed and git repo has already been cloned
 4 | #
 5 | # This script should be run from the repo's deployment directory
 6 | # cd deployment
 7 | # ./run-unit-tests.sh
 8 | #
 9 | 
10 | # Get reference for all important folders
11 | template_dir="$PWD"
12 | source_dir="$template_dir/../source"
13 | 
14 | echo "------------------------------------------------------------------------------"
15 | echo "[Init] Clean old dist and node_modules folders"
16 | echo "------------------------------------------------------------------------------"
17 | 
18 | echo "------------------------------------------------------------------------------"
19 | echo "[Test] Services - Example Function"
20 | echo "------------------------------------------------------------------------------"
21 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/awscli_test.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash -e
 2 | 
 3 | export AWS_DEFAULT_OUTPUT=text
 4 | 
 5 | project_name=${PROJECT_NAME:-GenomicsLearning}
 6 | 
 7 | resource_prefix=${project_name}
 8 | 
 9 | resource_prefix_lowercase=$(echo ${resource_prefix} | tr '[:upper:]' '[:lower:]')
10 | 
11 | process_data_job="${resource_prefix_lowercase}-create-trainingset"
12 | 
13 | aws glue get-job --job-name ${process_data_job}
14 | 
15 | printf "Test:Job exists\n"


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/code_cfn.yml:
--------------------------------------------------------------------------------
  1 | AWSTemplateFormatVersion: 2010-09-09
  2 | 
  3 | Description: GenomicsLearningCode
  4 | 
  5 | Parameters:
  6 |   ResourcePrefix:
  7 |     Type: String
  8 |     Default: GenomicsLearning
  9 |   ResourcePrefixLowercase:
 10 |     Type: String
 11 |     Default: genomicslearning
 12 |   ResourcesBucket:
 13 |     Type: String
 14 |   DataLakeBucket:
 15 |     Type: String
 16 | 
 17 | Resources:
 18 | 
 19 |   JobRole:
 20 |     Type: AWS::IAM::Role
 21 |     Properties:
 22 |       AssumeRolePolicyDocument:
 23 |         Version: 2012-10-17
 24 |         Statement:
 25 |           - Effect: Allow
 26 |             Principal:
 27 |               Service:
 28 |                 - glue.amazonaws.com
 29 |             Action:
 30 |               - sts:AssumeRole
 31 |       Path: /
 32 |       ManagedPolicyArns:
 33 |         - arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole
 34 |       Policies:
 35 |         - PolicyName: s3_access
 36 |           PolicyDocument:
 37 |             Version: 2012-10-17
 38 |             Statement:
 39 |               - Effect: Allow
 40 |                 Action:
 41 |                   - s3:GetObject
 42 |                   - s3:ListBucket
 43 |                 Resource:
 44 |                   - !Sub arn:aws:s3:::${ResourcesBucket}
 45 |                   - !Sub arn:aws:s3:::${ResourcesBucket}/*
 46 |               - Effect: Allow
 47 |                 Action:
 48 |                   - s3:PutObject
 49 |                   - s3:GetObject
 50 |                   - s3:ListBucket
 51 |                   - s3:DeleteObject
 52 |                 Resource:
 53 |                   - !Sub arn:aws:s3:::${DataLakeBucket}
 54 |                   - !Sub arn:aws:s3:::${DataLakeBucket}/*
 55 | 
 56 |   RunbookRole:
 57 |     Type: AWS::IAM::Role
 58 |     Properties:
 59 |       AssumeRolePolicyDocument:
 60 |         Version: 2012-10-17
 61 |         Statement:
 62 |           - Effect: Allow
 63 |             Principal:
 64 |               Service:
 65 |                 - sagemaker.amazonaws.com
 66 |             Action:
 67 |               - sts:AssumeRole
 68 |       Path: /
 69 |       Policies:
 70 |         - PolicyName: logs_access
 71 |           PolicyDocument:
 72 |             Version: 2012-10-17
 73 |             Statement:
 74 |               - Effect: Allow
 75 |                 Action:
 76 |                   - logs:CreateLogStream
 77 |                   - logs:DescribeLogStreams
 78 |                   - logs:CreateLogGroup
 79 |                   - logs:PutLogEvents
 80 |                   - logs:GetLogEvents
 81 |                 Resource:
 82 |                   - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/sagemaker/*
 83 |         - PolicyName: s3_access
 84 |           PolicyDocument:
 85 |             Version: 2012-10-17
 86 |             Statement:
 87 |               - Effect: Allow
 88 |                 Action:
 89 |                   - s3:CreateBucket
 90 |                   - s3:ListBucket
 91 |                 Resource:
 92 |                   - !Sub arn:aws:s3:::sagemaker-${AWS::Region}-${AWS::AccountId}
 93 |               - Effect: Allow
 94 |                 Action:
 95 |                   - iam:GetRole
 96 |                   - sagemaker:DescribeNotebookInstance
 97 |                 Resource: '*'
 98 |               - Effect: Allow
 99 |                 Action:
100 |                   - s3:ListBucket
101 |                   - s3:GetBucketLocation
102 |                 Resource:
103 |                   - !Sub arn:aws:s3:::${DataLakeBucket}
104 |                   - !Sub arn:aws:s3:::${ResourcesBucket}
105 |               - Effect: Allow
106 |                 Action:
107 |                   - s3:GetObject
108 |                   - s3:PutObject
109 |                   - s3:DeleteObject
110 |                 Resource:
111 |                   - !Sub arn:aws:s3:::${DataLakeBucket}/*
112 |                   - !Sub arn:aws:s3:::sagemaker-${AWS::Region}-${AWS::AccountId}/*
113 |               - Effect: Allow
114 |                 Action:
115 |                   - s3:GetObject
116 |                 Resource:
117 |                   - !Sub arn:aws:s3:::${ResourcesBucket}/*
118 |         - PolicyName: glue_access
119 |           PolicyDocument:
120 |             Version: 2012-10-17
121 |             Statement:
122 |               - Effect: Allow
123 |                 Action:
124 |                   - glue:StartJobRun
125 |                   - glue:StopJobRun
126 |                 Resource:
127 |                   - !Sub arn:aws:glue:${AWS::Region}:${AWS::AccountId}:job/${ResourcePrefix}*
128 |         - PolicyName: cfn_access
129 |           PolicyDocument:
130 |             Version: 2012-10-17
131 |             Statement:
132 |               - Effect: Allow
133 |                 Action:
134 |                   - cloudformation:DescribeStacks
135 |                 Resource:
136 |                   - !Sub arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${ResourcePrefix}Pipe/*
137 |                   - !Sub arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${ResourcePrefix}*
138 |         - PolicyName: sagemaker_access
139 |           PolicyDocument:
140 |             Version: 2012-10-17
141 |             Statement:
142 |               - Effect: Allow
143 |                 Action:
144 |                   - iam:CreateServiceLinkedRole
145 |                 Resource:
146 |                   - !Sub arn:aws:iam::*:role/aws-service-role/sagemaker.application-autoscaling.amazonaws.com/AWSServiceRoleForApplicationAutoScaling_SageMakerEndpoint
147 |                 Condition:
148 |                   StringLike:
149 |                     iam:AWSServiceName: sagemaker.application-autoscaling.amazonaws.com
150 |               - Effect: Allow
151 |                 Action:
152 |                   - iam:CreateServiceLinkedRole
153 |                 Resource: '*'
154 |                 Condition:
155 |                   StringEquals:
156 |                     iam:AWSServiceName: robomaker.amazonaws.com
157 |               - Effect: Allow
158 |                 Action:
159 |                   - iam:PassRole
160 |                 Resource:
161 |                   - !Sub arn:aws:iam::*:role/*
162 |                 Condition:
163 |                   StringEquals:
164 |                     iam:PassedToService:
165 |                       - sagemaker.amazonaws.com
166 |                       - glue.amazonaws.com
167 |                       - robomaker.amazonaws.com
168 |                       - states.amazonaws.com
169 |               - Effect: Allow
170 |                 Action:
171 |                   - sagemaker:ListEndpoints
172 |                 Resource: '*'
173 |               - Effect: Allow
174 |                 Action:
175 |                   - sagemaker:DescribeTrainingJob
176 |                   - sagemaker:DescribeTransformJob
177 |                   - sagemaker:CreateTrainingJob
178 |                   - sagemaker:CreateAutoMLJob
179 |                   - sagemaker:CreateTransformJob
180 |                   - sagemaker:StopTransformJob
181 |                   - sagemaker:CreateHyperParameterTuningJob
182 |                   - sagemaker:StopHyperParameterTuningJob
183 |                   - sagemaker:DescribeHyperParameterTuningJob
184 |                   - sagemaker:DescribeEndpoint
185 |                   - sagemaker:DescribeEndpointConfig
186 |                   - sagemaker:CreateEndpointConfig
187 |                   - sagemaker:CreateEndpoint
188 |                   - sagemaker:InvokeEndpoint
189 |                   - sagemaker:ListTrainingJobsForHyperParameterTuningJob
190 |                   - sagemaker:CreateModel
191 |                   - sagemaker:ListTags
192 |                   - logs:GetLogEvents
193 |                   - sagemaker:DeleteModel
194 |                   - sagemaker:StopAutoMLJob
195 |                   - sagemaker:ListAutoMLJob
196 |                   - sagemaker:DescribeAutoMLJob
197 |                 Resource:
198 |                   - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:training-job*
199 |                   - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:automl-job*
200 |                   - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:transform-job*
201 |                   - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:model*
202 |                   - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:endpoint*
203 |                   - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:hyper-parameter-tuning-job*
204 |               - Effect: Allow
205 |                 Action:
206 |                   - iam:ListRoles
207 |                 Resource: '*'
208 |     Metadata:
209 |       cfn_nag:
210 |         rules_to_suppress:
211 |           - id: W11
212 |             reason: GetRole, DescribeSageMakerRole, ListRoles and CreateServiceLinkedRole require star.
213 | 
214 |   ProcessClinvarDataJob:
215 |     Type: AWS::Glue::Job
216 |     Properties:
217 |       Command:
218 |         Name: pythonshell
219 |         PythonVersion: '3'
220 |         ScriptLocation: !Sub s3://${ResourcesBucket}/scripts/process_clinvar.py
221 |       DefaultArguments:
222 |         --job-bookmark-option: job-bookmark-disable
223 |         --input_bucket: !Sub ${DataLakeBucket}
224 |         --clinvar_input_key: !Sub annotation/clinvar/clinvar.vcf.gz
225 |         --clinvar_annotated_input_key: !Sub annotation/clinvar/clinvar.annotated.vcf.gz
226 |         --output_bucket: !Sub ${DataLakeBucket}
227 |         --output_key: !Sub annotation/clinvar/conflicting/clinvar_conflicting.csv
228 |       MaxCapacity: 1
229 |       ExecutionProperty:
230 |         MaxConcurrentRuns: 2
231 |       MaxRetries: 0
232 |       Name: !Sub ${ResourcePrefixLowercase}-create-trainingset
233 |       Role: !Ref JobRole
234 | 
235 |   RunbookLifecycle:
236 |     Type: AWS::SageMaker::NotebookInstanceLifecycleConfig
237 |     Properties:
238 |       NotebookInstanceLifecycleConfigName: !Sub ${ResourcePrefix}Runbook
239 |       OnStart:
240 |       - Content:
241 |           Fn::Base64: !Sub |
242 |             #!/bin/bash
243 |             cd /home/ec2-user/SageMaker
244 |             set -e
245 |             pip install awscli
246 |             # download notebooks from S3
247 |             aws s3 cp s3://${ResourcesBucket}/notebooks/variant_classifier-autopilot.ipynb ./ --acl public-read-write
248 |             chmod 666 variant_classifier-autopilot.ipynb
249 |             aws s3 cp s3://${ResourcesBucket}/notebooks/variant_predictor.ipynb ./ --acl public-read-write
250 |             chmod 666 variant_predictor.ipynb
251 |             echo "export RESOURCE_PREFIX='${ResourcePrefix}'" > /home/ec2-user/anaconda3/envs/tensorflow_p36/etc/conda/activate.d/env_vars.sh
252 | 
253 |   Runbook:
254 |     Type: AWS::SageMaker::NotebookInstance
255 |     Properties:
256 |       NotebookInstanceName: !Sub ${ResourcePrefix}Runbook
257 |       InstanceType: ml.t2.medium
258 |       LifecycleConfigName: !GetAtt RunbookLifecycle.NotebookInstanceLifecycleConfigName
259 |       RoleArn: !GetAtt RunbookRole.Arn


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/copyresources_buildspec.yml:
--------------------------------------------------------------------------------
 1 | version: 0.2
 2 | phases:
 3 |   install:
 4 |     runtime-versions:
 5 |           python: 3.8
 6 |     commands:
 7 |       - apt-get update -y
 8 |   build:
 9 |     commands:
10 |       - aws s3 sync ./resources s3://${RESOURCES_BUCKET} --size-only
11 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/resources/notebooks/variant_classifier-autopilot.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## Create Data Exploration and Candidate Notebooks with Sagemaker Autopilot"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "### Introduction\n",
 15 |     "Amazon SageMaker Autopilot is an automated machine learning (commonly referred to as AutoML) solution for tabular datasets. You can use SageMaker Autopilot in different ways: on autopilot (hence the name) or with human guidance, without code through SageMaker Studio, or using the AWS SDKs.\n",
 16 |     "\n",
 17 |     "### Problem Definition\n",
 18 |     "Reference: https://www.kaggle.com/kevinarvai/clinvar-conflicting\n",
 19 |     "\n",
 20 |     "[clinvar](https://www.ncbi.nlm.nih.gov/clinvar/) is a public resource containing annotations about human genetic variants. These variants are (usually manually) classified by clinical laboratories on a categorical spectrum ranging from benign, likely benign, uncertain significance, likely pathogenic, and pathogenic. Variants that have conflicting classifications (from laboratory to laboratory) can cause confusion when clinicians or researchers try to interpret whether the variant has an impact on the disease of a given patient.\n",
 21 |     "The objective is to predict whether a ClinVar variant will have conflicting classifications. This is presented here as a binary classification problem, where each record in the dataset is a genetic variant.\n",
 22 |     "\n",
 23 |     "### Acknowledgements\n",
 24 |     "Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, Gu B, Hart J, Hoffman D, Jang W, Karapetyan K, Katz K, Liu C, Maddipatla Z, Malheiro A, McDaniel K, Ovetsky M, Riley G, Zhou G, Holmes JB, Kattman BL, Maglott DR. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018 Jan 4. PubMed PMID: 29165669."
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {},
 30 |    "source": [
 31 |     "### Setup\n",
 32 |     "\n",
 33 |     "Let's start by specifying:\n",
 34 |     "\n",
 35 |     "The Region Name, Sagemaker Session, The S3 bucket and prefix that you want to use for training and model data. This should be within the same region as the Notebook Instance, training, and hosting.The IAM role arn used to give training and hosting access to your data."
 36 |    ]
 37 |   },
 38 |   {
 39 |    "cell_type": "code",
 40 |    "execution_count": null,
 41 |    "metadata": {},
 42 |    "outputs": [],
 43 |    "source": [
 44 |     "import sagemaker\n",
 45 |     "import boto3\n",
 46 |     "import os, jmespath\n",
 47 |     "from sagemaker import get_execution_role\n",
 48 |     "import pandas as pd\n",
 49 |     "from time import gmtime, strftime, sleep\n",
 50 |     "\n",
 51 |     "region = boto3.Session().region_name\n",
 52 |     "\n",
 53 |     "session = sagemaker.Session()\n",
 54 |     "bucket = session.default_bucket()\n",
 55 |     "\n",
 56 |     "prefix = 'sagemaker/autopilot-vc'\n",
 57 |     "\n",
 58 |     "role = get_execution_role()\n",
 59 |     "\n",
 60 |     "sm = boto3.Session().client(service_name='sagemaker',region_name=region)"
 61 |    ]
 62 |   },
 63 |   {
 64 |    "cell_type": "markdown",
 65 |    "metadata": {},
 66 |    "source": [
 67 |     "### Get datalake bucket"
 68 |    ]
 69 |   },
 70 |   {
 71 |    "cell_type": "code",
 72 |    "execution_count": null,
 73 |    "metadata": {},
 74 |    "outputs": [],
 75 |    "source": [
 76 |     "cfn = boto3.client('cloudformation')\n",
 77 |     "\n",
 78 |     "project_name = os.environ.get('RESOURCE_PREFIX')\n",
 79 |     "resources = cfn.describe_stacks(StackName='{0}-Pipeline'.format(project_name))\n",
 80 |     "query = 'Stacks[].Outputs[?OutputKey==`DataLakeBucket`].OutputValue'\n",
 81 |     "data_lake_bucket = path = jmespath.search(query, resources)[0][0]\n",
 82 |     "print(data_lake_bucket)"
 83 |    ]
 84 |   },
 85 |   {
 86 |    "cell_type": "markdown",
 87 |    "metadata": {},
 88 |    "source": [
 89 |     "### Dataset\n",
 90 |     "Lets load the raw data into a dataframe. The raw data is stored in S3 in the file clinvar_conflicting.csv. This file is downloaded from the follwoing location:https://github.com/arvkevi/clinvar-kaggle/blob/master/clinvar_conflicting.csv"
 91 |    ]
 92 |   },
 93 |   {
 94 |    "cell_type": "code",
 95 |    "execution_count": null,
 96 |    "metadata": {},
 97 |    "outputs": [],
 98 |    "source": [
 99 |     "# Load the raw data into a dataframe from S3\n",
100 |     "raw_data=pd.read_csv(\"s3://{0}/annotation/clinvar/conflicting/clinvar_conflicting.csv\".format(data_lake_bucket))\n",
101 |     "\n",
102 |     "# Take 80% of the data for training\n",
103 |     "train_data = raw_data.sample(frac=0.8,random_state=200)\n",
104 |     "\n",
105 |     "# Take the remaining 20% for testing\n",
106 |     "test_data = raw_data.drop(train_data.index)\n",
107 |     "\n",
108 |     "#save the train and test data as a CSV file and load it to S3\n",
109 |     "train_file = 'train_data.csv';\n",
110 |     "train_data.to_csv(train_file, index=False, header=True)\n",
111 |     "train_data_s3_path = session.upload_data(path=train_file, key_prefix=prefix + \"/train\")\n",
112 |     "print('Train data uploaded to: ' + train_data_s3_path)\n",
113 |     "\n",
114 |     "test_file = 'test_data.csv';\n",
115 |     "test_data.to_csv(test_file, index=False, header=True)\n",
116 |     "test_data_s3_path = session.upload_data(path=test_file, key_prefix=prefix + \"/test\")\n",
117 |     "print('Test data uploaded to: ' + test_data_s3_path)\n",
118 |     "\n",
119 |     "train_data.head()\n"
120 |    ]
121 |   },
122 |   {
123 |    "cell_type": "markdown",
124 |    "metadata": {},
125 |    "source": [
126 |     "### Setting up the SageMaker Autopilot Job\n",
127 |     "After uploading the dataset to Amazon S3, you can invoke Autopilot to find the best ML pipeline to train a model on this dataset.\n",
128 |     "\n",
129 |     "The required inputs for invoking a Autopilot job are:\n",
130 |     "\n",
131 |     "* Amazon S3 location for input dataset and for all output artifacts\n",
132 |     "* Name of the column of the dataset you want to predict (y in this case)\n",
133 |     "* An IAM role\n",
134 |     "\n"
135 |    ]
136 |   },
137 |   {
138 |    "cell_type": "code",
139 |    "execution_count": null,
140 |    "metadata": {},
141 |    "outputs": [],
142 |    "source": [
143 |     "input_data_config = [{\n",
144 |     "      'DataSource': {\n",
145 |     "        'S3DataSource': {\n",
146 |     "          'S3DataType': 'S3Prefix',\n",
147 |     "          'S3Uri': 's3://{}/{}/train'.format(bucket,prefix)\n",
148 |     "        }\n",
149 |     "      },\n",
150 |     "      'TargetAttributeName': 'CLASS'\n",
151 |     "    }\n",
152 |     "  ]\n",
153 |     "\n",
154 |     "output_data_config = {\n",
155 |     "    'S3OutputPath': 's3://{}/{}/output'.format(bucket,prefix)\n",
156 |     "  }"
157 |    ]
158 |   },
159 |   {
160 |    "cell_type": "markdown",
161 |    "metadata": {},
162 |    "source": [
163 |     "You can also specify the type of problem you want to solve with your dataset (Regression, MulticlassClassification, BinaryClassification). In case you are not sure, SageMaker Autopilot will infer the problem type based on statistics of the target column (the column you want to predict).\n",
164 |     "\n",
165 |     "You have the option to limit the running time of a SageMaker Autopilot job by providing either the maximum number of pipeline evaluations or candidates (one pipeline evaluation is called a Candidate because it generates a candidate model) or providing the total time allocated for the overall Autopilot job. Under default settings, this job takes about four hours to run. This varies between runs because of the nature of the exploratory process Autopilot uses to find optimal training parameters.\n",
166 |     "For our model, we are going to just generate the Candidate Notebooks and explore it ourselves instead of running the complete default experiment. This is done by setting the flag \"GenerateCandidateDefinitionsOnly=True\"\n",
167 |     "\n",
168 |     "### Launching the SageMaker Autopilot Job\n",
169 |     "You can now launch the Autopilot job by calling the create_auto_ml_job API.\n",
170 |     "\n",
171 |     "**NOTE: The name of the Autopilot job is important because it is used to create the names for all the resources created by Sagemaker like the model name and the endpoint name.**"
172 |    ]
173 |   },
174 |   {
175 |    "cell_type": "code",
176 |    "execution_count": null,
177 |    "metadata": {},
178 |    "outputs": [],
179 |    "source": [
180 |     "timestamp_suffix = strftime('%d-%H-%M-%S', gmtime())\n",
181 |     "\n",
182 |     "auto_ml_job_name = 'automl-vc-' + timestamp_suffix\n",
183 |     "print('AutoMLJobName: ' + auto_ml_job_name)\n",
184 |     "\n",
185 |     "sm.create_auto_ml_job(AutoMLJobName=auto_ml_job_name,\n",
186 |     "                      InputDataConfig=input_data_config,\n",
187 |     "                      OutputDataConfig=output_data_config,\n",
188 |     "                      GenerateCandidateDefinitionsOnly=True,\n",
189 |     "                      RoleArn=role)\n",
190 |     "\n",
191 |     "\n",
192 |     "\n",
193 |     "print ('JobStatus - Secondary Status')\n",
194 |     "print('------------------------------')\n",
195 |     "\n",
196 |     "\n",
197 |     "describe_response = sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)\n",
198 |     "print (describe_response['AutoMLJobStatus'] + \" - \" + describe_response['AutoMLJobSecondaryStatus'])\n",
199 |     "job_run_status = describe_response['AutoMLJobStatus']\n",
200 |     "    \n"
201 |    ]
202 |   },
203 |   {
204 |    "cell_type": "markdown",
205 |    "metadata": {},
206 |    "source": [
207 |     "We will now wait for Sagemaker autopilot to generate the candidate notebooks."
208 |    ]
209 |   },
210 |   {
211 |    "cell_type": "code",
212 |    "execution_count": null,
213 |    "metadata": {},
214 |    "outputs": [],
215 |    "source": [
216 |     "while job_run_status not in ('Failed', 'Completed', 'Stopped'):\n",
217 |     "    describe_response = sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)\n",
218 |     "    job_run_status = describe_response['AutoMLJobStatus']\n",
219 |     "    \n",
220 |     "    print (describe_response['AutoMLJobStatus'] + \" - \" + describe_response['AutoMLJobSecondaryStatus'])\n",
221 |     "    sleep(30)\n",
222 |     "\n",
223 |     "\n",
224 |     "\n",
225 |     "candidate_nb=sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)['AutoMLJobArtifacts']['CandidateDefinitionNotebookLocation']\n",
226 |     "\n",
227 |     "data_nb=sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)['AutoMLJobArtifacts']['DataExplorationNotebookLocation']\n",
228 |     "\n",
229 |     "print (\"Data Exploration Notebook: \"+data_nb)\n",
230 |     "print(\"------------------------------------------------------------------------------------------------\")\n",
231 |     "print(\"Candidate Generation Notebook: \"+candidate_nb)\n"
232 |    ]
233 |   },
234 |   {
235 |    "cell_type": "markdown",
236 |    "metadata": {},
237 |    "source": [
238 |     "### Downloading the autopilot candidate Notebooks\n",
239 |     "Now that Sagemaker autopilot has analyzed our data and created the candidate notebooks, lets download them and explore."
240 |    ]
241 |   },
242 |   {
243 |    "cell_type": "code",
244 |    "execution_count": null,
245 |    "metadata": {},
246 |    "outputs": [],
247 |    "source": [
248 |     "!aws s3 cp $data_nb .\n",
249 |     "!aws s3 cp $candidate_nb ."
250 |    ]
251 |   },
252 |   {
253 |    "cell_type": "markdown",
254 |    "metadata": {},
255 |    "source": [
256 |     "### Analyzing the candidate notebooks\n",
257 |     "Reference: https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-automate-model-development-notebook-output.html\n",
258 |     "\n",
259 |     "During the analysis phase of the AutoML job, two notebooks are created that describe the plan that Autopilot follows to generate candidate models. A candidate model consists of a (pipeline, algorithm) pair. First, there’s a data exploration notebook, that describes what Autopilot learned about the data that you provided. Second, there’s a candidate generation notebook, which uses the information about the data to generate candidates.\n",
260 |     "\n",
261 |     "You can run both notebooks in SageMaker or locally if you have installed the SageMaker Python SDK. You can share the notebooks just like any other SageMaker Studio notebook. The notebooks are created for you to conduct experiment. For example, you could edit the following items in the notebooks:\n",
262 |     "\n",
263 |     "* the preprocessors used on the data\n",
264 |     "\n",
265 |     "* the number of hyperparameter optimization (HPO) runs and their parallelism\n",
266 |     "\n",
267 |     "* the algorithms to try\n",
268 |     "\n",
269 |     "* the instance types used for the HPO jobs\n",
270 |     "\n",
271 |     "* the hyperparameter ranges\n",
272 |     "\n",
273 |     "Modifications to the candidate generation notebook are encouraged to be used as a learning tool. This capability allows you to learn about how the decisions made during the machine learning process impact the your results."
274 |    ]
275 |   },
276 |   {
277 |    "cell_type": "markdown",
278 |    "metadata": {},
279 |    "source": [
280 |     "### Next Steps\n",
281 |     "You can now switch over to the two notebooks. Feel free to change parameters and modify them as needed for your final ML model deployment. At the end of the candidate notebook, you will have a hosted model on Sagemaker with an endpoint. We have provided a notebook \"variant_predictor.ipynb\" that runs predictions on the model using the test data we saved earlier. So, to summarize the next steps:\n",
282 |     "* Explore and run the SageMakerAutopilotDataExplorationNotebook.ipynb notebook.\n",
283 |     "* Explore and run the SageMakerAutopilotCandidateDefinitionNotebook.ipynb notebook.\n",
284 |     "* Explore and run the variant_predictor.ipynb notebook."
285 |    ]
286 |   }
287 |  ],
288 |  "metadata": {
289 |   "kernelspec": {
290 |    "display_name": "conda_tensorflow_p36",
291 |    "language": "python",
292 |    "name": "conda_tensorflow_p36"
293 |   },
294 |   "language_info": {
295 |    "codemirror_mode": {
296 |     "name": "ipython",
297 |     "version": 3
298 |    },
299 |    "file_extension": ".py",
300 |    "mimetype": "text/x-python",
301 |    "name": "python",
302 |    "nbconvert_exporter": "python",
303 |    "pygments_lexer": "ipython3",
304 |    "version": "3.6.6"
305 |   }
306 |  },
307 |  "nbformat": 4,
308 |  "nbformat_minor": 2
309 | }
310 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/resources/notebooks/variant_predictor.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## Generate predictions and evaluate the model"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "### Introduction\n",
 15 |     "I this notebook, we will run predictions on the model that we trained and deployed in the previous steps. If you recall, the model is hosted on Sagemaker realtime prediction endpoint. We will invoke that endpoint to generate the binary labels(1,0) on a few rows that we have in our test file (test_data.csv). We will then evaluate the results against the ground truth and see how the model performs."
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "markdown",
 20 |    "metadata": {},
 21 |    "source": [
 22 |     "### Prerequisits\n",
 23 |     "Before proceeding, make sure you have run the following notebook in order without any errors:\n",
 24 |     "1. variant_classifier-autopilot.ipynb\n",
 25 |     "2. SageMakerAutopilotDataExplorationNotebook.ipynb\n",
 26 |     "3. SageMakerAutopilotCandidateDefinitionNotebook.ipynb\n",
 27 |     "\n",
 28 |     "If not, please go back to the notebooks and run them before proceeding.  "
 29 |    ]
 30 |   },
 31 |   {
 32 |    "cell_type": "markdown",
 33 |    "metadata": {},
 34 |    "source": [
 35 |     "### Setup\n",
 36 |     "Lets start by importing the libraries that we will need for executing this notebook."
 37 |    ]
 38 |   },
 39 |   {
 40 |    "cell_type": "code",
 41 |    "execution_count": null,
 42 |    "metadata": {},
 43 |    "outputs": [],
 44 |    "source": [
 45 |     "import pandas as pd\n",
 46 |     "import sagemaker \n",
 47 |     "from sagemaker.predictor import RealTimePredictor\n",
 48 |     "from sagemaker.content_types import CONTENT_TYPE_CSV\n",
 49 |     "import boto3\n",
 50 |     "from sklearn import metrics\n",
 51 |     "import numpy as np\n",
 52 |     "import seaborn as sns\n",
 53 |     "import matplotlib.pyplot as plt"
 54 |    ]
 55 |   },
 56 |   {
 57 |    "cell_type": "markdown",
 58 |    "metadata": {},
 59 |    "source": [
 60 |     "### Get the endpoint name\n",
 61 |     "To generate predictions on test data, we need to get the endpoint name of the model that we deployed at the end of the SageMakerAutopilotCandidateDefinitionNotebook.ipynb notebook. To do this, we find the endpoint among the list of endpoints that starts with the string \"AutoML-automl-vc\". This is the default naming format that has been used in the variant_classifier-autopilot.ipynb and SageMakerAutopilotCandidateDefinitionNotebook.ipynb notebooks.\n",
 62 |     "\n",
 63 |     "**NOTE:** If you changed the naming convention and/or have multiple endpoints beginning with the string \"AutoML-automl-vc\", the endpoint retrieved may not be the correct one. You can verify by logging into the AWS console, navigating to Sagemaker and selecting \"Endpoints\" from the left hand menu. Here you will see all the endpoints that have been created in your account. Select the one that you created as part of the SageMakerAutopilotCandidateDefinitionNotebook.ipynb notebook. If the correct endpoint is not selected, you can overwrite the name of the variable \"endpoint_name\" with the correct endpoint name. Make sure the correct endpoint is selected before proceeding."
 64 |    ]
 65 |   },
 66 |   {
 67 |    "cell_type": "code",
 68 |    "execution_count": null,
 69 |    "metadata": {},
 70 |    "outputs": [],
 71 |    "source": [
 72 |     "sm = boto3.client('sagemaker')\n",
 73 |     "endpoints=sm.list_endpoints()['Endpoints']\n",
 74 |     "for val in endpoints:\n",
 75 |     "    ep=val.get(\"EndpointName\")\n",
 76 |     "    if ep.startswith('AutoML-automl-vc'):\n",
 77 |     "        endpoint_name=ep\n",
 78 |     "        print ('Model endpoint: '+endpoint_name)\n",
 79 |     "        print ('Make sure this is the correct model endpoint before proceeding')\n",
 80 |     "        break\n",
 81 |     "    print('No endpoint found. Make sure you have completed the steps mentioned in the prerequisits above.')"
 82 |    ]
 83 |   },
 84 |   {
 85 |    "cell_type": "markdown",
 86 |    "metadata": {},
 87 |    "source": [
 88 |     "### Data Preprocessing\n",
 89 |     "We will now read the file \"test_data.csv\" into a dataframe and randomly sample 1000 records from it. "
 90 |    ]
 91 |   },
 92 |   {
 93 |    "cell_type": "code",
 94 |    "execution_count": null,
 95 |    "metadata": {},
 96 |    "outputs": [],
 97 |    "source": [
 98 |     "test_file=pd.read_csv('test_data.csv')\n",
 99 |     "test_rows=test_file.sample(1000)\n",
100 |     "test_rows.head()"
101 |    ]
102 |   },
103 |   {
104 |    "cell_type": "markdown",
105 |    "metadata": {},
106 |    "source": [
107 |     "As you can see, the test rows look exactly like the rows in the training dataset as expected. We will now saperate out our target variable \"CLASS\" from the test data and store it in a new dataframe \"actual\""
108 |    ]
109 |   },
110 |   {
111 |    "cell_type": "code",
112 |    "execution_count": null,
113 |    "metadata": {},
114 |    "outputs": [],
115 |    "source": [
116 |     "test_rows_notarget=test_rows.drop(['CLASS'],axis=1)\n",
117 |     "actual=test_rows['CLASS'].to_frame(name=\"actual\")\n",
118 |     "actual.reset_index(drop=True, inplace=True)"
119 |    ]
120 |   },
121 |   {
122 |    "cell_type": "markdown",
123 |    "metadata": {},
124 |    "source": [
125 |     "### Generate Predictions\n",
126 |     "Next, we will invoke the endpoint of our model with the test rows and generate a prediction for each row. We will then store the results of the predciton in a new dataframe called \"predicted\"."
127 |    ]
128 |   },
129 |   {
130 |    "cell_type": "code",
131 |    "execution_count": null,
132 |    "metadata": {},
133 |    "outputs": [],
134 |    "source": [
135 |     "sm_session = sagemaker.Session()\n",
136 |     "variant_predictor=RealTimePredictor(endpoint=endpoint_name,sagemaker_session=sm_session,content_type=CONTENT_TYPE_CSV,\n",
137 |     "    accept=CONTENT_TYPE_CSV)"
138 |    ]
139 |   },
140 |   {
141 |    "cell_type": "code",
142 |    "execution_count": null,
143 |    "metadata": {},
144 |    "outputs": [],
145 |    "source": [
146 |     "predicted_str=variant_predictor.predict(test_rows_notarget.to_csv(sep=',', header=False, index=False)).decode('utf-8')\n",
147 |     "predicted=pd.Series(predicted_str.split(),name='predicted').to_frame().astype(int)"
148 |    ]
149 |   },
150 |   {
151 |    "cell_type": "markdown",
152 |    "metadata": {},
153 |    "source": [
154 |     "Finally, we combine \"actual\" and \"predicted\" values into a single dataframe called \"results\""
155 |    ]
156 |   },
157 |   {
158 |    "cell_type": "code",
159 |    "execution_count": null,
160 |    "metadata": {},
161 |    "outputs": [],
162 |    "source": [
163 |     "results=pd.concat([actual, predicted],axis=1)\n",
164 |     "results.head()"
165 |    ]
166 |   },
167 |   {
168 |    "cell_type": "markdown",
169 |    "metadata": {},
170 |    "source": [
171 |     "### Model Evaluation\n",
172 |     "We will now generate some evaluation metrics for our binary classification model. We will start with a [confusion matrix](https://en.wikipedia.org/wiki/Confusion_matrix) and follow that up with an [Receiver Operating Characteristic (ROC) curve](https://en.wikipedia.org/wiki/Receiver_operating_characteristic)."
173 |    ]
174 |   },
175 |   {
176 |    "cell_type": "code",
177 |    "execution_count": null,
178 |    "metadata": {},
179 |    "outputs": [],
180 |    "source": [
181 |     "cf_matrix = metrics.confusion_matrix(results['actual'], results['predicted'])\n",
182 |     "group_names = ['True Neg','False Pos','False Neg','True Pos']\n",
183 |     "group_counts = [\"{0:0.0f}\".format(value) for value in\n",
184 |     "                cf_matrix.flatten()]\n",
185 |     "group_percentages = [\"{0:.2%}\".format(value) for value in\n",
186 |     "                     cf_matrix.flatten()/np.sum(cf_matrix)]\n",
187 |     "labels = [f\"{v1}\\n{v2}\\n{v3}\" for v1, v2, v3 in\n",
188 |     "          zip(group_names,group_counts,group_percentages)]\n",
189 |     "labels = np.asarray(labels).reshape(2,2)\n",
190 |     "sns.heatmap(cf_matrix, annot=labels, fmt='', cmap='Blues');"
191 |    ]
192 |   },
193 |   {
194 |    "cell_type": "code",
195 |    "execution_count": null,
196 |    "metadata": {},
197 |    "outputs": [],
198 |    "source": [
199 |     "fpr, tpr, thresholds = metrics.roc_curve(results['actual'], results['predicted'])"
200 |    ]
201 |   },
202 |   {
203 |    "cell_type": "code",
204 |    "execution_count": null,
205 |    "metadata": {},
206 |    "outputs": [],
207 |    "source": [
208 |     "roc_auc=metrics.auc(fpr, tpr)\n",
209 |     "accuracy=metrics.accuracy_score(results['actual'], results['predicted'])\n",
210 |     "precision=metrics.precision_score(results['actual'], results['predicted'])\n",
211 |     "recall=metrics.recall_score(results['actual'], results['predicted'])\n",
212 |     "f1score=metrics.f1_score(results['actual'], results['predicted'])\n",
213 |     "plt.figure()\n",
214 |     "lw = 2\n",
215 |     "plt.plot(fpr, tpr, color='darkorange',\n",
216 |     "         lw=lw, label='ROC curve (area = %0.2f)' % roc_auc)\n",
217 |     "plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')\n",
218 |     "plt.xlim([0.0, 1.0])\n",
219 |     "plt.ylim([0.0, 1.05])\n",
220 |     "plt.xlabel('False Positive Rate')\n",
221 |     "plt.ylabel('True Positive Rate')\n",
222 |     "plt.title('Receiver operating characteristic (ROC) curve')\n",
223 |     "plt.legend(loc=\"lower right\")\n",
224 |     "plt.text(1.1,0.75,s='Accuracy: '+str(round(accuracy,2))+'\\nPrecision: '+str(round(precision,2))+\n",
225 |     "'\\nRecall: '+str(round(recall,2))+'\\nF1 Score: '+str(round(f1score,2)),bbox=dict(boxstyle=\"square\",\n",
226 |     "                   ec=(1., 0.5, 0.5),\n",
227 |     "                   fc=(1., 0.8, 0.8),\n",
228 |     "                   ))\n",
229 |     "\n",
230 |     "plt.show()"
231 |    ]
232 |   }
233 |  ],
234 |  "metadata": {
235 |   "kernelspec": {
236 |    "display_name": "conda_python3",
237 |    "language": "python",
238 |    "name": "conda_python3"
239 |   },
240 |   "language_info": {
241 |    "codemirror_mode": {
242 |     "name": "ipython",
243 |     "version": 3
244 |    },
245 |    "file_extension": ".py",
246 |    "mimetype": "text/x-python",
247 |    "name": "python",
248 |    "nbconvert_exporter": "python",
249 |    "pygments_lexer": "ipython3",
250 |    "version": "3.6.5"
251 |   }
252 |  },
253 |  "nbformat": 4,
254 |  "nbformat_minor": 4
255 | }
256 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/resources/scripts/process_clinvar.py:
--------------------------------------------------------------------------------
  1 | ###############################################################################
  2 | #  Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.    #
  3 | #                                                                             #
  4 | #  Licensed under the Apache License, Version 2.0 (the "License").            #
  5 | #  You may not use this file except in compliance with the License.
  6 | #  A copy of the License is located at                                        #
  7 | #                                                                             #
  8 | #      http://www.apache.org/licenses/LICENSE-2.0                             #
  9 | #                                                                             #
 10 | #  or in the "license" file accompanying this file. This file is distributed  #
 11 | #  on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express #
 12 | #  or implied. See the License for the specific language governing permissions#
 13 | #  and limitations under the License.                                         #
 14 | ###############################################################################
 15 | 
 16 | import gzip, re
 17 | import pandas as pd
 18 | import csv
 19 | import sys
 20 | from awsglue.utils import getResolvedOptions
 21 | import boto3
 22 | 
 23 | s3 = boto3.client('s3')
 24 | s3_resource = boto3.resource('s3')
 25 | 
 26 | args = getResolvedOptions(sys.argv,
 27 |                           ['input_bucket', 'clinvar_input_key', 'clinvar_annotated_input_key', 'output_bucket',
 28 |                            'output_key'])
 29 | 
 30 | 
 31 | def download_to_local(filename):
 32 |     new_filename = filename.split('/')[-1]
 33 |     s3_resource.meta.client.download_file(args['input_bucket'], filename, '/tmp/' + new_filename)
 34 |     return new_filename
 35 | 
 36 | 
 37 | def list_to_dict(l):
 38 |     """Convert list to dict."""
 39 |     return {k: v for k, v in (x.split("=") for x in l)}
 40 | 
 41 | 
 42 | fieldnames = [
 43 |     "CHROM",
 44 |     "POS",
 45 |     "REF",
 46 |     "ALT",
 47 |     "AF_ESP",
 48 |     "AF_EXAC",
 49 |     "AF_TGP",
 50 |     "CLNDISDB",
 51 |     "CLNDISDBINCL",
 52 |     "CLNDN",
 53 |     "CLNDNINCL",
 54 |     "CLNHGVS",
 55 |     "CLNSIGINCL",
 56 |     "CLNVC",
 57 |     "CLNVI",
 58 |     "MC",
 59 |     "ORIGIN",
 60 |     "SSR",
 61 |     "CLASS",
 62 |     "Allele",
 63 |     "Consequence",
 64 |     "IMPACT",
 65 |     "SYMBOL",
 66 |     "Feature_type",
 67 |     "Feature",
 68 |     "BIOTYPE",
 69 |     "EXON",
 70 |     "INTRON",
 71 |     "cDNA_position",
 72 |     "CDS_position",
 73 |     "Protein_position",
 74 |     "Amino_acids",
 75 |     "Codons",
 76 |     "DISTANCE",
 77 |     "STRAND",
 78 |     "BAM_EDIT",
 79 |     "SIFT",
 80 |     "PolyPhen",
 81 |     "MOTIF_NAME",
 82 |     "MOTIF_POS",
 83 |     "HIGH_INF_POS",
 84 |     "MOTIF_SCORE_CHANGE",
 85 |     "LoFtool",
 86 |     "CADD_PHRED",
 87 |     "CADD_RAW",
 88 |     "BLOSUM62",
 89 | ]
 90 | 
 91 | obj = s3.get_object(Bucket=args['input_bucket'], Key=args['clinvar_input_key'])
 92 | cv_columns = {}
 93 | with gzip.GzipFile(fileobj=obj['Body'], mode='rb') as f:
 94 |     for metaline in f:
 95 |         if metaline.startswith(b'##INFO'):
 96 |             colname = re.search(b'ID=(\w+),', metaline.strip(b'#\n'))
 97 |             coldesc = re.search(b'.*Description=(.*)>', metaline.strip(b'#\n'))
 98 |             cv_columns[colname.group(1)] = coldesc.group(1).strip(b'"')
 99 | 
100 | # read clinvar vcf
101 | obj = s3.get_object(Bucket=args['input_bucket'], Key=args['clinvar_input_key'])
102 | with gzip.GzipFile(fileobj=obj['Body'], mode='rb') as f:
103 |     cv_df = pd.read_csv(f, sep='\t', comment='#', header=None, usecols=[0, 1, 2, 3, 4, 7], dtype={0: object})
104 | 
105 | # convert dictionaries to columns
106 | cv_df = pd.concat(
107 |     [
108 |         cv_df.drop([7], axis=1),
109 |         cv_df[7].str.split(";").apply(list_to_dict).apply(pd.Series),
110 |     ],
111 |     axis=1,
112 | )
113 | # rename columns
114 | cv_df.rename(columns={0: "CHROM", 1: "POS", 2: "ID", 3: "REF", 4: "ALT"}, inplace=True)
115 | 
116 | # drop columns we know we won't need
117 | cv_df = cv_df.drop(columns=["CHROM", "POS", "REF", "ALT"])
118 | 
119 | # assign classes
120 | cv_df["CLASS"] = 0
121 | cv_df.loc[cv_df["CLNSIGCONF"].notnull(), "CLASS"] = 1
122 | 
123 | # convert NaN to 0 where allele frequencies are null
124 | cv_df[["AF_ESP", "AF_EXAC", "AF_TGP"]] = cv_df[["AF_ESP", "AF_EXAC", "AF_TGP"]].fillna(
125 |     0
126 | )
127 | 
128 | # select variants that have beeen submitted by multiple organizations.
129 | cv_df = cv_df.loc[
130 |     cv_df["CLNREVSTAT"].isin(
131 |         [
132 |             "criteria_provided,_multiple_submitters,_no_conflicts",
133 |             "criteria_provided,_conflicting_interpretations",
134 |         ]
135 |     )
136 | ]
137 | 
138 | # Reduce the size of the dataset below
139 | cv_df.drop(columns=["ALLELEID", "RS", "DBVARID"], inplace=True)
140 | # drop columns that would reveal class
141 | cv_df.drop(columns=["CLNSIG", "CLNSIGCONF", "CLNREVSTAT"], inplace=True)
142 | # drop this redundant columns
143 | cv_df.drop(columns=["CLNVCSO", "GENEINFO"], inplace=True)
144 | 
145 | # dictionary to map ID to clinvar annotations
146 | clinvar_annotations = cv_df.set_index("ID")[
147 |     [col for col in cv_df.columns if col in fieldnames]
148 | ].to_dict(orient="index")
149 | 
150 | # open the output file
151 | outfile = "/tmp/clinvar_conflicting.csv"
152 | with open(outfile, "w") as fout:
153 |     dw = csv.DictWriter(
154 |         fout, delimiter=",", fieldnames=fieldnames, extrasaction="ignore"
155 |     )
156 |     dw.writeheader()
157 |     # read the VEP-annotated vcf file line-by-line
158 |     filename = download_to_local(args['clinvar_annotated_input_key'])
159 |     filename = "/tmp/" + filename
160 |     with gzip.GzipFile(filename, mode='rb') as f:
161 |         for line in f:
162 |             line = line.decode("utf-8")
163 |             if line.startswith("##INFO=<ID=CSQ"):
164 |                 m = re.search(r'.*Format: (.*)">', line)
165 |                 cols = m.group(1).split("|")
166 |                 continue
167 | 
168 |             if line.startswith("#"):
169 |                 continue
170 |             record = line.split("\t")
171 |             (
172 |                 chromosome,
173 |                 position,
174 |                 clinvar_id,
175 |                 reference_base,
176 |                 alternate_base,
177 |                 qual,
178 |                 filter_,
179 |                 info,
180 |             ) = record
181 |             info_field = info.strip("\n").split(";")
182 | 
183 |             # to lookup in clivnar_annotaitons
184 |             clinvar_id = int(clinvar_id)
185 | 
186 |             # only keep the variants that have been evaluated by multiple submitters
187 |             if clinvar_id in clinvar_annotations:
188 |                 # initialize a dictionary to hold all the VEP annotation data
189 |                 annotation_data = {column: None for column in cols}
190 |                 annotation_data.update(clinvar_annotations[clinvar_id])
191 |                 # fields directly from the vcf
192 |                 annotation_data["CHROM"] = str(chromosome)
193 |                 annotation_data["POS"] = position
194 |                 annotation_data["REF"] = reference_base
195 |                 annotation_data["ALT"] = alternate_base
196 | 
197 |                 for annotations in info_field:
198 |                     column, value = annotations.split("=")
199 | 
200 |                     if column == "CSQ":
201 |                         for csq_column, csq_value in zip(cols, value.split("|")):
202 |                             annotation_data[csq_column] = csq_value
203 |                         continue
204 | 
205 |                     annotation_data[column] = value
206 |                 dw.writerow(annotation_data)
207 | 
208 | s3_resource.meta.client.upload_file(outfile, args['output_bucket'], args['output_key'])


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/INSTALLER:
--------------------------------------------------------------------------------
1 | pip
2 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/LICENSE:
--------------------------------------------------------------------------------
  1 | 
  2 |                                  Apache License
  3 |                            Version 2.0, January 2004
  4 |                         http://www.apache.org/licenses/
  5 | 
  6 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  7 | 
  8 |    1. Definitions.
  9 | 
 10 |       "License" shall mean the terms and conditions for use, reproduction,
 11 |       and distribution as defined by Sections 1 through 9 of this document.
 12 | 
 13 |       "Licensor" shall mean the copyright owner or entity authorized by
 14 |       the copyright owner that is granting the License.
 15 | 
 16 |       "Legal Entity" shall mean the union of the acting entity and all
 17 |       other entities that control, are controlled by, or are under common
 18 |       control with that entity. For the purposes of this definition,
 19 |       "control" means (i) the power, direct or indirect, to cause the
 20 |       direction or management of such entity, whether by contract or
 21 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 22 |       outstanding shares, or (iii) beneficial ownership of such entity.
 23 | 
 24 |       "You" (or "Your") shall mean an individual or Legal Entity
 25 |       exercising permissions granted by this License.
 26 | 
 27 |       "Source" form shall mean the preferred form for making modifications,
 28 |       including but not limited to software source code, documentation
 29 |       source, and configuration files.
 30 | 
 31 |       "Object" form shall mean any form resulting from mechanical
 32 |       transformation or translation of a Source form, including but
 33 |       not limited to compiled object code, generated documentation,
 34 |       and conversions to other media types.
 35 | 
 36 |       "Work" shall mean the work of authorship, whether in Source or
 37 |       Object form, made available under the License, as indicated by a
 38 |       copyright notice that is included in or attached to the work
 39 |       (an example is provided in the Appendix below).
 40 | 
 41 |       "Derivative Works" shall mean any work, whether in Source or Object
 42 |       form, that is based on (or derived from) the Work and for which the
 43 |       editorial revisions, annotations, elaborations, or other modifications
 44 |       represent, as a whole, an original work of authorship. For the purposes
 45 |       of this License, Derivative Works shall not include works that remain
 46 |       separable from, or merely link (or bind by name) to the interfaces of,
 47 |       the Work and Derivative Works thereof.
 48 | 
 49 |       "Contribution" shall mean any work of authorship, including
 50 |       the original version of the Work and any modifications or additions
 51 |       to that Work or Derivative Works thereof, that is intentionally
 52 |       submitted to Licensor for inclusion in the Work by the copyright owner
 53 |       or by an individual or Legal Entity authorized to submit on behalf of
 54 |       the copyright owner. For the purposes of this definition, "submitted"
 55 |       means any form of electronic, verbal, or written communication sent
 56 |       to the Licensor or its representatives, including but not limited to
 57 |       communication on electronic mailing lists, source code control systems,
 58 |       and issue tracking systems that are managed by, or on behalf of, the
 59 |       Licensor for the purpose of discussing and improving the Work, but
 60 |       excluding communication that is conspicuously marked or otherwise
 61 |       designated in writing by the copyright owner as "Not a Contribution."
 62 | 
 63 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 64 |       on behalf of whom a Contribution has been received by Licensor and
 65 |       subsequently incorporated within the Work.
 66 | 
 67 |    2. Grant of Copyright License. Subject to the terms and conditions of
 68 |       this License, each Contributor hereby grants to You a perpetual,
 69 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 70 |       copyright license to reproduce, prepare Derivative Works of,
 71 |       publicly display, publicly perform, sublicense, and distribute the
 72 |       Work and such Derivative Works in Source or Object form.
 73 | 
 74 |    3. Grant of Patent License. Subject to the terms and conditions of
 75 |       this License, each Contributor hereby grants to You a perpetual,
 76 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 77 |       (except as stated in this section) patent license to make, have made,
 78 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 79 |       where such license applies only to those patent claims licensable
 80 |       by such Contributor that are necessarily infringed by their
 81 |       Contribution(s) alone or by combination of their Contribution(s)
 82 |       with the Work to which such Contribution(s) was submitted. If You
 83 |       institute patent litigation against any entity (including a
 84 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 85 |       or a Contribution incorporated within the Work constitutes direct
 86 |       or contributory patent infringement, then any patent licenses
 87 |       granted to You under this License for that Work shall terminate
 88 |       as of the date such litigation is filed.
 89 | 
 90 |    4. Redistribution. You may reproduce and distribute copies of the
 91 |       Work or Derivative Works thereof in any medium, with or without
 92 |       modifications, and in Source or Object form, provided that You
 93 |       meet the following conditions:
 94 | 
 95 |       (a) You must give any other recipients of the Work or
 96 |           Derivative Works a copy of this License; and
 97 | 
 98 |       (b) You must cause any modified files to carry prominent notices
 99 |           stating that You changed the files; and
100 | 
101 |       (c) You must retain, in the Source form of any Derivative Works
102 |           that You distribute, all copyright, patent, trademark, and
103 |           attribution notices from the Source form of the Work,
104 |           excluding those notices that do not pertain to any part of
105 |           the Derivative Works; and
106 | 
107 |       (d) If the Work includes a "NOTICE" text file as part of its
108 |           distribution, then any Derivative Works that You distribute must
109 |           include a readable copy of the attribution notices contained
110 |           within such NOTICE file, excluding those notices that do not
111 |           pertain to any part of the Derivative Works, in at least one
112 |           of the following places: within a NOTICE text file distributed
113 |           as part of the Derivative Works; within the Source form or
114 |           documentation, if provided along with the Derivative Works; or,
115 |           within a display generated by the Derivative Works, if and
116 |           wherever such third-party notices normally appear. The contents
117 |           of the NOTICE file are for informational purposes only and
118 |           do not modify the License. You may add Your own attribution
119 |           notices within Derivative Works that You distribute, alongside
120 |           or as an addendum to the NOTICE text from the Work, provided
121 |           that such additional attribution notices cannot be construed
122 |           as modifying the License.
123 | 
124 |       You may add Your own copyright statement to Your modifications and
125 |       may provide additional or different license terms and conditions
126 |       for use, reproduction, or distribution of Your modifications, or
127 |       for any such Derivative Works as a whole, provided Your use,
128 |       reproduction, and distribution of the Work otherwise complies with
129 |       the conditions stated in this License.
130 | 
131 |    5. Submission of Contributions. Unless You explicitly state otherwise,
132 |       any Contribution intentionally submitted for inclusion in the Work
133 |       by You to the Licensor shall be under the terms and conditions of
134 |       this License, without any additional terms or conditions.
135 |       Notwithstanding the above, nothing herein shall supersede or modify
136 |       the terms of any separate license agreement you may have executed
137 |       with Licensor regarding such Contributions.
138 | 
139 |    6. Trademarks. This License does not grant permission to use the trade
140 |       names, trademarks, service marks, or product names of the Licensor,
141 |       except as required for reasonable and customary use in describing the
142 |       origin of the Work and reproducing the content of the NOTICE file.
143 | 
144 |    7. Disclaimer of Warranty. Unless required by applicable law or
145 |       agreed to in writing, Licensor provides the Work (and each
146 |       Contributor provides its Contributions) on an "AS IS" BASIS,
147 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148 |       implied, including, without limitation, any warranties or conditions
149 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150 |       PARTICULAR PURPOSE. You are solely responsible for determining the
151 |       appropriateness of using or redistributing the Work and assume any
152 |       risks associated with Your exercise of permissions under this License.
153 | 
154 |    8. Limitation of Liability. In no event and under no legal theory,
155 |       whether in tort (including negligence), contract, or otherwise,
156 |       unless required by applicable law (such as deliberate and grossly
157 |       negligent acts) or agreed to in writing, shall any Contributor be
158 |       liable to You for damages, including any direct, indirect, special,
159 |       incidental, or consequential damages of any character arising as a
160 |       result of this License or out of the use or inability to use the
161 |       Work (including but not limited to damages for loss of goodwill,
162 |       work stoppage, computer failure or malfunction, or any and all
163 |       other commercial damages or losses), even if such Contributor
164 |       has been advised of the possibility of such damages.
165 | 
166 |    9. Accepting Warranty or Additional Liability. While redistributing
167 |       the Work or Derivative Works thereof, You may choose to offer,
168 |       and charge a fee for, acceptance of support, warranty, indemnity,
169 |       or other liability obligations and/or rights consistent with this
170 |       License. However, in accepting such obligations, You may act only
171 |       on Your own behalf and on Your sole responsibility, not on behalf
172 |       of any other Contributor, and only if You agree to indemnify,
173 |       defend, and hold each Contributor harmless for any liability
174 |       incurred by, or claims asserted against, such Contributor by reason
175 |       of your accepting any such warranty or additional liability.
176 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/METADATA:
--------------------------------------------------------------------------------
  1 | Metadata-Version: 2.1
  2 | Name: crhelper
  3 | Version: 2.0.6
  4 | Summary: crhelper simplifies authoring CloudFormation Custom Resources
  5 | Home-page: https://github.com/aws-cloudformation/custom-resource-helper
  6 | Author: Jay McConnell
  7 | Author-email: jmmccon@amazon.com
  8 | License: Apache2
  9 | Platform: UNKNOWN
 10 | Classifier: Programming Language :: Python :: 3.6
 11 | Classifier: Programming Language :: Python :: 3.7
 12 | Classifier: License :: OSI Approved :: Apache Software License
 13 | Classifier: Operating System :: OS Independent
 14 | Description-Content-Type: text/markdown
 15 | 
 16 | ## Custom Resource Helper
 17 | 
 18 | Simplify best practice Custom Resource creation, sending responses to CloudFormation and providing exception, timeout 
 19 | trapping, and detailed configurable logging.
 20 | 
 21 | [![PyPI Version](https://img.shields.io/pypi/v/crhelper.svg)](https://pypi.org/project/crhelper/)
 22 | ![Python Versions](https://img.shields.io/pypi/pyversions/crhelper.svg)
 23 | [![Build Status](https://travis-ci.com/aws-cloudformation/custom-resource-helper.svg?branch=master)](https://travis-ci.com/aws-cloudformation/custom-resource-helper)
 24 | [![Test Coverage](https://codecov.io/gh/aws-cloudformation/custom-resource-helper/branch/master/graph/badge.svg)](https://codecov.io/gh/aws-cloudformation/custom-resource-helper)
 25 | 
 26 | ## Features
 27 | 
 28 | * Dead simple to use, reduces the complexity of writing a CloudFormation custom resource
 29 | * Guarantees that CloudFormation will get a response even if an exception is raised
 30 | * Returns meaningful errors to CloudFormation Stack events in the case of a failure
 31 | * Polling enables run times longer than the lambda 15 minute limit
 32 | * JSON logging that includes request id's, stack id's and request type to assist in tracing logs relevant to a 
 33 | particular CloudFormation event
 34 | * Catches function timeouts and sends CloudFormation a failure response
 35 | * Static typing (mypy) compatible
 36 | 
 37 | ## Installation
 38 | 
 39 | Install into the root folder of your lambda function
 40 | 
 41 | ```json
 42 | cd my-lambda-function/
 43 | pip install crhelper -t .
 44 | ```
 45 | 
 46 | ## Example Usage
 47 | 
 48 | [This blog](https://aws.amazon.com/blogs/infrastructure-and-automation/aws-cloudformation-custom-resource-creation-with-python-aws-lambda-and-crhelper/) covers usage in more detail.
 49 | 
 50 | ```python
 51 | from __future__ import print_function
 52 | from crhelper import CfnResource
 53 | import logging
 54 | 
 55 | logger = logging.getLogger(__name__)
 56 | # Initialise the helper, all inputs are optional, this example shows the defaults
 57 | helper = CfnResource(json_logging=False, log_level='DEBUG', boto_level='CRITICAL', sleep_on_delete=120)
 58 | 
 59 | try:
 60 |     ## Init code goes here
 61 |     pass
 62 | except Exception as e:
 63 |     helper.init_failure(e)
 64 | 
 65 | 
 66 | @helper.create
 67 | def create(event, context):
 68 |     logger.info("Got Create")
 69 |     # Optionally return an ID that will be used for the resource PhysicalResourceId, 
 70 |     # if None is returned an ID will be generated. If a poll_create function is defined 
 71 |     # return value is placed into the poll event as event['CrHelperData']['PhysicalResourceId']
 72 |     #
 73 |     # To add response data update the helper.Data dict
 74 |     # If poll is enabled data is placed into poll event as event['CrHelperData']
 75 |     helper.Data.update({"test": "testdata"})
 76 | 
 77 |     # To return an error to cloudformation you raise an exception:
 78 |     if not helper.Data.get("test"):
 79 |         raise ValueError("this error will show in the cloudformation events log and console.")
 80 | 
 81 |     return "MyResourceId"
 82 | 
 83 | 
 84 | @helper.update
 85 | def update(event, context):
 86 |     logger.info("Got Update")
 87 |     # If the update resulted in a new resource being created, return an id for the new resource. 
 88 |     # CloudFormation will send a delete event with the old id when stack update completes
 89 | 
 90 | 
 91 | @helper.delete
 92 | def delete(event, context):
 93 |     logger.info("Got Delete")
 94 |     # Delete never returns anything. Should not fail if the underlying resources are already deleted.
 95 |     # Desired state.
 96 | 
 97 | 
 98 | @helper.poll_create
 99 | def poll_create(event, context):
100 |     logger.info("Got create poll")
101 |     # Return a resource id or True to indicate that creation is complete. if True is returned an id 
102 |     # will be generated
103 |     return True
104 | 
105 | 
106 | def handler(event, context):
107 |     helper(event, context)
108 | ```
109 | 
110 | ### Polling
111 | 
112 | If you need longer than the max runtime of 15 minutes, you can enable polling by adding additional decorators for 
113 | `poll_create`, `poll_update` or `poll_delete`. When a poll function is defined for `create`/`update`/`delete` the 
114 | function will not send a response to CloudFormation and instead a CloudWatch Events schedule will be created to 
115 | re-invoke the lambda function every 2 minutes. When the function is invoked the matching `@helper.poll_` function will 
116 | be called, logic to check for completion should go here, if the function returns `None` then the schedule will run again 
117 | in 2 minutes. Once complete either return a PhysicalResourceID or `True` to have one generated. The schedule will be 
118 | deleted and a response sent back to CloudFormation. If you use polling the following additional IAM policy must be 
119 | attached to the function's IAM role:
120 | 
121 | ```yaml
122 | {
123 |   "Version": "2012-10-17",
124 |   "Statement": [
125 |     {
126 |       "Effect": "Allow",
127 |       "Action": [
128 |         "lambda:AddPermission",
129 |         "lambda:RemovePermission",
130 |         "events:PutRule",
131 |         "events:DeleteRule",
132 |         "events:PutTargets",
133 |         "events:RemoveTargets"
134 |       ],
135 |       "Resource": "*"
136 |     }
137 |   ]
138 | }
139 | ```
140 | 
141 | ## Credits
142 | 
143 | Decorator implementation inspired by https://github.com/ryansb/cfn-wrapper-python
144 | 
145 | Log implementation inspired by https://gitlab.com/hadrien/aws_lambda_logging
146 | 
147 | ## License
148 | 
149 | This library is licensed under the Apache 2.0 License.
150 | 
151 | 
152 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/NOTICE:
--------------------------------------------------------------------------------
1 | Custom Resource Helper
2 | Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. 
3 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/RECORD:
--------------------------------------------------------------------------------
 1 | crhelper-2.0.6.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
 2 | crhelper-2.0.6.dist-info/LICENSE,sha256=CeipvOyAZxBGUsFoaFqwkx54aPnIKEtm9a5u2uXxEws,10142
 3 | crhelper-2.0.6.dist-info/METADATA,sha256=0FEfmNkHpgUGUHmR-GGoiZwcGJsEYmJE92mkBI_tQ1Q,5537
 4 | crhelper-2.0.6.dist-info/NOTICE,sha256=gDru0mjdrGkrCJfnHTVboKMdS7U85Ha8bV_PQTCckfM,96
 5 | crhelper-2.0.6.dist-info/RECORD,,
 6 | crhelper-2.0.6.dist-info/WHEEL,sha256=g4nMs7d-Xl9-xC9XovUrsDHGXt-FT0E17Yqo92DEfvY,92
 7 | crhelper-2.0.6.dist-info/top_level.txt,sha256=pe_5uNErAyss8aUfseYKAjd3a1-LXM6bPjnkun7vbso,15
 8 | crhelper/__init__.py,sha256=VSvHU2MKgP96DHSDXR1OYxnbC8j7yfuVhZubBLU7Pns,66
 9 | crhelper/__pycache__/__init__.cpython-38.pyc,,
10 | crhelper/__pycache__/log_helper.cpython-38.pyc,,
11 | crhelper/__pycache__/resource_helper.cpython-38.pyc,,
12 | crhelper/__pycache__/utils.cpython-38.pyc,,
13 | crhelper/log_helper.py,sha256=18n4WKlGgxXL_iiYPqE8dWv9TW4sPZc4Ae3px5dbHmY,2665
14 | crhelper/resource_helper.py,sha256=jlFCL0YMi1lEN9kOqhRtKkMcDovoJJpwq1oTk3W5hX0,12637
15 | crhelper/utils.py,sha256=HX_ZnUy3DP81L5ofOVshhWK9NwYnZ9dzIWUPnOfFm5w,1384
16 | tests/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
17 | tests/__pycache__/__init__.cpython-38.pyc,,
18 | tests/__pycache__/test_log_helper.cpython-38.pyc,,
19 | tests/__pycache__/test_resource_helper.cpython-38.pyc,,
20 | tests/__pycache__/test_utils.cpython-38.pyc,,
21 | tests/test_log_helper.py,sha256=T25g-RnRYrwp05v__25thYiodWIIDtoSXDFAqe9Z7rQ,3256
22 | tests/test_resource_helper.py,sha256=5BzbcWX49kSZN0GveRpG8Bt3PHAYUGubJMOmbAigFP0,14462
23 | tests/test_utils.py,sha256=HbLMvoXfYbF952AMM-ey8RNasbYHFqfX17rqajluOKM,1407
24 | tests/unit/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
25 | tests/unit/__pycache__/__init__.cpython-38.pyc,,
26 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/WHEEL:
--------------------------------------------------------------------------------
1 | Wheel-Version: 1.0
2 | Generator: bdist_wheel (0.34.2)
3 | Root-Is-Purelib: true
4 | Tag: py3-none-any
5 | 
6 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper-2.0.6.dist-info/top_level.txt:
--------------------------------------------------------------------------------
1 | crhelper
2 | tests
3 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper/__init__.py:
--------------------------------------------------------------------------------
1 | from crhelper.resource_helper import CfnResource, SUCCESS, FAILED
2 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper/__pycache__/__init__.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/crhelper/__pycache__/__init__.cpython-38.pyc


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper/__pycache__/log_helper.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/crhelper/__pycache__/log_helper.cpython-38.pyc


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper/__pycache__/resource_helper.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/crhelper/__pycache__/resource_helper.cpython-38.pyc


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper/__pycache__/utils.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/crhelper/__pycache__/utils.cpython-38.pyc


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper/log_helper.py:
--------------------------------------------------------------------------------
 1 | from __future__ import print_function
 2 | import json
 3 | import logging
 4 | 
 5 | 
 6 | def _json_formatter(obj):
 7 |     """Formatter for unserialisable values."""
 8 |     return str(obj)
 9 | 
10 | 
11 | class JsonFormatter(logging.Formatter):
12 |     """AWS Lambda Logging formatter.
13 | 
14 |     Formats the log message as a JSON encoded string.  If the message is a
15 |     dict it will be used directly.  If the message can be parsed as JSON, then
16 |     the parse d value is used in the output record.
17 |     """
18 | 
19 |     def __init__(self, **kwargs):
20 |         super(JsonFormatter, self).__init__()
21 |         self.format_dict = {
22 |             'timestamp': '%(asctime)s',
23 |             'level': '%(levelname)s',
24 |             'location': '%(name)s.%(funcName)s:%(lineno)d',
25 |         }
26 |         self.format_dict.update(kwargs)
27 |         self.default_json_formatter = kwargs.pop(
28 |             'json_default', _json_formatter)
29 | 
30 |     def format(self, record):
31 |         record_dict = record.__dict__.copy()
32 |         record_dict['asctime'] = self.formatTime(record)
33 | 
34 |         log_dict = {
35 |             k: v % record_dict
36 |             for k, v in self.format_dict.items()
37 |             if v
38 |         }
39 | 
40 |         if isinstance(record_dict['msg'], dict):
41 |             log_dict['message'] = record_dict['msg']
42 |         else:
43 |             log_dict['message'] = record.getMessage()
44 | 
45 |             # Attempt to decode the message as JSON, if so, merge it with the
46 |             # overall message for clarity.
47 |             try:
48 |                 log_dict['message'] = json.loads(log_dict['message'])
49 |             except (TypeError, ValueError):
50 |                 pass
51 | 
52 |         if record.exc_info:
53 |             # Cache the traceback text to avoid converting it multiple times
54 |             # (it's constant anyway)
55 |             # from logging.Formatter:format
56 |             if not record.exc_text:
57 |                 record.exc_text = self.formatException(record.exc_info)
58 | 
59 |         if record.exc_text:
60 |             log_dict['exception'] = record.exc_text
61 | 
62 |         json_record = json.dumps(log_dict, default=self.default_json_formatter)
63 | 
64 |         if hasattr(json_record, 'decode'):  # pragma: no cover
65 |             json_record = json_record.decode('utf-8')
66 | 
67 |         return json_record
68 | 
69 | 
70 | def setup(level='DEBUG', formatter_cls=JsonFormatter, boto_level=None, **kwargs):
71 |     if formatter_cls:
72 |         for handler in logging.root.handlers:
73 |             handler.setFormatter(formatter_cls(**kwargs))
74 | 
75 |     logging.root.setLevel(level)
76 | 
77 |     if not boto_level:
78 |         boto_level = level
79 | 
80 |     logging.getLogger('boto').setLevel(boto_level)
81 |     logging.getLogger('boto3').setLevel(boto_level)
82 |     logging.getLogger('botocore').setLevel(boto_level)
83 |     logging.getLogger('urllib3').setLevel(boto_level)
84 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper/resource_helper.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | TODO:
  4 | * Async mode – take a wait condition handle as an input, increases max timeout to 12 hours
  5 | * Idempotency – If a duplicate request comes in (say there was a network error in signaling back to cfn) the subsequent
  6 |   request should return the already created response, will need a persistent store of some kind...
  7 | * Functional tests
  8 | """
  9 | 
 10 | from __future__ import print_function
 11 | import threading
 12 | from crhelper.utils import _send_response
 13 | from crhelper import log_helper
 14 | import logging
 15 | import random
 16 | import boto3
 17 | import string
 18 | import json
 19 | import os
 20 | from time import sleep
 21 | 
 22 | logger = logging.getLogger(__name__)
 23 | 
 24 | SUCCESS = 'SUCCESS'
 25 | FAILED = 'FAILED'
 26 | 
 27 | 
 28 | class CfnResource(object):
 29 | 
 30 |     def __init__(self, json_logging=False, log_level='DEBUG', boto_level='ERROR', polling_interval=2, sleep_on_delete=120):
 31 |         self._sleep_on_delete= sleep_on_delete
 32 |         self._create_func = None
 33 |         self._update_func = None
 34 |         self._delete_func = None
 35 |         self._poll_create_func = None
 36 |         self._poll_update_func = None
 37 |         self._poll_delete_func = None
 38 |         self._timer = None
 39 |         self._init_failed = None
 40 |         self._json_logging = json_logging
 41 |         self._log_level = log_level
 42 |         self._boto_level = boto_level
 43 |         self._send_response = False
 44 |         self._polling_interval = polling_interval
 45 |         self.Status = ""
 46 |         self.Reason = ""
 47 |         self.PhysicalResourceId = ""
 48 |         self.StackId = ""
 49 |         self.RequestId = ""
 50 |         self.LogicalResourceId = ""
 51 |         self.Data = {}
 52 |         self._event = {}
 53 |         self._context = None
 54 |         self._response_url = ""
 55 |         self._sam_local = os.getenv('AWS_SAM_LOCAL')
 56 |         self._region = os.getenv('AWS_REGION')
 57 |         try:
 58 |             if not self._sam_local:
 59 |                 self._lambda_client = boto3.client('lambda', region_name=self._region)
 60 |                 self._events_client = boto3.client('events', region_name=self._region)
 61 |                 self._logs_client = boto3.client('logs', region_name=self._region)
 62 |             if json_logging:
 63 |                 log_helper.setup(log_level, boto_level=boto_level, RequestType='ContainerInit')
 64 |             else:
 65 |                 log_helper.setup(log_level, formatter_cls=None, boto_level=boto_level)
 66 |         except Exception as e:
 67 |             logger.error(e, exc_info=True)
 68 |             self.init_failure(e)
 69 | 
 70 |     def __call__(self, event, context):
 71 |         try:
 72 |             self._log_setup(event, context)
 73 |             logger.debug(event)
 74 |             if not self._crhelper_init(event, context):
 75 |                 return
 76 |             # Check for polling functions
 77 |             if self._poll_enabled() and self._sam_local:
 78 |                 logger.info("Skipping poller functionality, as this is a local invocation")
 79 |             elif self._poll_enabled():
 80 |                 self._polling_init(event)
 81 |             # If polling is not enabled, then we should respond
 82 |             else:
 83 |                 logger.debug("enabling send_response")
 84 |                 self._send_response = True
 85 |             logger.debug("_send_response: %s" % self._send_response)
 86 |             if self._send_response:
 87 |                 if self.RequestType == 'Delete':
 88 |                     self._wait_for_cwlogs()
 89 |                 self._cfn_response(event)
 90 |         except Exception as e:
 91 |             logger.error(e, exc_info=True)
 92 |             self._send(FAILED, str(e))
 93 |         finally:
 94 |             if self._timer:
 95 |                 self._timer.cancel()
 96 | 
 97 |     def _wait_for_cwlogs(self, sleep=sleep):
 98 |         time_left = int(self._context.get_remaining_time_in_millis() / 1000) - 15
 99 |         sleep_time = 0
100 | 
101 |         if time_left > self._sleep_on_delete:
102 |             sleep_time = self._sleep_on_delete
103 | 
104 |         if sleep_time > 1:
105 |             sleep(sleep_time)
106 | 
107 |     def _log_setup(self, event, context):
108 |         if self._json_logging:
109 |             log_helper.setup(self._log_level, boto_level=self._boto_level, RequestType=event['RequestType'],
110 |                              StackId=event['StackId'], RequestId=event['RequestId'],
111 |                              LogicalResourceId=event['LogicalResourceId'], aws_request_id=context.aws_request_id)
112 |         else:
113 |             log_helper.setup(self._log_level, boto_level=self._boto_level, formatter_cls=None)
114 | 
115 |     def _crhelper_init(self, event, context):
116 |         self._send_response = False
117 |         self.Status = SUCCESS
118 |         self.Reason = ""
119 |         self.PhysicalResourceId = ""
120 |         self.StackId = event["StackId"]
121 |         self.RequestId = event["RequestId"]
122 |         self.LogicalResourceId = event["LogicalResourceId"]
123 |         self.Data = {}
124 |         if "CrHelperData" in event.keys():
125 |             self.Data = event["CrHelperData"]
126 |         self.RequestType = event["RequestType"]
127 |         self._event = event
128 |         self._context = context
129 |         self._response_url = event['ResponseURL']
130 |         if self._timer:
131 |             self._timer.cancel()
132 |         if self._init_failed:
133 |             self._send(FAILED, str(self._init_failed))
134 |             return False
135 |         self._set_timeout()
136 |         self._wrap_function(self._get_func())
137 |         return True
138 | 
139 |     def _polling_init(self, event):
140 |         # Setup polling on initial request
141 |         logger.debug("pid1: %s" % self.PhysicalResourceId)
142 |         if 'CrHelperPoll' not in event.keys() and self.Status != FAILED:
143 |             logger.info("Setting up polling")
144 |             self.Data["PhysicalResourceId"] = self.PhysicalResourceId
145 |             self._setup_polling()
146 |             self.PhysicalResourceId = None
147 |             logger.debug("pid2: %s" % self.PhysicalResourceId)
148 |         # if physical id is set, or there was a failure then we're done
149 |         logger.debug("pid3: %s" % self.PhysicalResourceId)
150 |         if self.PhysicalResourceId or self.Status == FAILED:
151 |             logger.info("Polling complete, removing cwe schedule")
152 |             self._remove_polling()
153 |             self._send_response = True
154 | 
155 |     def generate_physical_id(self, event):
156 |         return '_'.join([
157 |             event['StackId'].split('/')[1],
158 |             event['LogicalResourceId'],
159 |             self._rand_string(8)
160 |         ])
161 | 
162 |     def _cfn_response(self, event):
163 |         # Use existing PhysicalResourceId if it's in the event and no ID was set
164 |         if not self.PhysicalResourceId and "PhysicalResourceId" in event.keys():
165 |             logger.info("PhysicalResourceId present in event, Using that for response")
166 |             self.PhysicalResourceId = event['PhysicalResourceId']
167 |         # Generate a physical id if none is provided
168 |         elif not self.PhysicalResourceId or self.PhysicalResourceId is True:
169 |             logger.info("No physical resource id returned, generating one...")
170 |             self.PhysicalResourceId = self.generate_physical_id(event)
171 |         self._send()
172 | 
173 |     def _poll_enabled(self):
174 |         return getattr(self, "_poll_{}_func".format(self._event['RequestType'].lower()))
175 | 
176 |     def create(self, func):
177 |         self._create_func = func
178 |         return func
179 | 
180 |     def update(self, func):
181 |         self._update_func = func
182 |         return func
183 | 
184 |     def delete(self, func):
185 |         self._delete_func = func
186 |         return func
187 | 
188 |     def poll_create(self, func):
189 |         self._poll_create_func = func
190 |         return func
191 | 
192 |     def poll_update(self, func):
193 |         self._poll_update_func = func
194 |         return func
195 | 
196 |     def poll_delete(self, func):
197 |         self._poll_delete_func = func
198 |         return func
199 | 
200 |     def _wrap_function(self, func):
201 |         try:
202 |             self.PhysicalResourceId = func(self._event, self._context) if func else ''
203 |         except Exception as e:
204 |             logger.error(str(e), exc_info=True)
205 |             self.Reason = str(e)
206 |             self.Status = FAILED
207 | 
208 |     def _timeout(self):
209 |         logger.error("Execution is about to time out, sending failure message")
210 |         self._send(FAILED, "Execution timed out")
211 | 
212 |     def _set_timeout(self):
213 |         self._timer = threading.Timer((self._context.get_remaining_time_in_millis() / 1000.00) - 0.5,
214 |                                       self._timeout)
215 |         self._timer.start()
216 | 
217 |     def _get_func(self):
218 |         request_type = "_{}_func"
219 |         if "CrHelperPoll" in self._event.keys():
220 |             request_type = "_poll" + request_type
221 |         return getattr(self, request_type.format(self._event['RequestType'].lower()))
222 | 
223 |     def _send(self, status=None, reason="", send_response=_send_response):
224 |         if len(str(str(self.Reason))) > 256:
225 |             self.Reason = "ERROR: (truncated) " + str(self.Reason)[len(str(self.Reason)) - 240:]
226 |         if len(str(reason)) > 256:
227 |             reason = "ERROR: (truncated) " + str(reason)[len(str(reason)) - 240:]
228 |         response_body = {
229 |             'Status': self.Status,
230 |             'PhysicalResourceId': str(self.PhysicalResourceId),
231 |             'StackId': self.StackId,
232 |             'RequestId': self.RequestId,
233 |             'LogicalResourceId': self.LogicalResourceId,
234 |             'Reason': str(self.Reason),
235 |             'Data': self.Data,
236 |         }
237 |         if status:
238 |             response_body.update({'Status': status, 'Reason': reason})
239 |         send_response(self._response_url, response_body)
240 | 
241 |     def init_failure(self, error):
242 |         self._init_failed = error
243 |         logger.error(str(error), exc_info=True)
244 | 
245 |     def _cleanup_response(self):
246 |         for k in ["CrHelperPoll", "CrHelperPermission", "CrHelperRule"]:
247 |             if k in self.Data.keys():
248 |                 del self.Data[k]
249 | 
250 |     @staticmethod
251 |     def _rand_string(l):
252 |         return ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(l))
253 | 
254 |     def _add_permission(self, rule_arn):
255 |         sid = self._event['LogicalResourceId'] + self._rand_string(8)
256 |         self._lambda_client.add_permission(
257 |             FunctionName=self._context.function_name,
258 |             StatementId=sid,
259 |             Action='lambda:InvokeFunction',
260 |             Principal='events.amazonaws.com',
261 |             SourceArn=rule_arn
262 |         )
263 |         return sid
264 | 
265 |     def _put_rule(self):
266 |         response = self._events_client.put_rule(
267 |             Name=self._event['LogicalResourceId'] + self._rand_string(8),
268 |             ScheduleExpression='rate({} minutes)'.format(self._polling_interval),
269 |             State='ENABLED',
270 |         )
271 |         return response["RuleArn"]
272 | 
273 |     def _put_targets(self, func_name):
274 |         region = self._event['CrHelperRule'].split(":")[3]
275 |         account_id = self._event['CrHelperRule'].split(":")[4]
276 |         partition = self._event['CrHelperRule'].split(":")[1]
277 |         rule_name = self._event['CrHelperRule'].split("/")[1]
278 |         logger.debug(self._event)
279 |         self._events_client.put_targets(
280 |             Rule=rule_name,
281 |             Targets=[
282 |                 {
283 |                     'Id': '1',
284 |                     'Arn': 'arn:%s:lambda:%s:%s:function:%s' % (partition, region, account_id, func_name),
285 |                     'Input': json.dumps(self._event)
286 |                 }
287 |             ]
288 |         )
289 | 
290 |     def _remove_targets(self, rule_arn):
291 |         self._events_client.remove_targets(
292 |             Rule=rule_arn.split("/")[1],
293 |             Ids=['1']
294 |         )
295 | 
296 |     def _remove_permission(self, sid):
297 |         self._lambda_client.remove_permission(
298 |             FunctionName=self._context.function_name,
299 |             StatementId=sid
300 |         )
301 | 
302 |     def _delete_rule(self, rule_arn):
303 |         self._events_client.delete_rule(
304 |             Name=rule_arn.split("/")[1]
305 |         )
306 | 
307 |     def _setup_polling(self):
308 |         self._event['CrHelperData'] = self.Data
309 |         self._event['CrHelperPoll'] = True
310 |         self._event['CrHelperRule'] = self._put_rule()
311 |         self._event['CrHelperPermission'] = self._add_permission(self._event['CrHelperRule'])
312 |         self._put_targets(self._context.function_name)
313 | 
314 |     def _remove_polling(self):
315 |         if 'CrHelperData' in self._event.keys():
316 |             self._event.pop('CrHelperData')
317 |         if "PhysicalResourceId" in self.Data.keys():
318 |             self.Data.pop("PhysicalResourceId")
319 |         if 'CrHelperRule' in self._event.keys():
320 |             self._remove_targets(self._event['CrHelperRule'])
321 |         else:
322 |             logger.error("Cannot remove CloudWatch events rule, Rule arn not available in event")
323 |         if 'CrHelperPermission' in self._event.keys():
324 |             self._remove_permission(self._event['CrHelperPermission'])
325 |         else:
326 |             logger.error("Cannot remove lambda events permission, permission id not available in event")
327 |         if 'CrHelperRule' in self._event.keys():
328 |             self._delete_rule(self._event['CrHelperRule'])
329 |         else:
330 |             logger.error("Cannot remove CloudWatch events target, Rule arn not available in event")
331 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/crhelper/utils.py:
--------------------------------------------------------------------------------
 1 | from __future__ import print_function
 2 | import json
 3 | import logging as logging
 4 | import time
 5 | from urllib.parse import urlsplit, urlunsplit
 6 | from http.client import HTTPSConnection
 7 | 
 8 | logger = logging.getLogger(__name__)
 9 | 
10 | 
11 | def _send_response(response_url, response_body):
12 |     try:
13 |         json_response_body = json.dumps(response_body)
14 |     except Exception as e:
15 |         msg = "Failed to convert response to json: {}".format(str(e))
16 |         logger.error(msg, exc_info=True)
17 |         response_body = {'Status': 'FAILED', 'Data': {}, 'Reason': msg}
18 |         json_response_body = json.dumps(response_body)
19 |     logger.debug("CFN response URL: {}".format(response_url))
20 |     logger.debug(json_response_body)
21 |     headers = {'content-type': '', 'content-length': str(len(json_response_body))}
22 |     split_url = urlsplit(response_url)
23 |     host = split_url.netloc
24 |     url = urlunsplit(("", "", *split_url[2:]))
25 |     while True:
26 |         try:
27 |             connection = HTTPSConnection(host)
28 |             connection.request(method="PUT", url=url, body=json_response_body, headers=headers)
29 |             response = connection.getresponse()
30 |             logger.info("CloudFormation returned status code: {}".format(response.reason))
31 |             break
32 |         except Exception as e:
33 |             logger.error("Unexpected failure sending response to CloudFormation {}".format(e), exc_info=True)
34 |             time.sleep(5)
35 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/lambda.py:
--------------------------------------------------------------------------------
  1 | # /*********************************************************************************************************************
  2 | # *  Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.                                           *
  3 | # *                                                                                                                    *
  4 | # *  Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance    *
  5 | # *  with the License. A copy of the License is located at                                                             *
  6 | # *                                                                                                                    *
  7 | # *      http://www.apache.org/licenses/LICENSE-2.0                                                                    *
  8 | # *                                                                                                                    *
  9 | # *  or in the 'license' file accompanying this file. This file is distributed on an 'AS IS' BASIS, WITHOUT WARRANTIES *
 10 | # *  OR CONDITIONS OF ANY KIND, express or implied. See the License for the specific language governing permissions    *
 11 | # *  and limitations under the License.                                                                                *
 12 | # *********************************************************************************************************************/
 13 | 
 14 | from __future__ import print_function
 15 | from crhelper import CfnResource
 16 | import logging
 17 | import boto3
 18 | import time
 19 | 
 20 | logger = logging.getLogger(__name__)
 21 | # Initialise the helper, all inputs are optional, this example shows the defaults
 22 | helper = CfnResource(json_logging=False, log_level='DEBUG', boto_level='CRITICAL')
 23 | 
 24 | try:
 25 |     codebuild = boto3.client('codebuild')
 26 |     # pass
 27 | except Exception as e:
 28 |     helper.init_failure(e)
 29 | 
 30 | 
 31 | @helper.create
 32 | def create(event, context):
 33 |     logger.info("Got Create")
 34 |     start_build_job(event, context)
 35 | 
 36 | 
 37 | @helper.update
 38 | def update(event, context):
 39 |     logger.info("Got Update")
 40 |     start_build_job(event, context)
 41 | 
 42 | 
 43 | @helper.delete
 44 | def delete(event, context):
 45 |     logger.info("Got Delete")
 46 |     start_build_job(event, context, action='teardown')
 47 |     # Delete never returns anything. Should not fail if the underlying resources are already deleted. Desired state.
 48 | 
 49 | 
 50 | @helper.poll_create
 51 | def poll_create(event, context):
 52 |     logger.info("Got Create poll")
 53 |     return check_build_job_status(event, context)
 54 | 
 55 | 
 56 | @helper.poll_update
 57 | def poll_update(event, context):
 58 |     logger.info("Got Update poll")
 59 |     return check_build_job_status(event, context)
 60 | 
 61 | 
 62 | @helper.poll_delete
 63 | def poll_delete(event, context):
 64 |     logger.info("Got Delete poll")
 65 |     return check_build_job_status(event, context)
 66 | 
 67 | 
 68 | def handler(event, context):
 69 |     helper(event, context)
 70 | 
 71 | 
 72 | def start_build_job(event, context, action='setup'):
 73 |     response = codebuild.start_build(
 74 |         projectName=event['ResourceProperties']['CodeBuildProjectName'],
 75 |         environmentVariablesOverride=[{
 76 |             'name': 'SOLUTION_ACTION',
 77 |             'value': action,
 78 |             'type': 'PLAINTEXT'
 79 |         }]
 80 |     )
 81 |     logger.info(response)
 82 | 
 83 |     helper.Data.update({"JobID": response['build']['id']})
 84 | 
 85 | 
 86 | def check_build_job_status(event, context):
 87 |     code_build_project_name = event['ResourceProperties']['CodeBuildProjectName']
 88 | 
 89 |     if not helper.Data.get("JobID"):
 90 |         raise ValueError("Job ID missing in the polling event.")
 91 | 
 92 |     job_id = helper.Data.get("JobID")
 93 | 
 94 |     # 'SUCCEEDED' | 'FAILED' | 'FAULT' | 'TIMED_OUT' | 'IN_PROGRESS' | 'STOPPED'
 95 |     response = codebuild.batch_get_builds(ids=[job_id])
 96 |     build_status = response['builds'][0]['buildStatus']
 97 | 
 98 |     if build_status == 'IN_PROGRESS':
 99 |         logger.info(build_status)
100 |         return None
101 |     else:
102 |         if build_status == 'SUCCEEDED':
103 |             logger.info(build_status)
104 |             return True
105 |         else:
106 |             msg = "Code Build job '{0}' in project '{1}' exited with a build status of '{2}'. Please check the code build job output log for more information." \
107 |                 .format(job_id, code_build_project_name, build_status)
108 |             logger.info(msg)
109 |             raise ValueError(msg)
110 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/requirements.txt:
--------------------------------------------------------------------------------
1 | crhelper
2 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/tests/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/__init__.py


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/tests/__pycache__/__init__.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/__pycache__/__init__.cpython-38.pyc


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/tests/__pycache__/test_log_helper.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/__pycache__/test_log_helper.cpython-38.pyc


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/tests/__pycache__/test_resource_helper.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/__pycache__/test_resource_helper.cpython-38.pyc


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/tests/__pycache__/test_utils.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/__pycache__/test_utils.cpython-38.pyc


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/tests/test_log_helper.py:
--------------------------------------------------------------------------------
 1 | from crhelper.log_helper import *
 2 | import unittest
 3 | import logging
 4 | 
 5 | 
 6 | class TestLogHelper(unittest.TestCase):
 7 | 
 8 |     def test_logging_no_formatting(self):
 9 |         logger = logging.getLogger('1')
10 |         handler = logging.StreamHandler()
11 |         logger.addHandler(handler)
12 |         orig_formatters = []
13 |         for c in range(len(logging.root.handlers)):
14 |             orig_formatters.append(logging.root.handlers[c].formatter)
15 |         setup(level='DEBUG', formatter_cls=None, boto_level='CRITICAL')
16 |         new_formatters = []
17 |         for c in range(len(logging.root.handlers)):
18 |             new_formatters.append(logging.root.handlers[c].formatter)
19 |         self.assertEqual(orig_formatters, new_formatters)
20 | 
21 |     def test_logging_boto_explicit(self):
22 |         logger = logging.getLogger('2')
23 |         handler = logging.StreamHandler()
24 |         logger.addHandler(handler)
25 |         setup(level='DEBUG', formatter_cls=None, boto_level='CRITICAL')
26 |         for t in ['boto', 'boto3', 'botocore', 'urllib3']:
27 |             b_logger = logging.getLogger(t)
28 |             self.assertEqual(b_logger.level, 50)
29 | 
30 |     def test_logging_json(self):
31 |         logger = logging.getLogger('3')
32 |         handler = logging.StreamHandler()
33 |         logger.addHandler(handler)
34 |         setup(level='DEBUG', formatter_cls=JsonFormatter, RequestType='ContainerInit')
35 |         for handler in logging.root.handlers:
36 |             self.assertEqual(JsonFormatter, type(handler.formatter))
37 | 
38 |     def test_logging_boto_implicit(self):
39 |         logger = logging.getLogger('4')
40 |         handler = logging.StreamHandler()
41 |         logger.addHandler(handler)
42 |         setup(level='DEBUG', formatter_cls=JsonFormatter, RequestType='ContainerInit')
43 |         for t in ['boto', 'boto3', 'botocore', 'urllib3']:
44 |             b_logger = logging.getLogger(t)
45 |             self.assertEqual(b_logger.level, 10)
46 | 
47 |     def test_logging_json_keys(self):
48 |         with self.assertLogs() as ctx:
49 |             logger = logging.getLogger()
50 |             handler = logging.StreamHandler()
51 |             logger.addHandler(handler)
52 |             setup(level='DEBUG', formatter_cls=JsonFormatter, RequestType='ContainerInit')
53 |             logger.info("test")
54 |             logs = json.loads(ctx.output[0])
55 |         self.assertEqual(["timestamp", "level", "location", "RequestType", "message"], list(logs.keys()))
56 | 
57 |     def test_logging_json_parse_message(self):
58 |         with self.assertLogs() as ctx:
59 |             logger = logging.getLogger()
60 |             handler = logging.StreamHandler()
61 |             logger.addHandler(handler)
62 |             setup(level='DEBUG', formatter_cls=JsonFormatter, RequestType='ContainerInit')
63 |             logger.info("{}")
64 |             logs = json.loads(ctx.output[0])
65 |         self.assertEqual({}, logs["message"])
66 | 
67 |     def test_logging_json_exception(self):
68 |         with self.assertLogs() as ctx:
69 |             logger = logging.getLogger()
70 |             handler = logging.StreamHandler()
71 |             logger.addHandler(handler)
72 |             setup(level='DEBUG', formatter_cls=JsonFormatter, RequestType='ContainerInit')
73 |             try:
74 |                 1 + 't'
75 |             except Exception as e:
76 |                 logger.info("[]", exc_info=True)
77 |             logs = json.loads(ctx.output[0])
78 |         self.assertIn("exception", logs.keys())
79 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/tests/test_resource_helper.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import crhelper
  3 | import unittest
  4 | from unittest.mock import call, patch, Mock
  5 | import threading
  6 | 
  7 | test_events = {
  8 |     "Create": {
  9 |         "RequestType": "Create",
 10 |         "RequestId": "test-event-id",
 11 |         "StackId": "arn/test-stack-id/guid",
 12 |         "LogicalResourceId": "TestResourceId",
 13 |         "ResponseURL": "response_url"
 14 |     },
 15 |     "Update": {
 16 |         "RequestType": "Update",
 17 |         "RequestId": "test-event-id",
 18 |         "StackId": "test-stack-id",
 19 |         "LogicalResourceId": "TestResourceId",
 20 |         "PhysicalResourceId": "test-pid",
 21 |         "ResponseURL": "response_url"
 22 |     },
 23 |     "Delete": {
 24 |         "RequestType": "Delete",
 25 |         "RequestId": "test-event-id",
 26 |         "StackId": "test-stack-id",
 27 |         "LogicalResourceId": "TestResourceId",
 28 |         "PhysicalResourceId": "test-pid",
 29 |         "ResponseURL": "response_url"
 30 |     }
 31 | }
 32 | 
 33 | 
 34 | class MockContext(object):
 35 | 
 36 |     function_name = "test-function"
 37 |     ms_remaining = 9000
 38 | 
 39 |     @staticmethod
 40 |     def get_remaining_time_in_millis():
 41 |         return MockContext.ms_remaining
 42 | 
 43 | 
 44 | class TestCfnResource(unittest.TestCase):
 45 |     def setUp(self):
 46 |         os.environ['AWS_REGION'] = 'us-east-1'
 47 | 
 48 |     def tearDown(self):
 49 |         os.environ.pop('AWS_REGION', None)
 50 | 
 51 |     @patch('crhelper.log_helper.setup', return_value=None)
 52 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
 53 |     def test_init(self, mock_method):
 54 |         crhelper.resource_helper.CfnResource()
 55 |         mock_method.assert_called_once_with('DEBUG', boto_level='ERROR', formatter_cls=None)
 56 | 
 57 |         crhelper.resource_helper.CfnResource(json_logging=True)
 58 |         mock_method.assert_called_with('DEBUG', boto_level='ERROR', RequestType='ContainerInit')
 59 | 
 60 |     @patch('crhelper.log_helper.setup', return_value=None)
 61 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
 62 |     def test_init_failure(self, mock_method):
 63 |         mock_method.side_effect = Exception("test")
 64 |         c = crhelper.resource_helper.CfnResource(json_logging=True)
 65 |         self.assertTrue(c._init_failed)
 66 | 
 67 |     @patch('crhelper.log_helper.setup', Mock())
 68 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
 69 |     @patch('crhelper.resource_helper.CfnResource._polling_init', Mock())
 70 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
 71 |     @patch('crhelper.resource_helper.CfnResource._send')
 72 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
 73 |     @patch('crhelper.resource_helper.CfnResource._wrap_function', Mock())
 74 |     def test_init_failure_call(self, mock_send):
 75 |         c = crhelper.resource_helper.CfnResource()
 76 |         c.init_failure(Exception('TestException'))
 77 | 
 78 |         event = test_events["Create"]
 79 |         c.__call__(event, MockContext)
 80 | 
 81 |         self.assertEqual([call('FAILED', 'TestException')], mock_send.call_args_list)
 82 | 
 83 |     @patch('crhelper.log_helper.setup', Mock())
 84 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
 85 |     @patch('crhelper.resource_helper.CfnResource._polling_init', Mock())
 86 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
 87 |     @patch('crhelper.resource_helper.CfnResource._send', Mock())
 88 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
 89 |     @patch('crhelper.resource_helper.CfnResource._wrap_function', Mock())
 90 |     @patch('crhelper.resource_helper.CfnResource._cfn_response', return_value=None)
 91 |     def test_call(self, cfn_response_mock):
 92 |         c = crhelper.resource_helper.CfnResource()
 93 |         event = test_events["Create"]
 94 |         c.__call__(event, MockContext)
 95 |         self.assertTrue(c._send_response)
 96 |         cfn_response_mock.assert_called_once_with(event)
 97 | 
 98 |         c._sam_local = True
 99 |         c._poll_enabled = Mock(return_value=True)
100 |         c._polling_init = Mock()
101 |         c.__call__(event, MockContext)
102 |         c._polling_init.assert_not_called()
103 |         self.assertEqual(1, len(cfn_response_mock.call_args_list))
104 | 
105 |         c._sam_local = False
106 |         c._send_response = False
107 |         c.__call__(event, MockContext)
108 |         c._polling_init.assert_called()
109 |         self.assertEqual(1, len(cfn_response_mock.call_args_list))
110 | 
111 |         event = test_events["Delete"]
112 |         c._wait_for_cwlogs = Mock()
113 |         c._poll_enabled = Mock(return_value=False)
114 |         c.__call__(event, MockContext)
115 |         c._wait_for_cwlogs.assert_called()
116 | 
117 |         c._send = Mock()
118 |         cfn_response_mock.side_effect = Exception("test")
119 |         c.__call__(event, MockContext)
120 |         c._send.assert_called_with('FAILED', "test")
121 | 
122 |     @patch('crhelper.log_helper.setup', Mock())
123 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
124 |     @patch('crhelper.resource_helper.CfnResource._polling_init', Mock())
125 |     @patch('crhelper.resource_helper.CfnResource._send', Mock())
126 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
127 |     @patch('crhelper.resource_helper.CfnResource._wrap_function', Mock())
128 |     @patch('crhelper.resource_helper.CfnResource._cfn_response', Mock(return_value=None))
129 |     def test_wait_for_cwlogs(self):
130 | 
131 |         c = crhelper.resource_helper.CfnResource()
132 |         c._context = MockContext
133 |         s = Mock()
134 |         c._wait_for_cwlogs(sleep=s)
135 |         s.assert_not_called()
136 |         MockContext.ms_remaining = 140000
137 |         c._wait_for_cwlogs(sleep=s)
138 |         s.assert_called_once()
139 | 
140 |     @patch('crhelper.log_helper.setup', Mock())
141 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
142 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
143 |     @patch('crhelper.resource_helper.CfnResource._send', Mock())
144 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
145 |     @patch('crhelper.resource_helper.CfnResource._wrap_function', Mock())
146 |     @patch('crhelper.resource_helper.CfnResource._cfn_response', Mock())
147 |     def test_polling_init(self):
148 |         c = crhelper.resource_helper.CfnResource()
149 |         event = test_events['Create']
150 |         c._setup_polling = Mock()
151 |         c._remove_polling = Mock()
152 |         c._polling_init(event)
153 |         c._setup_polling.assert_called_once()
154 |         c._remove_polling.assert_not_called()
155 |         self.assertEqual(c.PhysicalResourceId, None)
156 | 
157 |         c.Status = 'FAILED'
158 |         c._setup_polling.assert_called_once()
159 |         c._setup_polling.assert_called_once()
160 | 
161 |         c = crhelper.resource_helper.CfnResource()
162 |         event = test_events['Create']
163 |         c._setup_polling = Mock()
164 |         c._remove_polling = Mock()
165 |         event['CrHelperPoll'] = "Some stuff"
166 |         c.PhysicalResourceId = None
167 |         c._polling_init(event)
168 |         c._remove_polling.assert_not_called()
169 |         c._setup_polling.assert_not_called()
170 | 
171 |         c.Status = 'FAILED'
172 |         c._polling_init(event)
173 |         c._remove_polling.assert_called_once()
174 |         c._setup_polling.assert_not_called()
175 | 
176 |         c.Status = ''
177 |         c.PhysicalResourceId = "some-id"
178 |         c._remove_polling.assert_called()
179 |         c._setup_polling.assert_not_called()
180 | 
181 |     @patch('crhelper.log_helper.setup', Mock())
182 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
183 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
184 |     @patch('crhelper.resource_helper.CfnResource._send', Mock())
185 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
186 |     @patch('crhelper.resource_helper.CfnResource._wrap_function', Mock())
187 |     def test_cfn_response(self):
188 |         c = crhelper.resource_helper.CfnResource()
189 |         event = test_events['Create']
190 |         c._send = Mock()
191 | 
192 |         orig_pid = c.PhysicalResourceId
193 |         self.assertEqual(orig_pid, '')
194 |         c._cfn_response(event)
195 |         c._send.assert_called_once()
196 |         print("RID: [%s]" % [c.PhysicalResourceId])
197 |         self.assertEqual(True, c.PhysicalResourceId.startswith('test-stack-id_TestResourceId_'))
198 | 
199 |         c._send = Mock()
200 |         c.PhysicalResourceId = 'testpid'
201 |         c._cfn_response(event)
202 |         c._send.assert_called_once()
203 |         self.assertEqual('testpid', c.PhysicalResourceId)
204 | 
205 |         c._send = Mock()
206 |         c.PhysicalResourceId = True
207 |         c._cfn_response(event)
208 |         c._send.assert_called_once()
209 |         self.assertEqual(True, c.PhysicalResourceId.startswith('test-stack-id_TestResourceId_'))
210 | 
211 |         c._send = Mock()
212 |         c.PhysicalResourceId = ''
213 |         event['PhysicalResourceId'] = 'pid-from-event'
214 |         c._cfn_response(event)
215 |         c._send.assert_called_once()
216 |         self.assertEqual('pid-from-event', c.PhysicalResourceId)
217 | 
218 |     @patch('crhelper.log_helper.setup', Mock())
219 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
220 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
221 |     @patch('crhelper.resource_helper.CfnResource._send', Mock())
222 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
223 |     def test_wrap_function(self):
224 |         c = crhelper.resource_helper.CfnResource()
225 | 
226 |         def func(e, c):
227 |             return 'testpid'
228 | 
229 |         c._wrap_function(func)
230 |         self.assertEqual('testpid', c.PhysicalResourceId)
231 |         self.assertNotEqual('FAILED', c.Status)
232 | 
233 |         def func(e, c):
234 |             raise Exception('test exception')
235 | 
236 |         c._wrap_function(func)
237 |         self.assertEqual('FAILED', c.Status)
238 |         self.assertEqual('test exception', c.Reason)
239 | 
240 |     @patch('crhelper.log_helper.setup', Mock())
241 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
242 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
243 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
244 |     def test_send(self):
245 |         c = crhelper.resource_helper.CfnResource()
246 |         s = Mock()
247 |         c._send(send_response=s)
248 |         s.assert_called_once()
249 | 
250 |     @patch('crhelper.log_helper.setup', Mock())
251 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
252 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
253 |     @patch('crhelper.resource_helper.CfnResource._send', return_value=None)
254 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
255 |     def test_timeout(self, s):
256 |         c = crhelper.resource_helper.CfnResource()
257 |         c._timeout()
258 |         s.assert_called_with('FAILED', "Execution timed out")
259 | 
260 |     @patch('crhelper.log_helper.setup', Mock())
261 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
262 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
263 |     @patch('crhelper.resource_helper.CfnResource._send', Mock())
264 |     def test_set_timeout(self):
265 |         c = crhelper.resource_helper.CfnResource()
266 |         c._context = MockContext()
267 |         def func():
268 |             return None
269 | 
270 |         c._set_timeout()
271 |         t = threading.Timer(1000, func)
272 |         self.assertEqual(type(t), type(c._timer))
273 |         t.cancel()
274 |         c._timer.cancel()
275 | 
276 |     @patch('crhelper.log_helper.setup', Mock())
277 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
278 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
279 |     @patch('crhelper.resource_helper.CfnResource._send', Mock())
280 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
281 |     def test_cleanup_response(self):
282 |         c = crhelper.resource_helper.CfnResource()
283 |         c.Data = {"CrHelperPoll": 1, "CrHelperPermission": 2, "CrHelperRule": 3}
284 |         c._cleanup_response()
285 |         self.assertEqual({}, c.Data)
286 | 
287 |     @patch('crhelper.log_helper.setup', Mock())
288 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
289 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
290 |     @patch('crhelper.resource_helper.CfnResource._send', Mock())
291 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
292 |     def test_remove_polling(self):
293 |         c = crhelper.resource_helper.CfnResource()
294 |         c._context = MockContext()
295 | 
296 |         c._events_client.remove_targets = Mock()
297 |         c._events_client.delete_rule = Mock()
298 |         c._lambda_client.remove_permission = Mock()
299 | 
300 |         with self.assertRaises(Exception) as e:
301 |             c._remove_polling()
302 | 
303 |             self.assertEqual("failed to cleanup CloudWatch event polling", str(e))
304 |         c._events_client.remove_targets.assert_not_called()
305 |         c._events_client.delete_rule.assert_not_called()
306 |         c._lambda_client.remove_permission.assert_not_called()
307 | 
308 |         c._event["CrHelperRule"] = "1/2"
309 |         c._event["CrHelperPermission"] = "1/2"
310 |         c._remove_polling()
311 |         c._events_client.remove_targets.assert_called()
312 |         c._events_client.delete_rule.assert_called()
313 |         c._lambda_client.remove_permission.assert_called()
314 | 
315 |     @patch('crhelper.log_helper.setup', Mock())
316 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
317 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
318 |     @patch('crhelper.resource_helper.CfnResource._send', Mock())
319 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
320 |     def test_setup_polling(self):
321 |         c = crhelper.resource_helper.CfnResource()
322 |         c._context = MockContext()
323 |         c._event = test_events["Update"]
324 |         c._lambda_client.add_permission = Mock()
325 |         c._events_client.put_rule = Mock(return_value={"RuleArn": "arn:aws:lambda:blah:blah:function:blah/blah"})
326 |         c._events_client.put_targets = Mock()
327 |         c._setup_polling()
328 |         c._events_client.put_targets.assert_called()
329 |         c._events_client.put_rule.assert_called()
330 |         c._lambda_client.add_permission.assert_called()
331 | 
332 |     @patch('crhelper.log_helper.setup', Mock())
333 |     @patch('crhelper.resource_helper.CfnResource._poll_enabled', Mock(return_value=False))
334 |     @patch('crhelper.resource_helper.CfnResource._wait_for_cwlogs', Mock())
335 |     @patch('crhelper.resource_helper.CfnResource._send', Mock())
336 |     @patch('crhelper.resource_helper.CfnResource._set_timeout', Mock())
337 |     def test_wrappers(self):
338 |         c = crhelper.resource_helper.CfnResource()
339 | 
340 |         def func():
341 |             pass
342 | 
343 |         for f in ["create", "update", "delete", "poll_create", "poll_update", "poll_delete"]:
344 |             self.assertEqual(None, getattr(c, "_%s_func" % f))
345 |             getattr(c, f)(func)
346 |             self.assertEqual(func, getattr(c, "_%s_func" % f))
347 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/tests/test_utils.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | from unittest.mock import patch, Mock
 3 | from crhelper import utils
 4 | import unittest
 5 | 
 6 | 
 7 | class TestLogHelper(unittest.TestCase):
 8 |     TEST_URL = "https://test_url/this/is/the/url?query=123#aaa"
 9 | 
10 |     @patch('crhelper.utils.HTTPSConnection', autospec=True)
11 |     def test_send_succeeded_response(self, https_connection_mock):
12 |         utils._send_response(self.TEST_URL, {})
13 |         https_connection_mock.assert_called_once_with("test_url")
14 |         https_connection_mock.return_value.request.assert_called_once_with(
15 |             body='{}',
16 |             headers={"content-type": "", "content-length": "2"},
17 |             method="PUT",
18 |             url="/this/is/the/url?query=123#aaa",
19 |         )
20 | 
21 |     @patch('crhelper.utils.HTTPSConnection', autospec=True)
22 |     def test_send_failed_response(self, https_connection_mock):
23 |         utils._send_response(self.TEST_URL, Mock())
24 |         https_connection_mock.assert_called_once_with("test_url")
25 |         response = json.loads(https_connection_mock.return_value.request.call_args[1]["body"])
26 |         expected_body = '{"Status": "FAILED", "Data": {}, "Reason": "' + response["Reason"] + '"}'
27 |         https_connection_mock.return_value.request.assert_called_once_with(
28 |             body=expected_body,
29 |             headers={"content-type": "", "content-length": str(len(expected_body))},
30 |             method="PUT",
31 |             url="/this/is/the/url?query=123#aaa",
32 |         )
33 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/tests/unit/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/unit/__init__.py


--------------------------------------------------------------------------------
/source/GenomicsLearningCode/setup/tests/unit/__pycache__/__init__.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/awslabs/genomics-tertiary-analysis-and-machine-learning-using-amazon-sagemaker/95145950e705a6b8ec3b1eb449d9eee95938f37d/source/GenomicsLearningCode/setup/tests/unit/__pycache__/__init__.cpython-38.pyc


--------------------------------------------------------------------------------
/source/GenomicsLearningPipe/pipe_cfn.yml:
--------------------------------------------------------------------------------
  1 | AWSTemplateFormatVersion: 2010-09-09
  2 | 
  3 | Description: GenomicsLearningPipe
  4 | 
  5 | Parameters:
  6 |   ResourcePrefix:
  7 |     Type: String
  8 |     Default: GenomicsLearning
  9 |   ResourcePrefixLowercase:
 10 |     Type: String
 11 |     Default: genomicslearning
 12 | 
 13 | Resources:
 14 | 
 15 |   SourceEvent:
 16 |     Type: AWS::Events::Rule
 17 |     DependsOn:
 18 |       - CodePipeline
 19 |       - SourceEventRole
 20 |     Properties:
 21 |       Description: Rule for Amazon CloudWatch Events to detect changes to the source
 22 |         repository and trigger pipeline execution
 23 |       EventPattern:
 24 |         detail:
 25 |           event:
 26 |             - referenceCreated
 27 |             - referenceUpdated
 28 |           referenceName:
 29 |             - master
 30 |           referenceType:
 31 |             - branch
 32 |         detail-type:
 33 |           - CodeCommit Repository State Change
 34 |         resources:
 35 |           - !Sub ${Repo.Arn}
 36 |         source:
 37 |           - aws.codecommit
 38 |       Name: !Sub ${Repo}-Pipeline-Trigger
 39 |       State: ENABLED
 40 |       Targets:
 41 |         - Arn: !Sub arn:aws:codepipeline:${AWS::Region}:${AWS::AccountId}:${CodePipeline}
 42 |           Id: ProjectPipelineTarget
 43 |           RoleArn: !Sub ${SourceEventRole.Arn}
 44 | 
 45 |   Repo:
 46 |     DeletionPolicy: Retain
 47 |     Type: AWS::CodeCommit::Repository
 48 |     Properties:
 49 |       RepositoryName: !Sub ${ResourcePrefix}
 50 |       RepositoryDescription: !Sub ${ResourcePrefix}
 51 | 
 52 |   CodeBuildCopyResourcesProject:
 53 |     Type: AWS::CodeBuild::Project
 54 |     DependsOn:
 55 |       - BuildBucket
 56 |       - CodeBuildRole
 57 |       - ResourcesBucket
 58 |     Properties:
 59 |       Name: !Sub ${ResourcePrefix}CopyResources
 60 |       Description: !Sub ${ResourcePrefix}CopyResources
 61 |       Artifacts:
 62 |         Type: CODEPIPELINE
 63 |       Environment:
 64 |         Type: LINUX_CONTAINER
 65 |         ComputeType: BUILD_GENERAL1_SMALL
 66 |         Image: aws/codebuild/standard:3.0
 67 |         EnvironmentVariables:
 68 |           - Name: RESOURCES_BUCKET
 69 |             Value: !Sub ${ResourcesBucket}
 70 |       ServiceRole: !Sub ${CodeBuildRole.Arn}
 71 |       Source:
 72 |         Type: CODEPIPELINE
 73 |         BuildSpec: copyresources_buildspec.yml            
 74 |     Metadata:
 75 |       cfn_nag:
 76 |         rules_to_suppress:
 77 |           - id: W32
 78 |             reason: Artifact outputs are encrypted by default.
 79 | 
 80 |   CodePipeline:
 81 |     Type: AWS::CodePipeline::Pipeline
 82 |     DependsOn:
 83 |       - CodeBuildCopyResourcesProject
 84 |       - CodePipelineRole
 85 |       - Repo
 86 |     Properties:
 87 |       ArtifactStore:
 88 |         Location: !Ref BuildBucket
 89 |         Type: S3
 90 |       Name: !Sub ${ResourcePrefix}CodePipeline
 91 |       RoleArn: !GetAtt CodePipelineRole.Arn
 92 |       Stages:
 93 |         - Name: Source
 94 |           Actions:
 95 |             - Name: CodeCommitRepo
 96 |               ActionTypeId:
 97 |                 Category: Source
 98 |                 Owner: AWS
 99 |                 Provider: CodeCommit
100 |                 Version: 1
101 |               Configuration:
102 |                 BranchName: master
103 |                 RepositoryName: !Sub ${ResourcePrefix}
104 |                 PollForSourceChanges: false
105 |               OutputArtifacts:
106 |                 - Name: SourceStageOutput
107 |         - Name: Build
108 |           Actions:
109 |             - Name: CopyResources
110 |               ActionTypeId:
111 |                 Category: Build
112 |                 Owner: AWS
113 |                 Provider: CodeBuild
114 |                 Version: 1
115 |               Configuration:
116 |                 ProjectName: !Sub ${ResourcePrefix}CopyResources
117 |               InputArtifacts:
118 |                 - Name: SourceStageOutput
119 |         - Name: CreateStack
120 |           Actions:
121 |             - Name: CreateStack
122 |               ActionTypeId:
123 |                 Category: Deploy
124 |                 Owner: AWS
125 |                 Provider: CloudFormation
126 |                 Version: 1
127 |               Configuration:
128 |                 StackName: !Sub ${ResourcePrefix}
129 |                 ActionMode: CREATE_UPDATE
130 |                 Capabilities: CAPABILITY_NAMED_IAM
131 |                 RoleArn: !Sub ${CloudFormationRole.Arn}
132 |                 TemplatePath: !Sub SourceStageOutput::code_cfn.yml
133 |                 ParameterOverrides: !Sub |
134 |                   {
135 |                     "ResourcePrefix" : "${ResourcePrefix}",
136 |                     "ResourcePrefixLowercase" : "${ResourcePrefixLowercase}",
137 |                     "ResourcesBucket" : "${ResourcesBucket}",
138 |                     "DataLakeBucket": "${DataLakeBucket}"
139 |                   }
140 |               InputArtifacts:
141 |                 - Name: SourceStageOutput
142 |               OutputArtifacts: []
143 | 
144 |   CloudFormationRole:
145 |     Type: AWS::IAM::Role
146 |     Properties:
147 |       Path: /
148 |       AssumeRolePolicyDocument:
149 |         Version: 2012-10-17
150 |         Statement:
151 |           - Effect: Allow
152 |             Action:
153 |               - sts:AssumeRole
154 |             Principal:
155 |               Service:
156 |                 - cloudformation.amazonaws.com
157 |       Policies:
158 |         - PolicyName: CloudFormationRolePolicy
159 |           PolicyDocument:
160 |             Version: 2012-10-17
161 |             Statement:
162 |               - Effect: Allow
163 |                 Action:
164 |                   - iam:GetRolePolicy
165 |                 Resource: '*'
166 |               - Effect: Allow
167 |                 Action:
168 |                   - iam:CreateRole
169 |                   - iam:DeleteRole
170 |                   - iam:PutRolePolicy
171 |                   - iam:GetRolePolicy
172 |                   - iam:DeleteRolePolicy
173 |                   - iam:AttachRolePolicy
174 |                   - iam:DetachRolePolicy
175 |                   - iam:UpdateAssumeRolePolicy
176 |                   - iam:PassRole
177 |                   - iam:GetRole
178 |                 Resource:
179 |                   - !Sub arn:aws:iam::${AWS::AccountId}:role/${ResourcePrefix}*
180 |               - Effect: Allow
181 |                 Action:
182 |                   - glue:CreateJob
183 |                   - glue:UpdateJob
184 |                   - glue:DeleteJob
185 |                   - glue:GetJob
186 |                 Resource: '*'
187 |               - Effect: Allow
188 |                 Action:
189 |                   - s3:CreateBucket
190 |                   - s3:DeleteBucket
191 |                   - s3:GetObject
192 |                 Resource:
193 |                   - !Sub ${BuildBucket.Arn}
194 |                   - !Sub ${BuildBucket.Arn}/*
195 |               - Effect: Allow
196 |                 Action:
197 |                   - sagemaker:CreateNotebookInstanceLifecycleConfig
198 |                   - sagemaker:DescribeNotebookInstanceLifecycleConfig
199 |                   - sagemaker:UpdateNotebookInstanceLifecycleConfig
200 |                   - sagemaker:DeleteNotebookInstanceLifecycleConfig
201 |                   - sagemaker:CreateNotebookInstance
202 |                   - sagemaker:UpdateNotebookInstance
203 |                   - sagemaker:StartNotebookInstance
204 |                   - sagemaker:DescribeNotebookInstance
205 |                   - sagemaker:DeleteNotebookInstance
206 |                   - sagemaker:StopNotebookInstance
207 |                 Resource:
208 |                   - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:notebook-instance-lifecycle-config/${ResourcePrefixLowercase}*
209 |                   - !Sub arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:notebook-instance/${ResourcePrefixLowercase}*
210 |     Metadata:
211 |       cfn_nag:
212 |         rules_to_suppress:
213 |           - id: W11
214 |             reason: AWS Glue requires * resources for the spedified actions. Same for get role policy.
215 | 
216 |   CodeBuildRole:
217 |     Type: AWS::IAM::Role
218 |     DependsOn: ResourcesBucket
219 |     Properties:
220 |       AssumeRolePolicyDocument:
221 |         Version: 2012-10-17
222 |         Statement:
223 |           - Action:
224 |               - sts:AssumeRole
225 |             Effect: Allow
226 |             Principal:
227 |               Service:
228 |                 - codebuild.amazonaws.com
229 |       Path: /
230 |       Policies:
231 |         - PolicyName: CodeBuildAccess
232 |           PolicyDocument:
233 |             Version: 2012-10-17
234 |             Statement:
235 |               - Effect: Allow
236 |                 Resource:
237 |                   - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/codebuild/${ResourcePrefix}*
238 |                 Action:
239 |                   - logs:CreateLogGroup
240 |                   - logs:CreateLogStream
241 |                   - logs:PutLogEvents
242 |               - Effect: Allow
243 |                 Action:
244 |                   - s3:GetObject
245 |                   - s3:GetObjectVersion
246 |                   - s3:PutObject
247 |                 Resource: !Sub ${BuildBucket.Arn}/*
248 |               - Effect: Allow
249 |                 Action:
250 |                   - s3:ListBucket
251 |                 Resource:
252 |                   - !Sub ${ResourcesBucket.Arn}
253 |                   - !Sub ${DataLakeBucket.Arn}
254 |               - Effect: Allow
255 |                 Action:
256 |                   - s3:PutObject
257 |                   - s3:PutObjectAcl
258 |                 Resource:
259 |                   - !Sub ${ResourcesBucket.Arn}
260 |                   - !Sub ${ResourcesBucket.Arn}/*
261 |                   - !Sub ${DataLakeBucket.Arn}
262 |                   - !Sub ${DataLakeBucket.Arn}/*
263 | 
264 |   CodePipelineRole:
265 |     Type: AWS::IAM::Role
266 |     Properties:
267 |       AssumeRolePolicyDocument:
268 |         Version: 2012-10-17
269 |         Statement:
270 |           - Action:
271 |               - sts:AssumeRole
272 |             Effect: Allow
273 |             Principal:
274 |               Service:
275 |                 - codepipeline.amazonaws.com
276 |       Path: /
277 |       Policies:
278 |         - PolicyName: CloudFormationAccess
279 |           PolicyDocument:
280 |             Version: 2012-10-17
281 |             Statement:
282 |               - Action:
283 |                   - cloudformation:CreateStack
284 |                   - cloudformation:DescribeStacks
285 |                   - cloudformation:UpdateStack
286 |                 Effect: Allow
287 |                 Resource: !Sub arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${ResourcePrefix}/*
288 |         - PolicyName: S3Access
289 |           PolicyDocument:
290 |             Version: 2012-10-17
291 |             Statement:
292 |               - Effect: Allow
293 |                 Action:
294 |                   - s3:GetObject
295 |                   - s3:GetObjectVersion
296 |                   - s3:GetBucketVersioning
297 |                   - s3:DeleteObject
298 |                   - s3:PutObject
299 |                 Resource:
300 |                   - !Sub ${BuildBucket.Arn}
301 |                   - !Sub ${BuildBucket.Arn}/*
302 |         - PolicyName: CodeBuildAccess
303 |           PolicyDocument:
304 |             Version: 2012-10-17
305 |             Statement:
306 |               - Action:
307 |                   - codebuild:StartBuild
308 |                   - codebuild:BatchGetBuilds
309 |                 Effect: Allow
310 |                 Resource:
311 |                   - !GetAtt CodeBuildCopyResourcesProject.Arn
312 |         - PolicyName: IamAccess
313 |           PolicyDocument:
314 |             Version: 2012-10-17
315 |             Statement:
316 |               - Action:
317 |                   - iam:PassRole
318 |                 Effect: Allow
319 |                 Resource: !GetAtt CodeBuildRole.Arn
320 |         - PolicyName: IamAccessCF
321 |           PolicyDocument:
322 |             Version: 2012-10-17
323 |             Statement:
324 |               - Action:
325 |                   - iam:PassRole
326 |                 Effect: Allow
327 |                 Resource: !Sub ${CloudFormationRole.Arn}
328 |         - PolicyName: CodeCommitAccess
329 |           PolicyDocument:
330 |             Version: 2012-10-17
331 |             Statement:
332 |               - Effect: Allow
333 |                 Action:
334 |                   - codecommit:UploadArchive
335 |                   - codecommit:GetBranch
336 |                   - codecommit:GetCommit
337 |                   - codecommit:GetUploadArchiveStatus
338 |                 Resource: !GetAtt Repo.Arn
339 | 
340 |   SourceEventRole:
341 |     Type: AWS::IAM::Role
342 |     DependsOn: CodePipeline
343 |     Description: IAM role to allow Amazon CloudWatch Events to trigger AWS CodePipeline
344 |       execution
345 |     Properties:
346 |       AssumeRolePolicyDocument:
347 |         Statement:
348 |           - Action: sts:AssumeRole
349 |             Effect: Allow
350 |             Principal:
351 |               Service:
352 |                 - events.amazonaws.com
353 |             Sid: 1
354 |       Policies:
355 |         - PolicyName: CloudWatchEventPolicy
356 |           PolicyDocument:
357 |             Statement:
358 |               - Action:
359 |                   - codepipeline:StartPipelineExecution
360 |                 Effect: Allow
361 |                 Resource:
362 |                   - !Sub arn:aws:codepipeline:${AWS::Region}:${AWS::AccountId}:${CodePipeline}*
363 | 
364 |   BuildBucket:
365 |     Type: AWS::S3::Bucket
366 |     Properties:
367 |       PublicAccessBlockConfiguration:
368 |         BlockPublicAcls: True
369 |         BlockPublicPolicy: True
370 |         IgnorePublicAcls: True
371 |         RestrictPublicBuckets: True
372 |       LoggingConfiguration:
373 |         DestinationBucketName: !Ref LogsBucket
374 |         LogFilePrefix: templates_logs/
375 |       BucketEncryption:
376 |         ServerSideEncryptionConfiguration:
377 |           - ServerSideEncryptionByDefault:
378 |               SSEAlgorithm: AES256
379 |     Metadata:
380 |       cfn_nag:
381 |         rules_to_suppress:
382 |           - id: W51
383 |             reason: Bucket policy is not needed.
384 | 
385 |   DataLakeBucket:
386 |     Type: AWS::S3::Bucket
387 |     Properties:
388 |       PublicAccessBlockConfiguration:
389 |         BlockPublicAcls: True
390 |         BlockPublicPolicy: True
391 |         IgnorePublicAcls: True
392 |         RestrictPublicBuckets: True
393 |       LoggingConfiguration:
394 |         DestinationBucketName: !Ref LogsBucket
395 |         LogFilePrefix: templates_logs/
396 |       BucketEncryption:
397 |         ServerSideEncryptionConfiguration:
398 |           - ServerSideEncryptionByDefault:
399 |               SSEAlgorithm: AES256
400 |     Metadata:
401 |       cfn_nag:
402 |         rules_to_suppress:
403 |           - id: W51
404 |             reason: Bucket policy is not needed.
405 | 
406 |   ResourcesBucket:
407 |     Type: AWS::S3::Bucket
408 |     Properties:
409 |       PublicAccessBlockConfiguration:
410 |         BlockPublicAcls: True
411 |         BlockPublicPolicy: True
412 |         IgnorePublicAcls: True
413 |         RestrictPublicBuckets: True
414 |       LoggingConfiguration:
415 |         DestinationBucketName: !Ref LogsBucket
416 |         LogFilePrefix: templates_logs/
417 |       BucketEncryption:
418 |         ServerSideEncryptionConfiguration:
419 |           - ServerSideEncryptionByDefault:
420 |               SSEAlgorithm: AES256
421 |     Metadata:
422 |       cfn_nag:
423 |         rules_to_suppress:
424 |           - id: W51
425 |             reason: Bucket policy is not needed.
426 | 
427 |   LogsBucket:
428 |     DeletionPolicy: Retain
429 |     Type: AWS::S3::Bucket
430 |     Properties:
431 |       PublicAccessBlockConfiguration:
432 |         BlockPublicAcls: True
433 |         BlockPublicPolicy: True
434 |         IgnorePublicAcls: True
435 |         RestrictPublicBuckets: True
436 |       AccessControl: LogDeliveryWrite
437 |       BucketEncryption:
438 |         ServerSideEncryptionConfiguration:
439 |           - ServerSideEncryptionByDefault:
440 |               SSEAlgorithm: AES256
441 |     Metadata:
442 |       cfn_nag:
443 |         rules_to_suppress:
444 |           - id: W35
445 |             reason: This is the pipeline and solution log bucket and does not require access logging to be configured.
446 |           - id: W51
447 |             reason: Bucket policy is not needed.
448 | 
449 | Outputs:
450 |   LogsBucket:
451 |     Value: !Ref LogsBucket
452 |   BuildBucket:
453 |     Value: !Ref BuildBucket
454 |   RepoName:
455 |     Description: RepoName
456 |     Value: !Sub ${Repo.Name}
457 |   RepoHttpUrl:
458 |     Description: RepoHttpUrl
459 |     Value: !Sub ${Repo.CloneUrlHttp}
460 |   ResourcesBucket:
461 |     Value: !Ref ResourcesBucket
462 |   DataLakeBucket:
463 |     Value: !Ref DataLakeBucket
464 |     Export:
465 |       Name: !Sub ${AWS::StackName}-DataLakeBucket
466 |   DataLakeBucketArn:
467 |     Value: !GetAtt DataLakeBucket.Arn
468 |     Export:
469 |       Name: !Sub ${AWS::StackName}-DataLakeBucketArn
470 | 
471 |   # aws cloudformation update-stack --stack-name ${PROJECT_NAME:-GenomicsLearning}-Pipeline --template-body file://pipe_cfn.yml --capabilities CAPABILITY_NAMED_IAM --output text --parameters ParameterKey=ResourcePrefix,ParameterValue=${PROJECT_NAME:-GenomicsLearning} ParameterKey=ResourcePrefixLowercase,ParameterValue=$(echo ${PROJECT_NAME:-GenomicsLearning} | tr '[:upper:]' '[:lower:]'); aws cloudformation wait stack-update-complete --stack-name ${PROJECT_NAME:-GenomicsLearning}-Pipeline
472 | 


--------------------------------------------------------------------------------
/source/GenomicsLearningZone/zone_cfn.yml:
--------------------------------------------------------------------------------
 1 | ---
 2 | AWSTemplateFormatVersion: 2010-09-09
 3 | Description: GenomicsLearningZone
 4 | 
 5 | # CodeCommit
 6 | #   Repo
 7 | 
 8 | Parameters:
 9 |   ResourcePrefix:
10 |     Type: String
11 |     Default: GenomicsLearning
12 |   ResourcePrefixLowercase:
13 |     Type: String
14 |     Default: genomicslearning
15 |     
16 | Resources:
17 |   # CodeCommit
18 |   Repo:
19 |     DeletionPolicy: Retain
20 |     Type: AWS::CodeCommit::Repository
21 |     Properties:
22 |       RepositoryName: !Sub ${ResourcePrefix}-Pipe
23 |       RepositoryDescription: !Sub ${ResourcePrefix}-Pipe
24 | Outputs:
25 |   RepoName:
26 |     Description: RepoName
27 |     Value: !Sub ${Repo.Name}
28 |   RepoHttpUrl:
29 |     Description: RepoHttpUrl
30 |     Value: !Sub ${Repo.CloneUrlHttp}
31 | 
32 | # aws cloudformation update-stack --stack-name GenomicsLearningZone --template-body file://template_cfn.yml --capabilities CAPABILITY_IAM --output text; aws cloudformation wait stack-update-complete --stack-name GenomicsLearningZone
33 | 
34 | # aws cloudformation create-stack --stack-name GenomicsLearningZone --template-body file://template_cfn.yml --capabilities CAPABILITY_IAM --enable-termination-protection --output text; aws cloudformation wait stack-create-complete --stack-name GenomicsLearningZone; aws cloudformation describe-stacks --stack-name GenomicsLearningZone --query 'Stacks[].Outputs[?OutputKey==`RepoCloneCommand`].OutputValue' --output text
35 | 
36 | 


--------------------------------------------------------------------------------
/source/setup.sh:
--------------------------------------------------------------------------------
  1 | #!/bin/bash -e
  2 | 
  3 | export AWS_DEFAULT_OUTPUT=text
  4 | 
  5 | create_stack() {
  6 |   local stack_name=${1}
  7 |   local template_name=${2}
  8 |   local ResourcePrefix=${3}
  9 | 
 10 |   local ResourcePrefix_lowercase=$(echo ${ResourcePrefix} | tr '[:upper:]' '[:lower:]')
 11 |   
 12 |   aws cloudformation create-stack --stack-name ${stack_name} --template-body file://${template_name} --parameters ParameterKey=ResourcePrefix,ParameterValue=${ResourcePrefix} ParameterKey=ResourcePrefixLowercase,ParameterValue=${ResourcePrefix_lowercase} --capabilities CAPABILITY_NAMED_IAM --no-enable-termination-protection; aws cloudformation wait stack-create-complete --stack-name ${stack_name}
 13 | }
 14 | 
 15 | clone_and_commit() {
 16 |   local stack_name=${1}
 17 | 
 18 |   local repo_http_url=$(aws cloudformation describe-stacks --stack-name ${stack_name} --query 'Stacks[].Outputs[?OutputKey==`RepoHttpUrl`].OutputValue')
 19 | 
 20 |   git init .; git remote add origin ${repo_http_url}
 21 | 
 22 |   git add *; git commit -m "first commit"; git push --set-upstream origin master
 23 | }
 24 | 
 25 | wait_for_pipeline() {
 26 |   local pipeline_name=${1}
 27 |   local commit_id=${2}
 28 | 
 29 |   local message="Max attempts reached. Pipeline execution failed for commit: ${commit_id}"
 30 |   for i in {1..60}; do
 31 | 
 32 |     stage_status=$(aws codepipeline list-pipeline-executions --pipeline-name ${pipeline_name} --query 'pipelineExecutionSummaries[?sourceRevisions[0].revisionId==`'${commit_id}'`].status')
 33 | 
 34 |     if [ "${stage_status}" == "InProgress" ] || [ -z "${stage_status}" ]; then
 35 |       printf '.'
 36 |       sleep 30
 37 |     elif [ "${stage_status}" == "Succeeded" ]; then
 38 |       message="Pipeline execution succeeded for commit: ${commit_id}"
 39 |       break
 40 |     elif [ "${stage_status}" == "Failed" ]; then
 41 |       message="Pipeline execution Failed for commit: ${commit_id}"
 42 |       break
 43 |     fi
 44 | 
 45 |   done
 46 |   printf "\n${message}\n"
 47 | }
 48 | 
 49 | copy_test_data() {
 50 |   local artifact_bucket=${1}
 51 |   local artifact_key_prefix=${2}
 52 |   local pipe_stackname=${3}
 53 | 
 54 |   local data_lake_bucket=$(aws cloudformation describe-stacks --stack-name ${pipe_stackname} --query 'Stacks[].Outputs[?OutputKey==`DataLakeBucket`].OutputValue' --output text)
 55 | 
 56 |   aws s3 cp s3://${artifact_bucket}/${artifact_key_prefix}/annotation/clinvar/clinvar.vcf.gz s3://${data_lake_bucket}/annotation/clinvar/clinvar.vcf.gz
 57 |   aws s3 cp s3://${artifact_bucket}/${artifact_key_prefix}/annotation/clinvar/clinvar.annotated.vcf.gz s3://${data_lake_bucket}/annotation/clinvar/clinvar.annotated.vcf.gz
 58 |   aws s3 cp s3://${artifact_bucket}/${artifact_key_prefix}/annotation/clinvar/clinvar_conflicting.csv s3://${data_lake_bucket}/annotation/clinvar/conflicting/clinvar_conflicting.csv
 59 | }
 60 | 
 61 | setup() {
 62 | 
 63 |   local resource_prefix=$1
 64 |   local artifact_bucket=$2
 65 |   local artifact_key_prefix=$3
 66 | 
 67 |   local dir_prefix="GenomicsLearning"
 68 | 
 69 |   local zone_dir="${dir_prefix}Zone"
 70 |   local pipe_dir="${dir_prefix}Pipe"
 71 |   local code_dir="${dir_prefix}Code"
 72 | 
 73 |   local zone_stackname=${resource_prefix}-LandingZone
 74 |   local pipe_stackname=${resource_prefix}-Pipeline
 75 | 
 76 |   # Create stacks
 77 |   create_stack "${zone_stackname}" "${zone_dir}/zone_cfn.yml" "${resource_prefix}"
 78 |   create_stack "${pipe_stackname}" "${pipe_dir}/pipe_cfn.yml" "${resource_prefix}"
 79 | 
 80 |   # Clone and commit resources
 81 |   cd "${pipe_dir}"; clone_and_commit "${zone_stackname}"; cd ..
 82 |   cd "${code_dir}"; clone_and_commit "${pipe_stackname}";
 83 | 
 84 |   # Get the last commit id
 85 |   commit_id=$(git log -1 --pretty=format:%H)
 86 |   cd ..
 87 | 
 88 |   # Get pipeline name
 89 |   pipeline_name=$(aws cloudformation describe-stack-resource --stack-name ${pipe_stackname} --logical-resource-id CodePipeline --query 'StackResourceDetail.PhysicalResourceId')
 90 | 
 91 |   # Wait for pipeline execution using commit id
 92 |   wait_for_pipeline "${pipeline_name}" "${commit_id}"
 93 | 
 94 |   # Copy Test Data
 95 |   copy_test_data "${artifact_bucket}" "${artifact_key_prefix}" "${pipe_stackname}"
 96 | 
 97 |   # Run Test
 98 |   "${code_dir}/awscli_test.sh" "${resource_prefix}"
 99 | }
100 | 
101 | project_name=${PROJECT_NAME:-GenomicsLearning}
102 | 
103 | setup "$project_name" "${ARTIFACT_BUCKET}" "${ARTIFACT_KEY_PREFIX}"
104 | 


--------------------------------------------------------------------------------
/source/setup_cfn.yml:
--------------------------------------------------------------------------------
  1 | AWSTemplateFormatVersion: 2010-09-09
  2 | 
  3 | Description: |
  4 |     (SO0078) - The Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker solution creates a scalable environment 
  5 |     in AWS to develop machine learning models using genomics data, generate predictions, and evaluate model performance. 
  6 |     This solution demonstrates how to 1) automate the preparation of a genomics machine learning training dataset, 
  7 |     2) develop genomics machine learning model training and deployment pipelines and, 
  8 |     3) generate predictions and evaluate model performance using test data.
  9 | 
 10 | Mappings:
 11 |   Send:
 12 |     AnonymousUsage:
 13 |       Data: Yes
 14 |   SourceCode:
 15 |     General:
 16 |       S3Bucket: '%%BUCKET_NAME%%'
 17 |       KeyPrefix: '%%SOLUTION_NAME%%/%%VERSION%%'
 18 | 
 19 | Parameters:
 20 |   Project:
 21 |     Type: String
 22 |     Description: >
 23 |       The project name for this solution. The project name will be used to prefix resources created by this solution. Project names should be unique to a project.
 24 |     AllowedPattern: "[a-zA-Z0-9-]{3,24}"
 25 |     ConstraintDescription: >
 26 |       Project name should be unique, 3-24 characters in length, and only have alphanumeric characters and hyphens ([a-zA-Z0-9-]{3,32}).
 27 |     Default: GenomicsLearning
 28 |       
 29 | Resources:
 30 |   Setup:
 31 |     Type: Custom::Setup
 32 |     DependsOn:
 33 |       - CodeBuild
 34 |     Version: 1.0
 35 |     Properties:
 36 |       ServiceToken: !Sub ${SetupLambda.Arn}
 37 |       CodeBuildProjectName: !Sub ${CodeBuild}
 38 | 
 39 |   SetupLambda:
 40 |     Type: AWS::Lambda::Function
 41 |     DependsOn:
 42 |       - SetupLambdaRole
 43 |     Properties:
 44 |       Handler: lambda.handler
 45 |       Runtime: python3.8
 46 |       FunctionName: !Sub ${Project}Setup
 47 |       Code:
 48 |         S3Bucket: !Join ["-", [!FindInMap ["SourceCode", "General", "S3Bucket"], Ref: "AWS::Region"]]
 49 |         S3Key: !Join ["", [!FindInMap ["SourceCode", "General", "KeyPrefix"], "/SolutionSetup.zip"]]
 50 |       Role: !Sub ${SetupLambdaRole.Arn}
 51 |       Timeout: 600
 52 |     Metadata:
 53 |       cfn_nag:
 54 |         rules_to_suppress:
 55 |           - id: W58
 56 |             reason: Bug in CfnNag. Lambda functions require permission to write CloudWatch Logs. Looking for PutLogEvent instead of PutLogEvents
 57 | 
 58 |   SetupLambdaRole:
 59 |     Type: AWS::IAM::Role
 60 |     DependsOn:
 61 |       - CodeBuild
 62 |     Properties:
 63 |       AssumeRolePolicyDocument:
 64 |         Version: 2012-10-17
 65 |         Statement:
 66 |           - Action:
 67 |               - sts:AssumeRole
 68 |             Effect: Allow
 69 |             Principal:
 70 |               Service:
 71 |                 - lambda.amazonaws.com
 72 |       Path: /
 73 |       Policies:
 74 |         - PolicyName: LogsAccess
 75 |           PolicyDocument:
 76 |             Statement:
 77 |               - Effect: Allow
 78 |                 Action:
 79 |                   - logs:CreateLogGroup
 80 |                   - logs:CreateLogStream
 81 |                   - logs:PutLogEvents
 82 |                 Resource:
 83 |                   - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/lambda/${Project}*
 84 |         - PolicyName: CodeBuildAccess
 85 |           PolicyDocument:
 86 |             Statement:
 87 |               - Effect: Allow
 88 |                 Action:
 89 |                   - codebuild:BatchGetProjects
 90 |                   - codebuild:BatchGetBuilds
 91 |                   - codebuild:StartBuild
 92 |                 Resource:
 93 |                   - !Sub ${CodeBuild.Arn}
 94 |         - PolicyName: EventsAccess
 95 |           PolicyDocument:
 96 |             Statement:
 97 |               - Effect: Allow
 98 |                 Action:
 99 |                   - events:DeleteRule
100 |                   - events:PutRule
101 |                   - events:PutTargets
102 |                   - events:RemoveTargets
103 |                 Resource:
104 |                   - !Sub arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/Setup*
105 |         - PolicyName: LambdaAccess
106 |           PolicyDocument:
107 |             Statement:
108 |               - Effect: Allow
109 |                 Action:
110 |                   - lambda:AddPermission
111 |                   - lambda:RemovePermission
112 |                 Resource:
113 |                   - !Sub arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:${Project}*
114 | 
115 |   CodeBuildRole:
116 |       Type: AWS::IAM::Role
117 |       Properties:
118 |         AssumeRolePolicyDocument:
119 |           Version: 2012-10-17
120 |           Statement:
121 |             - Action:
122 |                 - sts:AssumeRole
123 |               Effect: Allow
124 |               Principal:
125 |                 Service:
126 |                   - codebuild.amazonaws.com
127 |         Path: /
128 |         Policies:
129 |           - PolicyName: CloudFormationAccess
130 |             PolicyDocument:
131 |               Statement:
132 |                 - Action:
133 |                     - cloudformation:CreateStack
134 |                     - cloudformation:DescribeStacks
135 |                     - cloudformation:DescribeStackResource
136 |                     - cloudformation:UpdateStack
137 |                     - cloudformation:DeleteStack
138 |                     - cloudformation:UpdateTerminationProtection
139 |                   Effect: Allow
140 |                   Resource: !Sub arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${Project}*
141 |           - PolicyName: LogsAccess
142 |             PolicyDocument:
143 |               Statement:
144 |                 - Effect: Allow
145 |                   Action:
146 |                     - logs:CreateLogGroup
147 |                     - logs:CreateLogStream
148 |                     - logs:PutLogEvents
149 |                   Resource:
150 |                     - !Sub arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/codebuild/${Project}*
151 |           - PolicyName: IAMAccess
152 |             PolicyDocument:
153 |               Statement:
154 |                 - Effect: Allow
155 |                   Action:
156 |                     - iam:CreateRole
157 |                     - iam:DeleteRole
158 |                     - iam:PutRolePolicy
159 |                     - iam:DeleteRolePolicy
160 |                     - iam:AttachRolePolicy
161 |                     - iam:DetachRolePolicy
162 |                     - iam:UpdateAssumeRolePolicy
163 |                     - iam:PassRole
164 |                     - iam:GetRole
165 |                     - iam:GetInstanceProfile
166 |                     - iam:CreateInstanceProfile
167 |                     - iam:DeleteInstanceProfile
168 |                     - iam:AddRoleToInstanceProfile
169 |                     - iam:RemoveRoleFromInstanceProfile
170 |                   Resource:
171 |                     - !Sub arn:aws:iam::${AWS::AccountId}:role/${Project}*
172 |                     - !Sub arn:aws:iam::${AWS::AccountId}:instance-profile/${Project}*
173 |           - PolicyName: CodeBuildAccess
174 |             PolicyDocument:
175 |               Statement:
176 |                 - Effect: Allow
177 |                   Action:
178 |                     - codebuild:CreateProject
179 |                     - codebuild:UpdateProject
180 |                     - codebuild:ListProjects
181 |                     - codebuild:BatchGetProjects
182 |                     - codebuild:DeleteProject
183 |                   Resource:
184 |                     - !Sub arn:aws:codebuild:${AWS::Region}:${AWS::AccountId}:project/${Project}*
185 |           - PolicyName: CodePipelineAccess
186 |             PolicyDocument:
187 |               Statement:
188 |                 - Effect: Allow
189 |                   Action:
190 |                     - codepipeline:CreatePipeline
191 |                     - codepipeline:GetPipeline
192 |                     - codepipeline:UpdatePipeline
193 |                     - codepipeline:DeletePipeline
194 |                     - codepipeline:GetPipelineState
195 |                     - codepipeline:ListPipelineExecutions
196 |                   Resource:
197 |                     - !Sub arn:aws:codepipeline:${AWS::Region}:${AWS::AccountId}:${Project}*
198 |           - PolicyName: CodeCommitAccess
199 |             PolicyDocument:
200 |               Statement:
201 |                 - Effect: Allow
202 |                   Action:
203 |                     - codecommit:CreateBranch
204 |                     - codecommit:CreateRepository
205 |                     - codecommit:GetRepository
206 |                     - codecommit:DeleteRepository
207 |                     - codecommit:CreateCommit
208 |                     - codecommit:GitPush
209 |                     - codecommit:GitPull
210 |                     - codecommit:DeleteBranch
211 |                   Resource:
212 |                     - !Sub arn:aws:codecommit:${AWS::Region}:${AWS::AccountId}:${Project}*
213 |                 - Effect: Allow
214 |                   Action:
215 |                     - codecommit:ListRepositories
216 |                   Resource: '*'
217 |           - PolicyName: EventsAccess
218 |             PolicyDocument:
219 |               Statement:
220 |                 - Effect: Allow
221 |                   Action:
222 |                     - events:DescribeRule
223 |                     - events:PutRule
224 |                     - events:DeleteRule
225 |                     - events:PutTargets
226 |                     - events:RemoveTargets
227 |                   Resource:
228 |                     - !Sub arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/*
229 |           - PolicyName: GlueAccess
230 |             PolicyDocument:
231 |               Statement:
232 |                 - Effect: Allow
233 |                   Action:
234 |                     - glue:StartJob
235 |                     - glue:GetJob
236 |                   Resource: '*'
237 |           - PolicyName: S3Access
238 |             PolicyDocument:
239 |               Statement:
240 |                 - Effect: Allow
241 |                   Action:
242 |                     - s3:GetObject
243 |                   Resource:
244 |                     !Join
245 |                       - ''
246 |                       - - 'arn:aws:s3:::'
247 |                         - !FindInMap ["SourceCode", "General", "S3Bucket"]
248 |                         - '/*'
249 |                 - Effect: Allow
250 |                   Action:
251 |                     - s3:GetObject
252 |                   Resource:
253 |                     !Join
254 |                       - ''
255 |                       - - 'arn:aws:s3:::'
256 |                         - !Join
257 |                             - '-'
258 |                             - - !FindInMap ["SourceCode", "General", "S3Bucket"]
259 |                               - Ref: "AWS::Region"
260 |                         - '/'
261 |                         - !FindInMap ["SourceCode", "General", "KeyPrefix"]
262 |                         - '/*'
263 |                 - Effect: Allow
264 |                   Action:
265 |                     - s3:ListBucket
266 |                   Resource:
267 |                     !Join
268 |                       - ''
269 |                       - - 'arn:aws:s3:::'
270 |                         - !FindInMap ["SourceCode", "General", "S3Bucket"]
271 | 
272 |                 - Effect: Allow
273 |                   Action:
274 |                     - s3:PutObjectAcl
275 |                     - s3:GetObject
276 |                     - s3:PutObject
277 |                     - s3:DeleteObject
278 |                     - s3:ListBucket
279 |                     - s3:CreateBucket
280 |                     - s3:DeleteBucket
281 |                     - s3:PutEncryptionConfiguration
282 |                     - s3:PutBucketPublicAccessBlock
283 |                     - s3:PutBucketLogging
284 |                     - s3:PutBucketAcl
285 |                   Resource:
286 |                     - arn:aws:s3:::*pipe*
287 |                     - arn:aws:s3:::*pipe*/*
288 |                 - Effect: Allow
289 |                   Action:
290 |                     - s3:ListBucket
291 |                   Resource:
292 |                     !Join
293 |                       - ''
294 |                       - - 'arn:aws:s3:::'
295 |                         - !FindInMap ["SourceCode", "General", "S3Bucket"]
296 |                 - Effect: Allow
297 |                   Action:
298 |                     - s3:CreateBucket
299 |                     - s3:DeleteBucket
300 |                     - s3:ListBucket
301 |                     - s3:PutEncryptionConfiguration
302 |                     - s3:PutBucketPublicAccessBlock
303 |                     - s3:PutBucketLogging
304 |                     - s3:PutBucketAcl
305 |                     - s3:PutObject
306 |                     - s3:PutObjectAcl
307 |                   Resource:
308 |                     - arn:aws:s3:::*pipe*
309 |                     - arn:aws:s3:::*pipe*/*
310 |       Metadata:
311 |         cfn_nag:
312 |           rules_to_suppress:
313 |             - id: W11
314 |               reason: Star required for codecommit:ListRepositories and Glue actions.
315 |   
316 |   CodeBuild:
317 |     Type: AWS::CodeBuild::Project
318 |     Properties:
319 |       Name: !Sub ${Project}Setup
320 |       Artifacts:
321 |         Type: NO_ARTIFACTS
322 |       Source:
323 |         Type: NO_SOURCE
324 |         BuildSpec: !Sub |
325 |           version: 0.2
326 |           phases:
327 |             install:
328 |               commands:
329 |                 - git config --global user.name automated_user
330 |                 - git config --global user.email automated_email
331 |                 - git config --global credential.helper '!aws codecommit credential-helper $@'
332 |                 - git config --global credential.UseHttpPath true
333 |                 - aws s3 cp s3://$ARTIFACT_BUCKET/$ARTIFACT_KEY_PREFIX/Solution.zip .
334 |                 - unzip Solution.zip
335 |                 - ./$SOLUTION_ACTION.sh
336 |       Environment:
337 |         ComputeType: BUILD_GENERAL1_SMALL
338 |         EnvironmentVariables:
339 |           - Name: SOLUTION_ACTION
340 |             Value: setup
341 |           - Name: PROJECT_NAME
342 |             Value: !Ref Project
343 |           - Name: ARTIFACT_BUCKET
344 |             Value: !Join ["-", [!FindInMap ["SourceCode", "General", "S3Bucket"], Ref: "AWS::Region"]]
345 |           - Name: ARTIFACT_KEY_PREFIX
346 |             Value: !FindInMap ["SourceCode", "General", "KeyPrefix"]
347 |         Image: aws/codebuild/standard:3.0
348 |         Type: LINUX_CONTAINER
349 |       ServiceRole: !Sub ${CodeBuildRole}
350 |       TimeoutInMinutes: 30
351 |     Metadata:
352 |       cfn_nag:
353 |         rules_to_suppress:
354 |           - id: W32
355 |             reason: Customer can enable encryption if desired.
356 | 


--------------------------------------------------------------------------------
/source/teardown.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash -e
 2 | 
 3 | export AWS_DEFAULT_OUTPUT=text
 4 | 
 5 | export RESOURCE_PREFIX=${PROJECT_NAME:-GenomicsLearning}
 6 | export RESOURCE_PREFIX_LOWERCASE=$(echo ${RESOURCE_PREFIX} | tr '[:upper:]' '[:lower:]')
 7 | 
 8 | export ZONE_STACKNAME=${RESOURCE_PREFIX}-LandingZone
 9 | export PIPE_STACKNAME=${RESOURCE_PREFIX}-Pipeline
10 | export CODE_STACKNAME=${RESOURCE_PREFIX}
11 | 
12 | export REPOSITORY_NAME=${RESOURCE_PREFIX_LOWERCASE}
13 | 
14 | # Clear Buckets
15 | 
16 | BUILD_BUCKET=$(aws cloudformation describe-stacks --stack-name ${PIPE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`BuildBucket`].OutputValue'); echo ${BUILD_BUCKET}
17 | RESOURCES_BUCKET=$(aws cloudformation describe-stacks --stack-name ${PIPE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`ResourcesBucket`].OutputValue'); echo ${RESOURCES_BUCKET}
18 | DATALAKE_BUCKET=$(aws cloudformation describe-stacks --stack-name ${PIPE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`DataLakeBucket`].OutputValue'); echo ${DATALAKE_BUCKET}
19 | LOGS_BUCKET=$(aws cloudformation describe-stacks --stack-name ${PIPE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`LogsBucket`].OutputValue'); echo ${LOGS_BUCKET}
20 | 
21 | aws s3 rm --recursive s3://${BUILD_BUCKET}/
22 | aws s3 rm --recursive s3://${RESOURCES_BUCKET}/
23 | aws s3 rm --recursive s3://${DATALAKE_BUCKET}/
24 | aws s3 rm --recursive s3://${LOGS_BUCKET}/
25 | 
26 | # Disable Termination Protection on Stacks
27 | 
28 | aws cloudformation update-termination-protection --no-enable-termination-protection --stack-name ${PIPE_STACKNAME} 
29 | aws cloudformation update-termination-protection --no-enable-termination-protection --stack-name ${ZONE_STACKNAME}
30 | 
31 | # Get Repo Names from Stacks
32 | 
33 | PIPE_REPO=$(aws cloudformation describe-stacks --stack-name ${ZONE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`RepoName`].OutputValue'); echo ${PIPE_REPO}
34 | CODE_REPO=$(aws cloudformation describe-stacks --stack-name ${PIPE_STACKNAME} --query 'Stacks[].Outputs[?OutputKey==`RepoName`].OutputValue'); echo ${CODE_REPO}
35 | 
36 | # Delete Stacks
37 | 
38 | aws cloudformation delete-stack --stack-name ${CODE_STACKNAME}; aws cloudformation wait stack-delete-complete --stack-name ${CODE_STACKNAME}
39 | aws cloudformation delete-stack --stack-name ${PIPE_STACKNAME}; aws cloudformation wait stack-delete-complete --stack-name ${PIPE_STACKNAME}
40 | aws cloudformation delete-stack --stack-name ${ZONE_STACKNAME}; aws cloudformation wait stack-delete-complete --stack-name ${ZONE_STACKNAME}
41 | 
42 | # Delete Repos
43 | 
44 | aws codecommit delete-repository --repository-name ${PIPE_REPO}
45 | aws codecommit delete-repository --repository-name ${CODE_REPO}
46 | 
47 | # Cleanup Local Git Repo
48 | 
49 | find . \( -name ".git" -o -name ".gitignore" -o -name ".gitmodules" -o -name ".gitattributes" \) -exec rm -rf -- {} +
50 | 


--------------------------------------------------------------------------------