├── .github └── PULL_REQUEST_TEMPLATE.md ├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── shared └── images │ ├── AWS_EDA_Dark_2018-08-27.png │ ├── cloudformation-launch-stack.png │ └── deploy_to_aws.png └── workshops ├── eda-workshop-aws-batch └── README.md ├── eda-workshop-aws-step-functions └── README.md ├── eda-workshop-lsf ├── README.md ├── config │ ├── cli-deploy │ ├── lsf │ │ ├── awsprov_config.json │ │ ├── awsprov_templates.json │ │ ├── awsprov_templates_fleet.json │ │ ├── ec2-fleet-config.json │ │ ├── ec2fleet_template_example.json │ │ ├── hostProviders.json │ │ ├── lsb.modules │ │ ├── lsb.params │ │ ├── lsb.queues │ │ ├── lsf.shared │ │ ├── policy_config.json │ │ ├── spot_template_example.json │ │ └── user_data.sh │ └── mosquitto.conf ├── docs │ ├── demo-commands.md │ ├── deploy-environment.md │ ├── deploy-simple.md │ ├── images │ │ ├── aws-eda-workshop-full-workflow.png │ │ ├── eda-lsf-workshop-diagram-3.png │ │ ├── eda-lsf-workshop-simple-diagram.png │ │ └── eda-lsf-workshop-workflow.png │ ├── lsf-spot.md │ ├── run-simple.md │ └── run-workload.md ├── scripts │ ├── config-lsf.sh │ ├── delete.ebs.vols.py │ ├── install-lsf.sh │ └── nfs-bootstrap-master.sh └── templates │ ├── 00-eda-lsf-full-workshop-master.yaml │ ├── 01-network.yaml │ ├── 02-lsf-master.yaml │ ├── 03-dcv-login-server.yaml │ ├── eda-lsf-simple-workshop.yaml │ ├── efs-filesystem.yaml │ ├── fsxn-filesystem.yaml │ ├── license-server.yaml │ ├── nfs_server_instanceStore_zfs.yaml │ └── secrets.yaml ├── eda-workshop-parallel-cluster └── README.md └── eda-workshop-soca ├── scripts ├── create_alwayson_nodes.sh ├── idea_custom_ami_imagebuilder.sh ├── modify_group_name.ldif ├── modify_user_shell.sh └── soca_custom_ami.sh └── templates ├── nfs_server_ebs_zfs_soca.yaml └── nfs_server_instanceStore_zfs_soca.yaml /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | *Issue #, if available:* 2 | 3 | *Description of changes:* 4 | 5 | 6 | By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. 7 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | MANIFEST 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | .pytest_cache/ 49 | 50 | # Translations 51 | *.mo 52 | *.pot 53 | 54 | # Django stuff: 55 | *.log 56 | local_settings.py 57 | db.sqlite3 58 | 59 | # Flask stuff: 60 | instance/ 61 | .webassets-cache 62 | 63 | # Scrapy stuff: 64 | .scrapy 65 | 66 | # Sphinx documentation 67 | docs/_build/ 68 | 69 | # PyBuilder 70 | target/ 71 | 72 | # Jupyter Notebook 73 | .ipynb_checkpoints 74 | 75 | # pyenv 76 | .python-version 77 | 78 | # celery beat schedule file 79 | celerybeat-schedule 80 | 81 | # SageMath parsed files 82 | *.sage.py 83 | 84 | # Environments 85 | .env 86 | .venv 87 | env/ 88 | venv/ 89 | ENV/ 90 | env.bak/ 91 | venv.bak/ 92 | 93 | # Spyder project settings 94 | .spyderproject 95 | .spyproject 96 | 97 | # Rope project settings 98 | .ropeproject 99 | 100 | # mkdocs documentation 101 | /site 102 | 103 | # mypy 104 | .mypy_cache/ 105 | 106 | # IDE and Editor artifacts # 107 | *.bbprojectd 108 | .idea 109 | *.iml 110 | 111 | # Temporary Files # 112 | tmp_* 113 | cfg.tmp.json 114 | 115 | # OS generated files # 116 | .DS_Store 117 | .DS_Store? 118 | 119 | # vscode config files 120 | .vscode 121 | 122 | cli-deploy 123 | 124 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check [existing open](https://github.com/aws-samples/aws-eda-workshops/issues), or [recently closed](https://github.com/aws-samples/aws-eda-workshops/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aclosed%20), issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *master* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ['help wanted'](https://github.com/aws-samples/aws-eda-workshops/labels/help%20wanted) issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](https://github.com/aws-samples/aws-eda-workshops/blob/master/LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | 61 | We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes. 62 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of 4 | this software and associated documentation files (the "Software"), to deal in 5 | the Software without restriction, including without limitation the rights to 6 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 7 | the Software, and to permit persons to whom the Software is furnished to do so. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 10 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 11 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 12 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 13 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 14 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 15 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # AWS EDA Workshops 4 | 5 | Samples and documentation for deploying EDA computing environments in AWS 6 | 7 | ## Overview 8 | 9 | These hands-on workshops are designed to demonstrate how the elasticity of the AWS Cloud can help you accelerate EDA design cycles and reduce time-to-market. 10 | 11 | ## Workshops 12 | 13 | - [**EDA Workshop with IBM Spectrum LSF**](workshops/eda-workshop-lsf) 14 | This workshop shows you how to deploy a fully functional compute cluster based on IBM Spectrum LSF. The environment includes all resources required to run an EDA verification workload on a sample design. Using standard LSF commands, you can quickly add compute capacity to satisfy verification workload demand. 15 | 16 | - **EDA Workshop with AWS ParallelCluster** 17 | This workshop shows you how to deploy an EDA computing cluster using AWS ParallelCluster. AWS ParallelCluster is an AWS supported Open Source cluster management tool that makes it easy for you to deploy and manage High Performance Computing (HPC) clusters in the AWS cloud. 18 | 19 | - **EDA Workshop with AWS Batch** 20 | This workshop demonstrates how to run EDA workloads using AWS Batch, a fully managed AWS workload and resource management service. In this workshop, you'll learn how to submit and mananage EDA workloads with AWS Batch. 21 | 22 | - **EDA Workflows with AWS Step Functions**] 23 | This workshop shows you how to build a cloud-native, visual EDA workflow using AWS Step Functions. 24 | 25 | ## License Summary 26 | 27 | This sample code is made available under the MIT-0 license. See the LICENSE file. 28 | -------------------------------------------------------------------------------- /shared/images/AWS_EDA_Dark_2018-08-27.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/aws-eda-workshops/6c9a01305c54a3036924fc8d89e7197a2c97d001/shared/images/AWS_EDA_Dark_2018-08-27.png -------------------------------------------------------------------------------- /shared/images/cloudformation-launch-stack.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/aws-eda-workshops/6c9a01305c54a3036924fc8d89e7197a2c97d001/shared/images/cloudformation-launch-stack.png -------------------------------------------------------------------------------- /shared/images/deploy_to_aws.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/aws-eda-workshops/6c9a01305c54a3036924fc8d89e7197a2c97d001/shared/images/deploy_to_aws.png -------------------------------------------------------------------------------- /workshops/eda-workshop-aws-batch/README.md: -------------------------------------------------------------------------------- 1 | Coming soon -------------------------------------------------------------------------------- /workshops/eda-workshop-aws-step-functions/README.md: -------------------------------------------------------------------------------- 1 | Coming soon -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/README.md: -------------------------------------------------------------------------------- 1 | 2 | # EDA Workshop with IBM Spectrum LSF 3 | 4 | ## Overview 5 | 6 | The CloudFormation templates in this workshop deploy a fully functional IBM Spectrum LSF compute cluster with all resources and tools required to run an EDA verification workload on a sample design in the AWS Cloud. This workshop uses the IBM Spectrum LSF Resource Connector feature to dynamically provision AWS compute instances to satisfy workload demand in the LSF queues. 7 | 8 | ![workflow](docs/images/eda-lsf-workshop-workflow.png) 9 | 10 | ## Prerequisites 11 | 12 | The following is required to run this workshop: 13 | 14 | * An AWS account with administrative level access 15 | * Licenses for IBM Spectrum LSF 10.1 16 | * An IBM Passport Advantage account for downloading the installation and full Linux distribution packages for IBM Spectrum LSF 10.1 Standard or Advanced Edition and a corresponding entitlement file. 17 | * An Amazon EC2 key pair 18 | * A free subscription to the [AWS FPGA Developer AMI](https://aws.amazon.com/marketplace/pp/B06VVYBLZZ). 19 | * A free subscription to the [Official CentOS 7 x86_64 HVM AMI](https://aws.amazon.com/marketplace/pp/B00O7WM7QW). 20 | 21 | ## Tutorials 22 | 23 | This workshop consists of two tutorials. You must complete the tutorials in sequence. 24 | 25 | 1. [**Deploy the environment**](docs/deploy-environment.md) In this module, you'll review the architecture and follow step-by-step instructions to deploy the environment using AWS CloudFormation. 26 | 27 | 1. [**Run EDA workload**](docs/run-workload.md) Finally, you'll submit logic simulations into the queue and watch the cluster grow and shrink as workload flows through the system. 28 | 29 | ## Costs 30 | 31 | You are responsible for the cost of the AWS services used while running workshop deployment. 32 | The AWS CloudFormation templates for this workshop include configuration parameters that you can customize. Some of these settings, such as instance type, will affect the cost of deployment. For cost estimates, see the pricing pages for each AWS service you will be using. Prices are subject to change. 33 | 34 | > **Tip** 35 | After you deploy the workshop, we recommend that you enable the AWS Cost and Usage Report to track costs associated with the workshop. This report delivers billing metrics to an S3 bucket in your account. It provides cost estimates based on usage throughout each month, and finalizes the data at the end of the month. For more information about the report, see the AWS documentation. 36 | 37 | ### Clean up 38 | 39 | * Delete the parent stack 40 | * Delete orphaned EBS volumes. The FPGA AMI doesn't delete them on instance termination. 41 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/cli-deploy: -------------------------------------------------------------------------------- 1 | aws cloudformation deploy --stack-name=lsf-full --template-file ~/repos/aws-eda-workshops/workshops/eda-workshop-lsf/templates/00-eda-lsf-full-workshop-master.yaml --parameter-overrides $(cat ~/repos/parameters.ini) --capabilities CAPABILITY_NAMED_IAM CAPABILITY_AUTO_EXPAND 2 | aws cloudformation deploy --stack-name=lsf-simple --template-file ~/repos/aws-eda-workshops/workshops/eda-workshop-lsf/templates/eda-lsf-simple-workshop.yaml --parameter-overrides $(cat ~/repos/parameters-simple.ini) --capabilities CAPABILITY_NAMED_IAM 3 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/awsprov_config.json: -------------------------------------------------------------------------------- 1 | { 2 | "LogLevel": "INFO", 3 | "AWS_REGION": "_CFN_AWS_REGION_", 4 | "AWS_SPOT_TERMINATE_ON_RECLAIM": "true" 5 | } 6 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/awsprov_templates.json: -------------------------------------------------------------------------------- 1 | { 2 | "templates": [ 3 | { 4 | "templateId": "fleet-template-1", 5 | "maxNumber": 1000, 6 | "priority": "121", 7 | "attributes": { 8 | "type": [ "String", "X86_64"], 9 | "ncores": [ "Numeric", "1"], 10 | "ncpus": [ "Numeric", "2"], 11 | "mem": [ "Numeric", "512"], 12 | "aws": [ "Boolean", "1"], 13 | "cpu_type": [ "String", "intel"] 14 | }, 15 | "onDemandTargetCapacityRatio": "0.5", 16 | "ec2FleetConfig": "ec2-fleet-config.json", 17 | "instanceTags": "Name=fleet-template-1;Cluster=%CFN_LSF_CLUSTER_NAME%;ec2FleetConfig=ec2-fleet-config.json", 18 | "userData": "FSXN_SVM_DNS_NAME=%CFN_FSXN_SVM_DNS_NAME%;NFS_MOUNT_POINT=%CFN_NFS_MOUNT_POINT%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;cpu_type=intel" 19 | }, 20 | { 21 | "templateId": "m5-xlarge", 22 | "priority": 7, 23 | "maxNumber": 1000, 24 | "attributes": { 25 | "type": ["String", "X86_64"], 26 | "ncores": ["Numeric", "2"], 27 | "ncpus": ["Numeric", "2"], 28 | "mem": ["Numeric", "15000"], 29 | "instance_type": ["String", "m5_xlarge"], 30 | "aws": ["Boolean", "1"] 31 | }, 32 | "imageId": "%CFN_COMPUTE_AMI%", 33 | "subnetId": "%CFN_COMPUTE_NODE_SUBNET%", 34 | "vmType": "m5.xlarge", 35 | "keyName": "%CFN_ADMIN_KEYPAIR%", 36 | "securityGroupIds": ["%CFN_COMPUTE_SECURITY_GROUP_ID%"], 37 | "placementGroupName": "", 38 | "tenancy": "default", 39 | "instanceProfile": "%CFN_LSF_COMPUTE_NODE_INSTANCE_PROFILE_ARN%", 40 | "instanceTags": "Name=LSF Exec Host;Cluster=%CFN_LSF_CLUSTER_NAME%", 41 | "userData": "FSXN_SVM_DNS_NAME=%CFN_FSXN_SVM_DNS_NAME%;NFS_MOUNT_POINT=%CFN_NFS_MOUNT_POINT%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;instance_type=m5_xlarge" 42 | }, 43 | { 44 | "templateId": "m5-2xlarge", 45 | "priority": 6, 46 | "maxNumber": 1000, 47 | "attributes": { 48 | "type": ["String", "X86_64"], 49 | "ncores": ["Numeric", "4"], 50 | "ncpus": ["Numeric", "4"], 51 | "mem": ["Numeric", "31000"], 52 | "instance_type": ["String", "m5_2xlarge"], 53 | "aws": ["Boolean", "1"] 54 | }, 55 | "imageId": "%CFN_COMPUTE_AMI%", 56 | "subnetId": "%CFN_COMPUTE_NODE_SUBNET%", 57 | "vmType": "m5.2xlarge", 58 | "keyName": "%CFN_ADMIN_KEYPAIR%", 59 | "securityGroupIds": ["%CFN_COMPUTE_SECURITY_GROUP_ID%"], 60 | "placementGroupName": "", 61 | "tenancy": "default", 62 | "instanceProfile": "%CFN_LSF_COMPUTE_NODE_INSTANCE_PROFILE_ARN%", 63 | "instanceTags": "Name=LSF Exec Host;Cluster=%CFN_LSF_CLUSTER_NAME%", 64 | "userData": "FSXN_SVM_DNS_NAME=%CFN_FSXN_SVM_DNS_NAME%;NFS_MOUNT_POINT=%CFN_NFS_MOUNT_POINT%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;instance_type=m5_2xlarge" 65 | }, 66 | { 67 | "templateId": "c5-2xlarge", 68 | "priority": 10, 69 | "maxNumber": 1000, 70 | "attributes": { 71 | "type": ["String", "X86_64"], 72 | "ncores": ["Numeric", "4"], 73 | "ncpus": ["Numeric", "4"], 74 | "mem": ["Numeric", "15000"], 75 | "instance_type": ["String", "c5_2xlarge"], 76 | "aws": ["Boolean", "1"] 77 | }, 78 | "imageId": "%CFN_COMPUTE_AMI%", 79 | "subnetId": "%CFN_COMPUTE_NODE_SUBNET%", 80 | "vmType": "c5.2xlarge", 81 | "keyName": "%CFN_ADMIN_KEYPAIR%", 82 | "securityGroupIds": ["%CFN_COMPUTE_SECURITY_GROUP_ID%"], 83 | "placementGroupName": "", 84 | "tenancy": "default", 85 | "instanceProfile": "%CFN_LSF_COMPUTE_NODE_INSTANCE_PROFILE_ARN%", 86 | "instanceTags": "Name=LSF Exec Host;Cluster=%CFN_LSF_CLUSTER_NAME%", 87 | "userData": "FSXN_SVM_DNS_NAME=%CFN_FSXN_SVM_DNS_NAME%;NFS_MOUNT_POINT=%CFN_NFS_MOUNT_POINT%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;instance_type=c5_2xlarge" 88 | }, 89 | { 90 | "templateId": "z1d-2xlarge", 91 | "priority": 8, 92 | "maxNumber": 1000, 93 | "attributes": { 94 | "type": ["String", "X86_64"], 95 | "ncores": ["Numeric", "4"], 96 | "ncpus": ["Numeric", "4"], 97 | "mem": ["Numeric", "62000"], 98 | "instance_type": ["String", "z1d_2xlarge"], 99 | "ssd": ["Boolean", "1"], 100 | "aws": ["Boolean", "1"] 101 | }, 102 | "imageId": "%CFN_COMPUTE_AMI%", 103 | "subnetId": "%CFN_COMPUTE_NODE_SUBNET%", 104 | "vmType": "z1d.2xlarge", 105 | "keyName": "%CFN_ADMIN_KEYPAIR%", 106 | "securityGroupIds": ["%CFN_COMPUTE_SECURITY_GROUP_ID%"], 107 | "placementGroupName": "", 108 | "tenancy": "default", 109 | "instanceProfile": "%CFN_LSF_COMPUTE_NODE_INSTANCE_PROFILE_ARN%", 110 | "instanceTags": "Name=LSF Exec Host;Cluster=%CFN_LSF_CLUSTER_NAME%", 111 | "userData": "FSXN_SVM_DNS_NAME=%CFN_FSXN_SVM_DNS_NAME%;NFS_MOUNT_POINT=%CFN_NFS_MOUNT_POINT%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;ssd=ssd;instance_type=z1d_2xlarge" 112 | }, 113 | { 114 | "templateId": "r5-12xlarge", 115 | "priority": 5, 116 | "maxNumber": 1000, 117 | "attributes": { 118 | "type": ["String", "X86_64"], 119 | "ncores": ["Numeric", "24"], 120 | "ncpus": ["Numeric", "24"], 121 | "mem": ["Numeric", "380000"], 122 | "instance_type": ["String", "r5_12xlarge"], 123 | "aws": ["Boolean", "1"] 124 | }, 125 | "imageId": "%CFN_COMPUTE_AMI%", 126 | "subnetId": "%CFN_COMPUTE_NODE_SUBNET%", 127 | "vmType": "r5.12xlarge", 128 | "keyName": "%CFN_ADMIN_KEYPAIR%", 129 | "securityGroupIds": ["%CFN_COMPUTE_SECURITY_GROUP_ID%"], 130 | "placementGroupName": "", 131 | "tenancy": "default", 132 | "instanceProfile": "%CFN_LSF_COMPUTE_NODE_INSTANCE_PROFILE_ARN%", 133 | "instanceTags": "Name=LSF Exec Host;Cluster=%CFN_LSF_CLUSTER_NAME%", 134 | "userData": "FSXN_SVM_DNS_NAME=%CFN_FSXN_SVM_DNS_NAME%;NFS_MOUNT_POINT=%CFN_NFS_MOUNT_POINT%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;instance_type=r5_12xlarge" 135 | }, 136 | { 137 | "templateId": "r5-24xlarge", 138 | "priority": 4, 139 | "maxNumber": 1000, 140 | "attributes": { 141 | "type": ["String", "X86_64"], 142 | "ncores": ["Numeric", "48"], 143 | "ncpus": ["Numeric", "48"], 144 | "mem": ["Numeric", "770000"], 145 | "instance_type": ["String", "r5_24xlarge"], 146 | "aws": ["Boolean", "1"] 147 | }, 148 | "imageId": "%CFN_COMPUTE_AMI%", 149 | "subnetId": "%CFN_COMPUTE_NODE_SUBNET%", 150 | "vmType": "r5.24xlarge", 151 | "keyName": "%CFN_ADMIN_KEYPAIR%", 152 | "securityGroupIds": ["%CFN_COMPUTE_SECURITY_GROUP_ID%"], 153 | "placementGroupName": "", 154 | "tenancy": "default", 155 | "instanceProfile": "%CFN_LSF_COMPUTE_NODE_INSTANCE_PROFILE_ARN%", 156 | "instanceTags": "Name=LSF Exec Host;Cluster=%CFN_LSF_CLUSTER_NAME%", 157 | "userData": "FSXN_SVM_DNS_NAME=%CFN_FSXN_SVM_DNS_NAME%;NFS_MOUNT_POINT=%CFN_NFS_MOUNT_POINT%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;instance_type=r5_24xlarge" 158 | }, 159 | { 160 | "templateId": "spot-fleet-c5-2xl", 161 | "priority": 4, 162 | "maxNumber": 1000, 163 | "attributes": { 164 | "type": ["String", "X86_64"], 165 | "ncores": ["Numeric", "4"], 166 | "ncpus": ["Numeric", "4"], 167 | "mem": ["Numeric", "15000"], 168 | "instance_type": ["String", "spot"], 169 | "!spot": ["Boolean", "1"], 170 | "aws": ["Boolean", "1"] 171 | }, 172 | "imageId": "%CFN_COMPUTE_AMI%", 173 | "subnetId": "%CFN_COMPUTE_NODE_SUBNET%", 174 | "vmType": "c5.2xlarge,c5.4xlarge,c5.9xlarge,m5.2xlarge,m5.4xlarge,m5.8xlarge", 175 | "spotPrice": "25", 176 | "allocationStrategy": "lowestPrice", 177 | "fleetRole": "%CFN_LSF_COMPUTE_NODE_SPOT_FLEET_ROLE_ARN%", 178 | "keyName": "%CFN_ADMIN_KEYPAIR%", 179 | "securityGroupIds": ["%CFN_COMPUTE_SECURITY_GROUP_ID%"], 180 | "placementGroupName": "", 181 | "tenancy": "default", 182 | "instanceProfile": "%CFN_LSF_COMPUTE_NODE_INSTANCE_PROFILE_ARN%", 183 | "instanceTags": "Name=LSF Exec Host;Cluster=%CFN_LSF_CLUSTER_NAME%", 184 | "userData": "FSXN_SVM_DNS_NAME=%CFN_FSXN_SVM_DNS_NAME%;NFS_MOUNT_POINT=%CFN_NFS_MOUNT_POINT%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;spot=!spot" 185 | }, 186 | { 187 | "templateId": "spot-fleet-amd-high-memory", 188 | "priority": 75, 189 | "maxNumber": 1000, 190 | "attributes": { 191 | "type": ["String", "X86_64"], 192 | "ncores": ["Numeric", "4"], 193 | "ncpus": ["Numeric", "4"], 194 | "mem": ["Numeric", "384000"], 195 | "instance_type": ["String", "spot"], 196 | "!spot": ["Boolean", "1"], 197 | "aws": ["Boolean", "1"], 198 | "cpu_type": ["String", "amd"] 199 | }, 200 | "imageId": "%CFN_COMPUTE_AMI%", 201 | "subnetId": "%CFN_COMPUTE_NODE_SUBNET%", 202 | "vmType": "r6a.16xlarge,r6a.24xlarge,r6a.32xlarge,r6a.48xlarge,r6a.metal,r5a.24xlarge,r5a.16xlarge,r5a.12xlarge", 203 | "spotPrice": "25", 204 | "allocationStrategy": "diversified", 205 | "fleetRole": "%CFN_LSF_COMPUTE_NODE_SPOT_FLEET_ROLE_ARN%", 206 | "keyName": "%CFN_ADMIN_KEYPAIR%", 207 | "securityGroupIds": ["%CFN_COMPUTE_SECURITY_GROUP_ID%"], 208 | "placementGroupName": "", 209 | "tenancy": "default", 210 | "instanceProfile": "%CFN_LSF_COMPUTE_NODE_INSTANCE_PROFILE_ARN%", 211 | "instanceTags": "Name=LSF Exec Host;Cluster=%CFN_LSF_CLUSTER_NAME%", 212 | "userData": "FSXN_SVM_DNS_NAME=%CFN_FSXN_SVM_DNS_NAME%;NFS_MOUNT_POINT=%CFN_NFS_MOUNT_POINT%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;spot=!spot;cpu_type=amd" 213 | } 214 | ] 215 | } 216 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/awsprov_templates_fleet.json: -------------------------------------------------------------------------------- 1 | { 2 | "templates": [ 3 | { 4 | "templateId": "fleet-template-1", 5 | "maxNumber": 100, 6 | "attributes": { 7 | "type": [ 8 | "String", 9 | "X86_64" 10 | ], 11 | "ncores": [ 12 | "Numeric", 13 | "1" 14 | ], 15 | "ncpus": [ 16 | "Numeric", 17 | "2" 18 | ], 19 | "mem": [ 20 | "Numeric", 21 | "512" 22 | ], 23 | "aws": [ 24 | "Boolean", 25 | "1" 26 | ], 27 | "cpu_type": [ 28 | "String", 29 | "intel" 30 | ] 31 | }, 32 | "priority": "121", 33 | "onDemandTargetCapacityRatio": "0.5", 34 | "ec2FleetConfig": "ec2-fleet-config.json", 35 | "instanceTags": "Name=fleet-template-1;Cluster=%CFN_LSF_CLUSTER_NAME%;ec2FleetConfig=ec2-fleet-config.json", 36 | "userData": "FSXN_SVM_DNS_NAME=%CFN_FSXN_SVM_DNS_NAME%;NFS_MOUNT_POINT=%CFN_NFS_MOUNT_POINT%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;cpu_type=intel" 37 | } 38 | ] 39 | } -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/ec2-fleet-config.json: -------------------------------------------------------------------------------- 1 | { 2 | "LaunchTemplateConfigs": [ 3 | { 4 | "LaunchTemplateSpecification": { 5 | "LaunchTemplateId": "%CFN_LAUNCH_TEMPLATE_ID%", 6 | "Version": "1" 7 | }, 8 | "Overrides": [ 9 | { 10 | "InstanceType": "m7i.large", 11 | "SubnetId": "%CFN_COMPUTE_NODE_SUBNET%", 12 | "WeightedCapacity": 1, 13 | "Priority": 71 14 | }, 15 | { 16 | "InstanceType": "m7i.xlarge", 17 | "SubnetId": "%CFN_COMPUTE_NODE_SUBNET%", 18 | "WeightedCapacity": 2, 19 | "Priority": 72 20 | }, 21 | { 22 | "InstanceType": "m6i.large", 23 | "SubnetId": "%CFN_COMPUTE_NODE_SUBNET%", 24 | "WeightedCapacity": 1, 25 | "Priority": 61 26 | }, 27 | { 28 | "InstanceType": "m6i.xlarge", 29 | "SubnetId": "%CFN_COMPUTE_NODE_SUBNET%", 30 | "WeightedCapacity": 2, 31 | "Priority": 62 32 | }, 33 | { 34 | "InstanceType": "m5.large", 35 | "SubnetId": "%CFN_COMPUTE_NODE_SUBNET%", 36 | "WeightedCapacity": 1, 37 | "Priority": 51 38 | }, 39 | { 40 | "InstanceType": "m5.xlarge", 41 | "SubnetId": "%CFN_COMPUTE_NODE_SUBNET%", 42 | "WeightedCapacity": 2, 43 | "Priority": 52 44 | } 45 | ] 46 | } 47 | ], 48 | "TargetCapacitySpecification": { 49 | "TotalTargetCapacity": $LSF_TOTAL_TARGET_CAPACITY, 50 | "OnDemandTargetCapacity": $LSF_ONDEMAND_TARGET_CAPACITY, 51 | "SpotTargetCapacity": $LSF_SPOT_TARGET_CAPACITY, 52 | "DefaultTargetCapacityType": "on-demand" 53 | }, 54 | "SpotOptions": { 55 | "AllocationStrategy": "diversified", 56 | "InstanceInterruptionBehavior": "terminate" 57 | }, 58 | "Type": "instant" 59 | } -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/ec2fleet_template_example.json: -------------------------------------------------------------------------------- 1 | EC2 Fleet is a new AWS feature that extends the existing Spot Fleet, 2 | which gives you a unique ability to create fleets of EC2 instances 3 | composed of a combination of EC2 on-demand, reserved, and spot instances, 4 | by using a single API. 5 | 6 | Requirements: LSF 10.1 fix 601205 7 | https://community.ibm.com/community/user/businessanalytics/blogs/martin-gao/2022/08/12/optimizing-the-speed-of-deployment-on-cloud](https://community.ibm.com/community/user/businessanalytics/blogs/martin-gao/2022/08/12/optimizing-the-speed-of-deployment-on-cloud 8 | 9 | { 10 | "templateId": "ec2-fleet-c5-2xl", 11 | "priority": 4, 12 | "maxNumber": 1000, 13 | "attributes": { 14 | "type": ["String", "X86_64"], 15 | "ncores": ["Numeric", "4"], 16 | "ncpus": ["Numeric", "4"], 17 | "mem": ["Numeric", "15000"], 18 | "instance_type": ["String", "spot"], 19 | "!spot": ["Boolean", "1"], 20 | "aws": ["Boolean", "1"] 21 | }, 22 | "imageId": "%CFN_COMPUTE_AMI%", 23 | "subnetId": "%CFN_COMPUTE_NODE_SUBNET%", 24 | "vmType": "c5.2xlarge,c5.4xlarge,c5.9xlarge,m5.xlarge,m5.2xlarge,m5.4xlarge,m5.8xlarge", 25 | "ec2FleetConfig": "/path/to/ec2-fleet-conf.json", 26 | "onDemandTargetCapacityRatio": "1", 27 | "keyName": "%CFN_ADMIN_KEYPAIR%", 28 | "securityGroupIds": ["%CFN_COMPUTE_SECURITY_GROUP_ID%"], 29 | "placementGroupName": "", 30 | "tenancy": "default", 31 | "instanceProfile": "%CFN_LSF_COMPUTE_NODE_INSTANCE_PROFILE_ARN%", 32 | "instanceTags": "Name=LSF Exec Host;Cluster=%CFN_LSF_CLUSTER_NAME%", 33 | "userData": "EFS_FS_DNS_NAME=%CFN_EFS_FS_DNS_NAME%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;PROJ_DIR=%CFN_PROJ_DIR%;SCRATCH_DIR=%CFN_SCRATCH_DIR%;FS_MOUNT_POINT=%CFN_FS_MOUNT_POINT%;NFS_SERVER_EXPORT=%CFN_NFS_SERVER_EXPORT%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;spot=!spot" 34 | } 35 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/hostProviders.json: -------------------------------------------------------------------------------- 1 | { 2 | "providers":[ 3 | { 4 | "name": "aws", 5 | "type": "awsProv", 6 | "confPath": "resource_connector/aws", 7 | "scriptPath": "resource_connector/aws" 8 | } 9 | ] 10 | } 11 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/lsb.modules: -------------------------------------------------------------------------------- 1 | Begin PluginModule 2 | SCH_PLUGIN RB_PLUGIN SCH_DISABLE_PHASES 3 | schmod_default () () 4 | schmod_fcfs () () 5 | schmod_fairshare () () 6 | schmod_limit () () 7 | schmod_parallel () () 8 | schmod_reserve () () 9 | schmod_mc () () 10 | schmod_preemption () () 11 | schmod_advrsv () () 12 | schmod_ps () () 13 | schmod_affinity () () 14 | schmod_demand () () 15 | #schmod_datamgr () () 16 | End PluginModule 17 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/lsb.params: -------------------------------------------------------------------------------- 1 | Begin Parameters 2 | DEFAULT_QUEUE = verif 3 | 4 | # Amount of time in seconds used for calculating parameter values 5 | MBD_SLEEP_TIME = 10 6 | 7 | # sbatchd scheduling interval 8 | SBD_SLEEP_TIME = 7 9 | 10 | # Interval between job scheduling sessions 11 | JOB_SCHEDULING_INTERVAL=1 12 | 13 | # Interval for any host to accept a job 14 | JOB_ACCEPT_INTERVAL = 0 15 | 16 | # Absolute run time is used instead of normalized one 17 | ABS_RUNLIMIT=Y 18 | 19 | # LSF evaluates only the most recently submitted job name for dependency conditions 20 | JOB_DEP_LAST_SUB=1 21 | 22 | # Concurrent queries mbatchd can handle 23 | MAX_CONCURRENT_QUERY=100 24 | 25 | # The maximum number of finished jobs whose events are to be stored in the lsb.events log file 26 | MAX_JOB_NUM=10000 27 | 28 | # Schedule parallel jobs based on slots 29 | PARALLEL_SCHED_BY_SLOT=y 30 | 31 | # Enhancement for query mbatchd new job information update 32 | NEWJOB_REFRESH=y 33 | 34 | # enable relaxed job dispatch order in the cluster 35 | RELAX_JOB_DISPATCH_ORDER = Y 36 | 37 | # enable generating combined event logs with the format of array index ranges for array jobs 38 | JOB_ARRAY_EVENTS_COMBINE = Y 39 | 40 | # Specifies the maximum number of new instances that can be requested through 41 | # the resource connector during demand evaluations. Use this parameter to control 42 | # the speed of cluster growth in the cluster. Default 300. 43 | # Use with LSB_RC_UPDATE_INTERVAL in lsf.conf to control how frequent LSF starts 44 | # demand evaluation. Combining with the new parameter, it plays a cluster wide 45 | # “step” to control the speed of cluster grow. 46 | #RC_MAX_REQUESTS=300 47 | 48 | End Parameters 49 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/lsb.queues: -------------------------------------------------------------------------------- 1 | Begin Queue 2 | QUEUE_NAME = verif 3 | DESCRIPTION = Verification workload 4 | PRIORITY = 15 5 | USERS = all 6 | RC_ACCOUNT = verif 7 | RC_HOSTS = aws 8 | #HOSTS = all 9 | RES_REQ = select[aws] 10 | End Queue 11 | 12 | Begin Queue 13 | QUEUE_NAME = sta 14 | DESCRIPTION = STA workload 15 | PRIORITY = 15 16 | USERS = all 17 | RC_ACCOUNT = sta 18 | RC_HOSTS = aws 19 | #HOSTS = all 20 | RES_REQ = select[aws] 21 | End Queue 22 | 23 | Begin Queue 24 | QUEUE_NAME = spot 25 | DESCRIPTION = spot workload 26 | PRIORITY = 15 27 | USERS = all 28 | RC_ACCOUNT = sta 29 | RC_HOSTS = aws 30 | #HOSTS = all 31 | RES_REQ = select[aws && spot] 32 | End Queue -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/lsf.shared: -------------------------------------------------------------------------------- 1 | Begin Cluster 2 | ClusterName 3 | _CFN_LSF_CLUSTER_NAME_ 4 | End Cluster 5 | 6 | Begin HostType 7 | TYPENAME 8 | DEFAULT 9 | LINUX64 10 | X86_64 11 | AARCH64 12 | LINUX_ARM64 13 | End HostType 14 | 15 | Begin HostModel 16 | MODELNAME CPUFACTOR ARCHITECTURE 17 | AWS_XEON_E5_2670v2 100 (x6_4988_IntelRXeonRCPUE52670v2250GHz) 18 | AWS_XEON_E5_2666v3 101 (x6_5800_IntelRXeonRCPUE52666v3290GHz) 19 | AWS_GRAVITON2 101 (AArch64Processorrev1aarch64) 20 | End HostModel 21 | 22 | Begin Resource 23 | RESOURCENAME TYPE INTERVAL INCREASING DESCRIPTION 24 | define_ncpus_procs Boolean () () (ncpus := procs) 25 | define_ncpus_cores Boolean () () (ncpus := cores) 26 | define_ncpus_threads Boolean () () (ncpus := threads) 27 | aws Boolean () () (AWS instance) 28 | cpu_type String () () (CPU vendor, intel, amd or graviton) 29 | highmem Boolean () () (High memory instance 384GB+) 30 | spot Boolean () () (AWS spot instance) 31 | on_demand Boolean () () (AWS on-demand instance) 32 | fsx Boolean () () (Mount Amazon FSx for Lustre on boot) 33 | ssd Boolean () () (Local SSD) 34 | rc_account String () () (Exclusive consumer tag for RC hosts) 35 | instance_type String () () (EC2 instance type) 36 | 37 | End Resource 38 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/policy_config.json: -------------------------------------------------------------------------------- 1 | { 2 | "Policies": 3 | [ 4 | { 5 | "Name": "default-instance-throttle", 6 | "Consumer": 7 | { 8 | "rcAccount": ["all"], 9 | "templateName": ["all"], 10 | "provider": ["aws"] 11 | }, 12 | "MaxNumber": "2000", 13 | "StepValue": "250:5" 14 | } 15 | ] 16 | } 17 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/spot_template_example.json: -------------------------------------------------------------------------------- 1 | { 2 | "templateId": "spot-fleet-c5-2xl", 3 | "priority": 4, 4 | "maxNumber": 1000, 5 | "attributes": { 6 | "type": ["String", "X86_64"], 7 | "ncores": ["Numeric", "4"], 8 | "ncpus": ["Numeric", "4"], 9 | "mem": ["Numeric", "15000"], 10 | "instance_type": ["String", "spot"], 11 | "!spot": ["Boolean", "1"], 12 | "aws": ["Boolean", "1"] 13 | }, 14 | "imageId": "%CFN_COMPUTE_AMI%", 15 | "subnetId": "%CFN_COMPUTE_NODE_SUBNET%", 16 | "vmType": "c5.2xlarge,c5.4xlarge,c5.9xlarge,m5.xlarge,m5.2xlarge,m5.4xlarge,m5.8xlarge", 17 | "spotPrice": "0.2", 18 | "allocationStrategy":"lowestPrice", 19 | "fleetRole": "%CFN_LSF_COMPUTE_NODE_FLEET_ROLE_ARN%", 20 | "keyName": "%CFN_ADMIN_KEYPAIR%", 21 | "securityGroupIds": ["%CFN_COMPUTE_SECURITY_GROUP_ID%"], 22 | "placementGroupName": "", 23 | "tenancy": "default", 24 | "instanceProfile": "%CFN_LSF_COMPUTE_NODE_INSTANCE_PROFILE_ARN%", 25 | "instanceTags": "Name=LSF Exec Host;Cluster=%CFN_LSF_CLUSTER_NAME%", 26 | "userData": "EFS_FS_DNS_NAME=%CFN_EFS_FS_DNS_NAME%;LSF_INSTALL_DIR=%CFN_LSF_INSTALL_DIR%;PROJ_DIR=%CFN_PROJ_DIR%;SCRATCH_DIR=%CFN_SCRATCH_DIR%;FS_MOUNT_POINT=%CFN_FS_MOUNT_POINT%;NFS_SERVER_EXPORT=%CFN_NFS_SERVER_EXPORT%;DCV_USER_NAME=%CFN_DCV_USER_NAME%;spot=!spot" 27 | } 28 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/lsf/user_data.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | set -x 3 | exec > >(tee /var/log/user-data.log|logger -t user-data ) 2>&1 4 | 5 | echo "*** BEGIN LSF HOST BOOTSTRAP ***" 6 | 7 | env 8 | 9 | # Export user data, which is defined with the "UserData" attribute 10 | # in the template 11 | %EXPORT_USER_DATA% 12 | 13 | export PATH=/sbin:/usr/sbin:/usr/local/bin:/bin:/usr/bin 14 | export AWS_DEFAULT_REGION="$( curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | sed -e 's/[a-z]*$//' )" 15 | export EC2_INSTANCE_TYPE="$( curl -s http://169.254.169.254/latest/meta-data/instance-type | sed -e 's/\./_/' )" 16 | export EC2_INSTANCE_LIFE_CYCLE="$( curl -s http://169.254.169.254/latest/meta-data/instance-life-cycle | sed -e 's/\-/_/' )" 17 | export CPU_ARCHITECTURE="$(lscpu | awk '/Architecture/ {print toupper($2)}')" 18 | export LSF_ADMIN=lsfadmin 19 | 20 | # Add the LSF admin account 21 | useradd -m -u 1500 $LSF_ADMIN 22 | # Add DCV login user account 23 | useradd -m -u 1501 $DCV_USER_NAME 24 | 25 | # Install SSM so we can use SSM Session Manager and avoid ssh logins. 26 | yum install -q -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm 27 | systemctl enable amazon-ssm-agent 28 | systemctl start amazon-ssm-agent 29 | 30 | # Disable Hyperthreading 31 | echo "Disabling Hyperthreading" 32 | for cpunum in $(cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | cut -s -d, -f2- | tr ',' '\n' | sort -un) 33 | do 34 | echo 0 > /sys/devices/system/cpu/cpu${cpunum}/online 35 | done 36 | 37 | # mount shared file systems 38 | if [[ "$FSXN_SVM_DNS_NAME" == *"efs"* ]]; then 39 | JUNCTION_PATH="/" 40 | NFS_OPTIONS="nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport" 41 | # Check if the NFS_DNS_NAME contains "fsx" 42 | elif [[ "$FSXN_SVM_DNS_NAME" == *"fsx"* ]]; then 43 | JUNCTION_PATH="/vol1" 44 | NFS_OPTIONS="rsize=262144,wsize=262144,hard,vers=3,tcp,mountproto=tcp" 45 | else 46 | # Set a default value or handle other cases if needed 47 | JUNCTION_PATH="unknown" 48 | NFS_OPTIONS="unknown" 49 | fi 50 | 51 | # Print NFS mount 52 | echo "Mounting $FSXN_SVM_DNS_NAME:$JUNCTION_PATH" 53 | mkdir -p $NFS_MOUNT_POINT 54 | mount -t nfs -o $NFS_OPTIONS $FSXN_SVM_DNS_NAME:$JUNCTION_PATH $NFS_MOUNT_POINT 55 | 56 | ## Set up the LSF environment 57 | # Create LSF log and conf directories 58 | mkdir -p /var/log/lsf && chmod 777 /var/log/lsf 59 | mkdir -p /etc/lsf && chmod 777 /etc/lsf 60 | 61 | LSF_TOP=${LSF_INSTALL_DIR} 62 | source $LSF_TOP/conf/profile.lsf 63 | 64 | # Create local lsf.conf file and update LSF_LOCAL_RESOURCES 65 | # parameter to support dynamic resources 66 | cp $LSF_ENVDIR/lsf.conf /etc/lsf/lsf.conf 67 | chmod 444 /etc/lsf/lsf.conf 68 | export LSF_ENVDIR=/etc/lsf 69 | 70 | # Add instance_type resource 71 | sed -i "s/\(LSF_LOCAL_RESOURCES=.*\)\"/\1 [resourcemap ${EC2_INSTANCE_TYPE}*instance_type]\"/" $LSF_ENVDIR/lsf.conf 72 | echo "Updated LSF_LOCAL_RESOURCES lsf.conf with [resourcemap ${EC2_INSTANCE_TYPE}*instance_type]" 73 | 74 | # Add cpu_type resource (if set) 75 | if [ -n "${cpu_type}" ]; then 76 | sed -i "s/\(LSF_LOCAL_RESOURCES=.*\)\"/\1 [resourcemap ${cpu_type}*cpu_type]\"/" $LSF_ENVDIR/lsf.conf 77 | echo "Updated LSF_LOCAL_RESOURCES lsf.conf with [resourcemap ${cpu_type}*cpu_type]" 78 | fi 79 | 80 | if [ -n "${rc_account}" ]; then 81 | sed -i "s/\(LSF_LOCAL_RESOURCES=.*\)\"/\1 [resourcemap ${rc_account}*rc_account]\"/" $LSF_ENVDIR/lsf.conf 82 | echo "Updated LSF_LOCAL_RESOURCES lsf.conf with [resourcemap ${rc_account}*rc_account]" 83 | fi 84 | 85 | # Add on_demand or spot attribute to resource map 86 | if [ -n "${EC2_INSTANCE_LIFE_CYCLE}" ]; then 87 | sed -i "s/\(LSF_LOCAL_RESOURCES=.*\)\"/\1 [resource ${EC2_INSTANCE_LIFE_CYCLE}]\"/" $LSF_ENVDIR/lsf.conf 88 | echo "Updated LSF_LOCAL_RESOURCES lsf.conf with [resource ${EC2_INSTANCE_LIFE_CYCLE}]" 89 | fi 90 | 91 | # Add CPU Architecture to type map 92 | if [ -n "${CPU_ARCHITECTURE}" ]; then 93 | # Check if the pattern [type ...] is present in the lsf.conf fil 94 | if grep -q "\[type [^]]*\]" $LSF_ENVDIR/lsf.conf; then 95 | sed -i "s/\[type [^]]*\]/[type $CPU_ARCHITECTURE]/" $LSF_ENVDIR/lsf.conf 96 | else 97 | sed -i "s/\(LSF_LOCAL_RESOURCES=.*\)\"/\1 [type ${CPU_ARCHITECTURE}]\"/" $LSF_ENVDIR/lsf.conf 98 | fi 99 | fi 100 | 101 | if [ -n "${ssd}" ]; then 102 | sed -i "s/\(LSF_LOCAL_RESOURCES=.*\)\"/\1 [resource ${ssd}]\"/" $LSF_ENVDIR/lsf.conf 103 | echo "Updated LSF_LOCAL_RESOURCES lsf.conf with [resource ${ssd}]" 104 | fi 105 | 106 | # Start LSF Daemons 107 | lsadmin limstartup 108 | lsadmin resstartup 109 | sleep 2 110 | badmin hstartup 111 | 112 | echo "*** END LSF HOST BOOTSTRAP ***" 113 | 114 | 115 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/config/mosquitto.conf: -------------------------------------------------------------------------------- 1 | log_dest file /var/log/lsf/mosquitto.log 2 | log_type all -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/docs/demo-commands.md: -------------------------------------------------------------------------------- 1 | cd /ec2-nfs/proj 2 | git clone https://github.com/morrmt/aws-fpga-sa-demo.git 3 | 4 | cd /ec2-nfs/proj/aws-fpga-sa-demo/eda-workshop 5 | 6 | bsub -R aws -J "setup" ./run-sim.sh --scratch-dir /ec2-nfs/scratch 7 | 8 | bsub -R aws -J "regress[1-100]" -w "done(setup)" ./run-sim.sh --scratch-dir /ec2-nfs/scratch 9 | 10 | bsub -R aws -J "regress[1-100]" ./run-sim.sh --scratch-dir /ec2-nfs/scratch 11 | 12 | bsub -R "select[aws && mem>30000]" ./run-sim.sh --scratch-dir /ec2-nfs/scratch 13 | 14 | bsub -R "select[aws && instance_type==z1d_2xlarge]" ./run-sim.sh --scratch-dir /ec2-nfs/scratch 15 | 16 | 1. Submit single spot job 17 | bsub -R spot sleep 15m 18 | 19 | 2. Manually terminate spot instance 20 | 3. Observe requeue 21 | 22 | bhist -l 23 | 24 | bsub -R spot -J "spot[1-100]" ./run-sim.sh --scratch-dir /ec2-nfs/scratch 25 | 26 | bjobs 27 | bhosts -w 28 | bhosts -rc 29 | bhosts -rconly 30 | lshosts 31 | lshosts -s 32 | badmin showstatus 33 | badmin rc view 34 | badmin rc error 35 | badmin rc view -c templates 36 | 37 | 38 | curl http://169.254.169.254/latest/meta-data/spot 39 | 40 | 41 | 42 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/docs/deploy-environment.md: -------------------------------------------------------------------------------- 1 | # Deploy an LSF-based EDA Computing Environment 2 | 3 | ## Overview 4 | 5 | This tutorial shows you how to deploy an elastic EDA computing cluster on AWS based on the IBM Spectrum LSF workload and resource management software and run an EDA logic verification workload within the environment. The deployed environment installs and configures the LSF software that you provide, using your licenses, and supplies the necessary EDA software and example design data to run an example EDA verification workload on the AWS Cloud. Using standard LSF commands, you will be able to submit front-end verification workload into the queue and observe as LSF dynamically adds and removes compute resources as the jobs flow through the system. 6 | 7 | This tutorial is for IT, CAD, and design engineers who are interested in running EDA workloads in the cloud using IBM's Spectrum LSF. 8 | 9 | ### IBM Spectrum LSF on AWS 10 | 11 | The Resource Connector feature in IBM Spectrum LSF 10.1 Standard and Advanced Editions enables LSF clusters to dynamically provision and deprovision right-sized AWS compute resources to satisfy pending demand in the queues. These dynamic hosts join the cluster, accept jobs, and are terminated when all demand has been satisfied. This process happens automatically based on the Resource Connector configuration. 12 | 13 | For more information about IBM Spectrum LSF Resource Connector, please see the [official documentation](https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_welcome/lsf_kc_resource_connector.html) on the IBM website. 14 | 15 | ### Cost and Licenses 16 | 17 | You are responsible for the cost of the AWS services used while running this reference deployment. There is no additional cost for using this tutorial. 18 | 19 | The AWS CloudFormation template for this tutorial includes configuration parameters that you can customize. Some of these settings, such as instance type, will affect the cost of deployment. For cost estimates, see the pricing pages for each AWS service you will be using. Prices are subject to change. 20 | 21 | IBM Spectrum LSF software and licenses are not provided by this tutorial. You must provide the licenses and full distribution packages for the software. We recommend that you verify with your IBM sales team that your LSF license terms allow you to run the software on the AWS Cloud. 22 | 23 | ## Workshop Architecture 24 | 25 | Deploying this cluster in a new virtual private cloud (VPC) with default parameters builds the following EDA computing environment in the AWS Cloud. 26 | 27 | ![diagram](images/eda-lsf-workshop-diagram-3.png "diagram") 28 | 29 | The tutorial sets up the following: 30 | 31 | - A VPC configured with public and private subnets to provide you with your own virtual network on AWS. 32 | - In the public subnet, a managed NAT gateway to allow outbound internet access for resources in the private subnets. 33 | - In the public subnet, a Linux login/submission host running NICE DCV to allow remote desktop and Secure Shell (SSH) access to the environment. 34 | - In the private subnet, an LSF master running IBM Spectrum LSF with the Resource Connector feature enabled to dynamically provisioned Amazon EC2 compute instances based on demand from the LSF queues. 35 | - An Amazon FSx for NetApp ONTAP file system for share file storage. 36 | 37 | > NOTE: This tutorial also provides an option to launch the cluster into an existing VPC. 38 | 39 | ## Planning the Deployment 40 | 41 | ### Specialized Knowledge 42 | 43 | This tutorial assumes familiarity with networking, the Linux command line, and EDA workflows. Additionally, intermediate-level experience with NFS storage and LSF operations will be helpful. 44 | 45 | This deployment guide also requires a moderate level of familiarity with AWS services. If you’re new to AWS, visit the [Getting Started Resource Center](https://aws.amazon.com/getting-started/) and the [AWS Training and Certification website](https://aws.amazon.com/training/) for materials and programs that can help you develop the skills to design, deploy, and operate your infrastructure and applications on the AWS Cloud. 46 | 47 | ### IBM LSF Software 48 | 49 | The IBM Spectrum LSF software is not provided in this workshop; you will need to download LSF 10.1 Fix Pack 8 and an associated entitlement file from your IBM Passport Advantage portal to complete this tutorial. Download the following packages from IBM: 50 | 51 | |Kind|IBM Download Source|Description|Package Name| 52 | |:---|:-----|:----------|------------| 53 | |Install script|Passport Advantage|--|`lsf10.1_lsfinstall_linux_x86_64.tar.Z` 54 | |Base distribution|Passport Advantage|--|`lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z` 55 | |Entitlement file|Passport Advantage|--|`lsf_std_entitlement.dat` or `lsf_adv_entitlement.dat` 56 | |Fix Pack 10|Fix Central|lsf-10.1.0.10-spk build545500|`lsf10.1_linux2.6-glibc2.3-x86_64-545500.tar.Z`| 57 | |Patch|Fix Central|LSF Resource Connector patch to address Log4J CVE-2021-44228 security vulnerability.|`lsf10.1_linux2.6-glibc2.3-x86_64-600877.tar.Z`| 58 | 59 | ### NICE DCV Remote Desktop Client 60 | 61 | NICE DCV is a license-free, high-performance remote display protocol that you'll use for logging into the login server's desktop environment. Download and install the [NICE DCV remote desktop native client](https://download.nice-dcv.com) on the computer you will be using for this workshop. 62 | 63 | ### AWS Account 64 | 65 | If you don’t already have an AWS account, create one at [aws.amazon.com](https://aws.amazon.com) by following the on-screen instructions. Part of the sign-up process involves receiving a phone call and entering a PIN using the phone keypad. Your AWS account is automatically signed up for all AWS services. You are charged only for the services you use. 66 | 67 | Before you launch this tutorial, your account must be configured as specified below. Otherwise, the deployment might fail. 68 | 69 | #### Resources 70 | 71 | If necessary, request [service limit increases](https://console.aws.amazon.com/support/home#/case/create?issueType=service-limit-increase&limitType=service-code-) for the following resources. You might need to do this if you already have an existing deployment that uses these resources, and you think you might exceed the default limits with this deployment. For default limits, see the [AWS documentation](http://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html). 72 | [AWS Trusted Advisor](https://console.aws.amazon.com/trustedadvisor/home?#/category/service-limits) offers a service limits check that displays your usage and limits for some aspects of some services. 73 | 74 | |Resource|This deployment uses| 75 | |:---|:---:| 76 | |VPCs|1| 77 | |Internet gateway|1| 78 | |NAT gateway|1| 79 | |IAM security groups|4| 80 | |IAM roles|4| 81 | |FSx for NetApp ONTAP file system|1| 82 | |m5.xlarge instance|1| 83 | |m5.2xlarge instance|1| 84 | |c5.2xlarge instance|Up to 20| 85 | 86 | #### Regions 87 | 88 | This deployment includes an Amazon FSx for NetApp ONTAP file system, which isn’t currently supported in all AWS Regions. Please refer to [Regional Products and Services](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) for details of Amazon FSx for NetApp ONTAP service availability by region. 89 | 90 | #### Key pair 91 | 92 | Make sure that at least one Amazon EC2 key pair exists in your AWS account in the region where you are planning to deploy the tutorial. Make note of the key pair name. You’ll be prompted for this information during deployment. To create a key pair, follow the [instructions in the AWS documentation](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html). 93 | If you’re deploying the tutorial for testing or proof-of-concept purposes, we recommend that you create a new key pair instead of specifying a key pair that’s already being used by a production instance. 94 | 95 | #### IAM permissions 96 | 97 | To deploy the environment, you must log in to the AWS Management Console with IAM permissions for the resources and actions the templates will deploy. You'll need to log in with an IAM User that has the **AdministratorAccess** managed policy within IAM. This managed policy provides sufficient permissions to deploy this workshop although your organization may choose to use a custom policy with more restrictions. 98 | 99 | ## Deployment Options 100 | 101 | This tutorial provides two deployment options: 102 | 103 | - **Deploy the cluster into a new VPC**. This option builds a new AWS environment consisting of the VPC, subnets, NAT gateways, security groups, an FSx for NetApp ONTAP NFS server, an LSF master, and other infrastructure components, and then installs and configures LSF into this new VPC. 104 | 105 | - **Deploy the cluster into an existing VPC**. This option is the same as above but it provisions the cluster in an existing VPC. 106 | 107 | This tutorial provides separate CloudFormation templates for these options. It also lets you configure CIDR blocks, instance types, and other settings, as discussed later in this guide. 108 | 109 | ## Deployment Steps 110 | 111 | ### Step 1. Sign in to your AWS account 112 | 113 | 1. Sign in to your AWS account at with an IAM user role that includes full administrative permissions. For details, see [Planning the deployment](#planning-the-deployment) earlier in this guide. 114 | 115 | 2. Make sure that your AWS account is configured correctly, as discussed in the [AWS Account](#aws-account) section. 116 | 117 | ### Step 2. Upload LSF Software Packages and Entitlement File 118 | 119 | 1. Upload the three required LSF software packages and the LSF entitlement file to a private S3 bucket in your account. 120 | 121 | ### Step 3. Subscribe to the Required AMIs 122 | 123 | This workshop requires a subscription to the following AMIs in AWS Marketplace. AMIs are images that are used to boot the instances (virtual servers) in AWS. They also contain software required to run the workshop. There is no additional cost to use these AMIs. 124 | 125 | - AWS FPGA Developer AMI 126 | - CentOS 7 (x86_64) - with Updates HVM AMI 127 | 128 | Sign in to your AWS account, and follow these instructions to subscribe: 129 | 130 | 1. Open the page for the [AWS FPGA Developer AMI](https://aws.amazon.com/marketplace/pp/B06VVYBLZZ) AMI in AWS Marketplace, and then choose **Continue to Subscribe**. 131 | 132 | 1. Review the terms and conditions for software usage, and then choose **Accept Terms**. You will get a confirmation page, and an email confirmation will be sent to the account owner. For detailed subscription instructions, see the [AWS Marketplace documentation](https://aws.amazon.com/marketplace/help/200799470). 133 | 134 | 1. When the subscription process is complete, exit out of AWS Marketplace without further action. **Do not** click **Continue to Launch**; the workshop CloudFormation templates will deploy the AMI for you. 135 | 136 | 1. Repeat the steps 1 through 3 to subscribe to the [CentOS 7 (x86_64) - with Updates HVM](https://aws.amazon.com/marketplace/pp/B00O7WM7QW) AMI. 137 | 138 | 1. Verify the subscriptions in the [Marketplace dashboard](https://console.aws.amazon.com/marketplace/home) within the AWS Console. 139 | - Click on **Manage subscriptions** to confirm that the two AMI subscriptions are active in your account. 140 | 141 | ### Step 4. Launch the Cluster 142 | 143 | **Note** The instructions in this section reflect the new version of the AWS CloudFormation console. If you’re using the original console, some of the user interface elements might be different. You can switch to the new console by selecting **New console** from the **CloudFormation** menu. 144 | 145 | 1. Sign in to your AWS account at with an IAM user role that includes full administrative permissions. 146 | 147 | 1. Click one of the deployment options below to start the CloudFormation deployment process. The link will take you to the AWS CloudFormation console with the path to the deployment template preloaded. 148 | 149 | | Deploy into New VPC | Deploy into Existing VPC | 150 | | :---: | :---: | 151 | | [![Launch Stack](../../../shared/images/deploy_to_aws.png)](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?stackName=aws-eda-lsf-full-workshop&templateURL=https://s3.amazonaws.com/aws-eda-workshop-files/workshops/eda-workshop-lsf/templates/00-eda-lsf-full-workshop-master.yaml)|[![Launch Stack](../../../shared/images/deploy_to_aws.png)](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?stackName=aws-eda-lsf-simple-workshop&templateURL=https://s3.amazonaws.com/aws-eda-workshop-files/workshops/eda-workshop-lsf/templates/eda-lsf-simple-workshop.yaml)| 152 | 153 | Check the region that's displayed in the upper-right corner of the navigation bar, and change it if necessary. This is where the cluster infrastructure will be built. This workshop supports the following regions: 154 | 155 | - US East (Ohio) 156 | - US East (N. Virginia) 157 | - US West (N. California) 158 | - US West (Oregon) 159 | - Asia Pacific (Seoul) 160 | - Asia Pacific (Singapore) 161 | - Asia Pacific (Sydney) 162 | - Asia Pacific (Tokyo) 163 | - EU (Ireland) 164 | 165 | The template is launched in the **US East (N. Virginia)** Region by default. 166 | 167 | **Important** If you're deploying the cluster into an existing VPC, make sure that your VPC has two private subnets, and that the subnets aren't shared. This tutorial doesn't support [shared subnets](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-sharing.html). The Cloudformation template will create a [NAT gateway](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html) in their route tables to allow the instances to download packages and software without exposing them to the internet. You will also need **DNS hostnames** and **DNS resolution** configured in the VPC's DHCP options as explained in the [Amazon VPC documentation](http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_DHCP_Options.html). 168 | 169 | 1. In the **Select Template** section of the **Create stack**, keep the default setting for the template URL, and then choose **Next**. 170 | 171 | 1. On the **Specify stack details** page, change the stack name if desired. Review the parameters for the template. Provide values for the following parameters in the table below. For all other parameters, it is recommended that you keep the default settings, but you can customize them as necessary. 172 | 173 | |Parameter|Notes| 174 | |---|---| 175 | |SSH source CIDR|Enter the internet-facing IP from which you will log into the login server| 176 | |EC2 KeyPair|Select the key pair you created in your account| 177 | |Cluster name|Enter a name for the LSF cluster| 178 | |LSF install package|Enter the S3 protocol URL for the `lsf10.1_lsfinstall_linux_x86_64.tar.Z` package| 179 | |LSF distribution package|Enter the S3 protocol URL for the `lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z` package| 180 | |LSF fix pack package|Enter the S3 protocol URL for the `lsf10.1_linux2.6-glibc2.3-x86_64-######.tar.Z` package| 181 | |LSF entitlement file|Enter the S3 protocol URL for the LSF entitlement file. This should be either `lsf_std_entitlement.dat` or `lsf_adv_entitlement.dat`. 182 | 183 | When you finish reviewing and customizing the parameters, choose **Next**. 184 | 185 | 1. On the **Configure stack options** page, you can specify [tags](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-resource-tags.html) (key-value pairs) for resources in your stack. We recommend setting **key** to `env` and **value** to `aws-lsf-eda-workshop` or something similar. This will help to identify resources created by this tutorial. When you're done, choose **Next**. 186 | 187 | 1. On the **Review** page, review and confirm the template settings. Under **Capabilities** at the very bottom, select the two check boxes to acknowledge that the template will create IAM resources and that it might require the capability to auto-expand macros. 188 | 189 | 1. Choose **Create stack** to deploy the stack. Either deployment option takes approximately 40 minutes to complete. 190 | 191 | 1. Monitor the status of the stack. When the status is **CREATE\_COMPLETE**, the cluster is ready. 192 | 193 | 1. Use the URLs displayed in the **Outputs** tab for the stack to view the resources that were created. 194 | 195 | ### Step 5. Test the Deployment 196 | 197 | 1. Log into the login server via SSH as `centos` user using the private key from the key pair you provided in the Cloudformation stack and the IP address found in **LoginServerPublicIp** under the stack's **Outputs** tab. 198 | 199 | `ssh -i /path/to/private_key centos@` 200 | 201 | >NOTE: If you have trouble connecting, ensure the security group on the login server includes the IP address of your client. 202 | 203 | 1. Run the `lsid` command to verify that LSF installed properly and is running. You should see something similar to the following: 204 | ``` 205 | IBM Spectrum LSF Standard 10.1.0.11, Nov 12 2020 206 | Copyright International Business Machines Corp. 1992, 2016. 207 | US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. 208 | My cluster name is 209 | My master name is 210 | ``` 211 | 212 | ### Step 6. Run workload 213 | 214 | Move on to the [next tutorial](run-workload.md) to run logic simulations in your new elastic LSF cluster in the AWS cloud. 215 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/docs/deploy-simple.md: -------------------------------------------------------------------------------- 1 | # Deploy an LSF-based EDA Computing Environment 2 | 3 | ## Contents 4 | 5 | * [Workshop Overview](#overview) 6 | 7 | * [Prerequisites](#prerequisites) 8 | 9 | * [Deploy the environment](#deploy-the-environment) 10 | 11 | * [Run example workload](#run-workload) 12 | 13 | ## Overview 14 | 15 | This tutorial shows you how to deploy an elastic EDA computing cluster on AWS based on the IBM Spectrum LSF workload and resource management software and run an EDA logic verification workload within the environment. The deployed environment installs and configures the LSF software that you provide, using your licenses, and supplies the necessary EDA software and design data to run an example EDA verification workload on the AWS Cloud. Using standard LSF commands, you will be able to submit front-end verification workload into the queue and observe as LSF dynamically adds and removes compute resources as the jobs flow through the system. 16 | 17 | This tutorial is for IT, CAD, and design engineers who are interested in running EDA workloads in the cloud using IBM's Spectrum LSF. 18 | 19 | ### Workshop Architecture 20 | 21 | This environment deploys into an existing virtual private cloud (VPC) and builds the following EDA computing environment in the AWS Cloud. 22 | 23 | ![diagram](images/eda-lsf-workshop-simple-diagram.png) 24 | 25 | The deployed cloud infrastructure consists of: 26 | 27 | * A Linux login/submission host to allow inbound Secure Shell (SSH) and graphical remote desktop access to the environment. 28 | 29 | * An LSF master running IBM Spectrum LSF with the Resource Connector feature enabled 30 | 31 | * Amazon EC2 compute instances that are dynamically provisioned by LSF 32 | 33 | * An Amazon FSx for NetApp ONTAP file system for share file storage. 34 | 35 | ### Workshop Workflow 36 | 37 | ![workflow-diagram](images/eda-lsf-workshop-simple-workflow.png) 38 | 39 | ### Cost and Licenses 40 | 41 | If you run this tutorial in your own account, you are responsible for the cost of the AWS services used while running this reference deployment. There is no additional cost for using this tutorial. 42 | 43 | IBM Spectrum LSF software and licenses are not provided by this tutorial. You must provide the licenses and full distribution packages for the software. 44 | 45 | ## Prerequisites 46 | 47 | 1. Download the IBM Spectrum LSF software and associated entitlement file. 48 | 49 | The IBM Spectrum LSF software is not provided in this workshop; you will need to download LSF 10.1 Fix Pack 8 or newer and an associated entitlement file from your IBM Passport Advantage portal to complete this tutorial. Download the following packages from the web portal: 50 | 51 | * `lsf10.1_lsfinstall_linux_x86_64.tar.Z` 52 | * `lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z` 53 | * `lsf_std_entitlement.dat` or `lsf_adv_entitlement.dat` 54 | * `lsf10.1_linux2.6-glibc2.3-x86_64-xxxxxx.tar.Z` (latest Fix Pack) 55 | * `lsf10.1_linux2.6-glibc2.3-x86_64-600877.tar.Z` (patch to address Log4J CVE-2021-44228 security vulnerability) 56 | 57 | 1. Download and install the [NICE DCV remote desktop native client](https://download.nice-dcv.com) on the computer you will be using for this workshop. 58 | 59 | 1. Create SSH key pair by following the [Amazon EC2 Key Pairs documentation](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair). 60 | 61 | ## Deploy the environment 62 | 63 | ### Step 1. Sign in to your AWS account 64 | 65 | 1. Sign into the AWS account provided to you for this workshop. 66 | 67 | ### Step 2. Upload LSF Software Packages and Entitlement File 68 | 69 | 1. Create a new S3 bucket in the workshop account. 70 | 1. Upload the two required LSF software packages and the LSF entitlement file into the new S3 bucket. 71 | 72 | ### Step 3. Subscribe to the Required AMIs 73 | 74 | This workshop requires a subscription to the following Amazon Machine Images (AMIs) in AWS Marketplace. AMIs are images that are used to boot the instances (virtual servers) in AWS. They also contain software required to run the workshop. There is no additional cost to use these AMIs. 75 | 76 | * **AWS FPGA Developer AMI**. This AMI provides the pre-licensed Xilinx Vivado EDA tool suite running on CentOS 7.7. 77 | * **CentOS 7 (x86_64) - with Updates HVM AMI**. This is the official CentOS 7 image. 78 | 79 | Be sure you are logged into the workshop AWS account, and follow these instructions to subscribe: 80 | 81 | 1. Open the page for the [AWS FPGA Developer AMI](https://aws.amazon.com/marketplace/pp/B06VVYBLZZ) AMI in AWS Marketplace, and then choose **Continue to Subscribe**. 82 | 83 | 1. Review the terms and conditions for software usage, and then choose **Accept Terms**. You will get a confirmation page, and an email confirmation will be sent to the account owner. For detailed subscription instructions, see the [AWS Marketplace documentation](https://aws.amazon.com/marketplace/help/200799470). 84 | 85 | 1. When the subscription process is complete, exit out of AWS Marketplace without further action. **Do not** click **Continue to Launch**; the workshop CloudFormation templates will deploy the AMI for you. 86 | 87 | 1. Repeat the steps 1 through 3 to subscribe to the [CentOS 7 (x86_64) - with Updates HVM](https://aws.amazon.com/marketplace/pp/B00O7WM7QW) AMI. 88 | 89 | 1. Verify the subscriptions in the [Marketplace dashboard](https://console.aws.amazon.com/marketplace/home) within the AWS Console. 90 | * Click on **Manage subscriptions** to confirm that the two AMI subscriptions are active in your account. 91 | 92 | ### Step 4. Launch the Cluster 93 | 94 | **Note** The instructions in this section reflect the new version of the AWS CloudFormation console. If you’re using the original console, some of the user interface elements might be different. You can switch to the new console by selecting **New console** from the **CloudFormation** menu. 95 | 96 | 1. Click The **Deploy to AWS** button below to start the CloudFormation deployment process. The link will take you to the AWS CloudFormation console with the path to the deployment template preloaded. 97 | 98 | [![Launch Stack](../../../shared/images/deploy_to_aws.png)](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?stackName=aws-eda-lsf-workshop&templateURL=https://aws-eda-workshop-files.s3.amazonaws.com/workshops/eda-workshop-lsf/templates/eda-lsf-simple-workshop.yaml) 99 | 100 | The cluster infrastructure is deployed in the **US East (N. Virginia)** AWS Region by default. 101 | 102 | 1. In the **Specify template** section of the **Create stack** page, keep the default setting for the template URL, and then choose **Next**. 103 | 104 | 1. On the **Specify stack details** page, provide values for the following parameters in the table below. For all other parameters, it is recommended that you keep the default settings. 105 | 106 | |Parameter|Notes| 107 | |---|---| 108 | |Cluster VPC|Select the Virtual Private Cloud network where the infrastructure will be provisioned. You should see only one option in the pop-up menu. 109 | |LSF master subnet|Select the subnet where the LSF master instance will be launched. 110 | |Compute node subnet|Select the subnet where LSF will provision its execution hosts. It is recommended to choose the same subnet you selected for the LSF master. 111 | |Source IP|Enter the internet-facing IP from which you will log into servers in the cloud cluster. You can use http://checkip.amazonaws.com or a similar service to discover you internet-facing IP address.| 112 | |EC2 Key Pair|Select the key pair you created earlier in this tutorial. You should see only one option here.| 113 | |LSF install package|Enter the S3 protocol URL for your `lsf10.1_lsfinstall_linux_x86_64.tar.Z` package. Select the package object in the S3 console and choose Copy Path and paste here. | 114 | |LSF binary package|Enter the S3 protocol URL for your `lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z` package. Select the package object in the S3 console and choose Copy Path and paste here.| 115 | |LSF entitlement file|Enter the S3 protocol URL for your LSF entitlement file. This should be either `lsf_std_entitlement.dat` or `lsf_adv_entitlement.dat`. Select the package object in the S3 console and choose Copy Path and paste here. 116 | |LSF Fix Pack| Enter the S3 protocol URL for the LSF Fix Pack package. Select the package object in the S3 console and choose Copy Path and paste here.| 117 | |DCV login password|Enter a password for the DCV user. Note the password complexity requirements in the parameter description. 118 | 119 | When you finish reviewing and customizing the parameters, choose **Next**. 120 | 121 | 1. On the **Configure stack options** page, you can specify [tags](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-resource-tags.html) (key-value pairs) for resources in your stack. We recommend setting **key** to `env` and **value** to `aws-lsf-eda-workshop` or something similar. This will help to identify resources created by this tutorial. When you're done, click **Add tag**. 122 | 123 | 1. Under **Stack creation options**, disable **Rollback on failure**. This will leave the resources up for debug in case an error occurs during the provisioning process. Choose **Next**. 124 | 125 | 1. On the **Review** page, review and confirm the stack details. Click the **Previous** button at the bottom of the page if you need to return to previous pages to make changes. Under **Capabilities** at the very bottom, select the check box to acknowledge that the template will create IAM resources. 126 | 127 | 1. Choose **Create stack** to deploy the stack. Stack creation will take approximately 30 minutes to complete. 128 | 129 | 1. Monitor the status of the stack. When the status is **CREATE\_COMPLETE**, the cluster is ready. 130 | 131 | 1. Use the URLs displayed in the **Outputs** tab for the stack to view the resources that were created. 132 | 133 | ### Step 5. Log into the cluster 134 | 135 | 1. Log into the login/remote desktop server with the DCV client. 136 | 137 | 1. In the CloudFormation console, click on the **aws-eda-lsf-workshop** stack and choose the **Outputs** tab. 138 | 1. Copy the IP:port in the **Value** column for the **LoginServerRemoteDesktop**. This is the public IP of the login server that was deployed by the CloudFormation stack. 139 | 1. Launch the DCV client and paste the IP into the server field. Click **Connect** and then **Proceed**. 140 | 1. Enter the DCV username and password you provided in **Step 3** above in the credentials fields. 141 | 1. Next, you should see a clock with a blue screen. Click on it, enter your DCV password once again, and click the **Unlock** button. 142 | 1. Close the **Getting Started** window to reveal the GNOME desktop. 143 | 144 | 1. Open a new terminal and run the `lsid` command to verify that LSF installed properly and is running. You should see output similar to the following: 145 | 146 | ```text 147 | $ lsid 148 | IBM Spectrum LSF Standard 10.1.0.0, Jul 08 2016 149 | Copyright International Business Machines Corp. 1992, 2016. 150 | US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. 151 | 152 | My cluster name is mycluster 153 | My master name is ip-172-31-15-186 154 | ``` 155 | 156 | ## Run workload 157 | 158 | Move on to the [next tutorial](run-simple.md) to run logic simulations in your new elastic LSF cluster in the AWS cloud. 159 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/docs/images/aws-eda-workshop-full-workflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/aws-eda-workshops/6c9a01305c54a3036924fc8d89e7197a2c97d001/workshops/eda-workshop-lsf/docs/images/aws-eda-workshop-full-workflow.png -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/docs/images/eda-lsf-workshop-diagram-3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/aws-eda-workshops/6c9a01305c54a3036924fc8d89e7197a2c97d001/workshops/eda-workshop-lsf/docs/images/eda-lsf-workshop-diagram-3.png -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/docs/images/eda-lsf-workshop-simple-diagram.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/aws-eda-workshops/6c9a01305c54a3036924fc8d89e7197a2c97d001/workshops/eda-workshop-lsf/docs/images/eda-lsf-workshop-simple-diagram.png -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/docs/images/eda-lsf-workshop-workflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/aws-eda-workshops/6c9a01305c54a3036924fc8d89e7197a2c97d001/workshops/eda-workshop-lsf/docs/images/eda-lsf-workshop-workflow.png -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/docs/lsf-spot.md: -------------------------------------------------------------------------------- 1 | # LSF Resource Connector and EC2 Spot 2 | 3 | ## Spot Fleet Allocation Strategies 4 | 5 | LSF Resource Connector uses AWS EC2 Spot Fleet for requesting Spot instances. However, it does not support all of the features of Spot Fleet, which could pose some challenges as you start to scale out on Spot. Specifically, Spot Fleet allocation strategies determine how it fulfills your Spot Fleet request. RC currently supports on the first two allocation strategies below: 6 | 7 | * `lowestPrice` (SUPPORTED IN RC) 8 | 9 | The Spot Instances come from the pool with the lowest price. This is the default strategy. 10 | 11 | * `diversified` (SUPPORTED IN RC) 12 | 13 | The Spot Instances are distributed across all pools. 14 | 15 | * `capacityOptimized` (NOT SUPPORTED IN RC) 16 | 17 | The Spot Instances come from the pool with optimal capacity for the number of instances that are launching. Also, by offering the possibility of fewer interruptions, the capacityOptimized strategy can lower the overall cost of your workload. 18 | 19 | As you start to do large-scale runs on Spot, in most cases `capacityOptimized` will be what you want in order to acquire the most capacity with the fewest Spot interruptions. 20 | 21 | ## Handling of Spot Terminations 22 | 23 | * Jobs are requeued to the top of the queue by default. 24 | 25 | ## Spot Fleet Requests 26 | 27 | * RC's default maximum spot fleet request is 300 instances at a time. It will add smaller requests to reach the target capacity. Override this default with `RC_MAX_REQUESTS` in `lsb.params`. Also, be aware of any instance limits in `policy_config.json`, `awsprov_templates.json`, or `LSB_RC_MAX_INSTANCES_PER_TEMPLATE` in `lsf.conf`. 28 | 29 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/docs/run-simple.md: -------------------------------------------------------------------------------- 1 | # Run Example Simulation Workload 2 | 3 | ## Overview 4 | 5 | This tutorial provides instructions for running an example logic simulation workload in the EDA computing environment created in the [deployment tutorial](deploy-simple.md) included in this workshop. The example workload uses example designs and IP contained in the public **AWS F1 FPGA Development Kit** and the **Xilinx Vivado** EDA software suite provided by the **AWS FPGA Developer AMI** that you subscribed to in the first tutorial. Although you'll be using data and tools from AWS FPGA developer resources, you will not be running on the F1 FPGA instance or executing any type of FPGA workload; we're simply running software simulations on EC2 compute instances using the design data, IP, and EDA tools and flows that these kits provide. 6 | 7 | **Note** There is are no additional charges to use the **AWS F1 FPGA Development Kit** or the **Xilinx Vivado** tools in the **AWS FPGA Developer AMI**. You are only charged for the underlying AWS resources consumed by running the AMI and included software. 8 | 9 | ### Step 1. Log into the DCV remote desktop session 10 | 11 | 1. Download and install the [NICE DCV remote desktop native client](https://download.nice-dcv.com) on your local laptop/desktop computer. 12 | 1. Launch the DCV client application. 13 | 1. Paste the public IP address of the **Login Server** into the field. Click "Trust & Connect" when prompted. 14 | 1. Enter the Username and Password. You can find these credentials in AWS Secrets Manager in the AWS Console: 15 | 1. Go to the Secrets Manager service and select the **DCVCredentialsSecret** secret. 16 | 1. Click on the **Retrieve secret value** button. 17 | 1. Copy the **username** and **password** and paste them into the appropriate DCV client fields. 18 | 1. If you have trouble connecting, ensure the security group on the login server includes the IP address of your client. 19 | 20 | ### Step 2. Clone the AWS F1 FPGA Development Kit repo 21 | This repo contains example design data, IP, and a simple workflow to execute verification tests at scale. 22 | 23 | 1. Open a new terminal in the DCV remote desktop session that you established in the previous module, clone the example workload from the `aws-fpga-sa-demo` Github repo into the `proj` directory on the NFS file system. The default location is `/fsxn/proj`. 24 | 25 | ```bash 26 | cd /fsxn/proj 27 | git clone https://github.com/morrmt/aws-fpga-sa-demo.git 28 | ``` 29 | 30 | 1. Change into the repo's workshop directory 31 | 32 | `cd /fsxn/proj/aws-fpga-sa-demo/eda-workshop` 33 | 34 | ### Step 3. Run build job 35 | 36 | This first job will set up the runtime environment for the simulations that you will submit to LSF in Step 3 below. 37 | 38 | 1. Run the `bhosts` command. Notice that the LSF master is the only host in the cluster. 39 | 1. **Submit the setup job into LSF**. The `--scratch-dir` should be the path to the scratch directory you defined when launching the CloudFormation stack in the previous tutorial. The default is `/fsxn/scratch`. 40 | 41 | `bsub -R aws -J "setup" ./run-sim.sh --scratch-dir /fsxn/scratch` 42 | 43 | 1. **Watch job status**. This job will generate demand to LSF Resource Connector for an EC2 instance. Shortly after you submit the job, you should see a new "LSF Exec Host" instance in the EC2 Dashboard in the AWS console. It should take 2-5 minutes for this new instance to join the cluster and accept the job. Use the `bjobs` command to watch the status of the job. Once it enters the `RUN` state, move on to the next step. 44 | 45 | ### Step 4. Run verification tests at scale 46 | 47 | Now we are ready to scale out the simulations. Like with the build job above, when these jobs hit the queue LSF will generate demand for EC2 instances, and Resource Connector will start up the appropriate number and type of instances to satisfy the pending jobs in the queue. 48 | 49 | 1. Run the `bhosts` command. You should see two hosts now -- the LSF master and the execution host for the setup job. 50 | 1. **Submit a large job array**. This job array will spawn 100 verification jobs. These jobs use a dependency condition so that they will be dispatched only after the build job above completes successfully. The build job run time is about 15 minutes. 51 | 52 | `bsub -R aws -J "regress[1-100]" -w "done(setup)" ./run-sim.sh --scratch-dir /fsxn/scratch` 53 | 54 | **Option: Specify an instance type** 55 | LSF will choose the best instance type based on the LSF Resource Connector configuration, but there may be situations where you may want to target a particular instance type for a workload. This workshop has been configured to allow you to overide the default behavior and specify the desired instance type. The following instance types are supported in this workshop: `z1d_2xlarge`, `m5_xlarge`, `m5_2xlarge`, `c5_2xlarge`, `z1d_2xlarge`, `r5_12xlarge`, and `r5_24xlarge`. Use the `instance_type` resource in the `bsub` resource requirement string to request one of these instances. For example: 56 | 57 | `bsub -R "select[aws && instance_type==z1d_2xlarge]" -J "regress[1-100]" -w "done(setup)" ./run-sim.sh --scratch-dir /ec2-nfs/scratch` 58 | 59 | 1. Check job status 60 | 61 | `bjobs -A` 62 | 63 | 1. Check execution host status 64 | 65 | `bhosts -w` 66 | 67 | 1. Check cluster status 68 | 69 | `badmin showstatus` 70 | 71 | 1. Check LSF Resource Connector status 72 | 73 | `badmin rc view` 74 | `badmin rc error` 75 | 76 | About 10 minutes after the jobs complete, LSF Resource Connector will begin terminating the idle EC2 instances. 77 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/docs/run-workload.md: -------------------------------------------------------------------------------- 1 | # Run Example Simulation Workload 2 | 3 | ## Overview 4 | 5 | This tutorial provides instructions for running an example logic simulation workload in the EDA computing envronment created in the [deployment tutorial](deploy-environment.md) included in this workshop. The example workload uses the designs and IP contained in the public **AWS F1 FPGA Development Kit** and the **Xilinx Vivado** EDA software suite provided by the **AWS FPGA Developer AMI** that you subscribed to in the first tutorial. Although you'll be using data and tools from AWS FPGA developer resources, you will not be running on the F1 FPGA instance or executing any type of FPGA workload; we're simply running software simulations on EC2 compute instances using the design data, IP, and software that these kits provide for no additional charge. 6 | 7 | **Note** there are no additional charges to use the AWS F1 FPGA Development Kit or the **Xilinx Vivado** tools in the AWS FPGA Developer AMI. You are only charged for the underlying AWS resources consumed by running the AMI and included software. 8 | 9 | ### Step 1. Log into the DCV remote desktop session 10 | 11 | 1. Download and install the [NICE DCV remote desktop native client](https://download.nice-dcv.com) on your local laptop/desktop computer. 12 | 1. Launch the DCV client application. 13 | 1. Paste the public IP address of the **Login Server** into the field. Click "Trust & Connect" when prompted. 14 | 1. Enter the Username and Password. You can find these credentials in **AWS Secrets Manager** in the AWS Console: 15 | 1. Go to the Secrets Manager service and select the ***/DCVCredentialsSecret** secret. 16 | 1. Click on the **Retrieve secret value** button. 17 | 1. Copy the **username** and **password** and paste them into the appropriate DCV client fields. 18 | 1. If you have trouble connecting, ensure the security group on the login server includes the IP address of your client. 19 | 20 | ### Step 2. Clone the AWS F1 FPGA Development Kit repo 21 | 22 | 23 | 1. Open a new terminal in the DCV remote desktop session and clone the example workload from the `aws-fpga-sa-demo` Github repo into the `proj` directory on the NFS file system. The default location is `/fsxn/proj`. 24 | 25 | ```bash 26 | cd /fsxn/proj 27 | git clone https://github.com/morrmt/aws-fpga-sa-demo.git 28 | ``` 29 | 30 | 1. Change into the repo's workshop directory 31 | 32 | `cd /fsxn/proj/aws-fpga-sa-demo/eda-workshop` 33 | 34 | ### Step 3. Run setup job 35 | 36 | This first job will set up the runtime environment for the simulations that you will submit to LSF in Step 3 below. 37 | 38 | 1. **Submit the setup job into LSF**. The `--scratch-dir` should be the path to the scratch directory you defined when launching the CloudFormation stack in the previous tutorial. The default is `/fsxn/scratch`. 39 | 40 | `bsub -R aws -J "setup" ./run-sim.sh --scratch-dir /fsxn/scratch` 41 | 42 | 1. **Watch job status**. This job will generate demand to LSF Resource Connector for an EC2 instance. Shortly after you submit the job, you should see a new "LSF Exec Host" instance in the EC2 Dashboard in the AWS console. It should take 2-5 minutes for this new instance to join the cluster and accept the job. Use the `bjobs` command to watch the status of the job. Once it enters the `RUN` state, move on to the next step. 43 | 44 | ### Step 4. Run verification tests at scale 45 | 46 | Now we are ready to scale out the simulations. Like with the setup job above, when these jobs hit the queue LSF will generate demand for EC2 instances, and Resource Connector will start up the appropriate number and type of instances to satisfy the pending jobs in the queue. 47 | 48 | 1. **Submit a large job array**. This job array will spawn 100 verification jobs. These jobs will be dispatched only after the setup job above completes successfully. Again, The `--scratch-dir` should be the path to the scratch directory you used above. 49 | 50 | `bsub -R aws -J "regress[1-100]" -w "done(setup)" ./run-sim.sh --scratch-dir /fsxn/scratch` 51 | 52 | **Option: Specify an instance type** 53 | LSF will choose the best instance type based on the LSF Resource Connector configuration, but there may be situations where you may want to target a particular instance type for a workload. This workshop has been configured to allow you to overide the default behavior and specify the desired instance type. The following instance types are supported in this workshop: `z1d_2xlarge`, `m5_xlarge`, `m5_2xlarge`, `c5_2xlarge`, `z1d_2xlarge`, `r5_12xlarge`, and `r5_24xlarge`. Use the `instance_type` resource in the `bsub` resource requirement string to request one of these instances. For example: 54 | 55 | `bsub -R "select[aws && instance_type==z1d_2xlarge]" -J "regress[1-100]" -w "done(setup)" ./run-sim.sh --scratch-dir /fsxn/scratch` 56 | 57 | 1. Check job status 58 | 59 | `bjobs -A` 60 | 61 | 1. Check execution host status 62 | 63 | `bhosts -w` 64 | 65 | 1. Check cluster status 66 | 67 | `badmin showstatus` 68 | 69 | 1. Check LSF Resource Connector status 70 | 71 | 72 | `badmin rc view` 73 | 74 | `badmin rc error` 75 | 76 | 1. View LSF Resource Connector template configuration 77 | 78 | `badmin rc view -c templates` 79 | 80 | About 10 minutes after the jobs complete, LSF Resource Connector will begin terminating the idle EC2 instances. 81 | 82 | ### Step 5: Clean up 83 | 84 | To help prevent unwanted charges to your AWS account, you can delete the AWS resources that you used for this tutorial. 85 | 86 | 1. Delete the parent stack 87 | 88 | 1. Delete orphaned EBS volumes. The FPGA AMI doesn't delete them on instance termination. 89 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/scripts/config-lsf.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | echo 'CONFIGURE LSF' 3 | source $LSF_INSTALL_DIR/conf/profile.lsf 4 | echo "LSF_ENVDIR=$LSF_ENVDIR" 5 | 6 | # Uncomment params to support dynamic hosts in lsf.cluster.* 7 | sed -i -e 's/#\sLSF_HOST_ADDR_RANGE/LSF_HOST_ADDR_RANGE/' \ 8 | -e 's/#\sFLOAT_CLIENTS/FLOAT_CLIENTS/' \ 9 | $LSF_ENVDIR/lsf.cluster.* && \ 10 | 11 | echo "Updated lsf.conf:" && \ 12 | cat $LSF_ENVDIR/lsf.conf && \ 13 | 14 | # lsf.conf 15 | # Set logging to local file system 16 | sed -i -e 's|^LSF_LOGDIR.*|LSF_LOGDIR=/var/log/lsf|' $LSF_ENVDIR/lsf.conf && \ 17 | 18 | cat <> $LSF_ENVDIR/lsf.conf 19 | LSB_RC_EXTERNAL_HOST_FLAG=aws 20 | LSB_RC_EXTERNAL_HOST_IDLE_TIME=1 # terminate instance after 1 of min idle 21 | LSB_RC_QUERY_INTERVAL=15 22 | LSB_RC_UPDATE_INTERVAL=10 23 | LSF_DYNAMIC_HOST_TIMEOUT=10m # Wait time before removing unavailable dynamic hosts 24 | LSF_DYNAMIC_HOST_WAIT_TIME=3 # time in sec that a dynamic host waits before communicating 25 | LSF_LOCAL_RESOURCES="[resource aws] [type LINUX64]" # Adds 'aws' boolean to dynamic hosts 26 | LSF_MQ_BROKER_HOSTS=$HOSTNAME # start mqtt broker for bhosts -rc and bhosts -rconly commands to work. 27 | LSF_STRIP_DOMAIN=.ec2.internal:.$AWS_CFN_STACK_REGION.compute.internal 28 | MQTT_BROKER_HOST=$HOSTNAME 29 | MQTT_BROKER_PORT=1883 30 | EOF 31 | 32 | # Dedup the lsf.conf file 33 | awk '!a[$0]++' $LSF_ENVDIR/lsf.conf > /tmp/lsf.conf.deduped && \ 34 | mv /tmp/lsf.conf.deduped $LSF_ENVDIR/lsf.conf && \ 35 | 36 | echo "Updated $LSF_ENVDIR/lsf.conf:" && \ 37 | cat $LSF_ENVDIR/lsf.conf 38 | 39 | # Copy other pre-configured lsf config files 40 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/config/lsf/lsf.shared \ 41 | $LSF_ENVDIR/lsf.shared && \ 42 | sed -i -e "s/^_CFN_LSF_CLUSTER_NAME_/$LSF_CLUSTER_NAME/" $LSF_ENVDIR/lsf.shared && \ 43 | 44 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/config/lsf/lsb.queues \ 45 | $LSF_ENVDIR/lsbatch/$LSF_CLUSTER_NAME/configdir/lsb.queues && \ 46 | 47 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/config/lsf/lsb.modules \ 48 | $LSF_ENVDIR/lsbatch/$LSF_CLUSTER_NAME/configdir/lsb.modules && \ 49 | 50 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/config/lsf/lsb.params\ 51 | $LSF_ENVDIR/lsbatch/$LSF_CLUSTER_NAME/configdir/lsb.params 52 | 53 | 54 | # CONFIGURE IBM LSF RESOURCE CONNECTOR FOR AWS 55 | # Sets AWS as the sole host provider 56 | echo 'CONFIGURE IBM LSF RESOURCE CONNECTOR FOR AWS' 57 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/config/lsf/hostProviders.json \ 58 | $LSF_ENVDIR/resource_connector/hostProviders.json && \ 59 | 60 | # awsprov.config.json 61 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/config/lsf/awsprov_config.json \ 62 | $LSF_ENVDIR/resource_connector/aws/conf/awsprov_config.json && \ 63 | sed -i -e "s/_CFN_AWS_REGION_/$AWS_CFN_STACK_REGION/" $LSF_ENVDIR/resource_connector/aws/conf/awsprov_config.json && \ 64 | 65 | # awsprov.templates.json 66 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/config/lsf/awsprov_templates_fleet.json \ 67 | $LSF_ENVDIR/resource_connector/aws/conf/awsprov_templates.json && \ 68 | 69 | sed -i -e "s|%CFN_COMPUTE_AMI%|$CFN_COMPUTE_NODE_AMI|" \ 70 | -e "s|%CFN_COMPUTE_NODE_SUBNET%|$CFN_COMPUTE_NODE_SUBNET|" \ 71 | -e "s|%CFN_ADMIN_KEYPAIR%|$CFN_EC2_KEY_PAIR|" \ 72 | -e "s|%CFN_COMPUTE_SECURITY_GROUP_ID%|$CFN_COMPUTE_NODE_SG_ID|" \ 73 | -e "s|%CFN_LSF_COMPUTE_NODE_INSTANCE_PROFILE_ARN%|$CFN_COMPUTE_NODE_INSTANCE_PROFILE_ARN|" \ 74 | -e "s|%CFN_LSF_CLUSTER_NAME%|$LSF_CLUSTER_NAME|" \ 75 | -e "s|%CFN_FSXN_SVM_DNS_NAME%|$NFS_DNS_NAME|" \ 76 | -e "s|%CFN_NFS_MOUNT_POINT%|$NFS_MOUNT_POINT|" \ 77 | -e "s|%CFN_LSF_INSTALL_DIR%|$LSF_INSTALL_DIR|" \ 78 | -e "s|%CFN_DCV_USER_NAME%|$CFN_DCV_USERNAME|" \ 79 | -e "s|%CFN_LSF_COMPUTE_NODE_SPOT_FLEET_ROLE_ARN%|$CFN_COMPUTE_SPOT_FLEET_ROLE_ARN|" \ 80 | $LSF_ENVDIR/resource_connector/aws/conf/awsprov_templates.json && \ 81 | 82 | # ec2-fleet-config.json 83 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/config/lsf/ec2-fleet-config.json \ 84 | $LSF_ENVDIR/resource_connector/aws/conf/ec2-fleet-config.json && \ 85 | sed -i -e "s|%CFN_COMPUTE_AMI%|$CFN_COMPUTE_NODE_AMI|" \ 86 | -e "s|%CFN_COMPUTE_NODE_SUBNET%|$CFN_COMPUTE_NODE_SUBNET|" \ 87 | -e "s|%CFN_LAUNCH_TEMPLATE_ID%|$CFN_LAUNCH_TEMPLATE_ID|" $LSF_ENVDIR/resource_connector/aws/conf/ec2-fleet-config.json 88 | 89 | # user_data script that RC executes on compute nodes 90 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/config/lsf/user_data.sh \ 91 | $LSF_INSTALL_DIR/10.1/resource_connector/aws/scripts/user_data.sh && \ 92 | chmod +x $LSF_INSTALL_DIR/10.1/resource_connector/aws/scripts/user_data.sh -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/scripts/delete.ebs.vols.py: -------------------------------------------------------------------------------- 1 | # Requires python 3.6. 2 | # 3 | # This is example Lambda code for deleting orphaned EBS volumes that are created by 4 | # the AWS FPGA AMI. 5 | # WARNING: 6 | # THIS SCRIPT WILL DELETE EBS VOLUMES. PLEASE TEST BEFORE USE. 7 | 8 | import boto3 9 | import os 10 | import sys 11 | 12 | ec2 = boto3.resource('ec2',region_name='$REGION') 13 | 14 | def lambda_handler(event, context): 15 | for vol in ec2.volumes.all(): 16 | print (vol) 17 | if vol.state=='available': 18 | if vol.tags is None: 19 | print ("No tag. Skipping " +vol.id) 20 | continue 21 | for tag in vol.tags: 22 | if tag['Key'] == '$mykey': 23 | value=tag['Value'] 24 | if value == '$myvalue' and vol.state=='available': 25 | vid=vol.id 26 | v=ec2.Volume(vol.id) 27 | print ("Deleting", vid, "based on tag", value, ".") 28 | #v.delete() 29 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/scripts/install-lsf.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | mkdir -p $LSF_INSTALL_DIR 3 | mkdir -p /var/log/lsf && chmod 777 /var/log/lsf 4 | 5 | # Add LSF admin account 6 | id -u $LSF_ADMIN &>/dev/null || adduser -m -u 1500 $LSF_ADMIN 7 | echo "source $LSF_INSTALL_DIR/conf/profile.lsf" >> /etc/bashrc 8 | 9 | # Add to bashrc if not yet exists 10 | grep -qxF "source $LSF_INSTALL_DIR/conf/profile.lsf" /etc/bashrc || \ 11 | echo "source $LSF_INSTALL_DIR/conf/profile.lsf" >> /etc/bashrc 12 | 13 | # Download customer-provided LSF binaries and entitlement file 14 | aws --quiet s3 cp $CFN_LSF_INSTALL_URI /tmp 15 | aws --quiet s3 cp $CFN_LSF_BIN_URI /tmp 16 | aws --quiet s3 cp $CFN_LSF_ENTITLEMENT_URI /tmp 17 | aws --quiet s3 cp $CFN_LSF_FIXPACK_URI /tmp 18 | 19 | cd /tmp 20 | tar xf $LSF_INSTALL_PKG 21 | cp $LSF_BIN_PKG lsf10.1_lsfinstall 22 | cd lsf10.1_lsfinstall 23 | 24 | # Create LSF installer config file 25 | cat < install.config 26 | LSF_TOP="$LSF_INSTALL_DIR" 27 | LSF_ADMINS="$LSF_ADMIN" 28 | LSF_CLUSTER_NAME=$LSF_CLUSTER_NAME 29 | LSF_MASTER_LIST="${HOSTNAME%%.*}" 30 | SILENT_INSTALL="Y" 31 | LSF_SILENT_INSTALL_TARLIST="ALL" 32 | ACCEPT_LICENSE="Y" 33 | LSF_ENTITLEMENT_FILE="/tmp/$LSF_ENTITLEMENT" 34 | EOF 35 | 36 | ./lsfinstall -f install.config 37 | 38 | # Setup LSF environment 39 | source $LSF_INSTALL_DIR/conf/profile.lsf 40 | 41 | # Install fix pack 42 | cd $LSF_INSTALL_DIR/10.1/install 43 | cp /tmp/$LSF_FP_PKG . 44 | echo "schmod_demand.so" >> patchlib/daemonlists.tbl 45 | ./patchinstall --silent $LSF_FP_PKG 46 | 47 | ## Create Resource Connector config dir 48 | mkdir -p $LSF_ENVDIR/resource_connector/aws/conf 49 | chown -R lsfadmin:root $LSF_ENVDIR/resource_connector/aws -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/scripts/nfs-bootstrap-master.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Mount NFS file system for LSF install 3 | mkdir -p $NFS_MOUNT_POINT 4 | 5 | # Check if the NFS_DNS_NAME is "efs", "FSxN" 6 | if [[ "$NFS_DNS_NAME" == *"efs"* ]]; then 7 | JUNCTION_PATH="/" 8 | NFS_OPTIONS="nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport" 9 | # Check if the NFS_DNS_NAME contains "fsx" 10 | elif [[ "$NFS_DNS_NAME" == *"fsx"* ]]; then 11 | JUNCTION_PATH="/vol1" 12 | NFS_OPTIONS="rsize=262144,wsize=262144,hard,vers=3,tcp,mountproto=tcp" 13 | else 14 | # Set a default value or handle other cases if needed 15 | JUNCTION_PATH="unknown" 16 | NFS_OPTIONS="unknown" 17 | fi 18 | 19 | # Print NFS mount 20 | echo "Mounting $NFS_DNS_NAME:$JUNCTION_PATH" 21 | mount -t nfs -o $NFS_OPTIONS $NFS_DNS_NAME:$JUNCTION_PATH $NFS_MOUNT_POINT 22 | mkdir -p $NFS_MOUNT_POINT/tmp 23 | chmod a+w $NFS_MOUNT_POINT/tmp 24 | 25 | # add to fstab 26 | echo "$NFS_DNS_NAME:$JUNCTION_PATH $NFS_MOUNT_POINT nfs $NFS_OPTIONS 0 0" >> \ 27 | /etc/fstab -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/templates/00-eda-lsf-full-workshop-master.yaml: -------------------------------------------------------------------------------- 1 | 2 | AWSTemplateFormatVersion: 2010-09-09 3 | Description: | 4 | Deploys a full EDA computing infrastructure that includes a new VPC, login server with 5 | remote desktop, LSF management server, and a shared NFS file system. 6 | 7 | This root stack launches a nested stack for each infrastructure component. 8 | 9 | **WARNING** This template creates AWS resources. 10 | You will be billed for the AWS resources used if you create a stack from this template. 11 | 12 | Metadata: 13 | AWS::CloudFormation::Interface: 14 | ParameterGroups: 15 | - Label: 16 | default: Network configuration 17 | Parameters: 18 | - VpcCIDR 19 | - PublicSubnet1CIDR 20 | - PrivateSubnet1CIDR 21 | - VpcAz 22 | - SshSource 23 | - AdminKeyPair 24 | - Label: 25 | default: File system configuration 26 | Parameters: 27 | - FileSystemMountPoint 28 | - ScratchDir 29 | - ProjectDir 30 | - Label: 31 | default: LSF configuration 32 | Parameters: 33 | - LSFInstallPath 34 | - LSFClusterName 35 | - CustomerLSFInstallUri 36 | - CustomerLSFBinsUri 37 | - CustomerLSFEntitlementUri 38 | - CustomerLSFFixPackUri 39 | - MasterInstanceType 40 | - ComputeAMI 41 | - Label: 42 | default: Login server configuration 43 | Parameters: 44 | - LoginServerInstanceType 45 | - LoginServerAMI 46 | - UserName 47 | 48 | ParameterLabels: 49 | AdminKeyPair: 50 | default: EC2 KeyPair 51 | SshSource: 52 | default: SSH source CIDR 53 | MasterInstanceType: 54 | default: Master instance type 55 | ComputeAMI: 56 | default: Compute node AMI 57 | LSFClusterName: 58 | default: Cluster name 59 | LSFInstallPath: 60 | default: LSF install path 61 | FileSystemMountPoint: 62 | default: File system mount point 63 | ScratchDir: 64 | default: Scratch subdirectory 65 | ProjectDir: 66 | default: Project subdirectory 67 | VpcCIDR: 68 | default: VPC CIDR range 69 | VpcAz: 70 | default: Availability zone 71 | PublicSubnet1CIDR: 72 | default: CIDR block for the public subnet 73 | PrivateSubnet1CIDR: 74 | default: CIDR block for the private subnet 75 | CustomerLSFInstallUri: 76 | default: LSF 10.1 install script package 77 | CustomerLSFBinsUri: 78 | default: LSF 10.1 Linux base distribution package 79 | CustomerLSFEntitlementUri: 80 | default: LSF 10.1 entitlement file 81 | CustomerLSFFixPackUri: 82 | default: LSF 10.1 Fix Pack 83 | LoginServerInstanceType: 84 | default: Login server instance type 85 | LoginServerAMI: 86 | default: Login server AMI 87 | UserName: 88 | default: DCV remote desktop login username 89 | 90 | Conditions: 91 | CreateFSxStack: !Equals 92 | - !Ref StorageType 93 | - FSxN 94 | CreateEfsStack: !Equals 95 | - !Ref StorageType 96 | - EFS 97 | 98 | Parameters: 99 | TemplateSourceBucketEndpoint: 100 | Description: Where to pull templates, configs and script from. Format .s3..amazonaws.com 101 | Default: 'aws-eda-workshop-files.s3.us-east-1.amazonaws.com' 102 | Type: String 103 | 104 | StorageType: 105 | Description: Storage type. 106 | Default: FSxN 107 | Type: String 108 | AllowedValues: 109 | - FSxN 110 | - EFS 111 | ConstraintDescription: must specify FSxN or EFS. 112 | 113 | AdminKeyPair: 114 | Description: "The name of an existing EC2 KeyPair to enable SSH access to the master server." 115 | Type: "AWS::EC2::KeyPair::KeyName" 116 | AllowedPattern: ".+" 117 | 118 | MasterInstanceType: 119 | Description: "The instance type for the LSF master host of the cluster." 120 | Type: String 121 | Default: "m5.2xlarge" 122 | AllowedValues: 123 | - m5.xlarge 124 | - m5.2xlarge 125 | - m6i.xlarge 126 | - m6i.2xlarge 127 | 128 | MasterServerAMI: 129 | Description: AMI (OS image) for the master server. 130 | Type: String 131 | Default: ALinux2 132 | AllowedValues: 133 | - ALinux2 134 | 135 | ComputeAMI: 136 | Description: |- 137 | 'FPGADev' provides access to the Xilinx Vivado EDA software. Use this when 138 | deploying the AWS EDA workshop. Choose 'CentOS' if you intend to run your 139 | own workloads in this environment. NOTE: You must first subscribe to these 140 | AMIs in the AWS Marketplace. See included documentation for details. 141 | Type: String 142 | Default: FPGADev 143 | AllowedValues: 144 | - FPGADev 145 | - CentOS75 146 | - ALinux2 147 | 148 | LSFClusterName: 149 | Description: The name of the computing environment. This will also be the name of the LSF cluster. 150 | Type: String 151 | Default: myawscluster 152 | 153 | LSFInstallPath: 154 | Description: Path where LSF will be installed. 155 | Type: "String" 156 | Default: "/tools/ibm/lsf" 157 | AllowedPattern: ^/.+ 158 | 159 | FileSystemMountPoint: 160 | Description: The local directory on which the NFS file system is mounted 161 | Type: String 162 | Default: /fsxn 163 | AllowedPattern: ^/.+ 164 | 165 | ScratchDir: 166 | Description: The name for the runtime scratch data subdirectory 167 | Type: String 168 | Default: scratch 169 | AllowedPattern: ^.+ 170 | 171 | ProjectDir: 172 | Description: The name for the project design data subdirectory 173 | Type: String 174 | Default: proj 175 | AllowedPattern: ^.+ 176 | 177 | SshSource: 178 | Description: The CIDR range that is permitted to ssh into the infrastructure instances. 179 | Use your public IP address (http://checkip.amazonaws.com). 180 | Type: String 181 | Default: 0.0.0.0/32 182 | AllowedPattern: (\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/32 183 | ConstraintDescription: This must be a valid IP CIDR range of the form x.x.x.x/32. 184 | 185 | LoginServerInstanceType: 186 | Description: The instance type for the login server. 187 | Type: String 188 | Default: m5.4xlarge 189 | AllowedValues: 190 | - t3.medium 191 | - t3.xlarge 192 | - m4.xlarge 193 | - m4.2xlarge 194 | - m5.xlarge 195 | - m5.2xlarge 196 | - m5.4xlarge 197 | - c5d.9xlarge 198 | 199 | LoginServerAMI: 200 | Description: This should be the same AMI that is used for the compute nodes. 201 | AMI (OS image) for the master server. NOTE You must first subscribe to this 202 | AMI in the AWS Marketplace at https://aws.amazon.com/marketplace/pp/B06VVYBLZZ" 203 | Type: String 204 | Default: FPGADev 205 | AllowedValues: 206 | - CentOS75 207 | - FPGADev 208 | - ALinux2 209 | 210 | VpcAz: 211 | Description: The availability zone for this compute environment 212 | Type: 'AWS::EC2::AvailabilityZone::Name' 213 | 214 | VpcCIDR: 215 | Default: 172.30.0.0/16 216 | Description: The IP range in CIDR notation for the new VPC. This should be a /16. 217 | Type: String 218 | AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(1[6-9]|2[0-8]))$ 219 | ConstraintDescription: The CIDR block parameter must be in the form x.x.x.x/16-28 220 | 221 | PublicSubnet1CIDR: 222 | Default: 172.30.32.0/24 223 | Description: The login/remote desktop server will reside in this subnet. 224 | Enter the IP range in CIDR notation for the public subnet. This should be a /24. 225 | Type: String 226 | AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(1[6-9]|2[0-8]))$ 227 | ConstraintDescription: The CIDR block parameter must be in the form x.x.x.x/16-28 228 | 229 | PrivateSubnet1CIDR: 230 | Default: 172.30.0.0/19 231 | Description: The LSF management, compute nodes, and NFS server will reside in this subnet. 232 | Enter the IP range in CIDR notation for the private subnet. This should be a /19. 233 | Type: String 234 | AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(1[6-9]|2[0-8]))$ 235 | ConstraintDescription: The CIDR block parameter must be in the form x.x.x.x/16-28 236 | 237 | CustomerLSFInstallUri: 238 | Description: The S3 URI to the LSF installer script package, lsf10.1_lsfinstall_linux_x86_64.tar.Z. 239 | Select object in the S3 console and choose Copy Path and paste here. 240 | Type: String 241 | Default: s3:///lsf10.1_lsfinstall_linux_x86_64.tar.Z 242 | AllowedPattern: s3\:\/\/.* 243 | ConstraintDescription: S3 path invalid. Please verify LSF package name matches 244 | the example in the parameter description. 245 | 246 | CustomerLSFBinsUri: 247 | Description: The S3 URI to the LSF 10.1 Linux 2.6 kernel glibc version 2.3 base distribution package, 248 | lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z. This must be a full distribution and not a patch 249 | or Fix Pack package. Select object in the S3 console and choose Copy Path and paste here. 250 | Type: String 251 | Default: s3:///lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z 252 | AllowedPattern: s3\:\/\/.* 253 | ConstraintDescription: S3 path invalid. Please verify LSF package name matches 254 | the example in the parameter description. 255 | 256 | CustomerLSFFixPackUri: 257 | Description: > 258 | The S3 URI to the LSF 10.1 Fix Pack package. This must the lastest cumulative Fix Pack package. 259 | Select package object in the console and choose Copy Path and paste here. 260 | Type: "String" 261 | Default: "s3:///lsf10.1_linux2.6-glibc2.3-x86_64.545500.tar.Z" 262 | AllowedPattern: s3\:\/\/.* 263 | 264 | CustomerLSFEntitlementUri: 265 | Description: The S3 URI to the LSF entitlement file, lsf_std_entitlement.dat or lsf_adv_entitlement.dat. 266 | Select object in the S3 console and choose Copy Path and paste here. 267 | Type: String 268 | Default: s3:///lsf_std_entitlement.dat 269 | AllowedPattern: s3\:\/\/.* 270 | ConstraintDescription: S3 path invalid. Please verify LSF file name matches 271 | the example in the parameter description. 272 | 273 | UserName: 274 | Default: simuser 275 | Description: User name for DCV remote desktop login. Default is "simuser". 276 | MinLength: '4' 277 | Type: String 278 | 279 | Resources: 280 | NetworkStack: 281 | Type: AWS::CloudFormation::Stack 282 | Properties: 283 | Parameters: 284 | VpcCIDR: !Ref VpcCIDR 285 | VpcAz: !Ref VpcAz 286 | PublicSubnet1CIDR: !Ref PublicSubnet1CIDR 287 | PrivateSubnet1CIDR: !Ref PrivateSubnet1CIDR 288 | LSFClusterName: !Ref LSFClusterName 289 | SshSource: !Ref SshSource 290 | TemplateURL: !Sub https://${TemplateSourceBucketEndpoint}/workshops/eda-workshop-lsf/templates/01-network.yaml 291 | 292 | LSFServiceStack: 293 | Type: AWS::CloudFormation::Stack 294 | DependsOn: NFSServerStack 295 | Properties: 296 | Parameters: 297 | TemplateSourceBucketEndpoint: !Ref TemplateSourceBucketEndpoint 298 | MasterInstanceType: !Ref MasterInstanceType 299 | MasterServerAMI: !Ref MasterServerAMI 300 | ComputeAMI: !Ref ComputeAMI 301 | LSFClusterName: !Ref LSFClusterName 302 | LSFInstallPath: !Ref LSFInstallPath 303 | FileSystemMountPoint: !Ref FileSystemMountPoint 304 | CustomerLSFInstallUri: !Ref CustomerLSFInstallUri 305 | CustomerLSFBinsUri: !Ref CustomerLSFBinsUri 306 | CustomerLSFFixPackUri: !Ref CustomerLSFFixPackUri 307 | CustomerLSFEntitlementUri: !Ref CustomerLSFEntitlementUri 308 | AdminKeyPair: !Ref AdminKeyPair 309 | DCVUserName: !Ref UserName 310 | TemplateURL: !Sub https://${TemplateSourceBucketEndpoint}/workshops/eda-workshop-lsf/templates/02-lsf-master.yaml 311 | 312 | NFSServerStack: 313 | Type: AWS::CloudFormation::Stack 314 | DependsOn: NetworkStack 315 | Properties: 316 | Parameters: 317 | LSFClusterName: !Ref LSFClusterName 318 | TemplateURL: !If [ 319 | CreateFSxStack, 320 | !Sub "https://${TemplateSourceBucketEndpoint}/workshops/eda-workshop-lsf/templates/fsxn-filesystem.yaml", 321 | !Sub "https://${TemplateSourceBucketEndpoint}/workshops/eda-workshop-lsf/templates/efs-filesystem.yaml" 322 | ] 323 | 324 | LoginServerStack: 325 | Type: AWS::CloudFormation::Stack 326 | DependsOn: LSFServiceStack 327 | Properties: 328 | Parameters: 329 | LoginServerInstanceType: !Ref LoginServerInstanceType 330 | LoginServerAMI: !Ref LoginServerAMI 331 | LSFClusterName: !Ref LSFClusterName 332 | AdminKeyPair: !Ref AdminKeyPair 333 | LSFInstallPath: !Ref LSFInstallPath 334 | FileSystemMountPoint: !Ref FileSystemMountPoint 335 | ScratchDir: !Ref ScratchDir 336 | ProjectDir: !Ref ProjectDir 337 | UserName: !Ref UserName 338 | TemplateURL: !Sub https://${TemplateSourceBucketEndpoint}/workshops/eda-workshop-lsf/templates/03-dcv-login-server.yaml 339 | 340 | Outputs: 341 | RootStackName: 342 | Description: The name of the root CloudFormation stack 343 | Value: !Ref 'AWS::StackName' 344 | Export: 345 | Name: !Join [ '-', [ !Ref LSFClusterName, "RootStackName" ] ] 346 | LoginServerSsh: 347 | Description: Login server SSH command 348 | Value: !Sub 'ssh -i /path/to/${AdminKeyPair}.pem centos@${LoginServerStack.Outputs.LoginServerPublicIp}' 349 | LoginServerRemoteDesktop: 350 | Description: Connect to the DCV Remote Desktop with this URL via web browser or native DCV client 351 | Value: !GetAtt LoginServerStack.Outputs.DCVConnectionLink 352 | SSHTunnelCommand: 353 | Description: > 354 | Command for setting up an SSH tunnel from your local host to the remote desktop. Use "localhost:18443" as 355 | the connection address in the DCV client. This is helpful if outbound port 8443 is blocked by a proxy. 356 | Value: !Sub 'ssh -i /path/to/${AdminKeyPair}.pem -L 18443:localhost:8443 -l centos ${LoginServerStack.Outputs.LoginServerPublicIp}' 357 | 358 | 359 | 360 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/templates/01-network.yaml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: 2010-09-09 2 | Description: | 3 | This template deploys a VPC with a public and private subnet in one 4 | Availability Zone, an Internet Gateway with a default route to it on the public subnet. 5 | The template also deploys a NAT Gateway and a default route to it 6 | in the private subnet. Also, VPC security groups are created of the instances and FSxN file systems. 7 | 8 | **WARNING** This template creates AWS resources. 9 | You will be billed for the AWS resources used if you create a stack from this template. 10 | 11 | Metadata: 12 | Authors: 13 | Description: Matt Morris (morrmt@amazon.com) 14 | License: 15 | Description: | 16 | Copyright 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. 17 | 18 | Permission is hereby granted, free of charge, to any person obtaining a copy of 19 | this software and associated documentation files (the "Software"), to deal in 20 | the Software without restriction, including without limitation the rights to 21 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 22 | the Software, and to permit persons to whom the Software is furnished to do so. 23 | 24 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 25 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 26 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 27 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 28 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 29 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.' 30 | 31 | Parameters: 32 | VpcAz: 33 | Description: Choose an availability zone for this compute environment. 34 | Type: 'AWS::EC2::AvailabilityZone::Name' 35 | LSFClusterName: 36 | Default: LSFCluster 37 | Description: An environment name that will be prefixed to resource names 38 | Type: String 39 | VpcCIDR: 40 | Default: 172.30.0.0/16 41 | Description: Enter the IP range in CIDR notation for this VPC. This should be a /16. 42 | Type: String 43 | AllowedPattern: "^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\\/(1[6-9]|2[0-8]))$" 44 | ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/16-28 45 | PublicSubnet1CIDR: 46 | Default: 172.30.32.0/24 47 | Description: Enter the IP range in CIDR notation for the public subnet. This should be a /24. 48 | Type: String 49 | AllowedPattern: "^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\\/(1[6-9]|2[0-8]))$" 50 | ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/16-28 51 | PrivateSubnet1CIDR: 52 | Default: 172.30.0.0/19 53 | Description: Enter the IP range in CIDR notation for the private subnet. This should be a /19. 54 | Type: String 55 | AllowedPattern: "^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\\/(1[6-9]|2[0-8]))$" 56 | ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/16-28 57 | SshSource: 58 | Description: "CIDR range that can ssh into the infrastructure instances. Use your public IP address (http://checkip.amazonaws.com)." 59 | Type: String 60 | Default: 0.0.0.0/32 61 | AllowedPattern: (\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/(\d{1,2}) 62 | ConstraintDescription: must be a valid IP CIDR range of the form x.x.x.x/x. 63 | 64 | Resources: 65 | EDAVPC: 66 | Type: 'AWS::EC2::VPC' 67 | Properties: 68 | CidrBlock: !Ref VpcCIDR 69 | EnableDnsHostnames: true 70 | EnableDnsSupport: true 71 | Tags: 72 | - Key: Name 73 | Value: !Sub '${AWS::StackName}-${LSFClusterName}' 74 | 75 | DefaultPrivateRoute1: 76 | Type: 'AWS::EC2::Route' 77 | Properties: 78 | DestinationCidrBlock: 0.0.0.0/0 79 | NatGatewayId: !Ref NatGateway1 80 | RouteTableId: !Ref PrivateRouteTable1 81 | 82 | DefaultPublicRoute: 83 | Type: 'AWS::EC2::Route' 84 | DependsOn: InternetGatewayAttachment 85 | Properties: 86 | DestinationCidrBlock: 0.0.0.0/0 87 | GatewayId: !Ref InternetGateway 88 | RouteTableId: !Ref PublicRouteTable 89 | 90 | InternetGateway: 91 | Type: 'AWS::EC2::InternetGateway' 92 | Properties: 93 | Tags: 94 | - Key: Name 95 | Value: !Ref LSFClusterName 96 | 97 | InternetGatewayAttachment: 98 | Type: 'AWS::EC2::VPCGatewayAttachment' 99 | Properties: 100 | InternetGatewayId: !Ref InternetGateway 101 | VpcId: !Ref EDAVPC 102 | 103 | NatGateway1: 104 | Type: 'AWS::EC2::NatGateway' 105 | Properties: 106 | AllocationId: !GetAtt 107 | - NatGateway1EIP 108 | - AllocationId 109 | SubnetId: !Ref PublicSubnet1 110 | 111 | NatGateway1EIP: 112 | Type: 'AWS::EC2::EIP' 113 | DependsOn: InternetGatewayAttachment 114 | Properties: 115 | Domain: vpc 116 | 117 | NoIngressSecurityGroup: 118 | Type: 'AWS::EC2::SecurityGroup' 119 | Properties: 120 | GroupDescription: Security group with no ingress rule 121 | GroupName: no-ingress-sg 122 | VpcId: !Ref EDAVPC 123 | 124 | # Explicit 'retain' required for this route table. 125 | # FSxN can't delete its file systems without a route table. 126 | PrivateRouteTable1: 127 | Type: 'AWS::EC2::RouteTable' 128 | DeletionPolicy: Retain 129 | UpdateReplacePolicy: Retain 130 | Properties: 131 | Tags: 132 | - Key: Name 133 | Value: !Sub '${LSFClusterName}PrivateRoutes' 134 | VpcId: !Ref EDAVPC 135 | 136 | PublicRouteTable: 137 | Type: 'AWS::EC2::RouteTable' 138 | Properties: 139 | Tags: 140 | - Key: Name 141 | Value: !Sub '${LSFClusterName}PublicRoutes' 142 | VpcId: !Ref EDAVPC 143 | 144 | PrivateSubnet1RouteTableAssociation: 145 | Properties: 146 | RouteTableId: !Ref PrivateRouteTable1 147 | SubnetId: !Ref PrivateSubnet1 148 | Type: 'AWS::EC2::SubnetRouteTableAssociation' 149 | 150 | PublicSubnet1RouteTableAssociation: 151 | Type: 'AWS::EC2::SubnetRouteTableAssociation' 152 | Properties: 153 | RouteTableId: !Ref PublicRouteTable 154 | SubnetId: !Ref PublicSubnet1 155 | 156 | PrivateSubnet1: 157 | Type: 'AWS::EC2::Subnet' 158 | Properties: 159 | AvailabilityZone: !Ref VpcAz 160 | CidrBlock: !Ref PrivateSubnet1CIDR 161 | MapPublicIpOnLaunch: false 162 | Tags: 163 | - Key: Name 164 | Value: !Sub '${LSFClusterName} Private Subnet' 165 | VpcId: !Ref EDAVPC 166 | 167 | PublicSubnet1: 168 | Type: 'AWS::EC2::Subnet' 169 | Properties: 170 | AvailabilityZone: !Ref VpcAz 171 | CidrBlock: !Ref PublicSubnet1CIDR 172 | MapPublicIpOnLaunch: true 173 | Tags: 174 | - Key: Name 175 | Value: !Sub '${LSFClusterName} Public Subnet' 176 | VpcId: !Ref EDAVPC 177 | 178 | S3VPCEndpoint: 179 | Type: 'AWS::EC2::VPCEndpoint' 180 | Properties: 181 | PolicyDocument: 182 | Version: 2012-10-17 183 | Statement: 184 | - Effect: Allow 185 | Principal: '*' 186 | Action: "*" 187 | Resource: "*" 188 | RouteTableIds: 189 | - !Ref PublicRouteTable 190 | ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3' 191 | VpcId: !Ref EDAVPC 192 | 193 | VPCFlowLog: 194 | Type: AWS::EC2::FlowLog 195 | Properties: 196 | DeliverLogsPermissionArn: !GetAtt FlowLogRole.Arn 197 | LogGroupName: !Ref FlowLogGroup 198 | ResourceId: !Ref EDAVPC 199 | ResourceType: VPC 200 | TrafficType: ALL 201 | 202 | FlowLogGroup: 203 | Type: 'AWS::Logs::LogGroup' 204 | Properties: 205 | RetentionInDays: 3 206 | 207 | FlowLogRole: 208 | Type: 'AWS::IAM::Role' 209 | Properties: 210 | AssumeRolePolicyDocument: 211 | Version: '2012-10-17' 212 | Statement: 213 | - Effect: Allow 214 | Principal: 215 | Service: 'vpc-flow-logs.amazonaws.com' 216 | Action: 'sts:AssumeRole' 217 | Policies: 218 | - PolicyName: 'flowlogs-policy' 219 | PolicyDocument: 220 | Version: '2012-10-17' 221 | Statement: 222 | - Effect: Allow 223 | Action: 224 | - 'logs:CreateLogStream' 225 | - 'logs:PutLogEvents' 226 | - 'logs:DescribeLogGroups' 227 | - 'logs:DescribeLogStreams' 228 | Resource: !GetAtt 'FlowLogGroup.Arn' 229 | 230 | LSFMasterSG: 231 | Type: AWS::EC2::SecurityGroup 232 | Properties: 233 | GroupDescription: "SG for LSF Master" 234 | VpcId: !Ref EDAVPC 235 | SecurityGroupEgress: 236 | - CidrIp: 0.0.0.0/0 237 | Description: Allows egress to all ports 238 | IpProtocol: "-1" 239 | 240 | LSFMasterSGRule01: 241 | Type: AWS::EC2::SecurityGroupIngress 242 | Properties: 243 | GroupId: !Ref LSFMasterSG 244 | Description: "SSH ingress" 245 | IpProtocol: tcp 246 | FromPort: 22 247 | ToPort: 22 248 | CidrIp: !Ref SshSource 249 | 250 | LSFMasterSGRule02: 251 | Type: AWS::EC2::SecurityGroupIngress 252 | Properties: 253 | GroupId: !Ref LSFMasterSG 254 | Description: "All traffic from LSF Compute Nodes" 255 | IpProtocol: "-1" 256 | SourceSecurityGroupId: !Ref LSFComputeNodeSG 257 | 258 | LSFMasterSGRule03: 259 | Type: AWS::EC2::SecurityGroupIngress 260 | Properties: 261 | GroupId: !Ref LSFMasterSG 262 | Description: "All traffic from Login Server" 263 | IpProtocol: "-1" 264 | SourceSecurityGroupId: !Ref LoginServerSG 265 | 266 | LSFComputeNodeSG: 267 | Type: AWS::EC2::SecurityGroup 268 | Properties: 269 | GroupDescription: "SG for LSF Compute Nodes" 270 | VpcId: !Ref EDAVPC 271 | 272 | LSFComputeNodeSGRule01: 273 | Type: AWS::EC2::SecurityGroupIngress 274 | Properties: 275 | GroupId: !Ref LSFComputeNodeSG 276 | Description: "All traffic from LSF Master" 277 | IpProtocol: "-1" 278 | SourceSecurityGroupId: !Ref LSFMasterSG 279 | 280 | LSFComputeNodeSGRule02: 281 | Type: AWS::EC2::SecurityGroupIngress 282 | Properties: 283 | GroupId: !Ref LSFComputeNodeSG 284 | Description: "All traffic from other LSF exec hosts" 285 | IpProtocol: "-1" 286 | SourceSecurityGroupId: !Ref LSFComputeNodeSG 287 | 288 | LSFComputeNodeSGRule03: 289 | Type: AWS::EC2::SecurityGroupIngress 290 | Properties: 291 | GroupId: !Ref LSFComputeNodeSG 292 | Description: "SSH Ingress" 293 | IpProtocol: "tcp" 294 | FromPort: 22 295 | ToPort: 22 296 | CidrIp: !Ref SshSource 297 | 298 | LoginServerSG: 299 | Type: AWS::EC2::SecurityGroup 300 | Properties: 301 | GroupDescription: "SG for Login Servers" 302 | VpcId: !Ref EDAVPC 303 | SecurityGroupIngress: 304 | - IpProtocol: tcp 305 | FromPort: 22 306 | ToPort: 22 307 | CidrIp: !Ref SshSource 308 | Description: "SSH from remote client" 309 | - IpProtocol: tcp 310 | FromPort: 8443 311 | ToPort: 8443 312 | CidrIp: !Ref SshSource 313 | Description: "DCV WebSocket traffic" 314 | - IpProtocol: udp 315 | FromPort: 8443 316 | ToPort: 8443 317 | CidrIp: !Ref SshSource 318 | Description: "DCV QUIC UDP traffic" 319 | 320 | NfsSG: 321 | Type: AWS::EC2::SecurityGroup 322 | Properties: 323 | GroupDescription: "SG for NFS file systems. Requires atleast ports 22, 111, 635, 2049 for NFSv3" 324 | VpcId: !Ref EDAVPC 325 | SecurityGroupIngress: 326 | - IpProtocol: -1 327 | SourceSecurityGroupId: !Ref LSFMasterSG 328 | Description: "All from LSF Mgmt Server" 329 | - IpProtocol: -1 330 | SourceSecurityGroupId: !Ref LSFComputeNodeSG 331 | Description: "All from compute nodes" 332 | - IpProtocol: -1 333 | SourceSecurityGroupId: !Ref LoginServerSG 334 | Description: "All from login servers" 335 | 336 | Outputs: 337 | NoIngressSecurityGroup: 338 | Description: Security group with no ingress rule 339 | Value: !Ref NoIngressSecurityGroup 340 | 341 | PrivateSubnet1: 342 | Description: A reference to the private subnet 343 | Value: !Ref PrivateSubnet1 344 | 345 | PublicSubnet1: 346 | Description: A reference to the public subnet 347 | Value: !Ref PublicSubnet1 348 | 349 | EnvVpc: 350 | Description: The ID of the VPC 351 | Value: !Ref EDAVPC 352 | Export: 353 | Name: !Join [ '-', [ !Ref LSFClusterName, "EDAVPC" ] ] 354 | 355 | VpcCIDR: 356 | Description: The CIDR of the VPC 357 | Value: !Ref VpcCIDR 358 | Export: 359 | Name: !Join [ '-', [ !Ref LSFClusterName, "VpcCIDR" ] ] 360 | 361 | PublicSubnet: 362 | Description: Public subnet exported for use by other stacks 363 | Value: 364 | Ref: PublicSubnet1 365 | Export: 366 | Name: !Join [ '-', [ !Ref LSFClusterName,"PublicSubnet" ] ] 367 | 368 | PrivateSubnet: 369 | Description: Private subnet export for use by other stacks 370 | Value: 371 | Ref: PrivateSubnet1 372 | Export: 373 | Name: !Join [ '-', [ !Ref LSFClusterName,"PrivateSubnet" ] ] 374 | 375 | PublicRouteTable: 376 | Description: Route Table for Public Subnet 377 | Value: 378 | Ref: PublicRouteTable 379 | Export: 380 | Name: !Join [ '-', [ !Ref LSFClusterName,"PublicRouteTable" ] ] 381 | 382 | PrivateRouteTable: 383 | Description: Route Table for Private Subnet 384 | Value: 385 | Ref: PrivateRouteTable1 386 | Export: 387 | Name: !Join [ '-', [ !Ref LSFClusterName,"PrivateRouteTable" ] ] 388 | 389 | LSFMasterSG: 390 | Description: Security group for LSF Master 391 | Value: 392 | Ref: LSFMasterSG 393 | Export: 394 | Name: !Join [ '-', [ !Ref LSFClusterName,"LSFMasterSG" ] ] 395 | 396 | LSFComputeNodeSG: 397 | Description: Security group export for LSF Compute Nodes 398 | Value: 399 | Ref: LSFComputeNodeSG 400 | Export: 401 | Name: !Join [ '-', [ !Ref LSFClusterName,"LSFComputeNodeSG" ] ] 402 | 403 | LoginServerSG: 404 | Description: Security group export for Login Servers 405 | Value: 406 | Ref: LoginServerSG 407 | Export: 408 | Name: !Join [ '-', [ !Ref LSFClusterName,"LoginServerSG" ] ] 409 | 410 | NfsSG: 411 | Description: Security group for NFS Servers 412 | Value: 413 | Ref: NfsSG 414 | Export: 415 | Name: !Join [ '-', [ !Ref LSFClusterName,"NfsSG" ] ] 416 | 417 | 418 | 419 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/templates/02-lsf-master.yaml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: "2010-09-09" 2 | Description: | 3 | This template deploys an LSF management server, mounts the NFS file system, and 4 | installs the LSF packages provided by the user. 5 | 6 | **WARNING** This template creates AWS resources. 7 | You will be billed for the AWS resources used if you create a stack from this template. 8 | 9 | Mappings: 10 | RegionMap: 11 | us-east-1: 12 | CentOS75: ami-9887c6e7 13 | FPGADev: ami-0cf12acd587e51b42 14 | ALinux2: not-used-dynamic-lookup-via-ssm 15 | us-east-2: 16 | CentOS75: ami-0f2b4fc905b0bd1f1 17 | FPGADev: ami-0f522eea547ffbdde 18 | ALinux2: not-used-dynamic-lookup-via-ssm 19 | us-west-1: 20 | CentOS75: ami-074e2d6769f445be5 21 | FPGADev: ami-02ed13c760b58790d 22 | ALinux2: not-used-dynamic-lookup-via-ssm 23 | us-west-2: 24 | CentOS75: ami-3ecc8f46 25 | FPGADev: ami-00736db43ba03656a 26 | ALinux2: not-used-dynamic-lookup-via-ssm 27 | eu-west-1: # Dublin 28 | CentOS75: ami-3548444c 29 | FPGADev: ami-01f373e791bb05667 30 | ALinux2: not-used-dynamic-lookup-via-ssm 31 | ap-southeast-1: # Singapore 32 | CentOS75: ami-8e0205f2 33 | FPGADev: ami-0d2658414ef6f29cf 34 | ALinux2: not-used-dynamic-lookup-via-ssm 35 | ap-southeast-2: # Sydney 36 | CentOS75: ami-d8c21dba 37 | FPGADev: ami-0651d0a596bb7c014 38 | ALinux2: not-used-dynamic-lookup-via-ssm 39 | ALinux2: not-used-dynamic-lookup-via-ssm 40 | ap-northeast-2: # Seoul 41 | CentOS75: ami-06cf2a72dadf92410 42 | FPGADev: ami-03162ccf408e174a1 43 | ALinux2: not-used-dynamic-lookup-via-ssm 44 | ap-northeast-1: # Tokyo 45 | CentOS75: ami-045f38c93733dd48d 46 | FPGADev: ami-051c91d3186bfdb7d 47 | ALinux2: not-used-dynamic-lookup-via-ssm 48 | 49 | Parameters: 50 | TemplateSourceBucketEndpoint: 51 | Description: From which bucket to deploy the nested stacks and scripts 52 | Default: 'aws-eda-workshop-files' 53 | Type: String 54 | AdminKeyPair: 55 | Description: "Name of an existing EC2 KeyPair to enable SSH access to the master server." 56 | Type: "AWS::EC2::KeyPair::KeyName" 57 | Default: "morrmt" 58 | AllowedPattern: ".+" 59 | MasterInstanceType: 60 | Description: "The desired instance type for the master node of the cluster." 61 | Type: "String" 62 | Default: "m5.2xlarge" 63 | LatestAmazonLinux2Ami: 64 | Description: AMI (OS image) for the master server. 65 | Type: 'AWS::SSM::Parameter::Value' 66 | Default: '/aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2' 67 | MasterServerAMI: 68 | Type: "String" 69 | Default: ALinux2 70 | AllowedValues: 71 | - ALinux2 72 | ComputeAMI: 73 | Description: AMI (OS image) for the compute nodes. 74 | NOTE - You must first subscribe to this AMI in the AWS Marketplace at https://aws.amazon.com/marketplace/pp/B00O7WM7QW" 75 | Type: "String" 76 | Default: "FPGADev" 77 | AllowedValues: 78 | - ALinux2 79 | - CentOS75 80 | - FPGADev 81 | LSFClusterName: 82 | Description: "The name of the LSF cluster." 83 | Type: "String" 84 | Description: |- 85 | An environment name that will be prefixed to resource names. 86 | Should be equal to the value in the network stack. 87 | Default: "LSFCluster" 88 | LSFInstallPath: 89 | Description: Path where LSF will be installed. 90 | Type: "String" 91 | Default: "/tools/ibm/lsf" 92 | FileSystemMountPoint: 93 | Description: Directory where the NFS file system will be mounted (EFS, FSxN or FSxZ). 94 | Type: "String" 95 | Default: "/nfs" 96 | CustomerLSFInstallUri: 97 | Description: > 98 | The S3 URI to the LSF installation script package. 99 | Select object in the console and choose Copy Path and paste here. 100 | Type: "String" 101 | Default: "s3:///lsf10.1_lsfinstall_linux_x86_64.tar.Z" 102 | CustomerLSFBinsUri: 103 | Description: The S3 URI to the LSF 10.1 Linux base distribution package. 104 | Select object in the console and choose Copy Path and paste here. 105 | Type: "String" 106 | Default: "s3:///lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z" 107 | CustomerLSFFixPackUri: 108 | Description: > 109 | The S3 URI to the LSF 10.1 Fix Pack package. This must the lastest cumulative Fix Pack package. 110 | Select package object in the console and choose Copy Path and paste here. 111 | Type: "String" 112 | Default: "s3:///lsf10.1_linux2.6-glibc2.3-x86_64.520009.tar.Z" 113 | CustomerLSFEntitlementUri: 114 | Description: The S3 URI to the LSF entitlement file, lsf_std_entitlement.dat or lsf_adv_entitlement.dat. 115 | Select object in the S3 console and choose Copy Path and paste here. 116 | Type: String 117 | Default: s3:///lsf_std_entitlement.dat 118 | DCVUserName: 119 | Type: String 120 | Default: simuser 121 | 122 | Conditions: 123 | UseAmazonLinux2AMI: !Equals 124 | - !Ref ComputeAMI 125 | - ALinux2 126 | 127 | Resources: 128 | LSFMasterInstance: 129 | Type: "AWS::EC2::Instance" 130 | CreationPolicy: 131 | ResourceSignal: 132 | Count: 1 133 | Timeout: PT15M 134 | Properties: 135 | InstanceType: !Ref MasterInstanceType 136 | ImageId: !If 137 | - UseAmazonLinux2AMI 138 | - Ref: LatestAmazonLinux2Ami 139 | - !FindInMap [ RegionMap, !Ref "AWS::Region", !Ref MasterServerAMI ] 140 | SubnetId: 141 | Fn::ImportValue: !Sub '${LSFClusterName}-PrivateSubnet' 142 | SecurityGroupIds: 143 | - Fn::ImportValue: !Sub '${LSFClusterName}-LSFMasterSG' 144 | KeyName: !Ref AdminKeyPair 145 | IamInstanceProfile: !Ref LSFMasterInstanceProfile 146 | Tags: 147 | - 148 | Key: "Name" 149 | Value: !Join [ '-', [ 'LSF Mgmt Host',!Ref LSFClusterName ] ] 150 | - 151 | Key: "Cluster" 152 | Value: !Ref LSFClusterName 153 | UserData: 154 | Fn::Base64: 155 | Fn::Sub: 156 | - | 157 | Content-Type: multipart/mixed; boundary="//" 158 | MIME-Version: 1.0 159 | 160 | --// 161 | Content-Type: text/cloud-config; charset="us-ascii" 162 | MIME-Version: 1.0 163 | Content-Transfer-Encoding: 7bit 164 | Content-Disposition: attachment; filename="cloud-config.txt" 165 | 166 | #cloud-config 167 | cloud_final_modules: 168 | - [scripts-user, always] 169 | 170 | --// 171 | Content-Type: text/x-shellscript; charset="us-ascii" 172 | MIME-Version: 1.0 173 | Content-Transfer-Encoding: 7bit 174 | Content-Disposition: attachment; filename="userdata.txt" 175 | 176 | #!/bin/bash 177 | 178 | set -x 179 | exec > >(tee /var/log/user-data.log|logger -t user-data ) 2>&1 180 | 181 | # Print user data 182 | INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id) 183 | cat /var/lib/cloud/instances/$INSTANCE_ID/user-data.txt 184 | 185 | echo "*** BEGIN LSF MASTER BOOTSTRAP ***" 186 | 187 | # Create local environment file to re-use later 188 | cat < /root/bootstrap.env 189 | AWS_CFN_STACK_NAME=${AWS::StackName} 190 | AWS_CFN_STACK_REGION=${AWS::Region} 191 | AWS_S3_BUCKET_ENDPOINT=${TemplateSourceBucketEndpoint} 192 | AWS_S3_BUCKET_NAME=$(echo "${TemplateSourceBucketEndpoint}" | cut -d '.' -f 1) 193 | CFN_LSF_INSTALL_URI=${CustomerLSFInstallUri} 194 | CFN_LSF_BIN_URI=${CustomerLSFBinsUri} 195 | CFN_LSF_ENTITLEMENT_URI=${CustomerLSFEntitlementUri} 196 | CFN_LSF_FIXPACK_URI=${CustomerLSFFixPackUri} 197 | CFN_EC2_KEY_PAIR=${AdminKeyPair} 198 | CFN_COMPUTE_NODE_AMI=${LSFComputeNodeAmi} 199 | CFN_COMPUTE_NODE_SUBNET=${LSFComputeNodeSubnet} 200 | CFN_COMPUTE_NODE_SG_ID=${LSFComputeNodeSGGroupId} 201 | CFN_COMPUTE_NODE_INSTANCE_PROFILE_ARN="${LSFComputeNodeInstanceProfileArn}" 202 | CFN_COMPUTE_SPOT_FLEET_ROLE_ARN="${LSFComputeNodeSpotFleetRoleArn}" 203 | CFN_DCV_USERNAME=${DCVUserName} 204 | CFN_LAUNCH_TEMPLATE_ID=${LaunchTemplate} 205 | LSF_INSTALL_DIR="${FileSystemMountPoint}${LSFInstallPath}/${LSFClusterName}" 206 | LSF_ADMIN=lsfadmin 207 | LSF_CLUSTER_NAME=${LSFClusterName} 208 | LSF_INSTALL_PKG=`echo ${CustomerLSFInstallUri} | awk -F "/" '{print $NF}'` 209 | LSF_BIN_PKG=`echo ${CustomerLSFBinsUri} | awk -F "/" '{print $NF}'` 210 | LSF_FP_PKG=`echo ${CustomerLSFFixPackUri} | awk -F "/" '{print $NF}'` 211 | LSF_ENTITLEMENT=`echo ${CustomerLSFEntitlementUri} | awk -F "/" '{print $NF}'` 212 | NFS_DNS_NAME=${NfsDnsName} 213 | NFS_MOUNT_POINT=${FileSystemMountPoint} 214 | EOF 215 | export $(grep -v '^#' /root/bootstrap.env | xargs) 216 | 217 | # Install cfn-signal helper script to signal bootstrap completion to CloudFormation 218 | yum install aws-cfn-bootstrap ed java-1.8.0-openjdk wget vim -q -y && \ 219 | 220 | # Prepare NFS 221 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/scripts/nfs-bootstrap-master.sh - | bash && \ 222 | 223 | # Install LSF using customer-provided packages 224 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/scripts/install-lsf.sh - | bash && \ 225 | 226 | # Configure lsf configuration (lsf.conf, lsf.cluster.*) 227 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/scripts/config-lsf.sh - | bash && \ 228 | 229 | # mosquitto.conf. Enables mostquitto daemon, which RC uses to show bhosts -rc output. 230 | aws s3 cp s3://$AWS_S3_BUCKET_NAME/workshops/eda-workshop-lsf/config/mosquitto.conf $LSF_ENVDIR/mosquitto.conf && \ 231 | chown $LSF_ADMIN $LSF_ENVDIR/mosquitto.conf && \ 232 | 233 | # Configure system scripts to start LSF at boot time 234 | # Add cshrc.lsf and profile.lsf to system-wide environment 235 | # Start LSF daemons 236 | 237 | source $LSF_INSTALL_DIR/conf/profile.lsf 238 | 239 | $LSF_INSTALL_DIR/10.1/install/hostsetup --top="$LSF_INSTALL_DIR" \ 240 | --profile="y" \ 241 | --start="y" && \ 242 | 243 | # Verify that LSF is up and send signal to Cloudformation 244 | sleep 5 && lsid 245 | /opt/aws/bin/cfn-signal -e $? --stack $AWS_CFN_STACK_NAME \ 246 | --resource LSFMasterInstance \ 247 | --region $AWS_CFN_STACK_REGION 248 | 249 | echo "*** END LSF MASTER BOOTSTRAP ***" 250 | --//-- 251 | 252 | - LSFComputeNodeInstanceProfileArn: !GetAtt LSFComputeNodeInstanceProfile.Arn 253 | LSFComputeNodeSpotFleetRoleArn: !GetAtt LSFSpotFleetRole.Arn 254 | LSFComputeNodeAmi: !If 255 | - UseAmazonLinux2AMI 256 | - Ref: LatestAmazonLinux2Ami 257 | - !FindInMap [ RegionMap, !Ref "AWS::Region", !Ref ComputeAMI ] 258 | LSFComputeNodeSGGroupId: 259 | Fn::ImportValue: !Join [ '-', [ !Ref LSFClusterName,"LSFComputeNodeSG" ] ] 260 | LSFComputeNodeSubnet: 261 | Fn::ImportValue: !Join [ '-', [ !Ref LSFClusterName,"PrivateSubnet" ] ] 262 | NfsDnsName: 263 | Fn::ImportValue: !Join [ '-', [ !Ref LSFClusterName,"NfsDnsName" ] ] 264 | 265 | LSFMasterRole: 266 | Type: "AWS::IAM::Role" 267 | Properties: 268 | Description: AWS service permissions for LSF Resource Connector 269 | Path: "/" 270 | AssumeRolePolicyDocument: 271 | Version: '2012-10-17' 272 | Statement: 273 | - 274 | Effect: Allow 275 | Principal: 276 | Service: 277 | - "ec2.amazonaws.com" 278 | Action: 279 | - "sts:AssumeRole" 280 | ManagedPolicyArns: 281 | - "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy" 282 | - "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore" 283 | Policies: 284 | - PolicyName: LSFResourceConnectorPerms 285 | PolicyDocument: 286 | Version: 2012-10-17 287 | Statement: 288 | - Effect: Allow 289 | Action: 290 | - ec2:AssociateIamInstanceProfile 291 | - ec2:CancelSpotFleetRequests 292 | - ec2:CreateFleet 293 | - ec2:CreateLaunchTemplateVersion 294 | - ec2:CreateTags 295 | - ec2:DeleteLaunchTemplateVersions 296 | - ec2:DescribeFleetInstances 297 | - ec2:DescribeFleets 298 | - ec2:DescribeInstances 299 | - ec2:DescribeInstanceStatus 300 | - ec2:DescribeKeyPairs 301 | - ec2:DescribeLaunchTemplateVersions 302 | - ec2:DescribeSpotFleetInstances 303 | - ec2:DescribeSpotFleetRequestHistory 304 | - ec2:DescribeSpotFleetRequests 305 | - ec2:DescribeSpotInstanceRequests 306 | - ec2:DescribeTags 307 | - ec2:GetLaunchTemplateData 308 | - ec2:ModifyIdFormat 309 | - ec2:ModifySpotFleetRequest 310 | - ec2:ReplaceIamInstanceProfileAssociation 311 | - ec2:RequestSpotFleet 312 | - ec2:RunInstances 313 | - ec2:TerminateInstances 314 | Resource: '*' 315 | - Effect: Allow 316 | Action: 317 | - iam:PassRole 318 | - iam:ListRoles 319 | - iam:ListInstanceProfiles 320 | - iam:CreateServiceLinkedRole 321 | Resource: 322 | - !GetAtt LSFSpotFleetRole.Arn 323 | - !GetAtt LSFComputeNodeRole.Arn 324 | Condition: 325 | StringEquals: 326 | iam:PassedToService: 327 | "ec2.amazonaws.com" 328 | - Effect: Allow 329 | Action: 330 | - s3:GetObject 331 | Resource: '*' 332 | 333 | LSFSpotFleetRole: 334 | Type: "AWS::IAM::Role" 335 | Properties: 336 | Description: Enables EC2 Spot Fleet to work on behalf of LSF Resource Connector 337 | Path: "/" 338 | AssumeRolePolicyDocument: 339 | Version: '2012-10-17' 340 | Statement: 341 | - 342 | Effect: Allow 343 | Principal: 344 | Service: 345 | - "spotfleet.amazonaws.com" 346 | Action: 347 | - "sts:AssumeRole" 348 | ManagedPolicyArns: 349 | - "arn:aws:iam::aws:policy/service-role/AmazonEC2SpotFleetTaggingRole" 350 | 351 | LSFComputeNodeRole: 352 | Type: "AWS::IAM::Role" 353 | Properties: 354 | Description: AWS service permissions for LSF compute nodes 355 | Path: "/" 356 | AssumeRolePolicyDocument: 357 | Version: '2012-10-17' 358 | Statement: 359 | - 360 | Effect: Allow 361 | Principal: 362 | Service: 363 | - "ec2.amazonaws.com" 364 | Action: 365 | - "sts:AssumeRole" 366 | ManagedPolicyArns: 367 | - "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy" 368 | - "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore" 369 | Policies: 370 | - PolicyName: DownloadS3Packages 371 | PolicyDocument: 372 | Version: 2012-10-17 373 | Statement: 374 | - Effect: Allow 375 | Action: 376 | - s3:GetObject 377 | Resource: '*' 378 | 379 | LSFMasterInstanceProfile: 380 | Type: "AWS::IAM::InstanceProfile" 381 | Properties: 382 | Path: "/" 383 | Roles: 384 | - !Ref LSFMasterRole 385 | 386 | LSFComputeNodeInstanceProfile: 387 | Type: "AWS::IAM::InstanceProfile" 388 | Properties: 389 | Path: "/" 390 | Roles: 391 | - !Ref LSFComputeNodeRole 392 | 393 | CloudWatchAgentConfiguration: 394 | Type: AWS::SSM::Parameter 395 | Properties: 396 | Description: SSM Parameter holding CloudWatchAgent configuration 397 | Name: !Sub ${AWS::StackName}-AmazonCloudWatch 398 | Type: String 399 | Value: | 400 | { 401 | "agent": { 402 | "metrics_collection_interval": 60, 403 | "logfile": "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log", 404 | "debug": false 405 | }, 406 | "logs": { 407 | "logs_collected": { 408 | "files": { 409 | "collect_list": [ 410 | { 411 | "file_path": "/var/log/lsf/aws-provider.log*", 412 | "log_group_name": "/var/log/lsf/rc/aws-provider.log", 413 | "log_stream_name": "{instance_id}" 414 | }, 415 | { 416 | "file_path": "/var/log/lsf/aws-provider.log*", 417 | "log_group_name": "/var/log/lsf/rc/ebrokerd", 418 | "log_stream_name": "{instance_id}" 419 | }, 420 | { 421 | "file_path": "/var/log/lsf/Install.log*", 422 | "log_group_name": "/var/log/lsf/Install.log", 423 | "log_stream_name": "{instance_id}" 424 | }, 425 | { 426 | "file_path": "/var/log/lsf/lim.log*", 427 | "log_group_name": "/var/log/lsf/lim.log", 428 | "log_stream_name": "{instance_id}" 429 | }, 430 | { 431 | "file_path": "/var/log/lsf/mbatchd.log*", 432 | "log_group_name": "/var/log/lsf/mbatchd.log", 433 | "log_stream_name": "{instance_id}" 434 | }, 435 | { 436 | "file_path": "/var/log/lsf/mbschd.log*", 437 | "log_group_name": "/var/log/lsf/mbschd.log", 438 | "log_stream_name": "{instance_id}" 439 | }, 440 | { 441 | "file_path": "/var/log/lsf/mosquitto.log*", 442 | "log_group_name": "/var/log/lsf/mosquitto.log", 443 | "log_stream_name": "{instance_id}" 444 | }, 445 | { 446 | "file_path": "/var/log/lsf/pim.log*", 447 | "log_group_name": "/var/log/lsf/pim.log", 448 | "log_stream_name": "{instance_id}" 449 | }, 450 | { 451 | "file_path": "/var/log/lsf/res.log*", 452 | "log_group_name": "/var/log/lsf/res.log", 453 | "log_stream_name": "{instance_id}" 454 | }, 455 | { 456 | "file_path": "/var/log/lsf/sbatchd.log*", 457 | "log_group_name": "/var/log/lsf/sbatchd.log", 458 | "log_stream_name": "{instance_id}" 459 | }, 460 | { 461 | "file_path": "/var/log/user-data.log*", 462 | "log_group_name": "/var/log/lsf/user-data.log", 463 | "log_stream_name": "{instance_id}" 464 | } 465 | ] 466 | } 467 | } 468 | }, 469 | "metrics": { 470 | "namespace": "CWAgent", 471 | "append_dimensions": { 472 | "InstanceId": "${aws:InstanceId}" 473 | }, 474 | "metrics_collected": { 475 | "mem": { 476 | "measurement": ["mem_used_percent"], 477 | "metrics_collection_interval": 60 478 | } 479 | } 480 | } 481 | } 482 | 483 | AmazonCloudWatchSetConfig: 484 | Type: AWS::SSM::Association 485 | Properties: 486 | ApplyOnlyAtCronInterval: false 487 | AssociationName: AmazonCloudWatchSetConfig 488 | Name: AmazonCloudWatch-ManageAgent 489 | Parameters: 490 | action: 491 | - configure 492 | mode: 493 | - ec2 494 | optionalConfigurationLocation: 495 | - !Ref CloudWatchAgentConfiguration 496 | optionalConfigurationSource: 497 | - ssm 498 | optionalRestart: 499 | - "yes" 500 | ScheduleExpression: cron(0 */30 * * * ? *) 501 | Targets: 502 | - Key: tag:Name 503 | Values: 504 | - !Join [ '-', [ 'LSF Mgmt Host',!Ref LSFClusterName ] ] 505 | 506 | InstallCloudWatch: 507 | Type: AWS::SSM::Association 508 | Properties: 509 | Name: AWS-ConfigureAWSPackage 510 | ApplyOnlyAtCronInterval: false 511 | AssociationName: InstallUpgradeCloudWatch 512 | Parameters: 513 | action: 514 | - Install 515 | additionalArguments: 516 | - "{}" 517 | installationType: 518 | - Uninstall and reinstall 519 | name: 520 | - AmazonCloudWatchAgent 521 | ScheduleExpression: cron(0 */30 * * * ? *) 522 | Targets: 523 | - Key: tag:Name 524 | Values: 525 | - !Join [ '-', [ 'LSF Mgmt Host',!Ref LSFClusterName ] ] 526 | 527 | LaunchTemplate: 528 | Type: AWS::EC2::LaunchTemplate 529 | Properties: 530 | LaunchTemplateName: !Sub ${AWS::StackName}-launch-template 531 | LaunchTemplateData: 532 | ImageId: !If 533 | - UseAmazonLinux2AMI 534 | - Ref: LatestAmazonLinux2Ami 535 | - !FindInMap [ RegionMap, !Ref "AWS::Region", !Ref ComputeAMI ] 536 | KeyName: !Ref AdminKeyPair 537 | IamInstanceProfile: 538 | Arn: !GetAtt LSFComputeNodeInstanceProfile.Arn 539 | SecurityGroupIds: 540 | - Fn::ImportValue: !Join [ '-', [ !Ref LSFClusterName,"LSFComputeNodeSG" ] ] -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/templates/03-dcv-login-server.yaml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: 2010-09-09 2 | Description: | 3 | This CloudFormation deploys a login/remote desktop server. 4 | This host will be a submission client to the LSF cluster. 5 | 6 | **WARNING** This template creates AWS resources. 7 | You will be billed for the AWS resources used if you create a stack from this template. 8 | 9 | Metadata: 10 | License: 11 | Description: | 12 | Copyright 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. 13 | 14 | Permission is hereby granted, free of charge, to any person obtaining a copy of 15 | this software and associated documentation files (the "Software"), to deal in 16 | the Software without restriction, including without limitation the rights to 17 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 18 | the Software, and to permit persons to whom the Software is furnished to do so. 19 | 20 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 21 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 22 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 23 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 24 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 25 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.' 26 | 27 | Mappings: 28 | RegionMap: 29 | us-east-1: 30 | CentOS75: ami-9887c6e7 31 | FPGADev: ami-0cf12acd587e51b42 32 | ALinux2: not-used-dynamic-lookup-via-ssm 33 | us-east-2: 34 | CentOS75: ami-0f2b4fc905b0bd1f1 35 | FPGADev: ami-0f522eea547ffbdde 36 | ALinux2: not-used-dynamic-lookup-via-ssm 37 | us-west-1: 38 | CentOS75: ami-074e2d6769f445be5 39 | FPGADev: ami-02ed13c760b58790d 40 | ALinux2: not-used-dynamic-lookup-via-ssm 41 | us-west-2: 42 | CentOS75: ami-3ecc8f46 43 | FPGADev: ami-00736db43ba03656a 44 | ALinux2: not-used-dynamic-lookup-via-ssm 45 | eu-west-1: # Dublin 46 | CentOS75: ami-3548444c 47 | FPGADev: ami-01f373e791bb05667 48 | ALinux2: not-used-dynamic-lookup-via-ssm 49 | ap-southeast-1: # Singapore 50 | CentOS75: ami-8e0205f2 51 | FPGADev: ami-0d2658414ef6f29cf 52 | ALinux2: not-used-dynamic-lookup-via-ssm 53 | ap-southeast-2: # Sydney 54 | CentOS75: ami-d8c21dba 55 | FPGADev: ami-0651d0a596bb7c014 56 | ALinux2: not-used-dynamic-lookup-via-ssm 57 | ALinux2: not-used-dynamic-lookup-via-ssm 58 | ap-northeast-2: # Seoul 59 | CentOS75: ami-06cf2a72dadf92410 60 | FPGADev: ami-03162ccf408e174a1 61 | ALinux2: not-used-dynamic-lookup-via-ssm 62 | ap-northeast-1: # Tokyo 63 | CentOS75: ami-045f38c93733dd48d 64 | FPGADev: ami-051c91d3186bfdb7d 65 | ALinux2: not-used-dynamic-lookup-via-ssm 66 | 67 | 68 | Parameters: 69 | AdminKeyPair: 70 | Description: "Name of an existing EC2 KeyPair to enable SSH access to this instance." 71 | Type: "AWS::EC2::KeyPair::KeyName" 72 | Default: "morrmt" 73 | AllowedPattern: ".+" 74 | LoginServerInstanceType: 75 | Description: "The desired instance type for this instance." 76 | Type: "String" 77 | Default: "m5.xlarge" 78 | AllowedValues: 79 | - t3.medium 80 | - t3.xlarge 81 | - m4.xlarge 82 | - m4.2xlarge 83 | - m5.xlarge 84 | - m5.2xlarge 85 | - m5.4xlarge 86 | - c5d.9xlarge 87 | LoginServerAMI: 88 | Description: "This should be the same AMI that is used for the compute nodes." 89 | Type: "String" 90 | Default: FPGADev 91 | AllowedValues: 92 | - ALinux2 93 | - CentOS75 94 | - FPGADev 95 | LSFClusterName: 96 | Default: LSFCluster 97 | Description: An environment name that will be prefixed to resource names 98 | Type: String 99 | LSFInstallPath: 100 | Description: "From NFS template. Shared NFS file system for installing LSF. Derive this from an Export or Parameter Store key." 101 | Type: "String" 102 | Default: "/tools/ibm/lsf" 103 | FileSystemMountPoint: 104 | Description: The local directory on which the NFS file system is mounted 105 | Type: String 106 | Default: /ec2-nfs 107 | AllowedPattern: ^/.+ 108 | ScratchDir: 109 | Description: The name for the runtime scratch data subdirectory 110 | Type: String 111 | Default: scratch 112 | AllowedPattern: ^.+ 113 | ProjectDir: 114 | Description: The name for the project design data subdirectory 115 | Type: String 116 | Default: proj 117 | AllowedPattern: ^.+ 118 | UserName: 119 | Default: simuser 120 | Description: User name for DCV remote desktop login. Default is "simuser". 121 | MinLength: '4' 122 | Type: String 123 | LatestAmazonLinux2Ami: 124 | Description: AMI (OS image) for the master server. 125 | Type: 'AWS::SSM::Parameter::Value' 126 | Default: '/aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2' 127 | 128 | Conditions: 129 | UseAmazonLinux2AMI: !Equals 130 | - !Ref LoginServerAMI 131 | - ALinux2 132 | 133 | Resources: 134 | InstanceWaitHandle: 135 | Type: AWS::CloudFormation::WaitConditionHandle 136 | 137 | InstanceWaitCondition: 138 | DependsOn: LoginServerInstance 139 | Properties: 140 | Handle: !Ref 'InstanceWaitHandle' 141 | Timeout: '3600' 142 | Type: AWS::CloudFormation::WaitCondition 143 | 144 | DCVCredentialsSecret: 145 | Type: AWS::SecretsManager::Secret 146 | Properties: 147 | Name: !Sub '${AWS::StackName}/DCVCredentialsSecret' 148 | GenerateSecretString: 149 | SecretStringTemplate: !Sub '{"username": "${UserName}"}' 150 | GenerateStringKey: password 151 | PasswordLength: 16 152 | ExcludeCharacters: '"@/\' 153 | 154 | LoginServerInstance: 155 | Type: "AWS::EC2::Instance" 156 | Properties: 157 | InstanceType: !Ref LoginServerInstanceType 158 | ImageId: !If 159 | - UseAmazonLinux2AMI 160 | - Ref: LatestAmazonLinux2Ami 161 | - !FindInMap [ RegionMap, !Ref "AWS::Region", !Ref LoginServerAMI ] 162 | SubnetId: 163 | Fn::ImportValue: !Join [ '-', [ !Ref LSFClusterName,"PublicSubnet" ] ] 164 | SecurityGroupIds: 165 | - Fn::ImportValue: !Join [ '-', [ !Ref LSFClusterName,"LoginServerSG" ] ] 166 | KeyName: !Ref AdminKeyPair 167 | IamInstanceProfile: !Ref LoginServerInstanceProfile 168 | Tags: 169 | - 170 | Key: "Name" 171 | Value: !Join [ '-', [ 'Login Server',!Ref LSFClusterName ] ] 172 | - 173 | Key: "Cluster" 174 | Value: !Ref LSFClusterName 175 | UserData: 176 | Fn::Base64: 177 | Fn::Sub: 178 | - | 179 | #!/bin/bash 180 | 181 | set -x 182 | exec > >(tee /var/log/user-data.log|logger -t user-data ) 2>&1 183 | 184 | echo "*** BEGIN LOGIN SERVER BOOTSTRAP - `/bin/date` ***" 185 | 186 | export LSF_INSTALL_DIR="${FileSystemMountPoint}${LSFInstallPath}/${LSFClusterName}" 187 | 188 | # Update AWS CLI 189 | pip3 -q install awscli 190 | 191 | # Set DCV username and password 192 | export SM_PASSWORD=`/usr/local/bin/aws secretsmanager get-secret-value \ 193 | --region ${AWS::Region} \ 194 | --secret-id ${DCVCredentialsSecret} \ 195 | --output text --query 'SecretString' \ 196 | | python -c 'import json, sys; print(json.load(sys.stdin)["password"])'` 197 | 198 | user_name="${UserName}" 199 | user_pass=$SM_PASSWORD 200 | 201 | 202 | my_wait_handle="${InstanceWaitHandle}" 203 | 204 | # Install SSM so we can use SSM Session Manager and avoid ssh logins. 205 | yum install -q -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm 206 | systemctl enable amazon-ssm-agent 207 | systemctl start amazon-ssm-agent 208 | 209 | ## Mount NFS file system for LSF install 210 | ## and create working directories 211 | 212 | mkdir -p ${FileSystemMountPoint} 213 | 214 | # mount points 215 | mount -t nfs -o rw,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 ${NfsDnsName}:/vol1 ${FileSystemMountPoint} 216 | 217 | #add to fstab 218 | echo "${NfsDnsName}:/vol1 ${FileSystemMountPoint} nfs nfsvers=3,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 0 0" >> \ 219 | /etc/fstab 220 | 221 | # create project and scratch directories 222 | mkdir -p ${FileSystemMountPoint}/{${ScratchDir},${ProjectDir}} \ 223 | && chmod 777 ${FileSystemMountPoint}/{${ScratchDir},${ProjectDir}} 224 | 225 | # Set up LSF envirionment 226 | echo "source $LSF_INSTALL_DIR/conf/profile.lsf" > /etc/profile.d/lsf.sh 227 | 228 | ### Install DCV ### 229 | echo "Installing DCV..." 230 | 231 | function install_prereqs { 232 | # Exclude non-responsive mirror. 233 | sed -i -e "s/^#exclude.*/exclude=mirror.es.its.nyu.edu/" /etc/yum/pluginconf.d/fastestmirror.conf 234 | sudo yum clean all 235 | sudo yum -y upgrade 236 | sudo yum -y groupinstall "GNOME Desktop" 237 | } 238 | 239 | function install_dcv { 240 | mkdir -p /tmp/dcv-inst.d 241 | pushd /tmp/dcv-inst.d 242 | rpm --import https://d1uj6qtbmh3dt5.cloudfront.net/NICE-GPG-KEY 243 | wget https://d1uj6qtbmh3dt5.cloudfront.net/2022.1/Servers/nice-dcv-2022.1-13300-el7-x86_64.tgz 244 | tar -xvzf nice-dcv*.tgz 245 | cd nice-dcv* 246 | yum -y install nice-dcv-server*.rpm \ 247 | nice-dcv-web-viewer*.rpm \ 248 | nice-xdcv*.rpm 249 | sed -i -e 's/#enable-quic-frontend=true/enable-quic-frontend=true/' /etc/dcv/dcv.conf 250 | popd 251 | } 252 | 253 | function add_user { 254 | 255 | user_name=${!user_name} 256 | user_pass=${!user_pass} 257 | 258 | groupadd ${!user_name} 259 | useradd -u 1501 -m -g ${!user_name} ${!user_name} 260 | echo "${!user_name}:${!user_pass}" | chpasswd 261 | echo "Created user ${!user_name}" 262 | 263 | } 264 | 265 | function cr_post_reboot { 266 | 267 | if [[ ! -d /opt/dcv-install ]]; then 268 | mkdir -p /opt/dcv-install 269 | fi 270 | 271 | cat < /opt/dcv-install/post_reboot.sh 272 | #!/usr/bin/env bash 273 | 274 | function stop_disable_svc() { 275 | systemctl stop \$1 276 | systemctl disable \$1 277 | } 278 | 279 | stop_disable_svc firewalld 280 | stop_disable_svc libvirtd 281 | sudo systemctl set-default multi-user.target 282 | #systemctl isolate multi-user.target 283 | #systemctl isolate graphical.target 284 | #DISPLAY=:0 XAUTHORITY=\$(ps aux | grep "X.*\\-auth" | grep -v grep | awk -F"-auth " '{print \$2}' | awk '{print \$1}') xhost | grep "SI:localuser:dcv$" 285 | dcv create-session --type=virtual --owner ${!user_name} --user ${!user_name} --gl off simuser 286 | dcv list-sessions 287 | 288 | my_wait_handle="${!my_wait_handle}" 289 | 290 | if [[ ! -f /tmp/wait-handle-sent ]]; then 291 | exit 0 292 | else 293 | wait_handle_status=\$(cat /tmp/wait-handle-sent) 294 | if [[ \${!wait_handle_status} == "true" ]]; then 295 | rm /tmp/wait-handle-sent 296 | exit 0 297 | elif [[ \${!wait_handle_status} == "false" && \${!my_wait_handle} != "" ]] ; then 298 | echo "Sending success to wait handle" 299 | curl -X PUT -H 'Content-Type:' --data-binary '{ "Status" : "SUCCESS", "Reason" : "instance launched", "UniqueId" : "inst001", "Data" : "instance launched."}' "\${!my_wait_handle}" 300 | echo "true" > /tmp/wait-handle-sent 301 | fi 302 | fi 303 | 304 | EOF 305 | 306 | chmod 744 /opt/dcv-install/post_reboot.sh 307 | 308 | } 309 | 310 | function cr_service { 311 | 312 | cat < /etc/systemd/system/post-reboot.service 313 | [Unit] 314 | Description=Post reboot service 315 | 316 | [Service] 317 | ExecStart=/opt/dcv-install/post_reboot.sh 318 | 319 | [Install] 320 | WantedBy=multi-user.target 321 | EOF 322 | 323 | chmod 664 /etc/systemd/system/post-reboot.service 324 | systemctl daemon-reload 325 | systemctl enable post-reboot.service 326 | 327 | } 328 | 329 | function stop_disable_svc() { 330 | systemctl stop $1 331 | systemctl disable $1 332 | } 333 | 334 | function main { 335 | 336 | install_prereqs 337 | install_dcv 338 | add_user 339 | cr_post_reboot 340 | cr_service 341 | 342 | systemctl enable dcvserver 343 | echo "false" > /tmp/wait-handle-sent 344 | stop_disable_svc firewalld 345 | stop_disable_svc libvirtd 346 | echo "*** END LOGIN SERVER BOOTSTRAP - `/bin/date` ***" 347 | echo "Rebooting" 348 | reboot 349 | 350 | } 351 | 352 | main 353 | 354 | ### End Install DCV ### 355 | 356 | - 357 | NfsDnsName: 358 | Fn::ImportValue: !Join [ '-', [ !Ref LSFClusterName,"NfsDnsName" ] ] 359 | 360 | LoginServerRole: 361 | Type: "AWS::IAM::Role" 362 | Properties: 363 | Path: "/" 364 | AssumeRolePolicyDocument: 365 | Version: '2012-10-17' 366 | Statement: 367 | - 368 | Effect: Allow 369 | Principal: 370 | Service: 371 | - "ec2.amazonaws.com" 372 | Action: 373 | - "sts:AssumeRole" 374 | ManagedPolicyArns: 375 | - "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy" 376 | - "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore" 377 | - "arn:aws:iam::aws:policy/SecretsManagerReadWrite" 378 | Policies: 379 | - PolicyName: DcvLicenseBucketPolicy 380 | PolicyDocument: 381 | Version: 2012-10-17 382 | Statement: 383 | - Effect: Allow 384 | Action: 385 | - s3:GetObject 386 | Resource: arn:aws:s3:::dcv-license.us-east-1/* 387 | 388 | LoginServerInstanceProfile: 389 | Type: "AWS::IAM::InstanceProfile" 390 | Properties: 391 | Path: "/" 392 | Roles: 393 | - !Ref LoginServerRole 394 | 395 | Outputs: 396 | LoginServerPublicIp: 397 | Description: Login Server Public IP 398 | Value: !GetAtt LoginServerInstance.PublicIp 399 | DCVConnectionLink: 400 | Description: Connect to the DCV Remote Desktop with this URL 401 | Value: !Sub 'https://${LoginServerInstance.PublicIp}:8443' 402 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/templates/efs-filesystem.yaml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: "2010-09-09" 2 | Description: | 3 | This template deploys an EFS file system for the LSF binaries and logs. 4 | 5 | **WARNING** This template creates an EFS file system. 6 | You will be billed for the AWS resources used if you create a stack from this template." 7 | 8 | Metadata: 9 | Authors: 10 | Description: Matt Morris (morrmt@amazon.com) 11 | License: 12 | Description: | 13 | Copyright 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. 14 | 15 | Permission is hereby granted, free of charge, to any person obtaining a copy of 16 | this software and associated documentation files (the "Software"), to deal in 17 | the Software without restriction, including without limitation the rights to 18 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 19 | the Software, and to permit persons to whom the Software is furnished to do so. 20 | 21 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 22 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 23 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 24 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 25 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 26 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.' 27 | 28 | Parameters: 29 | LSFClusterName: 30 | Default: LSFCluster 31 | Description: |- 32 | An environment name that will be prefixed to resource names. 33 | Should be equal to the value in the network stack. 34 | Type: String 35 | EFSPerformanceMode: 36 | Description: "Select the performance mode of the file system." 37 | Type: "String" 38 | AllowedValues: 39 | - generalPurpose 40 | - maxIO 41 | Default: generalPurpose 42 | EFSThroughputMode: 43 | Type: "String" 44 | AllowedValues: 45 | - bursting 46 | - elastic 47 | - provisioned 48 | Default: "elastic" 49 | 50 | Resources: 51 | LSFElasticFileSystem: 52 | Type: AWS::EFS::FileSystem 53 | Properties: 54 | PerformanceMode: !Ref EFSPerformanceMode 55 | ThroughputMode: !Ref EFSThroughputMode 56 | FileSystemTags: 57 | - Key: Name 58 | Value: !Ref 'AWS::StackName' 59 | 60 | LSFEFSMountTarget: 61 | Type: "AWS::EFS::MountTarget" 62 | Properties: 63 | FileSystemId: !Ref LSFElasticFileSystem 64 | SubnetId: 65 | Fn::ImportValue: !Sub '${LSFClusterName}-PrivateSubnet' 66 | SecurityGroups: 67 | - Fn::ImportValue: !Sub '${LSFClusterName}-NfsSG' 68 | 69 | Outputs: 70 | MountTargetID: 71 | Description: "Mount target ID" 72 | Value: !Ref LSFEFSMountTarget 73 | FileSystemID: 74 | Description: "File system ID" 75 | Value: !Ref LSFElasticFileSystem 76 | EfsNDnsName: 77 | Description: NFS mount command for the EFS filesystem 78 | Value: !Sub "${LSFElasticFileSystem}.efs.${AWS::Region}.amazonaws.com" 79 | Export: 80 | Name: !Join [ '-', [ !Ref LSFClusterName, "NfsDnsName" ] ] -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/templates/fsxn-filesystem.yaml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: 2010-09-09 2 | Description: | 3 | This stack deploys a single-AZ Amazon FSx for NetApp ONTAP file system. 4 | 5 | **WARNING** This template creates AWS resources. 6 | You will be billed for the AWS resources used if you create a stack from this template. 7 | 8 | Parameters: 9 | 10 | LSFClusterName: 11 | Description: "The name of the LSF cluster." 12 | Type: "String" 13 | Default: "LSFCluster" 14 | 15 | Resources: 16 | 17 | FSxOntapFS: 18 | Type: "AWS::FSx::FileSystem" 19 | Properties: 20 | FileSystemType: "ONTAP" 21 | StorageType: SSD 22 | StorageCapacity: 1024 23 | SubnetIds: 24 | - Fn::ImportValue: !Join [ '-', [ !Ref LSFClusterName,"PrivateSubnet" ] ] 25 | SecurityGroupIds: 26 | - Fn::ImportValue: !Join [ '-', [ !Ref LSFClusterName,"NfsSG" ] ] 27 | OntapConfiguration: 28 | DeploymentType: "SINGLE_AZ_1" 29 | AutomaticBackupRetentionDays: 0 30 | PreferredSubnetId: 31 | Fn::ImportValue: !Join [ '-', [ !Ref LSFClusterName,"PrivateSubnet" ] ] 32 | ThroughputCapacity: 512 33 | DiskIopsConfiguration: 34 | Iops: 10000 35 | Mode: "USER_PROVISIONED" 36 | FsxAdminPassword: !Join ['', ['{{resolve:secretsmanager:', !Ref FSxAdminSecret, ':SecretString:password}}' ]] 37 | Tags: 38 | - Key: "Name" 39 | Value: !Join [ '-', [ !Ref LSFClusterName,"FSxN-FS" ] ] 40 | FSxOntapStorageVirtualMachine: 41 | Type: "AWS::FSx::StorageVirtualMachine" 42 | Properties: 43 | FileSystemId: !Ref FSxOntapFS 44 | Name: "svm1" 45 | RootVolumeSecurityStyle: "UNIX" 46 | Tags: 47 | - Key: "Name" 48 | Value: !Join [ '-', [ !Ref LSFClusterName,"FSxN-SVM" ] ] 49 | FSxOntapVolume: 50 | Type: "AWS::FSx::Volume" 51 | Properties: 52 | Name: "vol1" 53 | VolumeType: "ONTAP" 54 | OntapConfiguration: 55 | JunctionPath: "/vol1" 56 | SecurityStyle: "UNIX" 57 | SizeInMegabytes: 512000 58 | StorageEfficiencyEnabled: False 59 | StorageVirtualMachineId: !Ref FSxOntapStorageVirtualMachine 60 | TieringPolicy: 61 | CoolingPeriod: 41 62 | Name: "AUTO" 63 | Tags: 64 | - Key: "Name" 65 | Value: !Join [ '-', [ !Ref LSFClusterName,"FSxN-vol1" ] ] 66 | 67 | FSxAdminSecret: 68 | Type: AWS::SecretsManager::Secret 69 | Properties: 70 | Name: !Sub '${AWS::StackName}/FSxCredentialsSecret' 71 | GenerateSecretString: 72 | SecretStringTemplate: '{"username": "fsxadmin"}' 73 | GenerateStringKey: password 74 | PasswordLength: 16 75 | RequireEachIncludedType: True 76 | ExcludeCharacters: '"@/\!' 77 | 78 | Outputs: 79 | FSxNDnsName: 80 | Description: FSxN file system SVM DNS name 81 | Value: !Join [ '', [ !Ref FSxOntapStorageVirtualMachine, ".", !Ref FSxOntapFS, !Sub ".fsx.${AWS::Region}.amazonaws.com" ] ] 82 | Export: 83 | Name: !Join [ '-', [ !Ref LSFClusterName, "NfsDnsName" ] ] 84 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/templates/license-server.yaml: -------------------------------------------------------------------------------- 1 | # Start up license server instance and attach flex ENI to it. 2 | # Create ENI -- Type: AWS::EC2::NetworkInterface. Name flexlic01, set term protection. 3 | # Use user-data script to attach and configure ENI 4 | # Based on https://github.com/aws-quickstart/quickstart-vfx-ise/blob/develop/templates/license-server.template 5 | --- 6 | AWSTemplateFormatVersion: 2010-09-09 7 | Description: Deploys an HA license server 8 | Metadata: 9 | Authors: 10 | Description: Matt Morris (morrmt@amazon.com) 11 | License: 12 | Description: | 13 | Copyright 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. 14 | 15 | Permission is hereby granted, free of charge, to any person obtaining a copy of 16 | this software and associated documentation files (the "Software"), to deal in 17 | the Software without restriction, including without limitation the rights to 18 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 19 | the Software, and to permit persons to whom the Software is furnished to do so. 20 | 21 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 22 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 23 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 24 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 25 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 26 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.' 27 | AWS::CloudFormation::Interface: 28 | ParameterGroups: 29 | - Label: 30 | default: Region Configuration 31 | Parameters: 32 | - pAvailabilityZones 33 | 34 | - Label: 35 | default: Network (existing Management VPC config) 36 | Parameters: 37 | - pManagementVPC 38 | - pMgmtAppPrivateSubnetA 39 | 40 | - Label: 41 | default: License Server Configuration 42 | Parameters: 43 | - pLicenseServerInstanceType 44 | - pLicenseServerAmi 45 | - pLicenseServerMinCapacity 46 | - pLicenseServerDesiredCapacity 47 | - pLicenseServerMaxCapacity 48 | 49 | Parameters: 50 | pAvailabilityZones: 51 | Description: The list of Availability Zones to use for the subnets in the VPC. 52 | This template uses two Availability Zones from your list and preserves the logical order you specify. 53 | Type: List 54 | 55 | pManagementVPC: 56 | Description: Management VPC 57 | Type: AWS::EC2::VPC::Id 58 | 59 | pMgmtAppPrivateSubnetA: 60 | Description: License Server Subnet A 61 | Type: AWS::EC2::Subnet::Id 62 | 63 | pLicenseServerInstanceType: 64 | Description: Instance type for the license server 65 | Type: String 66 | 67 | pLicenseServerAmi: 68 | Description: Which License Server AMI do you want to use? 69 | Type: AWS::EC2::Image::Id 70 | 71 | pLicenseServerMinCapacity: 72 | Description: The minimum number of instances that can run in your auto scale group 73 | Type: String 74 | 75 | pLicenseServerDesiredCapacity: 76 | Description: The desired capacity must be less than or equal to the maximum capacity 77 | Type: String 78 | 79 | pLicenseServerMaxCapacity: 80 | Description: The maximum number of instances that you can run in your auto scale group 81 | Type: String 82 | 83 | pEnvironment: 84 | AllowedValues: 85 | - DEV 86 | - TEST 87 | - PROD 88 | Default: DEV 89 | Description: Environment (Dev, Test or Prod) 90 | Type: String 91 | 92 | Resources: 93 | rLicenseServerSecurityGroup: 94 | Type: AWS::EC2::SecurityGroup 95 | Properties: 96 | GroupDescription: Security group for License server Instances 97 | VpcId: !Ref pManagementVPC 98 | Tags: 99 | - Key: Name 100 | Value: license-server-sg 101 | - Key: Environment 102 | Value: !Ref pEnvironment 103 | 104 | rLicenseServerInstanceRole: 105 | Type: AWS::IAM::Role 106 | Properties: 107 | AssumeRolePolicyDocument: 108 | Version: 2012-10-17 109 | Statement: 110 | - Effect: Allow 111 | Principal: 112 | Service: 113 | - ec2.amazonaws.com 114 | Action: 115 | - sts:AssumeRole 116 | Path: / 117 | Policies: 118 | - PolicyName: AttachENi 119 | PolicyDocument: 120 | Version: 2012-10-17 121 | Statement: 122 | - Effect: Allow 123 | Action: 124 | - ec2:AttachNetworkInterface 125 | Resource: '*' 126 | ManagedPolicyArns: 127 | - arn:aws:iam::aws:policy/service-role/AmazonEC2RoleforSSM 128 | - arn:aws:iam::aws:policy/AmazonEC2ReadOnlyAccess 129 | 130 | rLicenseServerInstanceProfile: 131 | Type: AWS::IAM::InstanceProfile 132 | Properties: 133 | Path: / 134 | Roles: 135 | - !Ref rLicenseServerInstanceRole 136 | 137 | rLicenseServerLogGroup: 138 | Type: AWS::Logs::LogGroup 139 | Properties: 140 | RetentionInDays: 90 141 | 142 | rLicenseServerSecondaryEni: 143 | Type: AWS::EC2::NetworkInterface 144 | Properties: 145 | GroupSet: 146 | - !Ref rLicenseServerSecurityGroup 147 | SubnetId: !Ref pMgmtAppPrivateSubnetA 148 | Tags: 149 | - Key: Name 150 | Value: vfx-secondary-eni 151 | - Key: Environment 152 | Value: !Ref pEnvironment 153 | 154 | rLicenseServerLaunchConfiguration: 155 | Type: AWS::AutoScaling::LaunchConfiguration 156 | Metadata: 157 | AWS::CloudFormation::Init: 158 | config: 159 | packages: 160 | yum: 161 | jq: [] 162 | awslogs: [] 163 | files: 164 | /etc/awslogs/awslogs.conf: 165 | content: !Sub | 166 | [general] 167 | state_file = /var/lib/awslogs/agent-state 168 | 169 | [/var/log/messages] 170 | file = /var/log/messages 171 | log_group_name = ${rLicenseServerLogGroup} 172 | log_stream_name = %INSTANCE_ID/var/log/messages 173 | datetime_format = %b %d %H:%M:%S 174 | initial_position = start_of_file 175 | /tmp/awslog_init.sh: 176 | content: !Sub | 177 | #!/bin/bash -xe 178 | INSTANCE_ID=$(curl -s http://instance-data/latest/meta-data/instance-id) 179 | sed -i "s|%INSTANCE_ID|$INSTANCE_ID|g" /etc/awslogs/awslogs.conf 180 | sed -i -e "s/region = us-east-1/region = ${AWS::Region}/g" /etc/awslogs/awscli.conf 181 | systemctl enable awslogsd.service 182 | systemctl start awslogsd.service 183 | mode: '755' 184 | commands: 185 | 01_check_data: 186 | command: !Sub | 187 | #!/bin/bash -x 188 | EC2_INSTANCE_ID=$(curl -s http://instance-data/latest/meta-data/instance-id) 189 | 190 | # Volume /dev/sdh (which will get created as /dev/xvdh on Amazon Linux) 191 | DATA_STATE="unknown" 192 | until [ "${!DATA_STATE}" == "attached" ]; do 193 | DATA_STATE=$(aws ec2 describe-volumes \ 194 | --region ${AWS::Region} \ 195 | --filters \ 196 | Name=attachment.instance-id,Values=${!EC2_INSTANCE_ID} \ 197 | Name=attachment.device,Values=/dev/sdh \ 198 | --query Volumes[].Attachments[].State \ 199 | --output text) 200 | sleep 5 201 | done 202 | 02_mkfs: 203 | command: | 204 | #!/bin/bash -x 205 | # Format /dev/xvdh if it does not contain a partition yet 206 | if [ "$(file -b -s /dev/xvdh)" == "data" ]; 207 | then mkfs -t ext4 /dev/xvdh 208 | fi 209 | 03_mkdir: 210 | command: mkdir -p /data 211 | 04_mount: 212 | command: mount /dev/xvdh /data 213 | 05_fstab: 214 | command: echo '/dev/xvdh /data ext4 defaults,nofail 0 2' >> /etc/fstab 215 | 06_attach_eni: 216 | command: !Sub | 217 | #!/bin/bash -x 218 | EC2_INSTANCE_ID=$(curl -s http://instance-data/latest/meta-data/instance-id) 219 | aws ec2 attach-network-interface --network-interface-id ${rLicenseServerSecondaryEni} --instance-id $EC2_INSTANCE_ID --device-index 1 --region ${AWS::Region} 220 | 07_awslogs: 221 | command: /tmp/awslog_init.sh 222 | Properties: 223 | AssociatePublicIpAddress: false 224 | ImageId: !Ref pLicenseServerAmi 225 | IamInstanceProfile: !Ref rLicenseServerInstanceProfile 226 | InstanceType: !Ref pLicenseServerInstanceType 227 | BlockDeviceMappings: 228 | - DeviceName: /dev/sdh 229 | Ebs: 230 | VolumeSize: 10 231 | VolumeType: gp2 232 | Encrypted: true 233 | SecurityGroups: 234 | - !Ref rLicenseServerSecurityGroup 235 | UserData: 236 | Fn::Base64: !Sub | 237 | #!/bin/bash -x 238 | yum update --security -y 239 | yum update aws-cfn-bootstrap -y 240 | 241 | /opt/aws/bin/cfn-init -v --stack ${AWS::StackName} --resource rLicenseServerLaunchConfiguration --region ${AWS::Region} 242 | # Signal the status from cfn-init 243 | /opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource rLicenseServerAutoScalingGroup --region ${AWS::Region} 244 | 245 | rLicenseServerAutoScalingGroup: 246 | Type: AWS::AutoScaling::AutoScalingGroup 247 | Properties: 248 | AvailabilityZones: 249 | - !Select [0, !Ref pAvailabilityZones] 250 | VPCZoneIdentifier: 251 | - !Ref pMgmtAppPrivateSubnetA 252 | LaunchConfigurationName: !Ref rLicenseServerLaunchConfiguration 253 | MinSize: !Ref pLicenseServerMinCapacity 254 | DesiredCapacity: !Ref pLicenseServerDesiredCapacity 255 | MaxSize: !Ref pLicenseServerMaxCapacity 256 | HealthCheckType: EC2 257 | HealthCheckGracePeriod: 0 258 | Tags: 259 | - Key: Name 260 | Value: vfx-license-server 261 | PropagateAtLaunch: true 262 | - Key: Environment 263 | Value: !Ref pEnvironment 264 | PropagateAtLaunch: true 265 | CreationPolicy: 266 | ResourceSignal: 267 | Count: 1 268 | Timeout: 'PT5M' 269 | UpdatePolicy: 270 | AutoScalingRollingUpdate: 271 | MinInstancesInService: 0 272 | MaxBatchSize: 1 273 | PauseTime: 'PT2M' 274 | WaitOnResourceSignals: true 275 | AutoScalingReplacingUpdate: 276 | WillReplace: true 277 | 278 | Outputs: 279 | rLicenseServerSecurityGroup: 280 | Value: !Ref rLicenseServerSecurityGroup 281 | Export: 282 | Name: eLicenseServerSecurityGroup 283 | 284 | rLicenseServerInstanceRole: 285 | Value: !Ref rLicenseServerInstanceRole 286 | 287 | rLicenseServerAutoScalingGroup: 288 | Value: !Ref rLicenseServerAutoScalingGroup 289 | 290 | rLicenseServerLaunchConfiguration: 291 | Value: !Ref rLicenseServerLaunchConfiguration 292 | 293 | -------------------------------------------------------------------------------- /workshops/eda-workshop-lsf/templates/secrets.yaml: -------------------------------------------------------------------------------- 1 | 2 | AWSTemplateFormatVersion: 2010-09-09 3 | 4 | Parameters: 5 | LSFClusterName: 6 | Description: "The name of the LSF cluster." 7 | Type: "String" 8 | Default: "cde-1" 9 | 10 | Resources: 11 | DCVCredentialsSecret: 12 | Type: AWS::SecretsManager::Secret 13 | Properties: 14 | Name: !Sub '${AWS::StackName}/DCVCredentialsSecret' 15 | GenerateSecretString: 16 | SecretStringTemplate: '{"username": "simuser"}' 17 | GenerateStringKey: "password" 18 | PasswordLength: 16 19 | ExcludeCharacters: '"@/\' 20 | 21 | FSxCredentialsSecret: 22 | Type: AWS::SecretsManager::Secret 23 | Properties: 24 | Name: !Sub '${AWS::StackName}/FSxCredentialsSecret' 25 | GenerateSecretString: 26 | SecretStringTemplate: '{"username": "fsxadmin"}' 27 | GenerateStringKey: "password" 28 | PasswordLength: 16 29 | RequireEachIncludedType: True 30 | 31 | Outputs: 32 | DCVCredentialsSecretArn: 33 | Description: DCV password in Secrets Manager 34 | Value: !Ref DCVCredentialsSecret 35 | Export: 36 | Name: !Join [ '-', [ !Ref LSFClusterName, "DCVSecretARN" ] ] 37 | 38 | FSxCredentialsSecretArn: 39 | Description: FSxN admin password in Secrets Manager 40 | Value: !Ref FSxCredentialsSecret 41 | Export: 42 | Name: !Join [ '-', [ !Ref LSFClusterName, "FSxNAdminSecretARN" ] ] 43 | -------------------------------------------------------------------------------- /workshops/eda-workshop-parallel-cluster/README.md: -------------------------------------------------------------------------------- 1 | Coming soon -------------------------------------------------------------------------------- /workshops/eda-workshop-soca/scripts/create_alwayson_nodes.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | source /etc/environment 4 | HOSTNAME=$(hostname | awk '{split($0,a,"."); print a[1]}') 5 | SCHEDULER_HOSTNAME=$(/opt/pbs/bin/qstat -Bf | grep "Server:" | awk '{print $2}') 6 | if [[ "$HOSTNAME" != "$SCHEDULER_HOSTNAME" ]]; then 7 | CMD_PREFIX="ssh $SCHEDULER_HOSTNAME" 8 | else 9 | CMD_PREFIX="" 10 | fi 11 | 12 | TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600") 13 | AMI_ID=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" -s http://169.254.169.254/latest/meta-data/ami-id) 14 | MAC_ID=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" -s http://169.254.169.254/latest/meta-data/network/interfaces/macs) 15 | SUBNET_ID=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" -s http://169.254.169.254/latest/meta-data/network/interfaces/macs/$MAC_ID/subnet-id) 16 | 17 | eval $CMD_PREFIX "/apps/soca/$SOCA_CONFIGURATION/python/latest/bin/python3 \ 18 | /apps/soca/$SOCA_CONFIGURATION/cluster_manager/add_nodes.py \ 19 | --ht_support true --instance_ami $AMI_ID --base_os \"centos7\" \ 20 | --root_size 10 --subnet_id $SUBNET_ID --instance_type t3.large \ 21 | --desired_capacity 1 --queue alwayson --job_owner $USER --job_name $USER \ 22 | --keep_forever false --terminate_when_idle 10" 23 | -------------------------------------------------------------------------------- /workshops/eda-workshop-soca/scripts/idea_custom_ami_imagebuilder.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -x 4 | if [[ $EUID -ne 0 ]]; then 5 | echo "Error: This script must be run as root" 6 | exit 1 7 | fi 8 | 9 | mkdir -p /root/bootstrap 10 | exec > >(tee /root/bootstrap/idea_preinstalled_packages.log ) 2>&1 11 | 12 | source /etc/os-release 13 | if [[ "$NAME" == "Red Hat Enterprise Linux Server" ]] && [[ "${VERSION_ID}" == "7.9" ]]; then 14 | OS="rhel7" 15 | elif [[ "$NAME" == "CentOS Linux" ]] && [[ "${VERSION_ID}" == "7" ]]; then 16 | OS="centos7" 17 | elif [[ "$NAME" == "Amazon Linux" ]] && [[ "${VERSION_ID}" == "2" ]]; then 18 | OS="amazonlinux2" 19 | else 20 | echo "Unsupported OS! NAME: $NAME, VERSION: ${VERSION_ID}" 21 | exit 22 | fi 23 | 24 | echo "Installing System packages" 25 | yum install -y wget deltarpm 26 | cd /root 27 | wget https://raw.githubusercontent.com/awslabs/scale-out-computing-on-aws/master/source/scripts/config.cfg 28 | source /root/config.cfg 29 | if [ $OS == "centos7" ]; then 30 | yum install -y epel-release 31 | yum install -y $(echo ${SYSTEM_PKGS[*]} ${SCHEDULER_PKGS[*]} ${OPENLDAP_SERVER_PKGS[*]} ${SSSD_PKGS[*]}) 32 | yum groupinstall -y "GNOME Desktop" 33 | elif [ $OS == "amazonlinux2" ]; then 34 | yum install -y epel-release 35 | yum install -y $(echo ${SYSTEM_PKGS[*]} ${SCHEDULER_PKGS[*]} ${OPENLDAP_SERVER_PKGS[*]} ${SSSD_PKGS[*]} ${DCV_AMAZONLINUX_PKGS[*]}) 36 | #amazon-linux-extras install mate-desktop1.x 37 | #bash -c 'echo PREFERRED=/usr/bin/mate-session > /etc/sysconfig/desktop' 38 | elif [ $OS == "rhel7" ]; then 39 | # Tested only on RHEL7.9 40 | yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm 41 | yum install -y $(echo ${SYSTEM_PKGS[*]} ${SCHEDULER_PKGS[*]}) --enablerepo rhel-7-server-rhui-optional-rpms 42 | yum install -y $(echo ${OPENLDAP_SERVER_PKGS[*]} ${SSSD_PKGS[*]}) 43 | yum groupinstall -y "Server with GUI" 44 | fi 45 | 46 | echo "Installing Packages typically needed for EDA applications" 47 | yum install -y vim vim-X11 xterm compat-db47 glibc glibc.i686 openssl098e compat-expat1.i686 dstat \ 48 | motif libXp libXaw libICE.i686 libpng.i686 libXau.i686 libuuid.i686 libSM.i686 libxcb.i686 \ 49 | plotutils libXext.i686 libXt.i686 libXmu.i686 libXp.i686 libXrender.i686 bzip2-libs.i686 \ 50 | freetype.i686 fontconfig.i686 libXft.i686 libjpeg-turbo.i686 motif.i686 apr.i686 libdb \ 51 | libdb.i686 libdb-utils apr-util.i686 libXp.i686 qt qt-x11 qtwebkit apr-util gnuplot \ 52 | libXScrnSaver tbb compat-libtiff3 arts SDL qt5-qtsvg 53 | 54 | #Install OpenPBS 55 | echo "Installing OpenPBS" 56 | OPENPBS_URL="https://github.com/openpbs/openpbs/archive/v22.05.11.tar.gz" 57 | OPENPBS_TGZ="v22.05.11.tar.gz" 58 | OPENPBS_VERSION="22.05.11" 59 | wget $OPENPBS_URL 60 | tar zxvf $OPENPBS_TGZ 61 | cd openpbs-$OPENPBS_VERSION 62 | ./autogen.sh 63 | ./configure PBS_VERSION=${OPENPBS_VERSION} --prefix=/opt/pbs 64 | local NUM_PROCS=`nproc --all` 65 | local MAKE_FLAGS="-j${NUM_PROCS}" 66 | make ${MAKE_FLAGS} 67 | make install ${MAKE_FLAGS} 68 | /opt/pbs/libexec/pbs_postinstall 69 | chmod 4755 /opt/pbs/sbin/pbs_iff /opt/pbs/sbin/pbs_rcp 70 | systemctl disable pbs 71 | systemctl disable libvirtd 72 | systemctl disable firewalld 73 | 74 | rm -rf /root/${OPENPBS_TGZ} /root/openpbs-${OPENPBS_VERSION} 75 | 76 | # Disable SELinux 77 | sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config 78 | 79 | # Install pip and awscli 80 | echo "Installing pip and awscli" 81 | yum install -y python3-pip 82 | PIP=$(which pip3) 83 | $PIP install awscli 84 | 85 | # Configure system limits 86 | echo "Configuring system limits" 87 | echo -e "net.core.somaxconn=65535 88 | net.ipv4.tcp_max_syn_backlog=163840 89 | net.core.rmem_default=31457280 90 | net.core.rmem_max=67108864 91 | net.core.wmem_default = 31457280 92 | net.core.wmem_max = 67108864 93 | fs.file-max=1048576 94 | fs.nr_open=1048576" >> /etc/sysctl.conf 95 | echo -e "* hard memlock unlimited 96 | * soft memlock unlimited 97 | * soft nproc 3061780 98 | * hard nproc 3061780 99 | * soft sigpending 3061780 100 | * hard sigpending 3061780 101 | * soft nofile 1048576 102 | * hard nofile 1048576" >> /etc/security/limits.conf 103 | echo -e "ulimit -l unlimited 104 | ulimit -u 3061780 105 | ulimit -i 3061780 106 | ulimit -n 1048576" >> /opt/pbs/lib/init.d/limits.pbs_mom 107 | 108 | # Install and configure Amazon CloudWatch Agent 109 | echo "Install and configure Amazon CloudWatch Agent" 110 | machine=$(uname -m) 111 | if [[ $machine == "x86_64" ]]; then 112 | yum install -y https://s3.amazonaws.com/amazoncloudwatch-agent/redhat/amd64/latest/amazon-cloudwatch-agent.rpm 113 | elif [[ $machine == "aarch64" ]]; then 114 | yum install -y https://s3.amazonaws.com/amazoncloudwatch-agent/redhat/arm64/latest/amazon-cloudwatch-agent.rpm 115 | fi 116 | echo -e "{ 117 | \"agent\": { 118 | \"metrics_collection_interval\": 60, 119 | \"run_as_user\": \"root\" 120 | }, 121 | \"metrics\": { 122 | \"append_dimensions\": { 123 | \"InstanceId\": \"$\{aws:InstanceId\}\", 124 | \"InstanceType\": \"$\{aws:InstanceType\}\" 125 | }, 126 | \"metrics_collected\": { 127 | \"cpu\": { 128 | \"measurement\": [ 129 | \"cpu_usage_idle\", 130 | \"cpu_usage_iowait\", 131 | \"cpu_usage_user\", 132 | \"cpu_usage_system\" 133 | ], 134 | \"metrics_collection_interval\": 60, 135 | \"totalcpu\": true 136 | }, 137 | \"mem\": { 138 | \"measurement\": [ 139 | \"mem_used_percent\" 140 | ], 141 | \"metrics_collection_interval\": 60 142 | }, 143 | \"netstat\": { 144 | \"measurement\": [ 145 | \"tcp_established\", 146 | \"tcp_time_wait\" 147 | ], 148 | \"metrics_collection_interval\": 60 149 | }, 150 | \"swap\": { 151 | \"measurement\": [ 152 | \"swap_used_percent\" 153 | ], 154 | \"metrics_collection_interval\": 60 155 | } 156 | } 157 | } 158 | }" > /opt/aws/amazon-cloudwatch-agent/bin/config.json 159 | sed -i 's/\(Instance.*\)\\{\(.*\)\\}/\1{\2}/g' /opt/aws/amazon-cloudwatch-agent/bin/config.json 160 | /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s 161 | 162 | # Install DCV 163 | echo "Install DCV" 164 | cd ~ 165 | machine=$(uname -m) 166 | DCV_X86_64_URL="https://d1uj6qtbmh3dt5.cloudfront.net/2022.1/Servers/nice-dcv-2022.1-13216-el7-x86_64.tgz" 167 | DCV_X86_64_TGZ="nice-dcv-2022.1-13216-el7-x86_64.tgz" 168 | DCV_X86_64_VERSION="2022.1-13216-el7-x86_64" 169 | if [[ $machine == "x86_64" ]]; then 170 | wget $DCV_X86_64_URL 171 | #if [[ $(md5sum $DCV_X86_64_TGZ | awk '{print $1}') != $DCV_X86_64_HASH ]]; then 172 | # echo -e "FATAL ERROR: Checksum for DCV failed. File may be compromised." > /etc/motd 173 | # exit 1 174 | #fi 175 | tar zxvf $DCV_X86_64_TGZ 176 | cd nice-dcv-$DCV_X86_64_VERSION 177 | elif [[ $machine == "aarch64" ]]; then 178 | DCV_URL=$(echo $DCV_AARCH64_URL | sed 's/x86_64/aarch64/') 179 | wget $DCV_AARCH64_URL 180 | if [[ $(md5sum $DCV_AARCH64_TGZ | awk '{print $1}') != $DCV_AARCH64_HASH ]]; then 181 | echo -e "FATAL ERROR: Checksum for DCV failed. File may be compromised." > /etc/motd 182 | exit 1 183 | fi 184 | DCV_TGZ=$(echo $DCV_AARCH64_TGZ | sed 's/x86_64/aarch64/') 185 | tar zxvf $DCV_AARCH64_TGZ 186 | DCV_VERSION=$(echo $DCV_AARCH64_VERSION | sed 's/x86_64/aarch64/') 187 | cd nice-dcv-$DCV_AARCH64_VERSION 188 | fi 189 | rpm -ivh nice-xdcv-*.${machine}.rpm --nodeps 190 | rpm -ivh nice-dcv-server*.${machine}.rpm --nodeps 191 | rpm -ivh nice-dcv-web-viewer-*.${machine}.rpm --nodeps 192 | 193 | rm -rf /root/${DCV_X86_64_TGZ} /root/nice-dcv* 194 | 195 | DCV_SESSION_MANAGER_AGENT_X86_64_URL="https://d1uj6qtbmh3dt5.cloudfront.net/2022.1/SessionManagerAgents/nice-dcv-session-manager-agent-2022.1.592-1.el7.x86_64.rpm" 196 | DCV_SESSION_MANAGER_AGENT_X86_64_VERSION="2022.1.592-1.el7.x86_64" 197 | 198 | echo "# installing dcv agent ..." 199 | if [[ $machine == "x86_64" ]]; then 200 | # x86_64 201 | AGENT_URL=${DCV_SESSION_MANAGER_AGENT_X86_64_URL} 202 | AGENT_VERSION=${DCV_SESSION_MANAGER_AGENT_X86_64_VERSION} 203 | else 204 | # aarch64 205 | AGENT_URL=${DCV_SESSION_MANAGER_AGENT_AARCH64_URL} 206 | AGENT_VERSION=${DCV_SESSION_MANAGER_AGENT_AARCH64_VERSION} 207 | fi 208 | 209 | wget ${AGENT_URL} 210 | yum install -y nice-dcv-session-manager-agent-${AGENT_VERSION}.rpm 211 | echo "# installing dcv agent complete ..." 212 | rm -rf nice-dcv-session-manager-agent-${AGENT_VERSION}.rpm 213 | 214 | echo "Installing microphone redirect..." 215 | yum install -y pulseaudio-utils 216 | 217 | echo "Creating post_reboot.sh script: /root/post_reboot.sh" 218 | echo -e "#!/bin/bash 219 | set -x 220 | exec > >(tee /root/bootstrap/post_reboot.sh.log ) 2>&1 221 | 222 | if [[ \$EUID -ne 0 ]]; then 223 | echo \"Error: This script must be run as root\" 224 | exit 1 225 | fi 226 | 227 | # Enable DCV support for USB remotization 228 | yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm 229 | yum install -y dkms 230 | DCVUSBDRIVERINSTALLER=$(which dcvusbdriverinstaller) 231 | $DCVUSBDRIVERINSTALLER --quiet 232 | 233 | echo \"Installing FSx lustre client\" 234 | kernel=\$(uname -r) 235 | machine=\$(uname -m) 236 | echo \"Found kernel version: \${kernel} running on: \${machine}\" 237 | if [ $OS == "centos7" ] || [ $OS == "rhel7" ]; then 238 | if [[ \$kernel == *\"3.10.0-957\"*\$machine ]]; then 239 | yum -y install https://downloads.whamcloud.com/public/lustre/lustre-2.10.8/el7/client/RPMS/x86_64/kmod-lustre-client-2.10.8-1.el7.x86_64.rpm 240 | yum -y install https://downloads.whamcloud.com/public/lustre/lustre-2.10.8/el7/client/RPMS/x86_64/lustre-client-2.10.8-1.el7.x86_64.rpm 241 | elif [[ \$kernel == *\"3.10.0-1062\"*\$machine ]]; then 242 | wget https://fsx-lustre-client-repo-public-keys.s3.amazonaws.com/fsx-rpm-public-key.asc -O /tmp/fsx-rpm-public-key.asc 243 | rpm --import /tmp/fsx-rpm-public-key.asc 244 | wget https://fsx-lustre-client-repo.s3.amazonaws.com/el/7/fsx-lustre-client.repo -O /etc/yum.repos.d/aws-fsx.repo 245 | sed -i 's#7#7.7#' /etc/yum.repos.d/aws-fsx.repo 246 | yum clean all 247 | yum install -y kmod-lustre-client lustre-client 248 | elif [[ \$kernel == *\"3.10.0-1127\"*\$machine ]]; then 249 | wget https://fsx-lustre-client-repo-public-keys.s3.amazonaws.com/fsx-rpm-public-key.asc -O /tmp/fsx-rpm-public-key.asc 250 | rpm --import /tmp/fsx-rpm-public-key.asc 251 | wget https://fsx-lustre-client-repo.s3.amazonaws.com/el/7/fsx-lustre-client.repo -O /etc/yum.repos.d/aws-fsx.repo 252 | sed -i 's#7#7.8#' /etc/yum.repos.d/aws-fsx.repo 253 | yum clean all 254 | yum install -y kmod-lustre-client lustre-client 255 | elif [[ \$kernel == *\"3.10.0-1160\"*\$machine ]]; then 256 | wget https://fsx-lustre-client-repo-public-keys.s3.amazonaws.com/fsx-rpm-public-key.asc -O /tmp/fsx-rpm-public-key.asc 257 | rpm --import /tmp/fsx-rpm-public-key.asc 258 | wget https://fsx-lustre-client-repo.s3.amazonaws.com/el/7/fsx-lustre-client.repo -O /etc/yum.repos.d/aws-fsx.repo 259 | yum clean all 260 | yum install -y kmod-lustre-client lustre-client 261 | elif [[ \$kernel == *\"4.18.0-193\"*\$machine ]]; then 262 | # FSX for Lustre on aarch64 is supported only on 4.18.0-193 263 | wget https://fsx-lustre-client-repo-public-keys.s3.amazonaws.com/fsx-rpm-public-key.asc -O /tmp/fsx-rpm-public-key.asc 264 | rpm --import /tmp/fsx-rpm-public-key.asc 265 | wget https://fsx-lustre-client-repo.s3.amazonaws.com/centos/7/fsx-lustre-client.repo -O /etc/yum.repos.d/aws-fsx.repo 266 | yum clean all 267 | yum install -y kmod-lustre-client lustre-client 268 | else 269 | echo \"ERROR: Can't install FSx for Lustre client as kernel version: ${kernel} isn't matching expected versions: (x86_64: 3.10.0-957, -1062, -1127, -1160, aarch64: 4.18.0-193)!\" 270 | fi 271 | elif [ $OS == "amazonlinux2" ]; then 272 | amazon-linux-extras install -y lustre2.10 273 | fi" > /root/post_reboot.sh 274 | chmod +x /root/post_reboot.sh 275 | 276 | echo "Creating /usr/local/sbin/cleanup_ami.sh" 277 | echo -e "#!/bin/bash 278 | rm -rf /var/tmp/* /tmp/* /var/crash/* 279 | rm -rf /var/lib/cloud/instances/* 280 | rm -f /var/lib/cloud/instance 281 | rm -rf /etc/ssh/ssh_host_* 282 | rm -f /etc/udev/rules.d/70-persistent-net.rules 283 | grep -l \"Created by cloud-init on instance boot automatically\" /etc/sysconfig/network-scripts/ifcfg-* | xargs rm -f 284 | " > /usr/local/sbin/cleanup_ami.sh 285 | chmod +x /usr/local/sbin/cleanup_ami.sh 286 | -------------------------------------------------------------------------------- /workshops/eda-workshop-soca/scripts/modify_group_name.ldif: -------------------------------------------------------------------------------- 1 | dn: cn=,ou=Group,dc=soca,dc=local 2 | changetype: modify 3 | replace: cn 4 | cn: 5 | -------------------------------------------------------------------------------- /workshops/eda-workshop-soca/scripts/modify_user_shell.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | echo "dn: uid=,ou=People,dc=soca,dc=local 4 | changetype: modify 5 | replace: loginShell 6 | loginShell: /bin/tcsh" > /root/modify_user_shell.ldif 7 | 8 | echo "Modify the username in /root/modify_user_shell.ldif" 9 | echo "Then run this command: ldapmodify -x -D cn=admin,dc=soca,dc=local -y /root/OpenLdapAdminPassword.txt -f /root/modify_user_shell.ldif" 10 | -------------------------------------------------------------------------------- /workshops/eda-workshop-soca/scripts/soca_custom_ami.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -x 4 | if [[ $EUID -ne 0 ]]; then 5 | echo "Error: This script must be run as root" 6 | exit 1 7 | fi 8 | 9 | exec > >(tee /root/soca_preinstalled_packages.log ) 2>&1 10 | 11 | OS_NAME=`awk -F= '/^NAME=/{print $2}' /etc/os-release` 12 | OS_VERSION=`awk -F= '/^VERSION_ID=/{print $2}' /etc/os-release` 13 | if [ "$OS_NAME" == "\"Red Hat Enterprise Linux Server\"" ] && [ "$OS_VERSION" == \"7.9\" ]; then 14 | OS="rhel7" 15 | elif [ "$OS_NAME" == "\"CentOS Linux\"" ] && [ "$OS_VERSION" == "\"7\"" ]; then 16 | OS="centos7" 17 | elif [ "$OS_NAME" == "\"Amazon Linux\"" ] && [ "$OS_VERSION" == "\"2\"" ]; then 18 | OS="amazonlinux2" 19 | else 20 | echo "Unsupported OS! OS_NAME: $OS_NAME, OS_VERSION: $OS_VERSION" 21 | exit 22 | fi 23 | 24 | echo "Installing System packages" 25 | yum install -y wget 26 | cd /root 27 | wget https://raw.githubusercontent.com/awslabs/scale-out-computing-on-aws/master/source/scripts/config.cfg 28 | source /root/config.cfg 29 | if [ $OS == "centos7" ]; then 30 | yum install -y epel-release 31 | yum install -y $(echo ${SYSTEM_PKGS[*]} ${SCHEDULER_PKGS[*]} ${OPENLDAP_SERVER_PKGS[*]} ${SSSD_PKGS[*]}) 32 | yum groupinstall -y "GNOME Desktop" 33 | elif [ $OS == "amazonlinux2" ]; then 34 | yum install -y epel-release 35 | yum install -y $(echo ${SYSTEM_PKGS[*]} ${SCHEDULER_PKGS[*]} ${OPENLDAP_SERVER_PKGS[*]} ${SSSD_PKGS[*]} ${DCV_AMAZONLINUX_PKGS[*]}) 36 | #amazon-linux-extras install mate-desktop1.x 37 | #bash -c 'echo PREFERRED=/usr/bin/mate-session > /etc/sysconfig/desktop' 38 | elif [ $OS == "rhel7" ]; then 39 | # Tested only on RHEL7.9 40 | yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm 41 | yum install -y $(echo ${SYSTEM_PKGS[*]} ${SCHEDULER_PKGS[*]}) --enablerepo rhel-7-server-rhui-optional-rpms 42 | yum install -y $(echo ${OPENLDAP_SERVER_PKGS[*]} ${SSSD_PKGS[*]}) 43 | yum groupinstall -y "Server with GUI" 44 | fi 45 | 46 | echo "Installing Packages typically needed for EDA applications" 47 | yum install -y vim vim-X11 xterm tcsh compat-db47 glibc glibc.i686 openssl098e compat-expat1.i686 dstat \ 48 | motif libXp libXaw libICE.i686 libpng.i686 libXau.i686 libuuid.i686 libSM.i686 libxcb.i686 \ 49 | plotutils libXext.i686 libXt.i686 libXmu.i686 libXp.i686 libXrender.i686 bzip2-libs.i686 \ 50 | freetype.i686 fontconfig.i686 libXft.i686 libjpeg-turbo.i686 motif.i686 apr.i686 libdb \ 51 | libdb.i686 libdb-utils apr-util.i686 libXp.i686 qt qt-x11 qtwebkit apr-util gnuplot \ 52 | libXScrnSaver tbb compat-libtiff3 arts SDL qt5-qtsvg 53 | 54 | #Install OpenPBS 55 | echo "Installing OpenPBS" 56 | wget $OPENPBS_URL 57 | tar zxvf $OPENPBS_TGZ 58 | cd openpbs-$OPENPBS_VERSION 59 | ./autogen.sh 60 | ./configure PBS_VERSION=${OPENPBS_VERSION} --prefix=/opt/pbs 61 | make -j6 62 | make install -j6 63 | /opt/pbs/libexec/pbs_postinstall 64 | chmod 4755 /opt/pbs/sbin/pbs_iff /opt/pbs/sbin/pbs_rcp 65 | systemctl disable pbs 66 | systemctl disable libvirtd 67 | systemctl disable firewalld 68 | 69 | # Disable SELinux 70 | sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config 71 | 72 | # Install EFA 73 | echo "Installing EFA" 74 | EFA_VERSION="1.29.0" 75 | EFA_TGZ="aws-efa-installer-1.29.0.tar.gz" 76 | EFA_URL="https://efa-installer.amazonaws.com/aws-efa-installer-1.29.0.tar.gz" 77 | EFA_HASH="39d06a002154d94cd982ed348133f385" 78 | cd /root/ 79 | curl --silent -O $EFA_URL 80 | if [[ $(md5sum $EFA_TGZ | awk '{print $1}') != $EFA_HASH ]]; then 81 | echo -e "FATAL ERROR: Checksum for EFA failed. File may be compromised." > /etc/motd 82 | exit 1 83 | fi 84 | tar -xf $EFA_TGZ 85 | cd aws-efa-installer 86 | /bin/bash efa_installer.sh -y 87 | 88 | # Install awscli 89 | cd ~ 90 | echo "Installing awscliv2" 91 | AWSCLI_X86_64_URL="https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" 92 | AWSCLI_AARCH64_URL="https://awscli.amazonaws.com/awscli-exe-linux-aarch64.zip" 93 | if [[ "$OS" == "amazonlinux2" ]]; then 94 | yum remove -y awscli 95 | fi 96 | machine=$(uname -m) 97 | if [[ $machine == "x86_64" ]]; then 98 | curl -s $AWSCLI_X86_64_URL -o "awscliv2.zip" 99 | elif [[ $machine == "aarch64" ]]; then 100 | curl -s $AWSCLI_AARCH64_URL -o "awscliv2.zip" 101 | fi 102 | which unzip > /dev/null 2>&1 103 | if [[ "$?" != "0" ]]; then 104 | yum install -y unzip 105 | fi 106 | unzip -q awscliv2.zip 107 | ./aws/install --bin-dir /bin --update 108 | 109 | # Configure system limits 110 | echo "Configuring system limits" 111 | echo -e "net.core.somaxconn=65535 112 | net.ipv4.tcp_max_syn_backlog=163840 113 | net.core.rmem_default=31457280 114 | net.core.rmem_max=67108864 115 | net.core.wmem_default = 31457280 116 | net.core.wmem_max = 67108864 117 | fs.file-max=1048576 118 | fs.nr_open=1048576" >> /etc/sysctl.conf 119 | echo -e "* hard memlock unlimited 120 | * soft memlock unlimited 121 | * soft nproc 3061780 122 | * hard nproc 3061780 123 | * soft sigpending 3061780 124 | * hard sigpending 3061780 125 | * soft nofile 1048576 126 | * hard nofile 1048576" >> /etc/security/limits.conf 127 | echo -e "ulimit -l unlimited 128 | ulimit -u 3061780 129 | ulimit -i 3061780 130 | ulimit -n 1048576" >> /opt/pbs/lib/init.d/limits.pbs_mom 131 | 132 | # Install and configure Amazon CloudWatch Agent 133 | echo "Install and configure Amazon CloudWatch Agent" 134 | machine=$(uname -m) 135 | if [[ $machine == "x86_64" ]]; then 136 | yum install -y https://s3.amazonaws.com/amazoncloudwatch-agent/redhat/amd64/latest/amazon-cloudwatch-agent.rpm 137 | elif [[ $machine == "aarch64" ]]; then 138 | yum install -y https://s3.amazonaws.com/amazoncloudwatch-agent/redhat/arm64/latest/amazon-cloudwatch-agent.rpm 139 | fi 140 | echo -e "{ 141 | \"agent\": { 142 | \"metrics_collection_interval\": 60, 143 | \"run_as_user\": \"root\" 144 | }, 145 | \"metrics\": { 146 | \"append_dimensions\": { 147 | \"InstanceId\": \"$\{aws:InstanceId\}\", 148 | \"InstanceType\": \"$\{aws:InstanceType\}\" 149 | }, 150 | \"metrics_collected\": { 151 | \"cpu\": { 152 | \"measurement\": [ 153 | \"cpu_usage_idle\", 154 | \"cpu_usage_iowait\", 155 | \"cpu_usage_user\", 156 | \"cpu_usage_system\" 157 | ], 158 | \"metrics_collection_interval\": 60, 159 | \"totalcpu\": true 160 | }, 161 | \"mem\": { 162 | \"measurement\": [ 163 | \"mem_used_percent\" 164 | ], 165 | \"metrics_collection_interval\": 60 166 | }, 167 | \"netstat\": { 168 | \"measurement\": [ 169 | \"tcp_established\", 170 | \"tcp_time_wait\" 171 | ], 172 | \"metrics_collection_interval\": 60 173 | }, 174 | \"swap\": { 175 | \"measurement\": [ 176 | \"swap_used_percent\" 177 | ], 178 | \"metrics_collection_interval\": 60 179 | } 180 | } 181 | } 182 | }" > /opt/aws/amazon-cloudwatch-agent/bin/config.json 183 | sed -i 's/\(Instance.*\)\\{\(.*\)\\}/\1{\2}/g' /opt/aws/amazon-cloudwatch-agent/bin/config.json 184 | /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s 185 | 186 | # Install DCV 187 | echo "Install DCV" 188 | cd ~ 189 | machine=$(uname -m) 190 | if [[ $machine == "x86_64" ]]; then 191 | wget $DCV_X86_64_URL 192 | if [[ $(md5sum $DCV_X86_64_TGZ | awk '{print $1}') != $DCV_X86_64_HASH ]]; then 193 | echo -e "FATAL ERROR: Checksum for DCV failed. File may be compromised." > /etc/motd 194 | exit 1 195 | fi 196 | tar zxvf $DCV_X86_64_TGZ 197 | cd nice-dcv-$DCV_X86_64_VERSION 198 | elif [[ $machine == "aarch64" ]]; then 199 | DCV_URL=$(echo $DCV_AARCH64_URL | sed 's/x86_64/aarch64/') 200 | wget $DCV_AARCH64_URL 201 | if [[ $(md5sum $DCV_AARCH64_TGZ | awk '{print $1}') != $DCV_AARCH64_HASH ]]; then 202 | echo -e "FATAL ERROR: Checksum for DCV failed. File may be compromised." > /etc/motd 203 | exit 1 204 | fi 205 | DCV_TGZ=$(echo $DCV_AARCH64_TGZ | sed 's/x86_64/aarch64/') 206 | tar zxvf $DCV_AARCH64_TGZ 207 | DCV_VERSION=$(echo $DCV_AARCH64_VERSION | sed 's/x86_64/aarch64/') 208 | cd nice-dcv-$DCV_AARCH64_VERSION 209 | fi 210 | rpm -ivh nice-xdcv-*.${machine}.rpm --nodeps 211 | rpm -ivh nice-dcv-server*.${machine}.rpm --nodeps 212 | rpm -ivh nice-dcv-web-viewer-*.${machine}.rpm --nodeps 213 | 214 | echo "Creating post_reboot.sh script: /root/post_reboot.sh" 215 | echo -e "#!/bin/bash 216 | 217 | set -x 218 | crontab -r 219 | if [[ \$EUID -ne 0 ]]; then 220 | echo \"Error: This script must be run as root\" 221 | exit 1 222 | fi 223 | 224 | # Enable DCV support for USB remotization 225 | yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm 226 | yum install -y dkms 227 | DCVUSBDRIVERINSTALLER=$(which dcvusbdriverinstaller) 228 | $DCVUSBDRIVERINSTALLER --quiet 229 | 230 | REQUIRE_REBOOT=0 231 | echo \"Installing FSx lustre client\" 232 | kernel=\$(uname -r) 233 | machine=\$(uname -m) 234 | echo \"Found kernel version: \${kernel} running on: \${machine}\" 235 | if [ $OS == "centos7" ] || [ $OS == "rhel7" ]; then 236 | if [[ \$kernel == *\"3.10.0-957\"*\$machine ]]; then 237 | yum -y install https://downloads.whamcloud.com/public/lustre/lustre-2.10.8/el7/client/RPMS/x86_64/kmod-lustre-client-2.10.8-1.el7.x86_64.rpm 238 | yum -y install https://downloads.whamcloud.com/public/lustre/lustre-2.10.8/el7/client/RPMS/x86_64/lustre-client-2.10.8-1.el7.x86_64.rpm 239 | REQUIRE_REBOOT=1 240 | elif [[ \$kernel == *\"3.10.0-1062\"*\$machine ]]; then 241 | wget https://fsx-lustre-client-repo-public-keys.s3.amazonaws.com/fsx-rpm-public-key.asc -O /tmp/fsx-rpm-public-key.asc 242 | rpm --import /tmp/fsx-rpm-public-key.asc 243 | wget https://fsx-lustre-client-repo.s3.amazonaws.com/el/7/fsx-lustre-client.repo -O /etc/yum.repos.d/aws-fsx.repo 244 | sed -i 's#7#7.7#' /etc/yum.repos.d/aws-fsx.repo 245 | yum clean all 246 | yum install -y kmod-lustre-client lustre-client 247 | REQUIRE_REBOOT=1 248 | elif [[ \$kernel == *\"3.10.0-1127\"*\$machine ]]; then 249 | wget https://fsx-lustre-client-repo-public-keys.s3.amazonaws.com/fsx-rpm-public-key.asc -O /tmp/fsx-rpm-public-key.asc 250 | rpm --import /tmp/fsx-rpm-public-key.asc 251 | wget https://fsx-lustre-client-repo.s3.amazonaws.com/el/7/fsx-lustre-client.repo -O /etc/yum.repos.d/aws-fsx.repo 252 | sed -i 's#7#7.8#' /etc/yum.repos.d/aws-fsx.repo 253 | yum clean all 254 | yum install -y kmod-lustre-client lustre-client 255 | REQUIRE_REBOOT=1 256 | elif [[ \$kernel == *\"3.10.0-1160\"*\$machine ]]; then 257 | wget https://fsx-lustre-client-repo-public-keys.s3.amazonaws.com/fsx-rpm-public-key.asc -O /tmp/fsx-rpm-public-key.asc 258 | rpm --import /tmp/fsx-rpm-public-key.asc 259 | wget https://fsx-lustre-client-repo.s3.amazonaws.com/el/7/fsx-lustre-client.repo -O /etc/yum.repos.d/aws-fsx.repo 260 | yum clean all 261 | yum install -y kmod-lustre-client lustre-client 262 | REQUIRE_REBOOT=1 263 | elif [[ \$kernel == *\"4.18.0-193\"*\$machine ]]; then 264 | # FSX for Lustre on aarch64 is supported only on 4.18.0-193 265 | wget https://fsx-lustre-client-repo-public-keys.s3.amazonaws.com/fsx-rpm-public-key.asc -O /tmp/fsx-rpm-public-key.asc 266 | rpm --import /tmp/fsx-rpm-public-key.asc 267 | wget https://fsx-lustre-client-repo.s3.amazonaws.com/centos/7/fsx-lustre-client.repo -O /etc/yum.repos.d/aws-fsx.repo 268 | yum clean all 269 | yum install -y kmod-lustre-client lustre-client 270 | REQUIRE_REBOOT=1 271 | else 272 | echo \"ERROR: Can't install FSx for Lustre client as kernel version: ${kernel} isn't matching expected versions: (x86_64: 3.10.0-957, -1062, -1127, -1160, aarch64: 4.18.0-193)!\" 273 | fi 274 | elif [ $OS == "amazonlinux2" ]; then 275 | amazon-linux-extras install -y lustre 276 | REQUIRE_REBOOT=1 277 | fi 278 | if [[ \$REQUIRE_REBOOT -eq 1 ]]; then 279 | echo \"Rebooting to load FSx for Lustre drivers!\" 280 | /sbin/reboot 281 | fi" > /root/post_reboot.sh 282 | chmod +x /root/post_reboot.sh 283 | 284 | echo "Creating /usr/local/sbin/cleanup_ami.sh" 285 | echo -e "#!/bin/bash 286 | 287 | rm -rf /var/tmp/* /tmp/* /var/crash/* 288 | cloud-init clean 289 | rm -rf /etc/ssh/ssh_host_* 290 | rm -f /etc/udev/rules.d/70-persistent-net.rules 291 | grep -l \"Created by cloud-init on instance boot automatically\" /etc/sysconfig/network-scripts/ifcfg-* | xargs rm -f 292 | " > /usr/local/sbin/cleanup_ami.sh 293 | chmod +x /usr/local/sbin/cleanup_ami.sh 294 | 295 | echo "Will reboot instance now to load new kernel! After reboot, the script at /root/post_reboot.sh.sh will install FSx for Lustre client corresponding to the new kernel version. See details in /root/post_reboot.sh.log" 296 | echo "@reboot /bin/bash /root/post_reboot.sh >> /root/post_reboot.sh.log 2>&1" | crontab - 297 | /sbin/reboot 298 | --------------------------------------------------------------------------------