├── .github
└── workflows
│ └── main.yml
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Doc
└── img
│ ├── Abaqus
│ ├── Abaqus-e13-Hpc6aVsHpc7a.png
│ ├── Abaqus-e13-Hpc6id.png
│ ├── Abaqus-e13-Hpc7a.png
│ ├── Abaqus-e14_DropTest_v0-1-Hpc6avsHpc7a.png
│ ├── Abaqus-e14_DropTest_v0-1-Hpc6id.png
│ ├── Abaqus-e14_DropTest_v0-1-Hpc7a.png
│ ├── Abaqus-s4e-Hpc6id.png
│ ├── Abaqus-s4e-Hpc7a.png
│ ├── Abaqus-s4e-Hpc7aVsHpc6a.png
│ ├── Abaqus-s9-Hpc6aVsHpc7a.png
│ ├── Abaqus-s9-Hpc6id.png
│ ├── Abaqus-s9-Hpc7a.png
│ └── s4e.png
│ ├── CloudFormationOutput.png
│ └── Fluent
│ ├── f1_racecar_140mXcoreAMD.png
│ ├── f1_racecar_140mXcoreAtScaleAMD.png
│ ├── f1_racecar_140mXinstanceAMD.png
│ └── f1_racecar_140mXinstanceINTEL.png
├── LICENSE
├── ParallelCluster
├── CloudFormation
│ ├── HPC-Networking.yaml
│ ├── HPC-Storage.yaml
│ ├── eu-north-1.GPU.yaml
│ ├── eu-north-1.x86.yaml
│ ├── us-east-1.Arm.yaml
│ └── us-east-2.x86.yaml
├── README.md
├── config
│ ├── eu-north-1.GPU.yaml
│ ├── eu-north-1.x86.yaml
│ ├── us-east-1.Arm.yaml
│ └── us-east-2.x86.yaml
└── post-install
│ ├── post-install.compute.sh
│ └── post-install.headnode.sh
├── README.md
└── apps
├── Abaqus
├── Abaqus.sbatch
└── README.md
├── AnsysMechanical
└── AnsysMechanical.sbatch
├── CFX
├── CFX.md
├── CFX.sbatch
└── Readme.md
├── Fluent
├── Arm
│ ├── Fluent-benchmark.sbatch
│ └── Fluent.sbatch
├── Fluent-Benchmark.md
├── Fluent-Install.sh
├── Fluent.md
├── Readme.md
├── gpu
│ ├── Fluent-Benchmark-GPU.sbatch
│ ├── Fluent-GPU.OMPI.sbatch
│ └── Fluent-GPU.sbatch
└── x86
│ ├── Fluent-Benchmark.OMPI.sbatch
│ ├── Fluent-Benchmark.sbatch
│ └── Fluent.sbatch
├── LS-Dyna
├── LS-Dyna.sbatch
└── Readme.md
├── Optistruct
├── Optistruct.2024_1.sbatch
└── Optistruct.sbatch
└── StarCCM
├── Readme.md
├── STARCCM_Submit_Server.sbatch
├── STARCCM_Submit_Simple.sbatch
├── gpu
└── STARCCM_GPGPU_Submit.sbatch
└── macros
└── setMax2HrRuntime.java
/.github/workflows/main.yml:
--------------------------------------------------------------------------------
1 | name: CloudFormation Template S3 upload
2 |
3 | on:
4 | push:
5 | branches:
6 | - main
7 |
8 | jobs:
9 | deploy:
10 | runs-on: ubuntu-latest
11 | steps:
12 | - uses: actions/checkout@master
13 | - uses: jakejarvis/s3-sync-action@master
14 | with:
15 | args: --acl authenticated-read
16 | env:
17 | AWS_S3_BUCKET: ${{ secrets.AWS_S3_BUCKET }}
18 | AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
19 | AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
20 | AWS_REGION: ${{ secrets.AWS_REGION }}
21 | SOURCE_DIR: ${{ secrets.SOURCE_DIR }}
22 |
--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | ## Code of Conduct
2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
4 | opensource-codeofconduct@amazon.com with any additional questions or comments.
5 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Contributing Guidelines
2 |
3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional
4 | documentation, we greatly value feedback and contributions from our community.
5 |
6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
7 | information to effectively respond to your bug report or contribution.
8 |
9 |
10 | ## Reporting Bugs/Feature Requests
11 |
12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features.
13 |
14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already
15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
16 |
17 | * A reproducible test case or series of steps
18 | * The version of our code being used
19 | * Any modifications you've made relevant to the bug
20 | * Anything unusual about your environment or deployment
21 |
22 |
23 | ## Contributing via Pull Requests
24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
25 |
26 | 1. You are working against the latest source on the *main* branch.
27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted.
29 |
30 | To send us a pull request, please:
31 |
32 | 1. Fork the repository.
33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
34 | 3. Ensure local tests pass.
35 | 4. Commit to your fork using clear commit messages.
36 | 5. Send us a pull request, answering any default questions in the pull request interface.
37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
38 |
39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
41 |
42 |
43 | ## Finding contributions to work on
44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.
45 |
46 |
47 | ## Code of Conduct
48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
50 | opensource-codeofconduct@amazon.com with any additional questions or comments.
51 |
52 |
53 | ## Security issue notifications
54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
55 |
56 |
57 | ## Licensing
58 |
59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
60 |
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-e13-Hpc6aVsHpc7a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-e13-Hpc6aVsHpc7a.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-e13-Hpc6id.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-e13-Hpc6id.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-e13-Hpc7a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-e13-Hpc7a.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-e14_DropTest_v0-1-Hpc6avsHpc7a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-e14_DropTest_v0-1-Hpc6avsHpc7a.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-e14_DropTest_v0-1-Hpc6id.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-e14_DropTest_v0-1-Hpc6id.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-e14_DropTest_v0-1-Hpc7a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-e14_DropTest_v0-1-Hpc7a.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-s4e-Hpc6id.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-s4e-Hpc6id.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-s4e-Hpc7a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-s4e-Hpc7a.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-s4e-Hpc7aVsHpc6a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-s4e-Hpc7aVsHpc6a.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-s9-Hpc6aVsHpc7a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-s9-Hpc6aVsHpc7a.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-s9-Hpc6id.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-s9-Hpc6id.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/Abaqus-s9-Hpc7a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/Abaqus-s9-Hpc7a.png
--------------------------------------------------------------------------------
/Doc/img/Abaqus/s4e.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Abaqus/s4e.png
--------------------------------------------------------------------------------
/Doc/img/CloudFormationOutput.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/CloudFormationOutput.png
--------------------------------------------------------------------------------
/Doc/img/Fluent/f1_racecar_140mXcoreAMD.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Fluent/f1_racecar_140mXcoreAMD.png
--------------------------------------------------------------------------------
/Doc/img/Fluent/f1_racecar_140mXcoreAtScaleAMD.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Fluent/f1_racecar_140mXcoreAtScaleAMD.png
--------------------------------------------------------------------------------
/Doc/img/Fluent/f1_racecar_140mXinstanceAMD.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Fluent/f1_racecar_140mXinstanceAMD.png
--------------------------------------------------------------------------------
/Doc/img/Fluent/f1_racecar_140mXinstanceINTEL.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/hpc-applications/8535162e6398b57fc0ea82d8f7fb6c14a158a3b4/Doc/img/Fluent/f1_racecar_140mXinstanceINTEL.png
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT No Attribution
2 |
3 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy of
6 | this software and associated documentation files (the "Software"), to deal in
7 | the Software without restriction, including without limitation the rights to
8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
9 | the Software, and to permit persons to whom the Software is furnished to do so.
10 |
11 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
12 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
13 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
14 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
15 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
16 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
17 |
18 |
--------------------------------------------------------------------------------
/ParallelCluster/CloudFormation/HPC-Networking.yaml:
--------------------------------------------------------------------------------
1 | AWSTemplateFormatVersion: '2010-09-09'
2 | Description: HPC-Networking
3 |
4 | Parameters:
5 | CidrBlockVPC:
6 | AllowedPattern: '((\d{1,3})\.){3}\d{1,3}/\d{1,2}'
7 | Default: 10.3.0.0/16
8 | Description: VPC CIDR Block (eg 10.3.0.0/16)
9 | Type: String
10 | CidrPublicSubnetA:
11 | AllowedPattern: '((\d{1,3})\.){3}\d{1,3}/\d{1,2}'
12 | Default: 10.3.128.0/20
13 | Description: VPC CIDR Block for the Public Subnet A (eg 10.3.128.0/20)
14 | Type: String
15 | CidrPublicSubnetB:
16 | AllowedPattern: '((\d{1,3})\.){3}\d{1,3}/\d{1,2}'
17 | Default: 10.3.144.0/20
18 | Description: VPC CIDR Block for the Public Subnet B (eg 10.3.144.0/20)
19 | Type: String
20 | CidrPrivateSubnetA:
21 | AllowedPattern: '((\d{1,3})\.){3}\d{1,3}/\d{1,2}'
22 | Default: 10.3.0.0/18
23 | Description: VPC CIDR Block for the Private Subnet A (eg 10.3.1.0/18)
24 | Type: String
25 | CidrPrivateSubnetB:
26 | AllowedPattern: '((\d{1,3})\.){3}\d{1,3}/\d{1,2}'
27 | Default: 10.3.64.0/18
28 | Description: VPC CIDR Block for the Private Subnet B (eg 10.3.64.0/18)
29 | Type: String
30 |
31 | Mappings:
32 | RegionMap:
33 | us-east-1:
34 | ZoneIdA: use1-az6
35 | ZoneIdB: use1-az4
36 | us-east-2:
37 | ZoneIdA: use2-az2
38 | ZoneIdB: use2-az3
39 | us-west-1:
40 | ZoneIdA: usw1-az1
41 | ZoneIdB: usw1-az3
42 | us-west-2:
43 | ZoneIdA: usw2-az1
44 | ZoneIdB: usw2-az2
45 | eu-central-1:
46 | ZoneIdA: euc1-az3
47 | ZoneIdB: euc1-az2
48 | eu-west-1:
49 | ZoneIdA: euw1-az1
50 | ZoneIdB: euw1-az2
51 | eu-north-1:
52 | ZoneIdA: eun1-az2
53 | ZoneIdB: eun1-az1
54 | ca-central-1:
55 | ZoneIdA: cac1-az2
56 | ZoneIdB: cac1-az1
57 | eu-south-1:
58 | ZoneIdA: eus1-az2
59 | ZoneIdB: eus1-az1
60 | ap-east-1:
61 | ZoneIdA: ape1-az3
62 | ZoneIdB: ape1-az2
63 | ap-northeast-1:
64 | ZoneIdA: apne1-az4
65 | ZoneIdB: apne1-az1
66 | ap-northeast-2:
67 | ZoneIdA: apne2-az1
68 | ZoneIdB: apne2-az3
69 | ap-south-1:
70 | ZoneIdA: aps1-az2
71 | ZoneIdB: aps1-az3
72 |
73 | Resources:
74 |
75 | VPC:
76 | Type: AWS::EC2::VPC
77 | Properties:
78 | CidrBlock: !Ref CidrBlockVPC
79 | EnableDnsHostnames: true
80 | EnableDnsSupport: true
81 | Tags:
82 | - Key: "Name"
83 | Value: !Sub '${AWS::StackName}-HPC-VPC'
84 |
85 | PublicSubnetA:
86 | Type: AWS::EC2::Subnet
87 | Properties:
88 | VpcId: !Ref VPC
89 | CidrBlock: !Ref CidrPublicSubnetA
90 | AvailabilityZoneId: !FindInMap [RegionMap, !Ref "AWS::Region", ZoneIdA]
91 | MapPublicIpOnLaunch: true
92 | Tags:
93 | - Key: Name
94 | Value: !Sub '${AWS::StackName}-Public-SubnetA'
95 |
96 | PublicSubnetB:
97 | Type: AWS::EC2::Subnet
98 | Properties:
99 | VpcId: !Ref VPC
100 | CidrBlock: !Ref CidrPublicSubnetB
101 | AvailabilityZoneId: !FindInMap [RegionMap, !Ref "AWS::Region", ZoneIdB]
102 | MapPublicIpOnLaunch: true
103 | Tags:
104 | - Key: Name
105 | Value: !Sub '${AWS::StackName}-Public-SubnetB'
106 |
107 | InternetGateway:
108 | Type: AWS::EC2::InternetGateway
109 |
110 | AttachGateway:
111 | Type: AWS::EC2::VPCGatewayAttachment
112 | Properties:
113 | VpcId: !Ref VPC
114 | InternetGatewayId: !Ref InternetGateway
115 |
116 | PublicRouteTable:
117 | Type: AWS::EC2::RouteTable
118 | Properties:
119 | VpcId: !Ref VPC
120 | Tags:
121 | - Key: Name
122 | Value: !Sub '${AWS::StackName}-Public-Route'
123 | PublicRoute1:
124 | Type: AWS::EC2::Route
125 | Properties:
126 | RouteTableId: !Ref PublicRouteTable
127 | DestinationCidrBlock: 0.0.0.0/0
128 | GatewayId: !Ref InternetGateway
129 |
130 | PublicSubnetARouteTableAssociation:
131 | Type: AWS::EC2::SubnetRouteTableAssociation
132 | Properties:
133 | SubnetId: !Ref PublicSubnetA
134 | RouteTableId: !Ref PublicRouteTable
135 |
136 | PublicSubnetBRouteTableAssociation:
137 | Type: AWS::EC2::SubnetRouteTableAssociation
138 | Properties:
139 | SubnetId: !Ref PublicSubnetB
140 | RouteTableId: !Ref PublicRouteTable
141 |
142 | PrivateSubnetA:
143 | Type: AWS::EC2::Subnet
144 | Properties:
145 | VpcId: !Ref VPC
146 | AvailabilityZoneId: !FindInMap [RegionMap, !Ref "AWS::Region", ZoneIdA]
147 | CidrBlock: !Ref CidrPrivateSubnetA
148 | MapPublicIpOnLaunch: false
149 | Tags:
150 | - Key: Name
151 | Value: !Sub '${AWS::StackName}-Private-SubnetA'
152 |
153 | PrivateSubnetB:
154 | Type: AWS::EC2::Subnet
155 | Properties:
156 | VpcId: !Ref VPC
157 | AvailabilityZoneId: !FindInMap [RegionMap, !Ref "AWS::Region", ZoneIdB]
158 | CidrBlock: !Ref CidrPrivateSubnetB
159 | MapPublicIpOnLaunch: false
160 | Tags:
161 | - Key: Name
162 | Value: !Sub '${AWS::StackName}-Private-SubnetB'
163 |
164 | NatGatewayAEIP:
165 | Type: AWS::EC2::EIP
166 | DependsOn: AttachGateway
167 | Properties:
168 | Domain: vpc
169 |
170 | NatGatewayBEIP:
171 | Type: AWS::EC2::EIP
172 | DependsOn: AttachGateway
173 | Properties:
174 | Domain: vpc
175 |
176 | NatGatewayA:
177 | Type: AWS::EC2::NatGateway
178 | Properties:
179 | AllocationId: !GetAtt NatGatewayAEIP.AllocationId
180 | SubnetId: !Ref PublicSubnetA
181 |
182 | NatGatewayB:
183 | Type: AWS::EC2::NatGateway
184 | Properties:
185 | AllocationId: !GetAtt NatGatewayBEIP.AllocationId
186 | SubnetId: !Ref PublicSubnetB
187 |
188 | PrivateRouteTableA:
189 | Type: AWS::EC2::RouteTable
190 | Properties:
191 | VpcId: !Ref VPC
192 | Tags:
193 | - Key: Name
194 | Value: !Sub '${AWS::StackName}-Private-Route-A'
195 |
196 | PrivateRouteTableB:
197 | Type: AWS::EC2::RouteTable
198 | Properties:
199 | VpcId: !Ref VPC
200 | Tags:
201 | - Key: Name
202 | Value: !Sub '${AWS::StackName}-Private-Route-B'
203 |
204 | DefaultPrivateRouteA:
205 | Type: AWS::EC2::Route
206 | Properties:
207 | RouteTableId: !Ref PrivateRouteTableA
208 | DestinationCidrBlock: 0.0.0.0/0
209 | NatGatewayId: !Ref NatGatewayA
210 |
211 | DefaultPrivateRouteB:
212 | Type: AWS::EC2::Route
213 | Properties:
214 | RouteTableId: !Ref PrivateRouteTableB
215 | DestinationCidrBlock: 0.0.0.0/0
216 | NatGatewayId: !Ref NatGatewayB
217 |
218 | PrivateSubnetARouteTableAssociation:
219 | Type: AWS::EC2::SubnetRouteTableAssociation
220 | Properties:
221 | RouteTableId: !Ref PrivateRouteTableA
222 | SubnetId: !Ref PrivateSubnetA
223 |
224 | PrivateSubnetBRouteTableAssociation:
225 | Type: AWS::EC2::SubnetRouteTableAssociation
226 | Properties:
227 | RouteTableId: !Ref PrivateRouteTableB
228 | SubnetId: !Ref PrivateSubnetB
229 |
230 | S3Endpoint:
231 | Type: 'AWS::EC2::VPCEndpoint'
232 | Properties:
233 | VpcEndpointType: 'Gateway'
234 | ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
235 | RouteTableIds:
236 | - !Ref PublicRouteTable
237 | - !Ref PrivateRouteTableA
238 | - !Ref PrivateRouteTableB
239 | VpcId: !Ref VPC
240 |
241 | localSG:
242 | Type: AWS::EC2::SecurityGroup
243 | Properties:
244 | GroupDescription: Allow all traffic from resources in VPC
245 | VpcId:
246 | Ref: VPC
247 | SecurityGroupIngress:
248 | - IpProtocol: -1
249 | CidrIp: !Ref CidrBlockVPC
250 | SecurityGroupEgress:
251 | - IpProtocol: -1
252 | CidrIp: !Ref CidrBlockVPC
253 |
254 | Outputs:
255 | VPC:
256 | Description: The ID of the VPC
257 | Value: !Ref VPC
258 | Export:
259 | Name: !Sub "${AWS::StackName}-VPC"
260 | PrivateSubnetA:
261 | Description: The ID of the PrivateSubnetA
262 | Value: !Ref PrivateSubnetA
263 | Export:
264 | Name: !Sub "${AWS::StackName}-PrivateSubnetA"
265 | PrivateSubnetB:
266 | Description: The ID of the PrivateSubnetB
267 | Value: !Ref PrivateSubnetB
268 | Export:
269 | Name: !Sub "${AWS::StackName}-PrivateSubnetB"
270 | PublicSubnetA:
271 | Description: The ID of the PublicSubnetA
272 | Value: !Ref PublicSubnetA
273 | Export:
274 | Name: !Sub "${AWS::StackName}-PublicSubnetA"
275 | PublicSubnetB:
276 | Description: The ID of the PublicSubnetB
277 | Value: !Ref PublicSubnetB
278 | Export:
279 | Name: !Sub "${AWS::StackName}-PublicSubnetB"
280 | localSG:
281 | Description: The ID of the localSG
282 | Value: !Ref localSG
283 | Export:
284 | Name: !Sub "${AWS::StackName}-localSG"
--------------------------------------------------------------------------------
/ParallelCluster/CloudFormation/HPC-Storage.yaml:
--------------------------------------------------------------------------------
1 | AWSTemplateFormatVersion: '2010-09-09'
2 | Description: HPC-Networking
3 | Parameters:
4 | PrivateSubnet:
5 | Description: The ID of your private subnet.
6 | Type: String
7 | AllowedPattern: ^(subnet-[0-9a-z]+)$
8 | SecurityGroup:
9 | Description: The ID of the SecurityGroup you want to attach.
10 | Type: String
11 | AllowedPattern: ^(sg-[0-9a-z]+)$
12 | Resources:
13 | FSxLustre:
14 | Type: AWS::FSx::FileSystem
15 | Properties:
16 | FileSystemType: LUSTRE
17 | StorageCapacity: 1200
18 | StorageType: SSD
19 | SubnetIds:
20 | - !Ref PrivateSubnet
21 | SecurityGroupIds:
22 | - !Ref SecurityGroup
23 | LustreConfiguration:
24 | WeeklyMaintenanceStartTime: "1:00:00"
25 | DeploymentType: PERSISTENT_2
26 | PerUnitStorageThroughput: 1000
27 | DataCompressionType: LZ4
28 | FileSystemTypeVersion: "2.15"
29 | Outputs:
30 | FileSystemId:
31 | Value: !Ref FSxLustre
--------------------------------------------------------------------------------
/ParallelCluster/CloudFormation/eu-north-1.GPU.yaml:
--------------------------------------------------------------------------------
1 | AWSTemplateFormatVersion: '2010-09-09'
2 | Description: HPC-Application-Best-Practice
3 | Metadata:
4 | AWS::CloudFormation::Interface:
5 | ParameterGroups:
6 | -
7 | Label:
8 | default: ""
9 | Parameters:
10 | - PrivateSubnet
11 | - FSx
12 | - ClusterSecurityGroup
13 | ParameterLabels:
14 | PrivateSubnet:
15 | default: 'Private Subnet:'
16 | FSx:
17 | default: 'FSx:'
18 | ClusterSecurityGroup:
19 | default: 'Security Group:'
20 | Parameters:
21 | PrivateSubnet:
22 | Description: 'Please, enter the ID of the Private Subnet you wish to use (if you want to re-use an existing one), or just leave "AUTO".'
23 | Type: String
24 | AllowedPattern: ^(AUTO|subnet-[0-9a-z]+)$
25 | Default : AUTO
26 | ClusterSecurityGroup:
27 | Description: 'Please, enter the ID of the Security Group you wish to use (if you want to re-use an existing one), or just leave "AUTO".'
28 | Type: String
29 | AllowedPattern: ^(AUTO|sg-[0-9a-z]+)$
30 | Default : AUTO
31 | FSx:
32 | Description: 'Please, enter your FSx ID, or just leave "AUTO" if you want to re-use an existing one.'
33 | Type: String
34 | AllowedPattern: ^(AUTO|fs-[0-9a-z]+)$
35 | Default : AUTO
36 | Conditions:
37 | CreateVpc: !Equals [!Ref PrivateSubnet, AUTO]
38 | CreateStorage: !Equals [!Ref FSx, AUTO]
39 | Resources:
40 | HPCNetworkStack:
41 | Type: AWS::CloudFormation::Stack
42 | Condition: CreateVpc
43 | Properties:
44 | TemplateURL: https://hpc-applications-best-practice.s3.eu-west-1.amazonaws.com/HPC-Networking.yaml
45 | HPCStorageStack:
46 | Type: AWS::CloudFormation::Stack
47 | Condition: CreateStorage
48 | Properties:
49 | TemplateURL: https://hpc-applications-best-practice.s3.eu-west-1.amazonaws.com/HPC-Storage.yaml
50 | Parameters:
51 | PrivateSubnet: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetB, !Ref PrivateSubnet]
52 | SecurityGroup: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
53 | PClusterPolicy:
54 | Type: AWS::IAM::ManagedPolicy
55 | Properties:
56 | ManagedPolicyName: !Sub 'pc-attach-policy-${AWS::StackName}'
57 | PolicyDocument:
58 | Version: 2012-10-17
59 | Statement:
60 | - Effect: Allow
61 | Action:
62 | - iam:AttachRolePolicy
63 | - iam:DetachRolePolicy
64 | Resource: "*"
65 | PclusterClusterProvider:
66 | Type: AWS::CloudFormation::Stack
67 | Properties:
68 | Parameters:
69 | AdditionalIamPolicies: !Ref PClusterPolicy
70 | TemplateURL: !Sub
71 | - https://${AWS::Region}-aws-parallelcluster.s3.${AWS::Region}.${AWS::URLSuffix}/parallelcluster/${Version}/templates/custom_resource/cluster.yaml
72 | - { Version: 3.9.1, Region: !Ref AWS::Region }
73 | PclusterCluster:
74 | Type: Custom::PclusterCluster
75 | Properties:
76 | ServiceToken: !GetAtt [ PclusterClusterProvider , Outputs.ServiceToken ]
77 | ClusterName: !Sub 'pc-${AWS::StackName}'
78 | ClusterConfiguration:
79 | Imds:
80 | ImdsSupport: v2.0
81 | Region: eu-north-1
82 | Image:
83 | Os: alinux2
84 | HeadNode:
85 | InstanceType: m6i.2xlarge
86 | Networking:
87 | SubnetId: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetB, !Ref PrivateSubnet]
88 | AdditionalSecurityGroups:
89 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
90 | DisableSimultaneousMultithreading: false
91 | SharedStorageType: Efs
92 | LocalStorage:
93 | RootVolume:
94 | Size: 100
95 | Encrypted: true
96 | VolumeType: gp3
97 | DeleteOnTermination: true
98 | CustomActions:
99 | OnNodeConfigured:
100 | Script: s3://hpc-benchmarks-software/post-install.headnode.sh
101 | Iam:
102 | AdditionalIamPolicies:
103 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
104 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
105 | S3Access:
106 | - BucketName: '*'
107 | EnableWriteAccess: true
108 | Imds:
109 | Secured: True
110 | Scheduling:
111 | Scheduler: slurm
112 | SlurmSettings:
113 | ScaledownIdletime: 10
114 | QueueUpdateStrategy: TERMINATE
115 | EnableMemoryBasedScheduling: true
116 | SlurmQueues:
117 | - Name: g5
118 | CapacityType: ONDEMAND
119 | Networking:
120 | SubnetIds:
121 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetB, !Ref PrivateSubnet]
122 | AdditionalSecurityGroups:
123 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
124 | PlacementGroup:
125 | Enabled: true
126 | ComputeResources:
127 | - Name: g5-48xlarge
128 | DisableSimultaneousMultithreading: true
129 | InstanceType: g5.48xlarge
130 | MinCount: 0
131 | MaxCount: 200
132 | Efa:
133 | Enabled: true
134 | GdrSupport: true
135 | - Name: g5-16xlarge
136 | DisableSimultaneousMultithreading: true
137 | InstanceType: g5.16xlarge
138 | MinCount: 0
139 | MaxCount: 200
140 | Efa:
141 | Enabled: true
142 | GdrSupport: true
143 | HealthChecks:
144 | Gpu:
145 | Enabled: false
146 | Iam:
147 | AdditionalIamPolicies:
148 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
149 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
150 | S3Access:
151 | - BucketName: '*'
152 | EnableWriteAccess: true
153 | CustomActions:
154 | OnNodeConfigured:
155 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
156 | - Name: p5
157 | CapacityType: ONDEMAND
158 | Networking:
159 | SubnetIds:
160 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetB, !Ref PrivateSubnet]
161 | AdditionalSecurityGroups:
162 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
163 | PlacementGroup:
164 | Enabled: true
165 | ComputeResources:
166 | - Name: p5-48xlarge
167 | DisableSimultaneousMultithreading: true
168 | InstanceType: p5.48xlarge
169 | MinCount: 0
170 | MaxCount: 200
171 | Efa:
172 | Enabled: true
173 | GdrSupport: true
174 | HealthChecks:
175 | Gpu:
176 | Enabled: false
177 | Iam:
178 | AdditionalIamPolicies:
179 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
180 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
181 | S3Access:
182 | - BucketName: '*'
183 | EnableWriteAccess: true
184 | CustomActions:
185 | OnNodeConfigured:
186 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
187 | - Name: p5-spot
188 | CapacityType: SPOT
189 | Networking:
190 | SubnetIds:
191 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetB, !Ref PrivateSubnet]
192 | AdditionalSecurityGroups:
193 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
194 | PlacementGroup:
195 | Enabled: true
196 | ComputeResources:
197 | - Name: p5-48xlarge
198 | DisableSimultaneousMultithreading: true
199 | InstanceType: p5.48xlarge
200 | MinCount: 0
201 | MaxCount: 200
202 | Efa:
203 | Enabled: true
204 | GdrSupport: true
205 | HealthChecks:
206 | Gpu:
207 | Enabled: false
208 | Iam:
209 | AdditionalIamPolicies:
210 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
211 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
212 | S3Access:
213 | - BucketName: '*'
214 | EnableWriteAccess: true
215 | CustomActions:
216 | OnNodeConfigured:
217 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
218 | SharedStorage:
219 | - Name: FsxLustre
220 | StorageType: FsxLustre
221 | MountDir: /fsx
222 | FsxLustreSettings:
223 | FileSystemId: !If [CreateStorage, !GetAtt HPCStorageStack.Outputs.FileSystemId, !Ref FSx]
224 | Monitoring:
225 | DetailedMonitoring: true
226 | Logs:
227 | CloudWatch:
228 | Enabled: true
229 | RetentionInDays: 30
230 | DeletionPolicy: Retain
231 | Rotation:
232 | Enabled: true
233 | Dashboards:
234 | CloudWatch:
235 | Enabled: true
236 | Outputs:
237 | HeadNodeIp:
238 | Description: The Public IP address of the HeadNode
239 | Value: !GetAtt [ PclusterCluster, headNode.privateIpAddress ]
240 | FileSystemId:
241 | Value: !If [CreateStorage, !GetAtt HPCStorageStack.Outputs.FileSystemId, !Ref FSx]
242 | SystemManagerUrl:
243 | Description: URL to access the HeadNode via SystemManager
244 | Value: !Sub
245 | - https://${ConsoleDomain}/systems-manager/session-manager/${InstanceId}?region=${AWS::Region}
246 | - { ConsoleDomain: !Sub '${AWS::Region}.console.aws.amazon.com',
247 | InstanceId: !GetAtt [ PclusterCluster, headNode.instanceId ]
248 | }
--------------------------------------------------------------------------------
/ParallelCluster/CloudFormation/eu-north-1.x86.yaml:
--------------------------------------------------------------------------------
1 | AWSTemplateFormatVersion: '2010-09-09'
2 | Description: HPC-Application-Best-Practice
3 | Metadata:
4 | AWS::CloudFormation::Interface:
5 | ParameterGroups:
6 | -
7 | Label:
8 | default: ""
9 | Parameters:
10 | - PrivateSubnet
11 | - FSx
12 | - ClusterSecurityGroup
13 | ParameterLabels:
14 | PrivateSubnet:
15 | default: 'Private Subnet:'
16 | FSx:
17 | default: 'FSx:'
18 | ClusterSecurityGroup:
19 | default: 'Security Group:'
20 | Parameters:
21 | PrivateSubnet:
22 | Description: 'Please, enter the ID of the Private Subnet you wish to use (if you want to re-use an existing one), or just leave "AUTO".'
23 | Type: String
24 | AllowedPattern: ^(AUTO|subnet-[0-9a-z]+)$
25 | Default : AUTO
26 | ClusterSecurityGroup:
27 | Description: 'Please, enter the ID of the Security Group you wish to use (if you want to re-use an existing one), or just leave "AUTO".'
28 | Type: String
29 | AllowedPattern: ^(AUTO|sg-[0-9a-z]+)$
30 | Default : AUTO
31 | FSx:
32 | Description: 'Please, enter your FSx ID, or just leave "AUTO" if you want to re-use an existing one.'
33 | Type: String
34 | AllowedPattern: ^(AUTO|fs-[0-9a-z]+)$
35 | Default : AUTO
36 | Conditions:
37 | CreateVpc: !Equals [!Ref PrivateSubnet, AUTO]
38 | CreateStorage: !Equals [!Ref FSx, AUTO]
39 | Resources:
40 | HPCNetworkStack:
41 | Type: AWS::CloudFormation::Stack
42 | Condition: CreateVpc
43 | Properties:
44 | TemplateURL: https://hpc-applications-best-practice.s3.eu-west-1.amazonaws.com/HPC-Networking.yaml
45 | HPCStorageStack:
46 | Type: AWS::CloudFormation::Stack
47 | Condition: CreateStorage
48 | Properties:
49 | TemplateURL: https://hpc-applications-best-practice.s3.eu-west-1.amazonaws.com/HPC-Storage.yaml
50 | Parameters:
51 | PrivateSubnet: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
52 | SecurityGroup: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
53 | PClusterPolicy:
54 | Type: AWS::IAM::ManagedPolicy
55 | Properties:
56 | ManagedPolicyName: !Sub 'pc-attach-policy-${AWS::StackName}'
57 | PolicyDocument:
58 | Version: 2012-10-17
59 | Statement:
60 | - Effect: Allow
61 | Action:
62 | - iam:AttachRolePolicy
63 | - iam:DetachRolePolicy
64 | Resource: "*"
65 | PclusterClusterProvider:
66 | Type: AWS::CloudFormation::Stack
67 | Properties:
68 | Parameters:
69 | AdditionalIamPolicies: !Ref PClusterPolicy
70 | TemplateURL: !Sub
71 | - https://${AWS::Region}-aws-parallelcluster.s3.${AWS::Region}.${AWS::URLSuffix}/parallelcluster/${Version}/templates/custom_resource/cluster.yaml
72 | - { Version: 3.9.1, Region: !Ref AWS::Region }
73 | PclusterCluster:
74 | Type: Custom::PclusterCluster
75 | Properties:
76 | ServiceToken: !GetAtt [ PclusterClusterProvider , Outputs.ServiceToken ]
77 | ClusterName: !Sub 'pc-${AWS::StackName}'
78 | ClusterConfiguration:
79 | Imds:
80 | ImdsSupport: v2.0
81 | Region: eu-north-1
82 | Image:
83 | Os: alinux2
84 | HeadNode:
85 | InstanceType: m6i.2xlarge
86 | Networking:
87 | SubnetId: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
88 | AdditionalSecurityGroups:
89 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
90 | DisableSimultaneousMultithreading: false
91 | SharedStorageType: Efs
92 | LocalStorage:
93 | RootVolume:
94 | Size: 100
95 | Encrypted: true
96 | VolumeType: gp3
97 | DeleteOnTermination: true
98 | CustomActions:
99 | OnNodeConfigured:
100 | Script: s3://hpc-applications-best-practice/post-install.headnode.sh
101 | Iam:
102 | AdditionalIamPolicies:
103 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
104 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
105 | S3Access:
106 | - BucketName: '*'
107 | EnableWriteAccess: true
108 | Imds:
109 | Secured: True
110 | Scheduling:
111 | Scheduler: slurm
112 | SlurmSettings:
113 | ScaledownIdletime: 10
114 | QueueUpdateStrategy: TERMINATE
115 | EnableMemoryBasedScheduling: true
116 | SlurmQueues:
117 | - Name: hpc7a
118 | CapacityType: ONDEMAND
119 | Networking:
120 | SubnetIds:
121 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
122 | AdditionalSecurityGroups:
123 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
124 | PlacementGroup:
125 | Enabled: true
126 | ComputeResources:
127 | - Name: hpc7a-96xlarge
128 | InstanceType: hpc7a.96xlarge
129 | MinCount: 0
130 | MaxCount: 200
131 | Efa:
132 | Enabled: true
133 | - Name: hpc7a-48xlarge
134 | InstanceType: hpc7a.48xlarge
135 | MinCount: 0
136 | MaxCount: 200
137 | Efa:
138 | Enabled: true
139 | - Name: hpc7a-24xlarge
140 | InstanceType: hpc7a.24xlarge
141 | MinCount: 0
142 | MaxCount: 200
143 | Efa:
144 | Enabled: true
145 | - Name: hpc7a-12xlarge
146 | InstanceType: hpc7a.12xlarge
147 | MinCount: 0
148 | MaxCount: 200
149 | Efa:
150 | Enabled: true
151 | Iam:
152 | AdditionalIamPolicies:
153 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
154 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
155 | S3Access:
156 | - BucketName: '*'
157 | EnableWriteAccess: true
158 | CustomActions:
159 | OnNodeConfigured:
160 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
161 | - Name: hpc6id
162 | CapacityType: ONDEMAND
163 | Networking:
164 | SubnetIds:
165 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
166 | AdditionalSecurityGroups:
167 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
168 | PlacementGroup:
169 | Enabled: true
170 | ComputeResources:
171 | - Name: hpc6id-32xlarge
172 | InstanceType: hpc6id.32xlarge
173 | MinCount: 0
174 | MaxCount: 200
175 | Efa:
176 | Enabled: true
177 | Iam:
178 | AdditionalIamPolicies:
179 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
180 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
181 | S3Access:
182 | - BucketName: '*'
183 | EnableWriteAccess: true
184 | CustomActions:
185 | OnNodeConfigured:
186 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
187 | - Name: hpc6a
188 | CapacityType: ONDEMAND
189 | Networking:
190 | SubnetIds:
191 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
192 | AdditionalSecurityGroups:
193 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
194 | PlacementGroup:
195 | Enabled: true
196 | ComputeResources:
197 | - Name: hpc6a-48xlarge
198 | InstanceType: hpc6a.48xlarge
199 | MinCount: 0
200 | MaxCount: 200
201 | Efa:
202 | Enabled: true
203 | Iam:
204 | AdditionalIamPolicies:
205 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
206 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
207 | S3Access:
208 | - BucketName: '*'
209 | EnableWriteAccess: true
210 | CustomActions:
211 | OnNodeConfigured:
212 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
213 | - Name: c5n
214 | CapacityType: ONDEMAND
215 | Networking:
216 | SubnetIds:
217 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
218 | AdditionalSecurityGroups:
219 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
220 | PlacementGroup:
221 | Enabled: true
222 | ComputeResources:
223 | - Name: c5n-18xlarge
224 | InstanceType: c5n.18xlarge
225 | MinCount: 0
226 | MaxCount: 200
227 | Efa:
228 | Enabled: true
229 | Iam:
230 | AdditionalIamPolicies:
231 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
232 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
233 | S3Access:
234 | - BucketName: '*'
235 | EnableWriteAccess: true
236 | CustomActions:
237 | OnNodeConfigured:
238 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
239 | SharedStorage:
240 | - Name: FsxLustre
241 | StorageType: FsxLustre
242 | MountDir: /fsx
243 | FsxLustreSettings:
244 | FileSystemId: !If [CreateStorage, !GetAtt HPCStorageStack.Outputs.FileSystemId, !Ref FSx]
245 | Monitoring:
246 | DetailedMonitoring: true
247 | Logs:
248 | CloudWatch:
249 | Enabled: true
250 | RetentionInDays: 30
251 | DeletionPolicy: Retain
252 | Rotation:
253 | Enabled: true
254 | Dashboards:
255 | CloudWatch:
256 | Enabled: true
257 | Outputs:
258 | HeadNodeIp:
259 | Description: The Public IP address of the HeadNode
260 | Value: !GetAtt [ PclusterCluster, headNode.privateIpAddress ]
261 | FileSystemId:
262 | Value: !If [CreateStorage, !GetAtt HPCStorageStack.Outputs.FileSystemId, !Ref FSx]
263 | SystemManagerUrl:
264 | Description: URL to access the HeadNode via SystemManager
265 | Value: !Sub
266 | - https://${ConsoleDomain}/systems-manager/session-manager/${InstanceId}?region=${AWS::Region}
267 | - { ConsoleDomain: !Sub '${AWS::Region}.console.aws.amazon.com',
268 | InstanceId: !GetAtt [ PclusterCluster, headNode.instanceId ]
269 | }
--------------------------------------------------------------------------------
/ParallelCluster/CloudFormation/us-east-1.Arm.yaml:
--------------------------------------------------------------------------------
1 | AWSTemplateFormatVersion: '2010-09-09'
2 | Description: HPC-Application-Best-Practice
3 | Metadata:
4 | AWS::CloudFormation::Interface:
5 | ParameterGroups:
6 | -
7 | Label:
8 | default: ""
9 | Parameters:
10 | - PrivateSubnet
11 | - FSx
12 | - ClusterSecurityGroup
13 | ParameterLabels:
14 | PrivateSubnet:
15 | default: 'Private Subnet:'
16 | FSx:
17 | default: 'FSx:'
18 | ClusterSecurityGroup:
19 | default: 'Security Group:'
20 | Parameters:
21 | PrivateSubnet:
22 | Description: 'Please, enter the ID of the Private Subnet you wish to use (if you want to re-use an existing one), or just leave "AUTO".'
23 | Type: String
24 | AllowedPattern: ^(AUTO|subnet-[0-9a-z]+)$
25 | Default : AUTO
26 | ClusterSecurityGroup:
27 | Description: 'Please, enter the ID of the Security Group you wish to use (if you want to re-use an existing one), or just leave "AUTO".'
28 | Type: String
29 | AllowedPattern: ^(AUTO|sg-[0-9a-z]+)$
30 | Default : AUTO
31 | FSx:
32 | Description: 'Please, enter your FSx ID, or just leave "AUTO" if you want to re-use an existing one.'
33 | Type: String
34 | AllowedPattern: ^(AUTO|fs-[0-9a-z]+)$
35 | Default : AUTO
36 | Conditions:
37 | CreateVpc: !Equals [!Ref PrivateSubnet, AUTO]
38 | CreateStorage: !Equals [!Ref FSx, AUTO]
39 | Resources:
40 | HPCNetworkStack:
41 | Type: AWS::CloudFormation::Stack
42 | Condition: CreateVpc
43 | Properties:
44 | TemplateURL: https://hpc-applications-best-practice.s3.eu-west-1.amazonaws.com/HPC-Networking.yaml
45 | HPCStorageStack:
46 | Type: AWS::CloudFormation::Stack
47 | Condition: CreateStorage
48 | Properties:
49 | TemplateURL: https://hpc-applications-best-practice.s3.eu-west-1.amazonaws.com/HPC-Storage.yaml
50 | Parameters:
51 | PrivateSubnet: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
52 | SecurityGroup: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
53 | PClusterPolicy:
54 | Type: AWS::IAM::ManagedPolicy
55 | Properties:
56 | ManagedPolicyName: !Sub 'pc-attach-policy-${AWS::StackName}'
57 | PolicyDocument:
58 | Version: 2012-10-17
59 | Statement:
60 | - Effect: Allow
61 | Action:
62 | - iam:AttachRolePolicy
63 | - iam:DetachRolePolicy
64 | Resource: "*"
65 | PclusterClusterProvider:
66 | Type: AWS::CloudFormation::Stack
67 | Properties:
68 | Parameters:
69 | AdditionalIamPolicies: !Ref PClusterPolicy
70 | TemplateURL: !Sub
71 | - https://${AWS::Region}-aws-parallelcluster.s3.${AWS::Region}.${AWS::URLSuffix}/parallelcluster/${Version}/templates/custom_resource/cluster.yaml
72 | - { Version: 3.9.1, Region: !Ref AWS::Region }
73 | PclusterCluster:
74 | Type: Custom::PclusterCluster
75 | Properties:
76 | ServiceToken: !GetAtt [ PclusterClusterProvider , Outputs.ServiceToken ]
77 | ClusterName: !Sub 'pc-${AWS::StackName}'
78 | ClusterConfiguration:
79 | Imds:
80 | ImdsSupport: v2.0
81 | Region: us-east-1
82 | Image:
83 | Os: alinux2
84 | HeadNode:
85 | InstanceType: m7g.4xlarge
86 | Networking:
87 | SubnetId: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
88 | AdditionalSecurityGroups:
89 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
90 | #SharedStorageType: Efs
91 | LocalStorage:
92 | RootVolume:
93 | Size: 100
94 | Encrypted: true
95 | VolumeType: gp3
96 | DeleteOnTermination: true
97 | CustomActions:
98 | OnNodeConfigured:
99 | Script: s3://hpc-applications-best-practice/post-install.headnode.sh
100 | Iam:
101 | AdditionalIamPolicies:
102 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
103 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
104 | S3Access:
105 | - BucketName: '*'
106 | EnableWriteAccess: true
107 | Imds:
108 | Secured: True
109 | Scheduling:
110 | Scheduler: slurm
111 | SlurmSettings:
112 | ScaledownIdletime: 10
113 | QueueUpdateStrategy: TERMINATE
114 | EnableMemoryBasedScheduling: true
115 | SlurmQueues:
116 | - Name: hpc7g
117 | CapacityType: ONDEMAND
118 | Networking:
119 | SubnetIds:
120 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
121 | AdditionalSecurityGroups:
122 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
123 | PlacementGroup:
124 | Enabled: true
125 | ComputeResources:
126 | - Name: hpc7g-16xlarge
127 | InstanceType: hpc7g.16xlarge
128 | MinCount: 0
129 | MaxCount: 200
130 | Efa:
131 | Enabled: true
132 | - Name: hpc7g-8xlarge
133 | InstanceType: hpc7g.8xlarge
134 | MinCount: 0
135 | MaxCount: 200
136 | Efa:
137 | Enabled: true
138 | - Name: hpc7g-4xlarge
139 | InstanceType: hpc7g.4xlarge
140 | MinCount: 0
141 | MaxCount: 200
142 | Efa:
143 | Enabled: true
144 | Iam:
145 | AdditionalIamPolicies:
146 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
147 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
148 | S3Access:
149 | - BucketName: '*'
150 | EnableWriteAccess: true
151 | CustomActions:
152 | OnNodeConfigured:
153 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
154 | SharedStorage:
155 | - Name: FsxLustre
156 | StorageType: FsxLustre
157 | MountDir: /fsx
158 | FsxLustreSettings:
159 | FileSystemId: !If [CreateStorage, !GetAtt HPCStorageStack.Outputs.FileSystemId, !Ref FSx]
160 | Monitoring:
161 | DetailedMonitoring: true
162 | Logs:
163 | CloudWatch:
164 | Enabled: true
165 | RetentionInDays: 30
166 | DeletionPolicy: Retain
167 | Rotation:
168 | Enabled: true
169 | Dashboards:
170 | CloudWatch:
171 | Enabled: true
172 | Outputs:
173 | HeadNodeIp:
174 | Description: The Public IP address of the HeadNode
175 | Value: !GetAtt [ PclusterCluster, headNode.privateIpAddress ]
176 | FileSystemId:
177 | Value: !If [CreateStorage, !GetAtt HPCStorageStack.Outputs.FileSystemId, !Ref FSx]
178 | SystemManagerUrl:
179 | Description: URL to access the HeadNode via SystemManager
180 | Value: !Sub
181 | - https://${ConsoleDomain}/systems-manager/session-manager/${InstanceId}?region=${AWS::Region}
182 | - { ConsoleDomain: !Sub '${AWS::Region}.console.aws.amazon.com',
183 | InstanceId: !GetAtt [ PclusterCluster, headNode.instanceId ]
184 | }
--------------------------------------------------------------------------------
/ParallelCluster/CloudFormation/us-east-2.x86.yaml:
--------------------------------------------------------------------------------
1 | AWSTemplateFormatVersion: '2010-09-09'
2 | Description: HPC-Application-Best-Practice
3 | Metadata:
4 | AWS::CloudFormation::Interface:
5 | ParameterGroups:
6 | -
7 | Label:
8 | default: ""
9 | Parameters:
10 | - PrivateSubnet
11 | - FSx
12 | - ClusterSecurityGroup
13 | ParameterLabels:
14 | PrivateSubnet:
15 | default: 'Private Subnet:'
16 | FSx:
17 | default: 'FSx:'
18 | ClusterSecurityGroup:
19 | default: 'Security Group:'
20 | Parameters:
21 | PrivateSubnet:
22 | Description: 'Please, enter the ID of the Private Subnet you wish to use (if you want to re-use an existing one), or just leave "AUTO".'
23 | Type: String
24 | AllowedPattern: ^(AUTO|subnet-[0-9a-z]+)$
25 | Default : AUTO
26 | ClusterSecurityGroup:
27 | Description: 'Please, enter the ID of the Security Group you wish to use (if you want to re-use an existing one), or just leave "AUTO".'
28 | Type: String
29 | AllowedPattern: ^(AUTO|sg-[0-9a-z]+)$
30 | Default : AUTO
31 | FSx:
32 | Description: 'Please, enter your FSx ID, or just leave "AUTO" if you want to re-use an existing one.'
33 | Type: String
34 | AllowedPattern: ^(AUTO|fs-[0-9a-z]+)$
35 | Default : AUTO
36 | Conditions:
37 | CreateVpc: !Equals [!Ref PrivateSubnet, AUTO]
38 | CreateStorage: !Equals [!Ref FSx, AUTO]
39 | Resources:
40 | HPCNetworkStack:
41 | Type: AWS::CloudFormation::Stack
42 | Condition: CreateVpc
43 | Properties:
44 | TemplateURL: https://hpc-applications-best-practice.s3.eu-west-1.amazonaws.com/HPC-Networking.yaml
45 | HPCStorageStack:
46 | Type: AWS::CloudFormation::Stack
47 | Condition: CreateStorage
48 | Properties:
49 | TemplateURL: https://hpc-applications-best-practice.s3.eu-west-1.amazonaws.com/HPC-Storage.yaml
50 | Parameters:
51 | PrivateSubnet: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
52 | SecurityGroup: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
53 | PClusterPolicy:
54 | Type: AWS::IAM::ManagedPolicy
55 | Properties:
56 | ManagedPolicyName: !Sub 'pc-attach-policy-${AWS::StackName}'
57 | PolicyDocument:
58 | Version: 2012-10-17
59 | Statement:
60 | - Effect: Allow
61 | Action:
62 | - iam:AttachRolePolicy
63 | - iam:DetachRolePolicy
64 | Resource: "*"
65 | PclusterClusterProvider:
66 | Type: AWS::CloudFormation::Stack
67 | Properties:
68 | Parameters:
69 | AdditionalIamPolicies: !Ref PClusterPolicy
70 | TemplateURL: !Sub
71 | - https://${AWS::Region}-aws-parallelcluster.s3.${AWS::Region}.${AWS::URLSuffix}/parallelcluster/${Version}/templates/custom_resource/cluster.yaml
72 | - { Version: 3.9.1, Region: !Ref AWS::Region }
73 | PclusterCluster:
74 | Type: Custom::PclusterCluster
75 | Properties:
76 | ServiceToken: !GetAtt [ PclusterClusterProvider , Outputs.ServiceToken ]
77 | ClusterName: !Sub 'pc-${AWS::StackName}'
78 | ClusterConfiguration:
79 | Imds:
80 | ImdsSupport: v2.0
81 | Region: us-east-2
82 | Image:
83 | Os: alinux2
84 | HeadNode:
85 | InstanceType: m6i.2xlarge
86 | Networking:
87 | SubnetId: !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
88 | AdditionalSecurityGroups:
89 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
90 | DisableSimultaneousMultithreading: false
91 | SharedStorageType: Efs
92 | LocalStorage:
93 | RootVolume:
94 | Size: 100
95 | Encrypted: true
96 | VolumeType: gp3
97 | DeleteOnTermination: true
98 | CustomActions:
99 | OnNodeConfigured:
100 | Script: s3://hpc-applications-best-practice/post-install.headnode.sh
101 | Iam:
102 | AdditionalIamPolicies:
103 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
104 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
105 | S3Access:
106 | - BucketName: '*'
107 | EnableWriteAccess: true
108 | Imds:
109 | Secured: True
110 | Scheduling:
111 | Scheduler: slurm
112 | SlurmSettings:
113 | ScaledownIdletime: 10
114 | QueueUpdateStrategy: TERMINATE
115 | EnableMemoryBasedScheduling: true
116 | SlurmQueues:
117 | - Name: hpc7a
118 | CapacityType: ONDEMAND
119 | Networking:
120 | SubnetIds:
121 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
122 | AdditionalSecurityGroups:
123 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
124 | PlacementGroup:
125 | Enabled: true
126 | ComputeResources:
127 | - Name: hpc7a-96xlarge
128 | InstanceType: hpc7a.96xlarge
129 | MinCount: 0
130 | MaxCount: 200
131 | Efa:
132 | Enabled: true
133 | - Name: hpc7a-48xlarge
134 | InstanceType: hpc7a.48xlarge
135 | MinCount: 0
136 | MaxCount: 200
137 | Efa:
138 | Enabled: true
139 | - Name: hpc7a-24xlarge
140 | InstanceType: hpc7a.24xlarge
141 | MinCount: 0
142 | MaxCount: 200
143 | Efa:
144 | Enabled: true
145 | - Name: hpc7a-12xlarge
146 | InstanceType: hpc7a.12xlarge
147 | MinCount: 0
148 | MaxCount: 200
149 | Efa:
150 | Enabled: true
151 | Iam:
152 | AdditionalIamPolicies:
153 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
154 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
155 | S3Access:
156 | - BucketName: '*'
157 | EnableWriteAccess: true
158 | CustomActions:
159 | OnNodeConfigured:
160 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
161 | - Name: hpc6id
162 | CapacityType: ONDEMAND
163 | Networking:
164 | SubnetIds:
165 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
166 | AdditionalSecurityGroups:
167 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
168 | PlacementGroup:
169 | Enabled: true
170 | ComputeResources:
171 | - Name: hpc6id-32xlarge
172 | InstanceType: hpc6id.32xlarge
173 | MinCount: 0
174 | MaxCount: 200
175 | Efa:
176 | Enabled: true
177 | Iam:
178 | AdditionalIamPolicies:
179 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
180 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
181 | S3Access:
182 | - BucketName: '*'
183 | EnableWriteAccess: true
184 | CustomActions:
185 | OnNodeConfigured:
186 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
187 | - Name: hpc6a
188 | CapacityType: ONDEMAND
189 | Networking:
190 | SubnetIds:
191 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
192 | AdditionalSecurityGroups:
193 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
194 | PlacementGroup:
195 | Enabled: true
196 | ComputeResources:
197 | - Name: hpc6a-48xlarge
198 | InstanceType: hpc6a.48xlarge
199 | MinCount: 0
200 | MaxCount: 200
201 | Efa:
202 | Enabled: true
203 | Iam:
204 | AdditionalIamPolicies:
205 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
206 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
207 | S3Access:
208 | - BucketName: '*'
209 | EnableWriteAccess: true
210 | CustomActions:
211 | OnNodeConfigured:
212 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
213 | - Name: c5n
214 | CapacityType: ONDEMAND
215 | Networking:
216 | SubnetIds:
217 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.PrivateSubnetA, !Ref PrivateSubnet]
218 | AdditionalSecurityGroups:
219 | - !If [CreateVpc, !GetAtt HPCNetworkStack.Outputs.localSG, !Ref ClusterSecurityGroup]
220 | PlacementGroup:
221 | Enabled: true
222 | ComputeResources:
223 | - Name: c5n-18xlarge
224 | InstanceType: c5n.18xlarge
225 | MinCount: 0
226 | MaxCount: 200
227 | Efa:
228 | Enabled: true
229 | Iam:
230 | AdditionalIamPolicies:
231 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
232 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
233 | S3Access:
234 | - BucketName: '*'
235 | EnableWriteAccess: true
236 | CustomActions:
237 | OnNodeConfigured:
238 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
239 | SharedStorage:
240 | - Name: FsxLustre
241 | StorageType: FsxLustre
242 | MountDir: /fsx
243 | FsxLustreSettings:
244 | FileSystemId: !If [CreateStorage, !GetAtt HPCStorageStack.Outputs.FileSystemId, !Ref FSx]
245 | Monitoring:
246 | DetailedMonitoring: true
247 | Logs:
248 | CloudWatch:
249 | Enabled: true
250 | RetentionInDays: 30
251 | DeletionPolicy: Retain
252 | Rotation:
253 | Enabled: true
254 | Dashboards:
255 | CloudWatch:
256 | Enabled: true
257 | Outputs:
258 | HeadNodeIp:
259 | Description: The Public IP address of the HeadNode
260 | Value: !GetAtt [ PclusterCluster, headNode.privateIpAddress ]
261 | FileSystemId:
262 | Value: !If [CreateStorage, !GetAtt HPCStorageStack.Outputs.FileSystemId, !Ref FSx]
263 | SystemManagerUrl:
264 | Description: URL to access the HeadNode via SystemManager
265 | Value: !Sub
266 | - https://${ConsoleDomain}/systems-manager/session-manager/${InstanceId}?region=${AWS::Region}
267 | - { ConsoleDomain: !Sub '${AWS::Region}.console.aws.amazon.com',
268 | InstanceId: !GetAtt [ PclusterCluster, headNode.instanceId ]
269 | }
--------------------------------------------------------------------------------
/ParallelCluster/README.md:
--------------------------------------------------------------------------------
1 | ## AWS ParallelCluster
2 |
3 | Today we have implemented our HPC Application Best Practice using [AWS ParallelCluster](https://aws.amazon.com/hpc/parallelcluster/).
4 | AWS ParallelCluster is an open source cluster management tool that makes it easy for you to deploy and manage High Performance Computing (HPC) clusters on AWS.
5 | ParallelCluster offer a simple graphical user interface [GUI](https://docs.aws.amazon.com/parallelcluster/latest/ug/pcui-using-v3.html) or text file to model and provision the resources needed for your HPC applications in an automated and secure manner.
6 | It also supports multiple instance types and job submission queues, and job schedulers like AWS Batch and Slurm.
7 |
8 | ### Build your cluster using the CLI
9 | You can build your cluster using the AWS ParallelCluster [CLI](https://docs.aws.amazon.com/parallelcluster/latest/ug/pcluster-v3.html).
10 | Find example ParallelCluster configuration files under the [config](config) directory.
11 | This configuration files are not working examples as they need you to replace a few tokens (like `subnet-1234567890` or `sg-1234567890`) with the resources you want to use on your AWS account.
12 |
13 | ### Build your cluster using CloudFormation (1-Click)
14 | In addition, we have build a few working [CloudFormantion](https://aws.amazon.com/cloudformation/) templates that help you to create a new HPC cluster with just 1-Click.
15 | Select your preferred AWS Region among the supported ones. You will be asked a few questions about Networking and Storage;
16 | If you have no idea how to answer or what these services are, just leave the default values: `AUTO`.
17 | The 1-Click deployment procedure will take care of creating everything needed for your HPC Cluster to run properly.
18 |
19 |
20 | | Region | Type | Launch |
21 | |--------------| --- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
22 | | US | --- | --- |
23 | | N. Virginia (us-east-1) | Arm | [](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/quickcreate?templateURL=https%3A%2F%2Fhpc-applications-best-practice.s3.eu-west-1.amazonaws.com%2Fus-east-1.Arm.yaml&stackName=hpc-best-practice¶m_PrivateSubnet=AUTO¶m_FSx=AUTO¶m_ClusterSecurityGroup=AUTO) |
24 | | Ohio (us-east-2) | x86 | [](https://us-east-2.console.aws.amazon.com/cloudformation/home?region=us-east-2#/stacks/quickcreate?templateURL=https%3A%2F%2Fhpc-applications-best-practice.s3.eu-west-1.amazonaws.com%2Fus-east-2.x86.yaml&stackName=hpc-best-practice¶m_PrivateSubnet=AUTO¶m_FSx=AUTO¶m_ClusterSecurityGroup=AUTO) |
25 | | EU | --- | --- |
26 | | Stockholm (eu-north-1) | x86 | [](https://eu-north-1.console.aws.amazon.com/cloudformation/home?region=eu-north-1#/stacks/quickcreate?templateURL=https%3A%2F%2Fhpc-applications-best-practice.s3.eu-west-1.amazonaws.com%2Feu-north-1.x86.yaml&stackName=hpc-best-practice¶m_PrivateSubnet=AUTO¶m_FSx=AUTO¶m_ClusterSecurityGroup=AUTO) |
27 | | Stockholm (eu-north-1) | GPU | [](https://eu-north-1.console.aws.amazon.com/cloudformation/home?region=eu-north-1#/stacks/quickcreate?templateURL=https%3A%2F%2Fhpc-applications-best-practice.s3.eu-west-1.amazonaws.com%2Feu-north-1.GPU.yaml&stackName=hpc-best-practice¶m_PrivateSubnet=AUTO¶m_FSx=AUTO¶m_ClusterSecurityGroup=AUTO) |
28 | | Ireland (eu-west-1) | x86 | [](https://console.aws.amazon.com/) |
29 | | Ireland (eu-west-1) | Arm | [](https://console.aws.amazon.com/) |
30 | | APJ | --- | --- |
31 | | Tokyo (ap-northeast-1) | x86 | [](https://console.aws.amazon.com/) |
32 |
33 |
34 |
35 | > **_NOTE:_** Please refer to [aws-hpc-recipes](https://github.com/aws-samples/aws-hpc-recipes/) in case you need to leverage advanced AWS ParallelCluster functionalities through interoperable CloudFormation templates designed to work together to build complete HPC environments.
36 |
37 |
38 |
39 | After the CloudFormation stack is completed you can go to the `Output` tab and click on the `SystemManagerUrl` link.
40 | This link will let you access the HeadNode via SystemManager without using any password or certificate.
41 | (see the image below)
42 |
43 | 
44 |
45 | You will find a clone of this repository under `/fsx` .
--------------------------------------------------------------------------------
/ParallelCluster/config/eu-north-1.GPU.yaml:
--------------------------------------------------------------------------------
1 | Imds:
2 | ImdsSupport: v2.0
3 | Region: eu-north-1
4 | Image:
5 | Os: alinux2
6 | HeadNode:
7 | InstanceType: m6i.2xlarge
8 | Networking:
9 | SubnetId: <>
10 | AdditionalSecurityGroups:
11 | - <>
12 | DisableSimultaneousMultithreading: false
13 | SharedStorageType: Efs
14 | LocalStorage:
15 | RootVolume:
16 | Size: 100
17 | Encrypted: true
18 | VolumeType: gp3
19 | DeleteOnTermination: true
20 | CustomActions:
21 | OnNodeConfigured:
22 | Script: s3://hpc-benchmarks-software/post-install.headnode.sh
23 | Iam:
24 | AdditionalIamPolicies:
25 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
26 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
27 | S3Access:
28 | - BucketName: '*'
29 | EnableWriteAccess: true
30 | Imds:
31 | Secured: True
32 | Scheduling:
33 | Scheduler: slurm
34 | SlurmSettings:
35 | ScaledownIdletime: 10
36 | QueueUpdateStrategy: TERMINATE
37 | EnableMemoryBasedScheduling: true
38 | SlurmQueues:
39 | - Name: g5
40 | CapacityType: ONDEMAND
41 | Networking:
42 | SubnetIds:
43 | - <>
44 | AdditionalSecurityGroups:
45 | - <>
46 | PlacementGroup:
47 | Enabled: true
48 | ComputeResources:
49 | - Name: g5-48xlarge
50 | DisableSimultaneousMultithreading: true
51 | InstanceType: g5.48xlarge
52 | MinCount: 0
53 | MaxCount: 200
54 | Efa:
55 | Enabled: true
56 | GdrSupport: true
57 | - Name: g5-16xlarge
58 | DisableSimultaneousMultithreading: true
59 | InstanceType: g5.16xlarge
60 | MinCount: 0
61 | MaxCount: 200
62 | Efa:
63 | Enabled: true
64 | GdrSupport: true
65 | HealthChecks:
66 | Gpu:
67 | Enabled: false
68 | Iam:
69 | AdditionalIamPolicies:
70 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
71 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
72 | S3Access:
73 | - BucketName: '*'
74 | EnableWriteAccess: true
75 | CustomActions:
76 | OnNodeConfigured:
77 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
78 | - Name: p5
79 | CapacityType: ONDEMAND
80 | Networking:
81 | SubnetIds:
82 | - <>
83 | AdditionalSecurityGroups:
84 | - <>
85 | PlacementGroup:
86 | Enabled: true
87 | ComputeResources:
88 | - Name: p5-48xlarge
89 | DisableSimultaneousMultithreading: true
90 | InstanceType: p5.48xlarge
91 | MinCount: 0
92 | MaxCount: 200
93 | Efa:
94 | Enabled: true
95 | GdrSupport: true
96 | HealthChecks:
97 | Gpu:
98 | Enabled: false
99 | Iam:
100 | AdditionalIamPolicies:
101 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
102 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
103 | S3Access:
104 | - BucketName: '*'
105 | EnableWriteAccess: true
106 | CustomActions:
107 | OnNodeConfigured:
108 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
109 | - Name: p5-spot
110 | CapacityType: SPOT
111 | Networking:
112 | SubnetIds:
113 | - <>
114 | AdditionalSecurityGroups:
115 | - <>
116 | PlacementGroup:
117 | Enabled: true
118 | ComputeResources:
119 | - Name: p5-48xlarge
120 | DisableSimultaneousMultithreading: true
121 | InstanceType: p5.48xlarge
122 | MinCount: 0
123 | MaxCount: 200
124 | Efa:
125 | Enabled: true
126 | GdrSupport: true
127 | HealthChecks:
128 | Gpu:
129 | Enabled: false
130 | Iam:
131 | AdditionalIamPolicies:
132 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
133 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
134 | S3Access:
135 | - BucketName: '*'
136 | EnableWriteAccess: true
137 | CustomActions:
138 | OnNodeConfigured:
139 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
140 | SharedStorage:
141 | - Name: FsxLustre
142 | StorageType: FsxLustre
143 | MountDir: /fsx
144 | FsxLustreSettings:
145 | FileSystemId: <>
146 | Monitoring:
147 | DetailedMonitoring: true
148 | Logs:
149 | CloudWatch:
150 | Enabled: true
151 | RetentionInDays: 30
152 | DeletionPolicy: Retain
153 | Rotation:
154 | Enabled: true
155 | Dashboards:
156 | CloudWatch:
157 | Enabled: true
--------------------------------------------------------------------------------
/ParallelCluster/config/eu-north-1.x86.yaml:
--------------------------------------------------------------------------------
1 | Imds:
2 | ImdsSupport: v2.0
3 | Region: eu-north-1
4 | Image:
5 | Os: alinux2
6 | HeadNode:
7 | InstanceType: m6i.2xlarge
8 | Networking:
9 | SubnetId: <>
10 | AdditionalSecurityGroups:
11 | - <>
12 | DisableSimultaneousMultithreading: false
13 | SharedStorageType: Efs
14 | LocalStorage:
15 | RootVolume:
16 | Size: 100
17 | Encrypted: true
18 | VolumeType: gp3
19 | DeleteOnTermination: true
20 | CustomActions:
21 | OnNodeConfigured:
22 | Script: s3://hpc-applications-best-practice/post-install.headnode.sh
23 | Iam:
24 | AdditionalIamPolicies:
25 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
26 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
27 | S3Access:
28 | - BucketName: '*'
29 | EnableWriteAccess: true
30 | Imds:
31 | Secured: True
32 | Scheduling:
33 | Scheduler: slurm
34 | SlurmSettings:
35 | ScaledownIdletime: 10
36 | QueueUpdateStrategy: TERMINATE
37 | EnableMemoryBasedScheduling: true
38 | SlurmQueues:
39 | - Name: hpc7a
40 | CapacityType: ONDEMAND
41 | Networking:
42 | SubnetIds:
43 | - <>
44 | AdditionalSecurityGroups:
45 | - <>
46 | PlacementGroup:
47 | Enabled: true
48 | ComputeResources:
49 | - Name: hpc7a-96xlarge
50 | InstanceType: hpc7a.96xlarge
51 | MinCount: 0
52 | MaxCount: 200
53 | Efa:
54 | Enabled: true
55 | - Name: hpc7a-48xlarge
56 | InstanceType: hpc7a.48xlarge
57 | MinCount: 0
58 | MaxCount: 200
59 | Efa:
60 | Enabled: true
61 | - Name: hpc7a-24xlarge
62 | InstanceType: hpc7a.24xlarge
63 | MinCount: 0
64 | MaxCount: 200
65 | Efa:
66 | Enabled: true
67 | - Name: hpc7a-12xlarge
68 | InstanceType: hpc7a.12xlarge
69 | MinCount: 0
70 | MaxCount: 200
71 | Efa:
72 | Enabled: true
73 | Iam:
74 | AdditionalIamPolicies:
75 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
76 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
77 | S3Access:
78 | - BucketName: '*'
79 | EnableWriteAccess: true
80 | CustomActions:
81 | OnNodeConfigured:
82 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
83 | - Name: hpc6id
84 | CapacityType: ONDEMAND
85 | Networking:
86 | SubnetIds:
87 | - <>
88 | AdditionalSecurityGroups:
89 | - <>
90 | PlacementGroup:
91 | Enabled: true
92 | ComputeResources:
93 | - Name: hpc6id-32xlarge
94 | InstanceType: hpc6id.32xlarge
95 | MinCount: 0
96 | MaxCount: 200
97 | Efa:
98 | Enabled: true
99 | Iam:
100 | AdditionalIamPolicies:
101 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
102 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
103 | S3Access:
104 | - BucketName: '*'
105 | EnableWriteAccess: true
106 | CustomActions:
107 | OnNodeConfigured:
108 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
109 | - Name: hpc6a
110 | CapacityType: ONDEMAND
111 | Networking:
112 | SubnetIds:
113 | - <>
114 | AdditionalSecurityGroups:
115 | - <>
116 | PlacementGroup:
117 | Enabled: true
118 | ComputeResources:
119 | - Name: hpc6a-48xlarge
120 | InstanceType: hpc6a.48xlarge
121 | MinCount: 0
122 | MaxCount: 200
123 | Efa:
124 | Enabled: true
125 | Iam:
126 | AdditionalIamPolicies:
127 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
128 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
129 | S3Access:
130 | - BucketName: '*'
131 | EnableWriteAccess: true
132 | CustomActions:
133 | OnNodeConfigured:
134 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
135 | - Name: c5n
136 | CapacityType: ONDEMAND
137 | Networking:
138 | SubnetIds:
139 | - <>
140 | AdditionalSecurityGroups:
141 | - <>
142 | PlacementGroup:
143 | Enabled: true
144 | ComputeResources:
145 | - Name: c5n-18xlarge
146 | InstanceType: c5n.18xlarge
147 | MinCount: 0
148 | MaxCount: 200
149 | Efa:
150 | Enabled: true
151 | Iam:
152 | AdditionalIamPolicies:
153 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
154 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
155 | S3Access:
156 | - BucketName: '*'
157 | EnableWriteAccess: true
158 | CustomActions:
159 | OnNodeConfigured:
160 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
161 | SharedStorage:
162 | - Name: FsxLustre
163 | StorageType: FsxLustre
164 | MountDir: /fsx
165 | FsxLustreSettings:
166 | FileSystemId: <>
167 | Monitoring:
168 | DetailedMonitoring: true
169 | Logs:
170 | CloudWatch:
171 | Enabled: true
172 | RetentionInDays: 30
173 | DeletionPolicy: Retain
174 | Rotation:
175 | Enabled: true
176 | Dashboards:
177 | CloudWatch:
178 | Enabled: true
--------------------------------------------------------------------------------
/ParallelCluster/config/us-east-1.Arm.yaml:
--------------------------------------------------------------------------------
1 | Imds:
2 | ImdsSupport: v2.0
3 | Region: us-east-1
4 | Image:
5 | Os: alinux2
6 | HeadNode:
7 | InstanceType: m7g.4xlarge
8 | Networking:
9 | SubnetId: <>
10 | AdditionalSecurityGroups:
11 | - <>
12 | #SharedStorageType: Efs
13 | LocalStorage:
14 | RootVolume:
15 | Size: 100
16 | Encrypted: true
17 | VolumeType: gp3
18 | DeleteOnTermination: true
19 | CustomActions:
20 | OnNodeConfigured:
21 | Script: s3://hpc-applications-best-practice/post-install.headnode.sh
22 | Iam:
23 | AdditionalIamPolicies:
24 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
25 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
26 | S3Access:
27 | - BucketName: '*'
28 | EnableWriteAccess: true
29 | Imds:
30 | Secured: True
31 | Scheduling:
32 | Scheduler: slurm
33 | SlurmSettings:
34 | ScaledownIdletime: 10
35 | QueueUpdateStrategy: TERMINATE
36 | EnableMemoryBasedScheduling: true
37 | SlurmQueues:
38 | - Name: hpc7g
39 | CapacityType: ONDEMAND
40 | Networking:
41 | SubnetIds:
42 | - <>
43 | AdditionalSecurityGroups:
44 | - <>
45 | PlacementGroup:
46 | Enabled: true
47 | ComputeResources:
48 | - Name: hpc7g-16xlarge
49 | InstanceType: hpc7g.16xlarge
50 | MinCount: 0
51 | MaxCount: 200
52 | Efa:
53 | Enabled: true
54 | - Name: hpc7g-8xlarge
55 | InstanceType: hpc7g.8xlarge
56 | MinCount: 0
57 | MaxCount: 200
58 | Efa:
59 | Enabled: true
60 | - Name: hpc7g-4xlarge
61 | InstanceType: hpc7g.4xlarge
62 | MinCount: 0
63 | MaxCount: 200
64 | Efa:
65 | Enabled: true
66 | Iam:
67 | AdditionalIamPolicies:
68 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
69 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
70 | S3Access:
71 | - BucketName: '*'
72 | EnableWriteAccess: true
73 | CustomActions:
74 | OnNodeConfigured:
75 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
76 | SharedStorage:
77 | - Name: FsxLustre
78 | StorageType: FsxLustre
79 | MountDir: /fsx
80 | FsxLustreSettings:
81 | FileSystemId: <>
82 | Monitoring:
83 | DetailedMonitoring: true
84 | Logs:
85 | CloudWatch:
86 | Enabled: true
87 | RetentionInDays: 30
88 | DeletionPolicy: Retain
89 | Rotation:
90 | Enabled: true
91 | Dashboards:
92 | CloudWatch:
93 | Enabled: true
--------------------------------------------------------------------------------
/ParallelCluster/config/us-east-2.x86.yaml:
--------------------------------------------------------------------------------
1 | Imds:
2 | ImdsSupport: v2.0
3 | Region: us-east-2
4 | Image:
5 | Os: alinux2
6 | HeadNode:
7 | InstanceType: m6i.2xlarge
8 | Networking:
9 | SubnetId: <>
10 | AdditionalSecurityGroups:
11 | - <>
12 | DisableSimultaneousMultithreading: false
13 | SharedStorageType: Efs
14 | LocalStorage:
15 | RootVolume:
16 | Size: 100
17 | Encrypted: true
18 | VolumeType: gp3
19 | DeleteOnTermination: true
20 | CustomActions:
21 | OnNodeConfigured:
22 | Script: s3://hpc-applications-best-practice/post-install.headnode.sh
23 | Iam:
24 | AdditionalIamPolicies:
25 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
26 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
27 | S3Access:
28 | - BucketName: '*'
29 | EnableWriteAccess: true
30 | Imds:
31 | Secured: True
32 | Scheduling:
33 | Scheduler: slurm
34 | SlurmSettings:
35 | ScaledownIdletime: 10
36 | QueueUpdateStrategy: TERMINATE
37 | EnableMemoryBasedScheduling: true
38 | SlurmQueues:
39 | - Name: hpc7a
40 | CapacityType: ONDEMAND
41 | Networking:
42 | SubnetIds:
43 | - <>
44 | AdditionalSecurityGroups:
45 | - <>
46 | PlacementGroup:
47 | Enabled: true
48 | ComputeResources:
49 | - Name: hpc7a-96xlarge
50 | InstanceType: hpc7a.96xlarge
51 | MinCount: 0
52 | MaxCount: 200
53 | Efa:
54 | Enabled: true
55 | - Name: hpc7a-48xlarge
56 | InstanceType: hpc7a.48xlarge
57 | MinCount: 0
58 | MaxCount: 200
59 | Efa:
60 | Enabled: true
61 | - Name: hpc7a-24xlarge
62 | InstanceType: hpc7a.24xlarge
63 | MinCount: 0
64 | MaxCount: 200
65 | Efa:
66 | Enabled: true
67 | - Name: hpc7a-12xlarge
68 | InstanceType: hpc7a.12xlarge
69 | MinCount: 0
70 | MaxCount: 200
71 | Efa:
72 | Enabled: true
73 | Iam:
74 | AdditionalIamPolicies:
75 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
76 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
77 | S3Access:
78 | - BucketName: '*'
79 | EnableWriteAccess: true
80 | CustomActions:
81 | OnNodeConfigured:
82 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
83 | - Name: hpc6id
84 | CapacityType: ONDEMAND
85 | Networking:
86 | SubnetIds:
87 | - <>
88 | AdditionalSecurityGroups:
89 | - <>
90 | PlacementGroup:
91 | Enabled: true
92 | ComputeResources:
93 | - Name: hpc6id-32xlarge
94 | InstanceType: hpc6id.32xlarge
95 | MinCount: 0
96 | MaxCount: 200
97 | Efa:
98 | Enabled: true
99 | Iam:
100 | AdditionalIamPolicies:
101 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
102 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
103 | S3Access:
104 | - BucketName: '*'
105 | EnableWriteAccess: true
106 | CustomActions:
107 | OnNodeConfigured:
108 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
109 | - Name: hpc6a
110 | CapacityType: ONDEMAND
111 | Networking:
112 | SubnetIds:
113 | - <>
114 | AdditionalSecurityGroups:
115 | - <>
116 | PlacementGroup:
117 | Enabled: true
118 | ComputeResources:
119 | - Name: hpc6a-48xlarge
120 | InstanceType: hpc6a.48xlarge
121 | MinCount: 0
122 | MaxCount: 200
123 | Efa:
124 | Enabled: true
125 | Iam:
126 | AdditionalIamPolicies:
127 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
128 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
129 | S3Access:
130 | - BucketName: '*'
131 | EnableWriteAccess: true
132 | CustomActions:
133 | OnNodeConfigured:
134 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
135 | - Name: c5n
136 | CapacityType: ONDEMAND
137 | Networking:
138 | SubnetIds:
139 | - <>
140 | AdditionalSecurityGroups:
141 | - <>
142 | PlacementGroup:
143 | Enabled: true
144 | ComputeResources:
145 | - Name: c5n-18xlarge
146 | InstanceType: c5n.18xlarge
147 | MinCount: 0
148 | MaxCount: 200
149 | Efa:
150 | Enabled: true
151 | Iam:
152 | AdditionalIamPolicies:
153 | - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
154 | - Policy: arn:aws:iam::aws:policy/SecretsManagerReadWrite
155 | S3Access:
156 | - BucketName: '*'
157 | EnableWriteAccess: true
158 | CustomActions:
159 | OnNodeConfigured:
160 | Script: s3://hpc-applications-best-practice/post-install.compute.sh
161 | SharedStorage:
162 | - Name: FsxLustre
163 | StorageType: FsxLustre
164 | MountDir: /fsx
165 | FsxLustreSettings:
166 | FileSystemId: <>
167 | Monitoring:
168 | DetailedMonitoring: true
169 | Logs:
170 | CloudWatch:
171 | Enabled: true
172 | RetentionInDays: 30
173 | DeletionPolicy: Retain
174 | Rotation:
175 | Enabled: true
176 | Dashboards:
177 | CloudWatch:
178 | Enabled: true
--------------------------------------------------------------------------------
/ParallelCluster/post-install/post-install.compute.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | #put here your post install script for the compute nodes
--------------------------------------------------------------------------------
/ParallelCluster/post-install/post-install.headnode.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | curl -sL -O https://raw.githubusercontent.com/spack/spack-configs/main/AWS/parallelcluster/postinstall.sh
4 | sudo bash postinstall.sh
5 |
6 | #git clone https://github.com/aws-samples/hpc-applications.git
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # HPC application best practices on AWS
2 | This repository contains HPC application best practices, specifically designed for, and optimized to run on AWS.
3 | In particular, these best practices take into account the peculiarity of AWS HPC-specific services and EC2 instances, in order to get the best out of them.
4 | This repo is maintained by AWS HPC Solution Architects, who will take care of updating and improving these best practices as AWS services evolve or new settings/tunings are discovered. This is **not intended** to be an AWS supported product or service, though.
5 |
6 | ## HPC application best practices included (to date):
7 | 1. [ANSYS Fluent](https://github.com/aws-samples/hpc-applications/tree/main/apps/Fluent)
8 | 2. [Abaqus](https://github.com/aws-samples/hpc-applications/tree/main/apps/Abaqus)
9 | 3. [LS-Dyna](https://github.com/aws-samples/hpc-applications/tree/main/apps/LS-Dyna)
10 | 4. [Optistruct](https://github.com/aws-samples/hpc-applications/tree/main/apps/Optistruct)
11 | 5. [STAR-CCM+](https://github.com/aws-samples/hpc-applications/tree/main/apps/StarCCM)
12 | 6. [ANSYS CFX](https://github.com/aws-samples/hpc-applications/tree/main/apps/CFX)
13 | 7. [ANSYS Mechanical](https://github.com/aws-samples/hpc-applications/tree/main/apps/AnsysMechanical)
14 |
15 | ### Request a HPC application best practice
16 |
17 | We're starting with the most common HPC applications, specifically in the CAE market.
18 | We're also updating the list of included HPC application best practice regularly, based on your feedback.
19 | So, feel free to request a new HPC application best practice from the [ISSUES](https://github.com/aws-samples/hpc-applications/issues) .
20 | We will do our best to satisfy your requests.
21 |
22 | ## Get Started
23 |
24 | ### On existing Clusters
25 |
26 | If you already have a cluster up&running and you want to try these best practices, you can `git clone` this repository:
27 | ```
28 | git clone https://github.com/aws-samples/hpc-applications.git
29 | ```
30 | Then, (if needed) customize the example launch script for the application you want to try and run it.
31 |
32 | ### Build a new HPC Cluster (1-Click deployment)
33 |
34 | You can follow the instructions [here](ParallelCluster/README.md) to build a new cluster with just 1-Click.
35 |
36 |
37 | ## HPC application Benchmarks
38 | In addition to application best practices, this repo will include some HPC application benchmarks. For all the included applications, we've run some benchmarks using public datasets. We'll publish our data and some charts to show the performance and scalability you should be aiming to achieve.
39 |
40 | ## AWS HPC products and services being used
41 | * [AWS ParallelCluster](https://aws.amazon.com/hpc/parallelcluster/) and its [Documentation](https://docs.aws.amazon.com/parallelcluster/latest/ug/what-is-aws-parallelcluster.html)
42 | * [Elastic Fabric Adapter](https://aws.amazon.com/hpc/efa/) (EFA) and its [Documentation](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa.html)
43 | * [Amazon FSx for Lustre](https://aws.amazon.com/fsx/lustre/) and its [Documentation](https://docs.aws.amazon.com/fsx/latest/LustreGuide/what-is.html)
44 | * AWS EC2 [Hpc7a](https://aws.amazon.com/ec2/instance-types/hpc7a/) Instances
45 | * AWS EC2 [Hpc6a](https://aws.amazon.com/ec2/instance-types/hpc6a/) Instances
46 | * AWS EC2 [Hpc6id](https://aws.amazon.com/ec2/instance-types/hpc6i/) Instances
47 | * AWS EC2 [Hpc7g](https://aws.amazon.com/ec2/instance-types/hpc7g/) Instances
48 |
49 | ## Security
50 |
51 | See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.
52 |
53 | ## License
54 |
55 | This library is licensed under the MIT-0 License. [See the LICENSE file](LICENSE).
--------------------------------------------------------------------------------
/apps/Abaqus/Abaqus.sbatch:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | #SBATCH --exclusive
3 | #SBATCH -t 99:00:00
4 | #SBATCH --nodes=2
5 | #SBATCH --ntasks=128
6 | #SBATCH --constraint=hpc6id.32xlarge
7 | #SBATCH --partition=hpc6id
8 |
9 | input_file=${1:-"/fsx/abaqus/s4e/Model/s4e.inp"}
10 | abaqus_version=${2:-"2022"}
11 | basedir=${BASE_DIR:-"/fsx"}
12 | abaqus_bin=${ABAQUS_BIN:-"${basedir}/SIMULIA/EstProducts/${abaqus_version}/linux_a64/code/bin/SMALauncher"}
13 | mp_host_split=${MP_HOST_SPLIT:-""}
14 | threads_mpi=${THREADS_MPI:-""}
15 | verbose=${VERBOSE:-""}
16 | dsls_license=${DSLS_LICENSE:-"/fsx/SIMULIA/EstProducts/2022/linux_a64/SMA/site/DSLicSrv.txt"}
17 |
18 | module load intelmpi
19 | export I_MPI_FABRICS=shm:ofi
20 | export I_MPI_OFI_PROVIDER=efa
21 | export I_MPI_MULTIRAIL=1
22 | module load libfabric-aws
23 |
24 | TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
25 | workdir="$(readlink -m "${basedir}/${SLURM_JOB_NAME%.*}")/$(basename ${input_file%.*})/${SLURM_JOB_ID}-${SLURM_JOB_NUM_NODES}x$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/instance-type)-${SLURM_NPROCS}-$(date "+%d-%m-%Y-%H-%M")"
26 |
27 | echo "Execution directory is: ${workdir}"
28 | mkdir -p "${workdir}" && cd "${workdir}"
29 |
30 | export cores_x_node=$(((SLURM_NPROCS / SLURM_JOB_NUM_NODES ) + ( SLURM_NPROCS % SLURM_JOB_NUM_NODES > 0 )))
31 |
32 | SCRATCH_BASE=$(mount | grep ephemeral | awk '{print $3}')
33 | if [[ $? != 0 ]]; then
34 | echo "Scratch failed."
35 | exit 1
36 | elif [[ $SCRATCH_BASE ]]; then
37 | export SCRATCH_DIR="${SCRATCH_BASE}/scratch-$SLURM_JOB_ID"
38 | mkdir -p "${SCRATCH_DIR}"
39 | export TMPDIR="${SCRATCH_DIR}"
40 | else
41 | export SCRATCH_DIR="${workdir}/scratch-$SLURM_JOB_ID"
42 | mkdir -p "${SCRATCH_DIR}"
43 | export TMPDIR="${SCRATCH_DIR}"
44 | fi
45 |
46 | cp $0 .
47 | ln -s "${input_file}" .
48 |
49 | echo "Drop caches on all nodes"
50 | mpirun -np $SLURM_JOB_NUM_NODES -ppn 1 /bin/bash -c "sync && echo 3 | sudo tee /proc/sys/vm/drop_caches"
51 |
52 | echo "Enabling Transparent Huge Pages (THP)"
53 | mpirun -np $SLURM_JOB_NUM_NODES -ppn 1 /bin/bash -c "echo always | sudo tee /sys/kernel/mm/transparent_hugepage/enabled"
54 |
55 |
56 | cat > abaqus_v6.env << EOF
57 | # System-Wide ABAQUS Environment File
58 | dsls_license_config="${dsls_license}"
59 | mp_rsh_command='ssh -n -l %U %H %C'
60 | mp_host_list=$(scontrol show nodes $SLURM_NODELIST | grep NodeHostName= | awk -v cores="$cores_x_node" -F" |=" 'BEGIN { printf "["; out="" } { out=out"[\x27"$7"\x27,"cores"]," } END { t=length(out); printf substr(out,0,t-1); print "]"; }')
61 | EOF
62 |
63 | [ -z "$mp_host_split" ] || echo "mp_host_split=$mp_host_split" >> abaqus_v6.env
64 | [ -z "$threads_mpi" ] || echo "threads_per_mpi_process=$threads_mpi" >> abaqus_v6.env
65 | [ -z "$verbose" ] || echo "verbose=$verbose" >> abaqus_v6.env
66 | [ -z "$SCRATCH_DIR" ] || echo "scratch=\"${SCRATCH_DIR}\"" >> abaqus_v6.env
67 |
68 | cat >> abaqus_v6.env << EOF
69 | mp_mpi_implementation = IMPI
70 | cpus=${SLURM_NPROCS}
71 | mp_mpirun_path = {IMPI: "$(which mpirun)"}
72 | mp_mpirun_options='-bootstrap ssh'
73 | mp_mode = MPI
74 | standard_parallel = ALL
75 | EOF
76 |
77 | "${abaqus_bin}" input=$(basename ${input_file}) interactive -job $(basename ${input_file%.*}) | tee $(basename ${input_file%.*}).log
78 |
79 | rm -rf "${SCRATCH_DIR}"
--------------------------------------------------------------------------------
/apps/Abaqus/README.md:
--------------------------------------------------------------------------------
1 | # Abaqus
2 |
3 | [Abaqus](https://www.3ds.com/products-services/simulia/products/abaqus/) is a software suite for finite element analysis and computer-aided engineering developed by [Dassault System](https://www.3ds.com/) (DS).
4 |
5 | Abaqus is part of the [Simulia](https://www.3ds.com/products-services/simulia/) product suite. It consists of various core software products aimed at accelerating the process of evaluating the performance, reliability, and safety of materials and products before committing to physical prototypes.
6 |
7 | This repository focuses on 2 specific software:
8 |
9 | * Abaqus/Standard (or Implicit), a general-purpose finite-element analyzer that employs an implicit integration scheme (traditional).
10 | * Abaqus/Explicit, a special-purpose finite-element analyzer that employs an explicit integration scheme to solve highly nonlinear systems with many complex contacts under transient loads.
11 |
12 | # Versions
13 |
14 | In this repository we will provide best practices for the following Abaqus versions 2021 (and newer)
15 |
16 |
17 | # Installation
18 |
19 | The Abaqus installer only supports a few selected types of Linux operating systems. [ `RHEL` `CentOS` and `SUSE`]
20 | Abaqus is typically distributed as `tar` files.
21 | For example Abaqus 2021 is comprised of 5 tar files named like `2021.AM_SIM_Abaqus_Extend.AllOS.1-5.tar ... 2021.AM_SIM_Abaqus_Extend.AllOS.5-5.tar` .
22 | Uncompress them with `tar xvf 2021.AM_SIM_Abaqus_Extend.AllOS.1-5.tar ... 2021.AM_SIM_Abaqus_Extend.AllOS.5-5.tar` .
23 | In order to install Abaqus go into `AM_SIM_Abaqus_Extend.AllOS/1`; The Abaqus installation process is interactive and can be done via terminal by running `./StartTUI.sh` or via graphical interface `./StartGUI.sh`
24 |
25 | The Installer will ask you to choose the installation directory and will suggest `/usr/SIMULIA/EstProducts/`, please change that to you shared Filesystem ([Amazon FSx for Lustre](https://aws.amazon.com/fsx/lustre/)), like `/fsx/SIMULIA/EstProducts/`
26 |
27 | Same for the `CAE commands directory path`, please change the suggested dir `/var/DassaultSystemes/SIMULIA/Commands` to `/fsx/DassaultSystemes/SIMULIA/Commands`
28 |
29 | and for the `SIMULIA Established Products` please change `/var/DassaultSystemes/SIMULIA/CAE/plugins/` to `/fsx/DassaultSystemes/SIMULIA/CAE/plugins/`
30 |
31 | Even if it works, at the moment [Amazon Linux 2](https://aws.amazon.com/amazon-linux-2/) (AL2) is not one of the supported Operating Systems.
32 | If you are planning to install Abaqus on AL2 (or in general on an un-supported operating system) you need to work around the installer.
33 |
34 | ```bash
35 | export DSY_Force_OS=linux_a64
36 | export DSYAuthOS_`lsb_release --short --id | sed 's/ //g'`=1
37 |
38 | ..
39 | ...
40 | #then
41 |
42 | ./StartTUI.sh
43 |
44 | ```
45 |
46 | **_NOTE:_** For the full Abaqus installation guide, please refer to the official documentation.
47 |
48 | # Key settings & tips (performance related ones) :
49 |
50 | * Abaqus is a memory and (for Standard/Implicit simulations) IO bound code. So the best instance types are the ones with high memory per-core ratio (8:1 or 16:1). This would allow `in-core` simulations.
51 | * Abaqus Standard/Implicit can also benefit from fast IO, see below how to use the local NVMe disk (when available) of some AWS EC2 instances for the Abaqus scratch.
52 | * The best instance type for running Abaqus Standard/Implicit simulations is [Hpc6id](https://aws.amazon.com/ec2/instance-types/hpc6i) .
53 | * The best instance type for running Abaqus Explicit simulations is [Hpc7a](https://aws.amazon.com/ec2/instance-types/hpc7a) .
54 |
55 |
56 | Abaqus key settings are stored and managed in the Abaqus configuration file **abaqus_v6.env** file.
57 |
58 | This file needs to be stored in the job execution directory.
59 | Below you can find an **abaqus_v6.env** file example:
60 |
61 | ```bash
62 | license_server_type=DSLS
63 | dsls_license_config="/fsx/DassaultSystemes/Licenses/DSLicSrv.txt"
64 | mp_rsh_command='ssh -n -l %U %H %C'
65 | mp_host_list=[['compute-od-1-dy-hpc6id-32xlarge-2',64],['compute-od-1-dy-hpc6id-32xlarge-3',64]]
66 | mp_host_split=4
67 | mp_mpi_implementation = IMPI
68 | verbose=0
69 | cpus=128
70 | mp_mpirun_path = {IMPI: "/opt/intel/mpi/2021.6.0/bin/mpirun"}
71 | mp_mpirun_options='-bootstrap ssh'
72 | scratch="/scratch"
73 | mp_mode = MPI
74 | standard_parallel = ALL
75 | ```
76 |
77 | The most important parameters are:
78 | * `mp_host_list` This variable defines the host (and the amount of available core per host) to be used for the simulation.
79 | * `cpus` This variable defines the total number of core being used. It might be different from the total amount of cores available from `mp_host_list`
80 | * `mp_mpi_implementation` This variable defines the MPI library being used
81 | * `mp_mpirun_path` while this one contains the path of a custom MPI library.
82 | * `mp_mpirun_options` This variable contains the MPI parameters to pass to the MPI executable.
83 | * `mp_host_split` Should be set to an integer equal to the desired number of MPI processes to run on each node.
84 | * `scratch` This variable defines the path of the scratch directory being used in the Implicit simulation only. This is typically set to the local NVME disk or on FSx for Lustre.
85 |
86 |
87 | **_NOTE:_** For the full Abaqus guide please refer to the official documentation.
88 |
89 |
90 | # Performance
91 |
92 | This section shows the benchmark results of Abaqus 2024 running common datasets for Standard/Implicit and Explicit simulations: `s4e` , `s9` , `e14_DropTest_v0` , and `e13`
93 | For more information about these benchmarks please refer to the Abaqus offical documentation.
94 |
95 | **_NOTE:_** The benchmark results are based on wallclock time normalized.
96 |
97 | ## s4e
98 |
99 |
100 | This chart shows the per-core performance of Abaqus 2024 running a Standard/Implicit simulation using the s4e dataset on older and newer generation of AMD-Based AWS EC2 [Hpc6a](https://aws.amazon.com/ec2/instance-types/hpc6a/) Instances.
101 | 
102 |
103 | This chart shows the per-core performance of Abaqus 2024 running a Standard/Implicit simulation using the s4e dataset on all the different sizes of the AWS EC2 [Hpc7a](https://aws.amazon.com/ec2/instance-types/hpc7a/) Instances.
104 | 
105 |
106 | This chart shows the per-core performance of Abaqus 2024 running a Standard/Implicit simulation using the s4e dataset on older and newer generation of Intel-based AWS EC2 [Hpc6id](https://aws.amazon.com/ec2/instance-types/hpc6id/) Instances.
107 | 
108 |
109 | ## s9
110 |
111 |
112 | This chart shows the per-core performance of Abaqus 2024 running a Standard/Implicit simulation using the s9 dataset on older and newer generation of AMD-Based AWS EC2 [Hpc6a](https://aws.amazon.com/ec2/instance-types/hpc6a/) Instances.
113 | 
114 |
115 | This chart shows the per-core performance of Abaqus 2024 running a Standard/Implicit simulation using the s9 dataset on all the different sizes of the AWS EC2 [Hpc7a](https://aws.amazon.com/ec2/instance-types/hpc7a/) Instances.
116 | 
117 |
118 | This chart shows the per-core performance of Abaqus 2024 running a Standard/Implicit simulation using the s9 dataset on older and newer generation of Intel-based AWS EC2 [Hpc6id](https://aws.amazon.com/ec2/instance-types/hpc6id/) Instances.
119 | 
120 |
121 |
122 | ## e14_DropTest_v0
123 |
124 |
125 | This chart shows the per-core performance of Abaqus 2024 running an Explicit simulation using the e14_DropTest_v0 dataset on older and newer generation of AMD-Based AWS EC2 [Hpc6a](https://aws.amazon.com/ec2/instance-types/hpc6a/) Instances.
126 | 
127 |
128 | This chart shows the per-core performance of Abaqus 2024 running an Explicit simulation using the e14_DropTest_v0 dataset on all the different sizes of the AWS EC2 [Hpc7a](https://aws.amazon.com/ec2/instance-types/hpc7a/) Instances.
129 | 
130 |
131 | This chart shows the per-core performance of Abaqus 2024 running an Explicit simulation using the e14_DropTest_v0 dataset on older and newer generation of Intel-based AWS EC2 [Hpc6id](https://aws.amazon.com/ec2/instance-types/hpc6id/) Instances.
132 | 
133 |
134 |
135 | ## e13
136 |
137 |
138 | This chart shows the per-core performance of Abaqus 2024 running an Explicit simulation using the e13 dataset on older and newer generation of AMD-Based AWS EC2 [Hpc6a](https://aws.amazon.com/ec2/instance-types/hpc6a/) Instances.
139 | 
140 |
141 | This chart shows the per-core performance of Abaqus 2024 running an Explicit simulation using the e13 dataset on all the different sizes of the AWS EC2 [Hpc7a](https://aws.amazon.com/ec2/instance-types/hpc7a/) Instances.
142 | 
143 |
144 | This chart shows the per-core performance of Abaqus 2024 running an Explicit simulation using the e13 dataset on older and newer generation of Intel-based AWS EC2 [Hpc6id](https://aws.amazon.com/ec2/instance-types/hpc6id/) Instances.
145 | 
146 |
--------------------------------------------------------------------------------
/apps/AnsysMechanical/AnsysMechanical.sbatch:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | #SBATCH --exclusive
3 | #SBATCH --nodes=1
4 | #SBATCH --ntasks=64
5 | #SBATCH --constraint=hpc6id.32xlarge
6 | #SBATCH --partition=hpc6id
7 | #SBATCH --chdir=/fsx/slurm-out/mechanical
8 |
9 |
10 | ######################## EFA settings ########################
11 | export I_MPI_OFI_LIBRARY_INTERNAL=0
12 | module load intelmpi
13 | export I_MPI_DEBUG=5
14 | export I_MPI_FABRICS=shm:ofi
15 | export I_MPI_OFI_PROVIDER=efa
16 | export I_MPI_MULTIRAIL=1
17 | #export FI_EFA_RECVWIN_SIZE=65536
18 | module load libfabric-aws
19 | export INTELMPI_ROOT="$(dirname $(dirname $(which mpirun)))"
20 | ######################## EFA settings ########################
21 |
22 |
23 | export ANSYSLMD_LICENSE_FILE=${ANSYSLMD_LICENSE_FILE:-"1055@172.31.89.151"}
24 |
25 | mechanicalversion=${1:-"231"}
26 | input_file=${2:-"/fsx/.../your_file.dat"}
27 | basedir=${BASE_DIR:-"/fsx"}
28 |
29 | APP_BIN_PATH="${basedir}/ansys_inc/v{${mechanicalversion}/ansys/bin/ansys${mechanicalversion}"
30 |
31 | TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
32 | workdir="$(readlink -m "${basedir}/${SLURM_JOB_NAME%.*}")/Run/${SLURM_JOB_ID}-${SLURM_JOB_NUM_NODES}x$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/instance-type)-${SLURM_NPROCS}-$(date "+%d-%m-%Y-%H-%M")"
33 |
34 | mkdir -p "${workdir}" && cd "${workdir}"
35 |
36 | scontrol show nodes $SLURM_NODELIST | grep NodeHostName= | awk '{print $2}' | sed 's/NodeHostName=//' >hostfile
37 | export cores_x_node=$(((SLURM_NPROCS / SLURM_JOB_NUM_NODES ) + ( SLURM_NPROCS % SLURM_JOB_NUM_NODES > 0 )))
38 |
39 | MACHINES=$(xargs printf '%s:$cores_x_node, ' < hostfile)
40 |
41 | cp $0 .
42 | ln -s "${input_file}" .
43 |
44 |
45 | echo "Drop caches on all nodes"
46 | mpirun -np $SLURM_JOB_NUM_NODES -ppn 1 /bin/bash -c "sync && echo 3 | sudo tee /proc/sys/vm/drop_caches"
47 |
48 | echo "Enabling Transparent Huge Pages (THP)"
49 | mpirun -np $SLURM_JOB_NUM_NODES -ppn 1 /bin/bash -c "echo always | sudo tee /sys/kernel/mm/transparent_hugepage/enabled"
50 |
51 | echo "Install missing lib"
52 | mpirun -np $SLURM_JOB_NUM_NODES -ppn 1 /bin/bash -c "sudo yum -y install mesa-libGLU"
53 |
54 |
55 | "${APP_BIN_PATH}" -b -dis -mpi intelmpi -ssh -np ${SLURM_NPROCS} -machines $MACHINES -i $(basename ${input_file}) -o output.log
--------------------------------------------------------------------------------
/apps/CFX/CFX.md:
--------------------------------------------------------------------------------
1 | ```
2 | Name:
3 |
4 |
5 | cfx5solve
6 |
7 |
8 | Purpose:
9 |
10 |
11 | Run the ANSYS CFX Solver or the ANSYS CFX Solver Manager.
12 |
13 |
14 | If no Solver Input File is specified on the command line the
15 | ANSYS CFX Solver Manager will be started to allow you to select an ANSYS
16 | CFX Solver Input File and then run the ANSYS CFX Solver. Otherwise the
17 | ANSYS CFX Solver will be run directly unless the -interactive option is
18 | also specified on the command line.
19 |
20 |
21 | Synopsis:
22 |
23 |
24 | cfx5solve -def []
25 |
26 |
27 | cfx5solve -mdef []
28 | [-config []]...
29 |
30 |
31 | [] denotes a discretionary option.
32 | | separates mutually exclusive options.
33 | ... following an option indicates that it may be repeated
34 | <> denotes that substitution of a suitable value is required.
35 | All other options are keywords, some of which have a short form.
36 |
37 |
38 | Options:
39 |
40 |
41 | Options to cfx5solve may be reordered where necessary. If an option
42 | is specified multiple times (e.g. for the same configuration) the
43 | option on the right will override those to its left.
44 |
45 |
46 | -------------------------------------------------------------------------
47 | GENERAL
48 | -------------------------------------------------------------------------
49 |
50 |
51 | -batch
52 | Run the ANSYS CFX Solver in batch mode (i.e. without starting the
53 | ANSYS CFX Solver Manager).
54 |
55 |
56 | -chdir
57 | Set the working directory as specified.
58 |
59 |
60 | -check-only
61 | When running in batch mode, this will cause cfx5solve to verify
62 | its options, but exit before starting any processes, and is
63 | mostly for use by the ANSYS CFX Solver Manager.
64 |
65 |
66 | -display
67 | Use the X11 server instead of the X11 server
68 | defined by the DISPLAY environment variable.
69 |
70 |
71 | -graphics ogl|mesa
72 | -gr ogl|mesa
73 | Set the graphics mode to use for the GUI to the OpenGL (hardware)
74 | or MESA (software)-based version. This affects any CFX
75 | application started from within the ANSYS CFX Solver Manager.
76 |
77 |
78 | -interactive
79 | -int
80 | -manager
81 | Run the interactive ANSYS CFX Solver Manager to start a new run
82 | and/or manage or monitor an existing run.
83 | This option may be combined with specific options only
84 | (-chdir -definition -display -eg -graphics -monitor -verbose).
85 | Other options will have no effect.
86 |
87 |
88 | -monitor
89 | When starting the Solver Manager, monitor the run represented
90 | by , which may be an ANSYS CFX Results File or Output File.
91 |
92 |
93 | -output-summary-option
94 | Specify the job summary format in the solver output file.
95 | may be set to:
96 | 0: minimal
97 | 1: terse format (default, no information per partition)
98 | 2: compact format (one line per partition)
99 | 3: verbose format (default prior to CFX-15.0)
100 |
101 |
102 | -verbose
103 | -v
104 | Specifying this option may result in additional output being sent to
105 | the Unix "standard output" file (normally the screen).
106 |
107 |
108 | -------------------------------------------------------------------------
109 | CFX RUN SPECIFICATION
110 | -------------------------------------------------------------------------
111 |
112 |
113 | -bak-elapsed-time
114 | -baket
115 | This will cause the ANSYS CFX Solver to write a backup file every
116 | minutes, hours, seconds, etc... Elapsed
117 | time must be in quotes and have units in square brackets.
118 | eg: -baket "10 [min]" or -baket "5 [hr]"
119 |
120 |
121 | -bg-ccl
122 | Reads Command Language from the named file, and uses it to
123 | provide defaults for the current run. If the file specifies a
124 | definition file for the run, the command language contained in
125 | that definition file will take precedence over that supplied.
126 | See also -ccl.
127 |
128 |
129 | -ccl
130 | Reads Command Language from the named file, and includes it in
131 | the setup of the run. If is the single character '-',
132 | the Command Language is read from the standard input (usually
133 | the terminal). If any settings are made in the Command
134 | Language file which also occur on the command line to the left
135 | of the -ccl switch, the settings in the file will take
136 | precedence, as stated above. This switch may be repeated to
137 | include Command Language from more than one file.
138 | Changes that affect the way the mesh is defined, or that affect the
139 | way the physics CCL relates to the topology of the mesh stored
140 | in the solver input file, cannot be made using the -ccl option.
141 | For example, locators for applying physics cannot be modified using
142 | the -ccl option. Such changes can, however, be made in CFX-Pre.
143 |
144 |
145 | -config
146 | Apply subsequent options to the specified configuration.
147 |
148 |
149 | -continue-from-file
150 | -cont-from-file
151 | Use initial values and continue the run history from the specified
152 | ANSYS CFX Results File. The mesh from the Solver Input File is used
153 | unless the -use-mesh-from-iv option is also specified.
154 |
155 |
156 | -continue-from-configuration
157 | -cont-from-config
158 | Use initial values and continue the run history from the most recent
159 | results for the specified configuration. The mesh from the
160 | configuration's Solver Input File is used unless the -use-mesh-from-iv
161 | option is also specified.
162 |
163 |
164 | -definition
165 | -def
166 | Use specified file as the Solver Input File for a single configuration
167 | simulation. Specifying an ANSYS CFX Results File (.res) here produces
168 | a restart. See also "-mdef".
169 |
170 |
171 | -eg
172 | -example
173 | Run the ANSYS CFX Solver on one of the Example Definition
174 | files provided with the product.
175 |
176 |
177 | -fullname
178 | Choose names for the Output File, ANSYS CFX Results File, and the temporary
179 | directory based on instead of the Solver Input File name.
180 | No numerical suffix (e.g. _001) is added to the specified name.
181 |
182 |
183 | -initial
184 | -ini
185 | Use initial values from the specified ANSYS CFX Results File. The mesh
186 | and run history from the specified file is also used unless the
187 | -interpolate-iv option is also specified. This option has been deprecated
188 | and should be replaced by "-initial-file" or "-continue-from-file"
189 | as appropriate.
190 |
191 |
192 | -initial-configuration
193 | -ini-conf
194 | Use initial values and continue the run history from the most recent
195 | results for the specified configuration. The mesh from the Solver Input File
196 | is used unless the -use-mesh-from-iv option is also specified.
197 |
198 |
199 | -initial-file
200 | -ini-file
201 | Use initial values but discard the run history from the specified
202 | ANSYS CFX Results File. The mesh from the Solver Input File is used
203 | unless the -use-mesh-from-iv option is also specified.
204 |
205 |
206 | -interpolate-iv
207 | -interp-iv
208 | Interpolate the solution from the initial values file, if one
209 | is supplied (using the -initial option), onto the mesh in the
210 | Solver Input File, rather than using the mesh from the initial
211 | values file. This option has been deprecated and should be
212 | replaced by "-initial-file" or "-continue-from-file" as appropriate.
213 |
214 |
215 | -maxet
216 | -max-elapsed-time
217 | Set the maximum elapsed time (wall clock time) that the ANSYS CFX
218 | Solver will run. Elapsed time must be in quotes and have correct
219 | units in square brackets (eg: -maxet "10 [min]" or -maxet "5 [hr]")
220 |
221 |
222 | -mdefinition
223 | -mdef
224 | Use specified file as the Solver Input File for a multi-configuration
225 | or operating points simulation. Specifying a multi-configuration or
226 | operating points ANSYS CFX Results File (.mres) here produces a restart.
227 |
228 |
229 | -mcontinuation
230 | -mcont
231 | Specify continuation of an operating points run from the specified
232 | operating points ANSYS CFX Results File (.mres).
233 |
234 |
235 | -multiconfig
236 | Treat the Solver Input File as a multi-configuration input file.
237 |
238 |
239 | -name
240 | Choose names for the output file, ANSYS CFX Results File, and the temporary
241 | directory based on the problem name instead of the
242 | Solver Input File name.
243 |
244 |
245 | -use-mesh-from-iv
246 | Use the mesh from the source initial values (i.e. file or configuration)
247 | rather than from the Solver Input File. This is only valid if a single
248 | initial values source is specified.
249 |
250 |
251 | ADVANCED RUN SPECIFICATION
252 |
253 |
254 | -indirect-startup
255 | Run the solver using the run directory, output file, and ccl specified by
256 | the -indirect-startup-path
257 |
258 |
259 | -job
260 | Keep job file after an ANSYS CFX Solver run. This file contains a
261 | brief summary of various solution values, and is most useful for
262 | regression purposes.
263 |
264 |
265 | -job-part
266 | -jobp
267 | Keep job file after an ANSYS CFX Partitioner run. This file contains a
268 | brief summary of various solution values, and is most useful for
269 | regression purposes.
270 |
271 |
272 | -lpf
273 | Specify a license preference file.
274 |
275 |
276 | -norun
277 | Preprocess the Solver Input File only;
278 | Do not run the ANSYS CFX solver executable.
279 | When used with a multi-configuration Solver Input File, this option will
280 | produce complete Solver Input Files for the individual configurations.
281 | When used with the "-config" option, only the specified configuration
282 | is preprocessed.
283 |
284 |
285 | -preferred-license
286 | -P
287 | License used first by the ANSYS CFX Solver, given the availability of
288 | multiple useable licenses.
289 | This option has no effect and has been superseded by the -lpf option.
290 |
291 |
292 | -respect-suffix-history
293 | Consider results files referenced by initial values files
294 | when choosing the numerical suffix (e.g. _001) added to the run name.
295 |
296 |
297 | -save
298 | Do not delete any temporary files after the run. Normally the standard
299 | temporary files created by the ANSYS CFX Solver are deleted
300 | automatically after each run.
301 |
302 |
303 | -------------------------------------------------------------------------
304 | EXECUTION
305 | -------------------------------------------------------------------------
306 |
307 |
308 | -priority
309 | -pri
310 | Set the run priority for the ANSYS CFX Solver. should
311 | be one of:
312 | CFX Levels Nice increment Windows Priority
313 | Idle (0) 19 Low
314 | Low (1) 7 BelowNormal
315 | Standard (2) 0 Normal
316 | High (3) 0 AboveNormal
317 | This applies to all processes in a parallel run. A numeric
318 | setting is also accepted as shown in the CFX column. The default
319 | CFX setting is Standard (2), corresponding to a nice increment
320 | of 0 on UNIX platforms and a priority level of Normal on Windows.
321 |
322 |
323 | -size
324 | -S
325 | -s
326 | Change the memory estimates used by the ANSYS CFX Solver by a
327 | factor of . By default the memory estimates contained
328 | in the Solver Input File are used. Sometimes these are
329 | inaccurate and this option needs to be used to increase the
330 | memory allocated.
331 |
332 |
333 | -size-cat
334 | -size-nr
335 | -size-ni
336 | -size-nd
337 | -size-nc
338 | -size-nl
339 | -scat
340 | -nr
341 | -ni
342 | -nd
343 | -nc
344 | -nl
345 | These options are for advanced users to change the memory allocation
346 | parameters for the ANSYS CFX Solver. Usually, you should use
347 | the -size option instead. is the desired memory
348 | allocation in words, and may have K or M appended for kilo- or
349 | mega-. If the suffix is 'x', the number is treated as a
350 | multiplier.
351 |
352 |
353 | -size-mms
354 | -smms
355 | Change the initial MMS catalogue size estimate used by the
356 | ANSYS CFX Solver by a factor of . This option has been
357 | deprecated and should be replaced by -size-cat.
358 |
359 |
360 | -size-part-mms
361 | -smmspar
362 | Change the initial MMS catalogue size estimate used by the
363 | ANSYS CFX Partitioner by a factor of . This option has
364 | been deprecated and should be replaced by -size-part-cat.
365 |
366 |
367 | -size-cclsetup
368 | -sizeccl
369 | Change the memory estimates used by the ANSYS CFX cclsetup executable
370 | by a factor of .
371 |
372 |
373 | -size-cclsetup-cat
374 | -size-cclsetup-nr
375 | -size-cclsetup-ni
376 | -size-cclsetup-nd
377 | -size-cclsetup-nc
378 | -size-cclsetup-nl
379 | -scatccl
380 | -nrccl
381 | -niccl
382 | -ndccl
383 | -ncccl
384 | -nlccl
385 | These options are the same as the -size-* options above, but give the
386 | sizes needed for the ANSYS CFX CCL Setup executable.
387 |
388 |
389 | -size-interp
390 | -sizeint
391 | Change the memory estimates used by the ANSYS CFX Interpolator by a
392 | factor of .
393 |
394 |
395 | -size-interp-cat
396 | -size-interp-nr
397 | -size-interp-ni
398 | -size-interp-nd
399 | -size-interp-nc
400 | -size-interp-nl
401 | -scatint
402 | -nrint
403 | -niint
404 | -ndint
405 | -ncint
406 | -nlint
407 | These options are the same as the -size-* options above, but give the
408 | sizes needed for the ANSYS CFX Interpolator.
409 |
410 |
411 | -size-part
412 | -sizepart
413 | -size-par
414 | -sizepar
415 | Change the memory estimates used by the ANSYS CFX Partitioner by a
416 | factor of .
417 |
418 |
419 | -size-part-cat
420 | -size-part-nr
421 | -size-part-ni
422 | -size-part-nd
423 | -size-part-nc
424 | -size-part-nl
425 | -scatpar
426 | -nrpar
427 | -nipar
428 | -ndpar
429 | -ncpar
430 | -nlpar
431 | These options are the same as the -size-* options above, but give the
432 | sizes needed for the ANSYS CFX Partitioner.
433 |
434 |
435 | -size-maximal []
436 | Use a 'maximal' memory estimate.
437 | The fraction of system memory may be optionally specified.
438 | If not specified, a default system memory fraction is used.
439 |
440 |
441 |
442 |
443 | -size-maximal-part []
444 | This option is similar to the -size-maximal option above,
445 | but uses a 'maximal' memory estimate for the partitioner.
446 |
447 |
448 |
449 |
450 | -numa
451 | Set the option for NUMA memory containment
452 | Valid options are:
453 | none (NUMA containment disabled)
454 | auto (NUMA containment enabled)
455 |
456 |
457 | -affinity
458 | Set the option for process affinity control
459 | implicit (affinity not set by the solver)
460 | explicit (affinity explicitly set by the solver)
461 |
462 |
463 | -thread-count-interp
464 | Set the maximum number of threads used by the interpolator.
465 |
466 |
467 |
468 |
469 | -thread-hwcap-interp
470 | Set the number of threads used by the interpolator
471 | as a fraction of those available on the current hardware.
472 |
473 |
474 |
475 |
476 | -------------------------------------------------------------------------
477 | PARALLEL AND PARTITIONING
478 | -------------------------------------------------------------------------
479 |
480 |
481 | -op-concurrency []
482 | Run operating point jobs concurrently.
483 | The maximum number of concurrent jobs may be optionally specified.
484 | If not specified, the number of concurrent jobs is unlimited.
485 |
486 |
487 | -parallel
488 | -par
489 | Run the ANSYS CFX Solver in parallel mode. This option can be combined
490 | with the -part (-partition) option for a partitioning run. If
491 | the -part switch is not given, the -parfile-read switch must be used
492 | to specify a valid partitioning information file.
493 |
494 |
495 | -parfile-read
496 | Set the name of an input partition file, used to set up
497 | a partitioning or parallel run.
498 |
499 |
500 | -parfile-save
501 | When used with a parallel run, save the partitioning
502 | information to a file with the same basename as the results
503 | file, and the extension .par.
504 |
505 |
506 | -parfile-write
507 | Give the name of a partition file to write containing the
508 | information from a partitioning run.
509 |
510 |
511 | -partition
512 | -part
513 | Run the ANSYS CFX Solver in partitioning mode. This option should not
514 | be used if an existing partition file is also specified.
515 |
516 |
517 | -part-only
518 | Run the ANSYS CFX Solver in partitioning mode only, but do not run
519 | the solver.
520 |
521 |
522 | -part-coupled
523 | Activate coupled partitioning mode for multidomain problems.
524 | This is the default.
525 |
526 |
527 | -part-independent
528 | Activate independent partitioning mode for multidomain problems.
529 | This is not activated by default.
530 |
531 |
532 | -part-mode
533 | Set the partitioning mode to use when running the partitioner.
534 | Valid options are:
535 | metis-kway (MeTiS k-way)
536 | metis-rec (MeTiS Recursive Bisection)
537 | simple (Simple Assignment)
538 | drcb (Directional Recursive Coordinate Bisection)
539 | orcb (Optimized Recursive Coordinate Bisection)
540 | rcb (Recursive Coordinate Bisection)
541 | Finer control over the partitioning method is available through
542 | the Command Language.
543 |
544 |
545 | -part-remap
546 | Remap parallel processes to maximize intra-host communication
547 | and minimize inter-host communication.
548 |
549 |
550 |
551 |
552 | -par-dist
553 | Set the comma-separated in the same form as is
554 | used in the Command Language definition. This option does not
555 | require the -partition switch, as one partition is run on each
556 | host mentioned in the list. To run multiple partitions on the
557 | same host it may be listed multiple times, or an asterisk may
558 | be used with the count, as in "wallaby*3,kangaroo*4" for a
559 | 7-partition run.
560 |
561 |
562 | Host details are taken from the hostinfo.ccl file, if they are
563 | there; otherwise, if possible, the required information will be
564 | automatically detected. may be specified as
565 | [@][:] if the user name or the
566 | ANSYS CFX installation root directory differ from the local host.
567 |
568 |
569 | -par-host-list [,[,...]]
570 | When running in parallel, use the given host list. See the
571 | -par-dist switch for details of the host list.
572 |
573 |
574 | -par-local
575 | When running in parallel, use only the local host. This will
576 | override the -par-hist or -par-host-list switches.
577 |
578 |
579 | -serial
580 | Explicitly specify that a serial run is required. Normally
581 | this is the default, but when restarting from a results file
582 | from a parallel run, the new run will also be parallel by
583 | default, and this switch can be used to override it.
584 |
585 |
586 | -start-method
587 | Use the named start method to start the ANSYS CFX Solver. This option
588 | allows you to use different parallel methods, as listed in the
589 | ANSYS CFX Solver Manager GUI or in the etc/start-methods.ccl file,
590 | instead of the defaults. For parallel start methods, you must also
591 | provide the -part or -par-dist option.
592 |
593 |
594 | -------------------------------------------------------------------------
595 | EXECUTABLE SELECTION
596 | -------------------------------------------------------------------------
597 |
598 |
599 | Single precision is the default for the ANSYS CFX Partitioner, Interpolator
600 | and Solver. The -double and -single options set the default precision for
601 | all stages but may be overridden for each stage. E.g. To use the double
602 | precision solver and single precision interpolator use:
603 | -double -single-interp
604 |
605 |
606 | These options may also be used to override setting in a CFX Command
607 | Language from previous runs. E.g If the previous run used a double
608 | precision solver -solver-single would force the single precision solver
609 | to be used. Note: -single/-double will not override existing setting that
610 | have be defined for each step.
611 |
612 |
613 | -ccl2flow
614 | Run instead of the standard ANSYS CFX ccl2flow.
615 |
616 |
617 | -ccl2flow-double
618 | Use the double precision ANSYS CFX ccl2flow executable.
619 |
620 |
621 | -ccl2flow-single
622 | Use the single precision ANSYS CFX ccl2flow executable.
623 |
624 |
625 | -cclsetup
626 | Run instead of the standard ANSYS CFX cclsetup.
627 |
628 |
629 | -cclsetup-double
630 | Use the double precision ANSYS CFX cclsetup executable.
631 |
632 |
633 | -cclsetup-single
634 | Use the single precision ANSYS CFX cclsetup executable.
635 |
636 |
637 | -double
638 | Default to the double-precision version of the ANSYS CFX Partitioner,
639 | Interpolator and Solver.
640 |
641 |
642 | -large
643 | Default to the large problem version of the ANSYS CFX Partitioner,
644 | Interpolator and Solver.
645 |
646 |
647 | -interpolator
648 | Run instead of the standard ANSYS CFX Interpolator.
649 |
650 |
651 | -interp-double
652 | Use the double precision ANSYS CFX Interpolator.
653 |
654 |
655 | -interp-single
656 | Use the single precision ANSYS CFX Interpolator.
657 |
658 |
659 | -interp-large
660 | Run the large problem interpolator for problems too large for
661 | the default executable. This interpolator uses 64 bit integer
662 | and logical variables so it will allocate more memory than the
663 | default interpolator executable.
664 |
665 |
666 | -partitioner
667 | Run instead of the standard ANSYS CFX Partitioner.
668 |
669 |
670 | -part-double
671 | Use the double precision ANSYS CFX Partitioner.
672 |
673 |
674 | -part-single
675 | Use the single precision ANSYS CFX Partitioner.
676 |
677 |
678 | -part-large
679 | Run the large problem partitioner which can partition problems
680 | up to 2^32-1 elements. This partitioner uses 64 bit integer
681 | and logical variables so it will allocate more memory than the
682 | default partitioning executable.
683 |
684 |
685 | -single
686 | Default to the single precision version of the ANSYS CFX Partitioner,
687 | Interpolator and Solver.
688 |
689 |
690 | -solver [=][,=[,...]]
691 | -exec [=][,=[,...]]
692 | Run instead of the standard ANSYS CFX Solver on .
693 | If is omitted the current os is assumed.
694 | E.g -solver "linux-amd64/mysolver.exe,linux=linux/mysolver.exe"
695 |
696 |
697 | -solver-double
698 | Use the double precision ANSYS CFX Solver.
699 |
700 |
701 | -solver-single
702 | Use the single precision ANSYS CFX Solver.
703 |
704 |
705 | -solver-large
706 | Run the large problem solver for problems too large for
707 | the default executable. This executable uses 64 bit integer
708 | and logical variables so it will allocate more memory than the
709 | default solver executable.
710 |
711 |
712 | -------------------------------------------------------------------------
713 | ANSYS SYSTEM COUPLING
714 | -------------------------------------------------------------------------
715 |
716 |
717 | -scport
718 | For a coupled CFX Solver/System Coupling run, specify the port
719 | number for the ANSYS CFX Solver to establish a connection to.
720 |
721 |
722 | -schost
723 | For a coupled CFX Solver/System Coupling run, specify the host
724 | name for the ANSYS CFX Solver to establish a connection to.
725 |
726 |
727 | -scname
728 | For a coupled CFX Solver/System Coupling run, specify the name
729 | that System Coupling has assigned to identify the CFX Solver
730 | participant.
731 |
732 |
733 | -------------------------------------------------------------------------
734 | ANSYS MULTIFIELD AND PROCESS COUPLING
735 | -------------------------------------------------------------------------
736 |
737 |
738 | -ansys-arguments
739 | For an ANSYS Multi-field run, set any additional options for
740 | the ANSYS Solver.
741 |
742 |
743 | -ansys-input
744 | For an ANSYS Multi-field run, set the ANSYS input file to use.
745 |
746 |
747 | -ansys-input-is-complete
748 | For an ANSYS Multi-field run, treat the ANSYS input file as being
749 | complete. Do not pre-process using CCL2MF.
750 |
751 |
752 | -ansys-installation
753 | For an ANSYS Multi-field run, set the ansys installation
754 | directory if not installed in a standard location.
755 |
756 |
757 | -ansys-jobname
758 | For an ANSYS Multi-field run, set the jobname to use. Default is
759 | ANSYS. For restarts, the jobname must be the same as the initial run.
760 |
761 |
762 | -ansys-license
763 | For an ANSYS Multi-field run, set the license that the ANSYS
764 | Solver should use.
765 |
766 |
767 | -ansys-restart
768 | For an ANSYS Multi-field run, set a restart database for the ANSYS
769 | Solver to use.
770 |
771 |
772 | -cplg-host
773 | For a coupled solver run/ANSYS Multi-field run, specify the port number
774 | and hostname for the ANSYS CFX Solver to establish a connection to.
775 |
776 |
777 | -mfx-run-mode
778 | For an ANSYS Multi-field run, specify the run mode. Valid modes are:
779 | "Start ANSYS and CFX"
780 | "Start ANSYS only"
781 | "Start CFX only"
782 | "Process Input File only"
783 | ```
--------------------------------------------------------------------------------
/apps/CFX/CFX.sbatch:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | #SBATCH --exclusive
3 | #SBATCH --nodes=4
4 | #SBATCH --ntasks=192
5 | #SBATCH --constraint=hpc7a.24xlarge
6 | #SBATCH --partition=hpc7a
7 | #SBATCH --chdir=/fsx/slurm-out/cfx
8 |
9 | ######################## EFA settings ########################
10 | #export I_MPI_DEBUG=5
11 | export I_MPI_FABRICS=shm:ofi
12 | export I_MPI_OFI_PROVIDER=efa
13 | export I_MPI_MULTIRAIL=1
14 | export I_MPI_ADJUST_REDUCE=2
15 | export I_MPI_ADJUST_ALLREDUCE=2
16 | export I_MPI_ADJUST_BCAST=1
17 | ######################## EFA settings ########################
18 |
19 | cfxversion=${cfxversion:-"v242"}
20 | def_file=${def_file:-"/fsx/CFX/Dataset/perf_Airfoil_100M_R16.def"}
21 | basedir=${basedir:-"/fsx"}
22 |
23 | CFX_DIR="/fsx/ansys_inc/${cfxversion}/CFX/bin"
24 |
25 | export ANSYSLMD_LICENSE_FILE=${ANSYSLMD_LICENSE_FILE:-"1055@172.31.89.151"}
26 |
27 | TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
28 | workdir="$(readlink -m "${basedir}/${SLURM_JOB_NAME%.*}")/Run/${SLURM_JOB_ID}-${SLURM_JOB_NUM_NODES}x$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/instance-type)-${SLURM_NPROCS}-$(date "+%d-%m-%Y-%H-%M")"
29 |
30 | echo "Execution directory is: ${workdir}"
31 | mkdir -p "${workdir}" && cd "${workdir}"
32 |
33 | echo "building the MPI hostfile"
34 | HOST_LIST=$(scontrol show hostname $SLURM_NODELIST)
35 |
36 | cp $0 .
37 | ln -s "${def_file}" .
38 |
39 | "${CFX_DIR}/cfx5solve" -batch -def $def_file -parallel -start-method 'Intel MPI Distributed Parallel' -par-dist "$HOST_LIST" -part $SLURM_NPROCS -part-large
--------------------------------------------------------------------------------
/apps/CFX/Readme.md:
--------------------------------------------------------------------------------
1 | # ANSYS CFX
2 |
3 | Ansys [CFX](https://www.ansys.com/products/fluids/ansys-cfx) is a CFD software for turbomachinery applications. It offers streamlined workflows, advanced physics modeling capabilities, and accurate results.
4 |
5 | # Versions
6 |
7 | In this repository we will provide best practices for all the CFX versions starting from 2023 and newer.
8 |
9 | **_NOTE:_** We will provide best practices for AWS Graviton instances as soon as CFX will officially support ARM-based cpus.
10 |
11 |
12 | # Installation
13 |
14 | CFX is supported on both Windows and on Linux machines.
15 | In this repository we will share an example script to install CFX on a Linux system.
16 | CFX installation is relatively easy as it is part of the ANSYS `FLUIDSTRUCTURES` package. You can have a look at [this example script](https://github.com/aws-samples/hpc-applications/blob/main/apps/Fluent/Fluent-Install.sh) to create your own installation procedure, or you can execute this script as follow:
17 |
18 | ```
19 | ./Fluent-Install.sh /fsx s3://your_bucket/FLUIDSTRUCTURES_2024R2_LINX64.tgz
20 | ```
21 |
22 | * This is working example installation script that run unattended.
23 | * The first parameter is the base directory where you want to install CFX. If you pass `/fsx` then CFX will be installed under `/fsx/ansys_inc` .
24 | * The second parameter is the [S3](https://aws.amazon.com/pm/serv-s3/) URI pointing to installation package (tar.gz).
25 |
26 |
27 |
28 | For running CFX on multiple nodes, it is required to install it in a shared directory, possibly a parallel file system.
29 | We would strongly recommend to use [Amazon FSx for Lustre](https://aws.amazon.com/fsx/lustre/), more info in the official [documentation](https://docs.aws.amazon.com/fsx/latest/LustreGuide/what-is.html) .
30 |
31 | # Key settings & tips (performance related ones) :
32 |
33 | * CFX is a compute and memory bandwidth bound code.
34 | * the best instance types for running it are the ones with higher amount of cores, and higher memory bandwidth per core.
35 | * As of today, the instance that shows the **best price/performance** is the [Hpc7a](https://aws.amazon.com/ec2/instance-types/hpc7a/) .
36 | * CFX is a software that scales on multiple nodes: the simulation time decreases as the numbrer of cores being used increases (typically not proportionally).
37 |
38 | * `-parallel` This parameter tells CFX to use run in parallel on multiple nodes.
39 | * ` -start-method 'Intel MPI Distributed Parallel'` This parameter specifies the MPI implementation. At the moment, `IntelMPI` is the MPI library that offer better performance on AWS.
40 | * `-par-dist "$HOST_LIST"` This parameter specifies hosts where the simulation run.
41 | * ` -part $SLURM_NPROCS` This parameter specifies the number of cores used to run the simulation.
42 | * ` -part-large` This parameter is used for large models.
43 |
44 | # Performance
45 |
46 | TBC
--------------------------------------------------------------------------------
/apps/Fluent/Arm/Fluent-benchmark.sbatch:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | #SBATCH --exclusive
3 | #SBATCH --nodes=4
4 | #SBATCH --ntasks=256
5 | #SBATCH --constraint=hpc7g.16xlarge
6 | #SBATCH --partition=hpc7g
7 | #SBATCH --time=12:00:00
8 |
9 | ######################## EFA settings ########################
10 | module load openmpi
11 | module load libfabric-aws
12 | #export FI_LOG_LEVEL=warn
13 | #export OMPI_MCA_mtl_ofi_verbose=100
14 | export OPENMPI_ROOT="$(dirname $(dirname $(which mpirun)))"
15 | ######################## EFA settings ########################
16 |
17 | fluentversion=${1:-"v251"}
18 | benchmark_uri=${2:-"s3://YOUR_BUCKET/f1_racecar_140m.tar"}
19 | basedir=${BASE_DIR:-"/fsx"}
20 | export ANSYSLMD_LICENSE_FILE=${ANSYSLMD_LICENSE_FILE:-"1055@XXX.YYY.ZZZ.XYZ"}
21 |
22 | benchmark_file=$(basename ${benchmark_uri})
23 | APP_BIN_PATH="/fsx/ansys_inc/v242/fluent/bin/fluentbench.pl"
24 | export FLUENT_ARCH=lnarm64
25 |
26 | TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
27 | workdir="$(readlink -m "${basedir}/${SLURM_JOB_NAME%.*}")/${benchmark_file%.*}/${SLURM_JOB_ID}-${SLURM_JOB_NUM_NODES}x$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/instance-type)-${SLURM_NPROCS}-$(date "+%d-%m-%Y-%H-%M")"
28 |
29 | echo "Execution directory is: ${workdir}"
30 | mkdir -p "${workdir}" && cd "${workdir}"
31 |
32 | echo "building the MPI hostfile"
33 | scontrol show nodes $SLURM_NODELIST | grep NodeHostName= | awk '{print $2}' | sed 's/NodeHostName=//' >hostfile
34 |
35 | cp $0 .
36 | aws s3 cp --quiet "${benchmark_uri}" .
37 | tar xf "${benchmark_file}"
38 | ln -s bench/fluent/v6/${benchmark_file%.*}/cas_dat/* .
39 |
40 |
41 | echo "Drop caches on all nodes"
42 | mpirun -n $SLURM_JOB_NUM_NODES --map-by ppr:1:node /bin/bash -c "sync && echo 3 | sudo tee /proc/sys/vm/drop_caches"
43 |
44 | echo "Enabling Transparent Huge Pages (THP)"
45 | mpirun -n $SLURM_JOB_NUM_NODES --map-by ppr:1:node /bin/bash -c "echo always | sudo tee /sys/kernel/mm/transparent_hugepage/enabled"
46 |
47 | echo "Install missing libraries"
48 | mpirun -n $SLURM_JOB_NUM_NODES --map-by ppr:1:node /bin/bash -c "sudo yum -y install libXtst libXtst-devel libvorbis libvorbis-devel"
49 |
50 | "${APP_BIN_PATH}" ${benchmark_file%.*} -t${SLURM_NPROCS} -cnf=hostfile -part=4 -nosyslog -noloadchk -ssh -p=eth.efa -mpi=openmpi
--------------------------------------------------------------------------------
/apps/Fluent/Arm/Fluent.sbatch:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | #SBATCH --exclusive
3 | #SBATCH -t 24:00:00
4 | #SBATCH --nodes=4
5 | #SBATCH --ntasks=256
6 | #SBATCH --constraint=hpc7g.16xlarge
7 | #SBATCH --partition=hpc7g
8 |
9 | ######################### EFA settings ########################
10 | module load openmpi
11 | module load libfabric-aws
12 | #export FI_LOG_LEVEL=warn
13 | #export OMPI_MCA_mtl_ofi_verbose=100
14 | export OPENMPI_ROOT="$(dirname $(dirname $(which mpirun)))"
15 | ######################## EFA settings ########################
16 |
17 | fluentversion=${1:-"v251"}
18 | journal_file=${2:-"/fsx/MY_FLUENT_EXAMPLE/MY_JOURNAL_FILE.jou"}
19 | dataset_cas=${3:-"/fsx/MY_FLUENT_EXAMPLE/MY_CAS_FILE.cas.gz"}
20 | dataset_dat=${4:-"/fsx/MY_FLUENT_EXAMPLE/MY_DAT_FILE.dat.gz"}
21 | basedir=${BASE_DIR:-"/fsx"}
22 | fluent_mode=${FLUENT_MODE:-"3d"}
23 |
24 | export ANSYSLMD_LICENSE_FILE=${ANSYSLMD_LICENSE_FILE:-"1055@XXX.YYY.ZZZ.XYZ"}
25 |
26 | APP_BIN_PATH="/fsx/ansys_inc/${fluentversion}/fluent/bin/fluent"
27 | export FLUENT_ARCH=lnarm64
28 |
29 | TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
30 | workdir="$(readlink -m "${basedir}/${SLURM_JOB_NAME%.*}")/$(basename "${journal_file%.*}")/Run/${SLURM_JOB_ID}-${SLURM_JOB_NUM_NODES}x$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/instance-type)-${SLURM_NPROCS}-$(date "+%d-%m-%Y-%H-%M")"
31 |
32 | echo "Execution directory is: ${workdir}"
33 | mkdir -p "${workdir}" && cd "${workdir}"
34 |
35 | echo "building the MPI hostfile"
36 | scontrol show nodes $SLURM_NODELIST | grep NodeHostName= | awk '{print $2}' | sed 's/NodeHostName=//' >hostfile
37 |
38 | cp $0 .
39 | ln -s "${journal_file}" .
40 | ln -s "${dataset_cas}" .
41 | ln -s "${dataset_dat}" .
42 |
43 | echo "Drop caches on all nodes"
44 | mpirun -n $SLURM_JOB_NUM_NODES --map-by ppr:1:node /bin/bash -c "sync && echo 3 | sudo tee /proc/sys/vm/drop_caches"
45 |
46 | echo "Enabling Transparent Huge Pages (THP)"
47 | mpirun -n $SLURM_JOB_NUM_NODES --map-by ppr:1:node /bin/bash -c "echo always | sudo tee /sys/kernel/mm/transparent_hugepage/enabled"
48 |
49 | "${APP_BIN_PATH}" ${fluent_mode} -g -t${SLURM_NPROCS} -cnf=hostfile -cflush -env -ssh -part=4 -p=eth.efa -mpi=openmpi -i ${journal_file} | tee output.$SLURM_JOBID.out
50 |
--------------------------------------------------------------------------------
/apps/Fluent/Fluent-Benchmark.md:
--------------------------------------------------------------------------------
1 | ```
2 | -------------------------------------------------------------
3 | This is the standard ANSYS FLUENT benchmarks suite.
4 | For permission to use or publish please contact ANSYS Inc..
5 |
6 | Running FLUENT benchmarks...
7 | Host: ip-10-42-0-23
8 | Date: Sun Feb 25 09:49:22 2024
9 | Creating benchmarks archive fluent_benchmarks.zip
10 | On successful completion, please send this file to ANSYS Inc.
11 | -------------------------------------------------------------
12 |
13 | ANSYS FLUENT problem:
14 | open_racecar_280m|f1_racecar_140m|combustor_71m|exhaust_system_33m|lm6000_16m|landing_gear_15m|aircraft_wing_14m|combustor_12m|oil_rig_7m|sedan_4m|rotor_3m|aircraft_wing_2m|fluidized_bed_2m|ice_2m|pump_2m
15 |
16 | options:
17 | -t Specify number of processors (parallel).
18 | -gpgpu Specify number of GPGPUs per machine (parallel).
19 | -out Post-process transcript files.
20 | -res Post-process output files.
21 | -p Specify communicator (parallel).
22 | -casdat= Specify directory which contains case and data files.
23 | -cnf= Specify hostfile (parallel).
24 | -path Specify root path x to Fluent.Inc
25 | -iter= Specify iteration count.
26 | -flexible-cycle Use flexible cycle
27 | -io Write case and data files at end of benchmark.
28 | -hdfio= Write case and data files using HDF5 io at end of benchmark. : 1=controller (default), 2=node0, 3=independent, 4=collective.
29 | -pio Write data file using parallel io at end of benchmark.
30 | -norm Do not remove case and data files at end of benchmark.
31 | -zip Compress the case and dat files during IO measurements.
32 | -hdfcompr= Specify HDF5 compression level. : [0-9]. 0 (default) signifies no compression. 1 is fastest compression. 9 is slowest.
33 | -part= Partition method: 1=metis, 2=metis-zone, 3=force-auto. (default), 4=auto.
34 | -pre= Specify number of pre-iterations.
35 | -prod= Specify product.
36 | -pthreads= Specify number of pthreads for DPM model.
37 | -ri= Specify reporting interval (default=5).
38 | -server Include only server runs in results file.
39 | -solver= Specify the solver seg|segregated or cpld|coupled to be used.
40 | -init Initialize solution, do not read data file.
41 | -time= Specify number of timesteps.
42 | -ver= Fluent version 3d or 2d or 2ddp ....
43 | -nosyslog Do not collect benchmarking system information
44 | -noloadchk Do not ckeck the system load on benchmarking system
45 | -help Print this message.
46 | -quiet No chatter.
47 | -verbose More chatter.
48 | -ssh Use ssh instead of rsh.
49 | ```
--------------------------------------------------------------------------------
/apps/Fluent/Fluent-Install.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | ##how to use it: ./Fluent-Install.sh /fsx s3://your_bucket/FLUIDSTRUCTURES_2022R1_LINX64.tgz
4 |
5 | root_dir=${1:-"/fsx"}
6 | fluent_s3_path=${2:-"s3://your_bucket/Ansys/FLUIDSTRUCTURES_2023R1_LINX64.tgz"}
7 |
8 | #check the installation directory
9 | if [ ! -d "${root_dir}" -o -z "${root_dir}" ]; then
10 | echo "Error: please check the install dir"
11 | exit 1
12 | fi
13 |
14 | #check s3 path
15 | if [ -z "${fluent_s3_path}" ]; then
16 | echo "Error: please check the s3 path"
17 | exit 1
18 | fi
19 |
20 | check=$(aws s3 ls "${fluent_s3_path}")
21 | if [ -z "${check}" ]; then
22 | echo "Error: please check your file on S3"
23 | exit 1
24 | fi
25 |
26 | tmpInstallDir="${root_dir}/ansys_tmp"
27 | ansysDir="${root_dir}/ansys_inc"
28 |
29 | mkdir -p "${tmpInstallDir}"
30 | cd "${tmpInstallDir}"
31 |
32 | echo "Downloading file from ${fluent_s3_path} ..."
33 | aws s3 cp "${fluent_s3_path}" .
34 |
35 | tar -xzvf *.tgz
36 |
37 | # INSTALL Ansys WB
38 | echo "Installing Ansys"
39 | "./INSTALL" -silent -install_dir "${ansysDir}"
40 |
41 | echo "Ansys installed"
42 |
43 | rm -rf ${tmpInstallDir}
44 |
45 | echo "Installation process completed!"
46 |
--------------------------------------------------------------------------------
/apps/Fluent/Fluent.md:
--------------------------------------------------------------------------------
1 | ```
2 | Usage: fluent [version] [-help] [options]
3 | options:
4 | -aas start Fluent in server mode,
5 | -act load ACT Start page,
6 | -affinity= set processor affinity; ={core | sock | off>,
7 | -app=flremote launches the Remote Visualization Client,
8 | -appscript=
9 | run the specified script in App,
10 | -case [-data]
11 | reads the case file immediately after Fluent
12 | launches; can include the data file if it
13 | shares the same name as the case file,
14 | -cflush flush the file cache buffer,
15 | -cnf= specify the hosts file,
16 | -command=""
17 | run TUI command on startup,
18 | -driver sets the graphics driver;
19 | ={opengl | opengl2 | x11 | null},
20 | -env show environment variables,
21 | -g run without GUI or graphics,
22 | -gpgpu= specify number of GPGPUs per machine,
23 | -gpu[=] run with GPU Solver, and specify devices to
24 | use as needed (where is a comma-separated
25 | list of devices),
26 | -gr run without graphics,
27 | -gu run without GUI,
28 | -gui_machine=
29 | specify the machine to be used for running
30 | graphics-related process,
31 | -h specify heap size for Cortex,
32 | -help this listing,
33 | -hidden fluent window is created but hidden,
34 | -host_ip=
35 | specify the ip interface to be used by the
36 | host process,
37 | -i read the specified journal file,
38 | -license= specify the license capability;
39 | ={enterprise | premium},
40 | -meshing run Fluent in meshing mode,
41 | -mpi= specify MPI implementation;
42 | ={openmpi | intel | ...},
43 | -mpitest run the mpitest program instead of Fluent
44 | to test the network,
45 | -nm don't display mesh after reading,
46 | -pcheck check the network connections before spawning
47 | compute node processes,
48 | -platform=intel use AVX2 optimized binary;
49 | This option is for processors that can
50 | support AVX2 instruction set,
51 | -post run a post-processing-only executable,
52 | -prepost run a pre/post-processing-only executable,
53 | -p specify interconnect;
54 |