├── .gitignore
├── README.md
├── arm_conda.sh
├── batch_run.sh
├── data.config
├── hpcbench
├── hpcbench.py
├── init.sh
├── package
├── common
│ ├── check_deps.sh
│ ├── check_root.sh
│ └── download.sh
├── ior
│ └── master
│ │ └── install.sh
├── openblas
│ └── 0.3.18
│ │ └── install.sh
└── osu
│ └── 7.0.1
│ └── install.sh
├── requirements.yaml
├── result
├── test_result.json
└── test_score.json
├── run.AI
├── run.balance
├── run.compute
├── run.network
├── run.storage
├── run.system
├── setting.py
├── templates
├── AI
│ ├── maskrcnn.aarch64.config
│ ├── maskrcnn.x86_64.config
│ ├── resnet.aarch64.config
│ └── resnet.x86_64.config
├── balance
│ ├── balance.linux64.config
│ └── stream
│ │ └── main
│ │ └── stream.linux64.config
├── compute
│ ├── hpcg.aarch64.config
│ ├── hpcg.x86_64.config
│ ├── hpl.aarch64.config
│ └── hpl.x86_64.config
├── network
│ ├── osu.aarch64.config
│ └── osu.x86_64.config
├── storage
│ ├── ior.aarch64.config
│ ├── ior.x86_64.config
│ └── protocol
│ │ ├── hadoop.aarch64.config
│ │ ├── hadoop.x86_64.config
│ │ ├── nfs.aarch64.config
│ │ ├── nfs.x86_64.config
│ │ ├── nfs_environment.md
│ │ ├── nfs_environment.sh
│ │ ├── posix.aarch64.config
│ │ ├── posix.x86_64.config
│ │ ├── warp.aarch64.config
│ │ └── warp.x86_64.config
└── system
│ └── system.linux64.config
└── utils
├── app.py
├── build.py
├── config.py
├── download.py
├── execute.py
├── install.py
├── invoke.py
├── machine.py
├── report_tmp.html
├── result.py
├── scheduler.py
├── score.py
├── standard_score.json
└── tool.py
/.gitignore:
--------------------------------------------------------------------------------
1 | # Compiled source #
2 | ###################
3 | *.a
4 | *.com
5 | *.class
6 | *.dll
7 | *.exe
8 | *.o
9 | *.o.d
10 | *.py[ocd]
11 | *.so
12 |
13 | # Packages #
14 | ############
15 | # it's better to unpack these files and commit the raw source
16 | # git has its own built in compression methods
17 | *.7z
18 | *.bz2
19 | *.bzip2
20 | *.dmg
21 | *.gz
22 | *.iso
23 | *.jar
24 | *.rar
25 | *.tar
26 | *.tbz2
27 | *.tgz
28 | *.zip
29 |
30 | # Python files #
31 | ################
32 | # setup.py working directory
33 | build
34 | # sphinx build directory
35 | _build
36 | # setup.py dist directory
37 | dist
38 | doc/build
39 | doc/cdoc/build
40 | # Egg metadata
41 | *.egg-info
42 | # The shelf plugin uses this dir
43 | ./.shelf
44 | MANIFEST
45 | .cache
46 | pip-wheel-metadata
47 | .python-version
48 |
49 | # Logs and databases #
50 | ######################
51 | *.log
52 | *.sql
53 | *.sqlite
54 | # Things specific to this project #
55 | ######################
56 | env.sh
57 | build.sh
58 | test.py
59 | hostfile
60 | .vscode
61 | tmp
62 | downloads/*
63 | depend_install.sh
64 | build.sh
65 | .meta
66 | software/*
67 | test.sh
68 | *.swp
69 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | hpcbenchmark是一个高性能集群计算性能评测工具集,本评测工具集引入与使用场景相关的性能指标,通过综合评分方法,为集群的计算、存储、网络和效率等关键维度,分别给出评价分数。
2 |
3 | 评测工具集准备了6大评测维度的工具模块,分别为:
4 |
5 | 1.计算性能维度 2.AI计算性能 3.存储性能维度 4.网络性能维度 5.系统能效维度 6.系统平衡性维度
6 |
7 | 用户可简单根据自己集群配置修改模板文件,即可对集群进行评测。
8 |
9 | # 项目目录结构
10 |
11 | 项目文件夹内包含以下几个文件和目录:
12 |
13 | ```
14 | #模块测试目录
15 | benchmark
16 | #数据下载目录
17 | downloads
18 | #配置文件目录
19 | templates
20 | #测试结果目录
21 | result
22 | #软件安装目录
23 | software
24 | #初始环境文件
25 | init.sh
26 | #主程序
27 | hpcbench
28 | ```
29 |
30 | # 项目测试环境
31 | HPCBench在下列环境分别通过测试:
32 | 1. X86 + NVIDIA GPU
33 | OS:CentOS 8
34 | CPU:Intel Platinum 8358
35 | GPU:NVIDIA A100
36 |
37 | 2. ARM + 昇腾NPU
38 | OS:openeuler 2203
39 | CPU: kunpeng 920
40 | NPU:Ascend 910
41 |
42 | # 评测工具集使用方法
43 |
44 | ## 依赖环境
45 |
46 | 运行环境:Python3
47 | 编译器:GCC-11.2.0
48 | MPI:OpenMPI-4.1.1
49 | CUDA:cuda-11.8
50 | 调度系统:slurm
51 |
52 | ## 安装hpcbenchmark
53 | 可以使用以下命令将hpcbenchmark仓库克隆到本地,并安装必须的依赖。
54 |
55 | ```
56 | $ git clone https://github.com/SJTU-HPC/hpcbenchmarks.git
57 | $ conda env create -f requirements.yaml
58 | $ conda activate hpcbench
59 | ```
60 |
61 | ## 初始化环境
62 | 用户需要根据具体集群配置简单修改初始环境文件`init.sh`,包括录入集群信息,以及调用GCC和OpenMPI命令
63 |
64 | 使用以下命令进行初始化环境操作:
65 |
66 | ```
67 | $ source init.sh
68 | ```
69 |
70 | ## 程序命令行参数
71 | 初始化完后,查看HPCBench的命令行参数:
72 |
73 | ```
74 | $ ./hpcbench -h
75 | usage: hpcbench [-h] [--build] [--clean] [...]
76 |
77 | please put me into CASE directory, used for App Compiler/Clean/Run
78 |
79 | options:
80 | -h, --help show this help message and exit
81 | -v, --version get version info
82 | -use USE, --use USE Switch config file...
83 | -i, --info get machine info
84 | -l, --list get installed package info
85 | -install INSTALL [INSTALL ...], --install INSTALL [INSTALL ...]
86 | install dependency
87 | -remove REMOVE, --remove REMOVE
88 | remove software
89 | -find FIND, --find FIND
90 | find software
91 | -dp, --depend App dependency install
92 | -e, --env set environment App
93 | -b, --build compile App
94 | -cls, --clean clean App
95 | -r, --run run App
96 | -j, --job run job App
97 | -rb, --rbatch run batch App
98 | -d, --download Batch Download...
99 | -u, --update start update hpcbench...
100 | -check, --check start check hpcbench download url...
101 | -s, --score Calculate the score and output benchmark report
102 |
103 | ```
104 | ## 模块测试示例
105 | 下面以``COMPUTE``测试模块中的``HPL``为例进行测试介绍:
106 |
107 | 该模块以两节点,每节点64核心512g运行内存配置进行计算,用户需要根据实际情况自行修改配置文件。
108 |
109 | ```
110 | templates/compute/hpl.x86_64.config
111 | ```
112 |
113 | ### 调用配置文件
114 |
115 | ```
116 | $ ./hpcbench -use templates/compute/hpl.linux64.config
117 | Switch config file to templates/compute/hpl.linux64.config
118 | Successfully switched. config file saved in file .meta
119 | ```
120 |
121 | ### 下载依赖文件
122 |
123 | ```
124 | $ ./hpcbench -d
125 | ```
126 |
127 | ### 安装依赖库
128 |
129 | ```
130 | $ ./hpcbench -dp
131 | ```
132 |
133 | ### 安装HPL
134 |
135 | ```
136 | $ ./hpcbench -b
137 | ```
138 |
139 | ### 提交作业
140 |
141 | ```
142 | $ ./hpcbench -j
143 | ```
144 |
145 | ### 查看测试结果
146 | hpl测试程序在计算完成后,在`result/compute`路径下会生成`hpl.txt`文件,查看可得知浮点计算速度为6303Gflops,符合精度要求。
147 |
148 | ```
149 | $ tail -n30 result/compute/hpl.txt
150 | Column=000175872 Fraction=99.6% Gflops=6.307e+03
151 | Column=000176128 Fraction=99.7% Gflops=6.307e+03
152 | Column=000176384 Fraction=99.9% Gflops=6.307e+03
153 | ==============================================================
154 | T/V N NB P Q Time Gflops
155 | --------------------------------------------------------------
156 | WR00R2R4 176640 256 8 16 582.89 6.3037e+03
157 | HPL_pdgesv() start time Tue Aug 22 00:24:39 2023
158 |
159 | HPL_pdgesv() end time Tue Aug 22 00:34:21 2023
160 |
161 | --VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV-
162 | Max aggregated wall time rfact . . . : 3.56
163 | + Max aggregated wall time pfact . . : 2.67
164 | + Max aggregated wall time mxswp . . : 2.50
165 | Max aggregated wall time update . . : 533.48
166 | + Max aggregated wall time laswp . . : 56.77
167 | Max aggregated wall time up tr sv . : 0.26
168 | --------------------------------------------------------------
169 | ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 8.52254435e-04 ...... PASSED
170 | ===========================================================
171 | ```
172 |
173 | ## 完成所有模块测试
174 |
175 | 模块测试前需要根据集群实际情况,修改对应模块下的配置文件,可执行快捷测试命令
176 | ```
177 | # 计算性能
178 | $ ./run.compute
179 |
180 | # 网络性能
181 | $ ./run.network
182 |
183 | # 存储性能
184 | $ ./run.storage
185 |
186 | # AI计算性能
187 | $ ./run.AI
188 |
189 | # 系统平衡性
190 | $ ./run.balance
191 |
192 | # 系统能耗性
193 | $ ./run.system
194 | ```
195 |
196 | ## 生成可视化报告
197 | 所有模块测试完后,执行以下命令可生成一个Report.html文件,可通过浏览器打开查看。
198 |
199 | ```
200 | $ ./hpcbench -s
201 | ```
202 |
--------------------------------------------------------------------------------
/arm_conda.sh:
--------------------------------------------------------------------------------
1 | wget --no-check-certificate https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py39_4.9.2-Linux-aarch64.sh
2 | sh Miniconda3-py39_4.9.2-Linux-aarch64.sh
3 |
--------------------------------------------------------------------------------
/batch_run.sh:
--------------------------------------------------------------------------------
1 | source ./init.sh
2 | ./hpcbench -e
3 | source ./env.sh
4 |
5 | cd $HPCbench_ROOT
6 |
7 | cd $RESULT_DIR
8 | exec 1>$RESULT_DIR/system.log 2>/dev/null
9 | # compute_efficiency
10 | echo "Calculating Compute_Effiency"
11 | CLUSTER_POWER=314.25 #w
12 | TOTAL_NODES=5
13 | TOTAL_CLUSTER_POWER=$(echo "scale=2; $CLUSTER_POWER*$TOTAL_NODES*0.875/1000"|bc)
14 | CLUSTER_HPL=$(python -c "from utils.result import extract_pflops;print(extract_pflops('$HPCbench_ROOT/result/compute/hpl.txt'))") #Pflops
15 | COMPUTE_EFFIENCY=$(echo "scale=2;$CLUSTER_HPL*1000/$TOTAL_CLUSTER_POWER"|bc)
16 | echo COMPUTE_EFFIENCY=$COMPUTE_EFFIENCY
17 | # IO_operation_rate
18 | echo "Calculating IO_OPERATION_RATE"
19 | IOPS=`cat $HPCbench_ROOT/result/storage/ior/iops.txt |grep write |awk 'NR==2 {print $3}'`
20 | STORAGE_POWER=384
21 | STORAGE_POWER=$(echo "scale=2; $STORAGE_POWER*0.8"|bc)
22 | IO_operation_rate=$(echo "scale=2; $IOPS/$STORAGE_POWER/1000"|bc)
23 | echo "IO_operation_rate=$IO_operation_rate"
--------------------------------------------------------------------------------
/data.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | 11.11.11.11
3 |
4 | [DOWNLOAD]
5 | stream_mpi.c/2014.10.21 https://www.cs.virginia.edu/stream/FTP/Code/Versions/stream_mpi.c
6 | stream_mpi.f/2014.2.14 https://www.cs.virginia.edu/stream/FTP/Code/Versions/stream_mpi.f
7 | mysecond.c/2009.2.19 https://www.cs.virginia.edu/stream/FTP/Code/mysecond.c
8 |
9 | [DEPENDENCY]
10 | set -x
11 | set -e
12 |
13 | export CC=`which gcc`
14 | export CXX=`which g++`
15 | export FC=`which gfortran`
16 |
17 | mkdir -p ${HPCbench_TMP}/stream-1.8
18 | cd ${HPCbench_TMP}
19 | mv ${HPCbench_DOWNLOAD}/stream_mpi.c ${HPCbench_TMP}/stream-1.8
20 | mv ${HPCbench_DOWNLOAD}/stream_mpi.f ${HPCbench_TMP}/stream-1.8
21 | mv ${HPCbench_DOWNLOAD}/mysecond.c ${HPCbench_TMP}/stream-1.8
22 |
23 | [ENV]
24 | module purge
25 | module load intel-oneapi-compilers/2021.4.0
26 | module load intel-oneapi-mpi/2021.4.0
27 | export CC=mpiicc FC=mpiifort F77=mpiifort
28 |
29 | [APP]
30 | app_name = stream
31 | build_dir = ${HPCbench_TMP}/stream-1.8
32 | binary_dir = ${HPCbench_LIBS}/stream-1.8
33 | case_dir =
34 |
35 | [BUILD]
36 | mpiicc -O3 -ffreestanding -qopenmp -qopt-streaming-stores=always \
37 | -DSTREAM_ARRAY_SIZE=8650752 -DNTIMES=20 -DVERBOSE \
38 | stream_mpi.c -o stream_mpi_c
39 | icc -c mysecond.c
40 | mpiifort -c stream_mpi.f
41 | mpiifort -O3 -qopenmp -qopt-streaming-stores=always stream_mpi.o mysecond.o -o stream_mpi_f
42 | mkdir -p ${HPCbench_LIBS}/stream-1.8
43 | cp -r stream_mpi_* ${HPCbench_LIBS}/stream-1.8
44 |
45 | [RUN]
46 | run = ${HPCbench_LIBS}/stream-1.8/stream_mpi_f
47 | binary =
48 | nodes = 1
49 |
--------------------------------------------------------------------------------
/hpcbench:
--------------------------------------------------------------------------------
1 | hpcbench.py
--------------------------------------------------------------------------------
/hpcbench.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | import argparse
4 | from utils.scheduler import Scheduler
5 |
6 | parser = argparse.ArgumentParser(description=f'please put me into CASE directory, used for App Compiler/Clean/Run',
7 | usage='%(prog)s [-h] [--build] [--clean] [...]')
8 | parser.add_argument("-v","--version", help=f"get version info", action="store_true")
9 | parser.add_argument("-use","--use", help="Switch config file...", nargs=1)
10 | parser.add_argument("-i","--info", help=f"get machine info", action="store_true")
11 | parser.add_argument("-l","--list", help=f"get installed package info", action="store_true")
12 | parser.add_argument("-install","--install", help=f"install dependency", nargs='+')
13 | parser.add_argument("-remove","--remove", help=f"remove software", nargs=1)
14 | parser.add_argument("-find","--find", help=f"find software", nargs=1)
15 | # dependency install
16 | parser.add_argument("-dp","--depend", help=f"App dependency install", action="store_true")
17 | parser.add_argument("-e","--env", help=f"set environment App", action="store_true")
18 | parser.add_argument("-b","--build", help=f"compile App", action="store_true")
19 | parser.add_argument("-cls","--clean", help=f"clean App", action="store_true")
20 | parser.add_argument("-r","--run", help=f"run App", action="store_true")
21 | parser.add_argument("-j","--job", help=f"run job App", action="store_true")
22 | # batch run
23 | parser.add_argument("-rb","--rbatch", help=f"run batch App", action="store_true")
24 | # batch download
25 | parser.add_argument("-d","--download", help="Batch Download...", action="store_true")
26 | # update modulefile path
27 | parser.add_argument("-u","--update", help="start update hpcbench...", action="store_true")
28 | # check download url is good or not
29 | parser.add_argument("-check","--check", help="start check hpcbench download url...", action="store_true")
30 | parser.add_argument("-s","--score", help="Calculate the score and output benchmark report", action="store_true")
31 | args = parser.parse_args()
32 |
33 |
34 | if __name__ == '__main__':
35 | Scheduler(args).main()
--------------------------------------------------------------------------------
/init.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | ## User environment
4 | #module purge
5 | #module load openmpi/4.0.3-gcc-10.3.1
6 | export UCX_NET_DEVICES=mlx5_0:1
7 | export OMPI_MCA_btl=self,vader,tcp
8 |
9 | ## Cluster info
10 | export CLUSTER_NAME=kp920
11 | export GPU_PARTITION=asend01
12 | export CPU_PARTITION=arm128c256g
13 | export PARA_STORAGE_PATH=/lustre
14 | export TOTAL_NODES=5
15 | # A computing node's total cores
16 | export CPU_MAX_CORES=128
17 | # A computing node's power (W)
18 | export CLUSTER_POWER=314.25
19 | # The storage system total power (KW)
20 | export STORAGE_POWER=192
21 | # need testing
22 | export CLUSTER_BURSTBUFFER=111616
23 | export BW_BURSTBUFFER=12100.38
24 |
25 | ## Defult setting
26 | CUR_PATH=$(pwd)
27 | export HPCbench_ROOT=${CUR_PATH}
28 | export HPCbench_COMPILER=${CUR_PATH}/software/compiler
29 | export HPCbench_MPI=${CUR_PATH}/software/mpi
30 | export HPCbench_LIBS=${CUR_PATH}/software/libs
31 | export HPCbench_UTILS=${CUR_PATH}/software/utils
32 | export HPCbench_DOWNLOAD=${CUR_PATH}/downloads
33 | export HPCbench_MODULES=${CUR_PATH}/software/modulefiles
34 | export HPCbench_MODULEDEPS=${CUR_PATH}/software/moduledeps
35 | export HPCbench_BENCHMARK=${CUR_PATH}/benchmark
36 | export HPCbench_TMP=${CUR_PATH}/tmp
37 | export HPCbench_RESULT=${CUR_PATH}/result
38 | export DOWNLOAD_TOOL=${CUR_PATH}/package/common/download.sh
39 | export CHECK_DEPS=${CUR_PATH}/package/common/check_deps.sh
40 | export CHECK_ROOT=${CUR_PATH}/package/common/check_root.sh
41 | export gcc_version_number=$(gcc --version |grep GCC | awk '{ match($0, /[0-9]+\.[0-9]+\.[0-9]+/, version); print version[0] }')
42 | export arch=$(lscpu |grep Architecture|awk '{print $2}')
43 | export HADOOP_DATA=${CUR_PATH}/benchmark/storage/protocol/hadoop_data
44 |
45 | mkdir -p tmp downloads software
46 | if [ ! -d benchmark ];then
47 | mkdir -p benchmark/AI benchmark/compute benchmark/jobs benchmark/network benchmark/storage/ior benchmark/storage/protocol
48 | fi
49 | if [ ! -d result ];then
50 | mkdir -p result/AI result/balance result/compute result/network result/storage/ior result/storage/protocol result/system
51 | fi
52 |
--------------------------------------------------------------------------------
/package/common/check_deps.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | #循环遍历脚本入参,查看是否存在
3 | if [ $# -eq 0 ];then
4 | echo "Usage: $0 para1 para2"
5 | exit 1
6 | fi
7 | flag=0
8 | result=''
9 | echo "Start checking dependency..."
10 | for i in $* #在$*中遍历参数,此时每个参数都是独立的,会遍历$#次
11 | do
12 | result=$(env|grep $i)
13 | if [ -z "$result" ];then
14 | echo "Please load $i first."
15 | flag=1
16 | else
17 | echo "$i detected."
18 | fi
19 | done
20 |
21 | if [ $flag == 0 ]; then
22 | echo 'CHECK SUCCESS'
23 | else
24 | echo 'CHECK FAILED'
25 | exit 1
26 | fi
--------------------------------------------------------------------------------
/package/common/check_root.sh:
--------------------------------------------------------------------------------
1 | if [[ $EUID -ne 0 ]]; then
2 | echo "Warning:Permissions need to be elevated, some package may need to be installed by root or sudo."
3 | return 1
4 | fi
5 | return 0
--------------------------------------------------------------------------------
/package/common/download.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | download_path=$HPCbench_DOWNLOAD
3 | type_=wget
4 | url=
5 | filename=
6 | OPTIND=1
7 |
8 | while getopts ":u:f:t:" opt;
9 | do
10 | case $opt in
11 | #下载的链接
12 | u) url=$OPTARG;;
13 | #使用的下载类型,默认wget
14 | t) type_=$OPTARG;;
15 | #下载后重命名,可不添加
16 | f) filename=$OPTARG;;
17 | ?) echo -e "\033[0;31m[Error]\033[0m:Unknown parameter:"$opt
18 | exit 0;;
19 | esac
20 |
21 | done
22 |
23 | if [ ! "$url" ];then
24 | echo "Error: No available download link found"
25 | exit 0
26 | fi
27 | #如果需要重命名,则修改exist
28 | if [ "$filename" ];then
29 | exist_path=$download_path/$filename
30 | else
31 | if [ "$type_" == "git" ];then
32 | url=$(echo $url|sed 's/\.[^./]*$//')
33 | fi
34 | exist_path=$download_path/${url##*/}
35 | fi
36 |
37 | #判断文件是否存在
38 | if [ ! -e $exist_path ];then
39 | if [ "$type_" == "wget" ];then
40 | echo -e "\033[0;32m[Info]\033[0m:Using commands: wget $url -O $exist_path --no-check-certificate"
41 | wget $url -O $exist_path --no-check-certificate || rm -rf $exist_path
42 | elif [ "$type_" == "git" ];then
43 | echo -e "\033[0;32m[Info]\033[0m:Using commands: git clone $url $exist_path"
44 | git clone $url $exist_path
45 | else
46 | echo -e "\033[0;31m[Error]\033[0m:Unsupported download mode:"$type_
47 | exit 0
48 | fi
49 |
50 | #下载失败
51 | if [ $? != 0 ];then
52 | rm -rf $exist_path
53 | echo -e "\033[0;31m[Error]\033[0m:Download failed:"$url
54 | exit 0
55 | fi
56 | else
57 | echo -e "\033[0;32m[Info]\033[0m:"$exist_path" already exist"
58 | fi
59 |
--------------------------------------------------------------------------------
/package/ior/master/install.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | set -x
3 | set -e
4 |
5 | #Download the IOR source:
6 | git clone https://github.com/hpc/ior
7 | #Compile the software:
8 | cd ior
9 | ./bootstrap
10 | ./configure CC=mpicc --prefix=$1
11 |
12 | make
13 | make install
14 |
--------------------------------------------------------------------------------
/package/openblas/0.3.18/install.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | set -x
3 | set -e
4 | . ${DOWNLOAD_TOOL} -u https://github.com/xianyi/OpenBLAS/archive/refs/tags/v0.3.18.tar.gz -f OpenBLAS-0.3.18.tar.gz
5 | cd ${HPCbench_TMP}
6 | rm -rf OpenBLAS-0.3.18
7 | tar -xzvf ${HPCbench_DOWNLOAD}/OpenBLAS-0.3.18.tar.gz
8 | cd OpenBLAS-0.3.18
9 | make -j
10 | make PREFIX=$1 install
11 |
--------------------------------------------------------------------------------
/package/osu/7.0.1/install.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | set -x
3 | set -e
4 | . ${DOWNLOAD_TOOL} -u http://mvapich.cse.ohio-state.edu/download/mvapich/osu-micro-benchmarks-7.0.1.tar.gz -f osu-micro-benchmarks-7.0.1.tar.gz
5 | cd ${HPCbench_TMP}
6 | tar -xvf ${HPCbench_DOWNLOAD}/osu-micro-benchmarks-7.0.1.tar.gz
7 | cd osu-micro-benchmarks-7.0.1/
8 | ./configure --prefix=$1 CC=mpicc CXX=mpicxx
9 | make
10 | make install
11 |
--------------------------------------------------------------------------------
/requirements.yaml:
--------------------------------------------------------------------------------
1 | name: hpcbench
2 | channels:
3 | - defaults
4 | - conda-forge
5 | - bioconda
6 | dependencies:
7 | - _libgcc_mutex=0.1=main
8 | - _openmp_mutex=5.1=1_gnu
9 | - bzip2=1.0.8=h5eee18b_6
10 | - ca-certificates=2024.3.11=h06a4308_0
11 | - ld_impl_linux-64=2.38=h1181459_1
12 | - libffi=3.4.4=h6a678d5_1
13 | - libgcc-ng=11.2.0=h1234567_1
14 | - libgomp=11.2.0=h1234567_1
15 | - libstdcxx-ng=11.2.0=h1234567_1
16 | - libuuid=1.41.5=h5eee18b_0
17 | - ncurses=6.4=h6a678d5_0
18 | - openssl=3.0.13=h7f8727e_2
19 | - pip=24.0=py310h06a4308_0
20 | - python=3.10.14=h955ad1f_1
21 | - readline=8.2=h5eee18b_0
22 | - setuptools=69.5.1=py310h06a4308_0
23 | - sqlite=3.45.3=h5eee18b_0
24 | - tk=8.6.14=h39e8969_0
25 | - tzdata=2024a=h04d1e81_0
26 | - wheel=0.43.0=py310h06a4308_0
27 | - xz=5.4.6=h5eee18b_1
28 | - zlib=1.2.13=h5eee18b_1
29 | - pip:
30 | - environs==11.0.0
31 | - jinja2==3.1.4
32 | - loguru==0.7.2
33 | - markupsafe==2.1.5
34 | - marshmallow==3.21.3
35 | - packaging==24.0
36 | - prettytable==3.10.0
37 | - pyecharts==2.0.5
38 | - python-dotenv==1.0.1
39 | - simplejson==3.19.2
40 | - wcwidth==0.2.13
41 |
--------------------------------------------------------------------------------
/result/test_result.json:
--------------------------------------------------------------------------------
1 | {
2 | "compute":{
3 | "HPL":1.69,
4 | "HPCG":30
5 | },
6 | "AI":{
7 | "infering":5216,
8 | "training":1085
9 | },
10 | "storage":{
11 | "single_client_single_fluence":2,
12 | "single_client_multi_fluence":5.6,
13 | "aggregation_bandwidth":40,
14 | "IO_rate":2000000,
15 | "multi_request":57.2
16 | },
17 | "network":{
18 | "P2P_network_bandwidth":100,
19 | "P2P_message_latency":"1/1.2",
20 | "ratio":0.5
21 | },
22 | "system":{
23 | "compute_efficiency":7.3,
24 | "IO_operation_rate":50
25 | },
26 | "balance":{
27 | "mem2cpu":4,
28 | "buffer2mem":0.49,
29 | "file2buffer":30,
30 | "mem2buffer":3526,
31 | "buffer2file":3
32 | }
33 | }
34 |
--------------------------------------------------------------------------------
/result/test_score.json:
--------------------------------------------------------------------------------
1 | {"compute": {"HPL": {"name": "HPL双精度浮点计算性能", "weights": 0.6, "large": 148.6, "medium": 14.01, "small": 6.0, "mini": 0.3, "score": 35.20833333333332}, "HPCG": {"name": "HPCG双精度浮点计算性能", "weights": 0.4, "large": 2725.75, "medium": 355.44, "small": 175, "mini": 6, "score": 21.428571428571427}, "issue_score": 28.865864453564903}, "AI": {"infering": {"name": "图像推理任务的计算性能", "weights": 0.5, "large": 2000000, "medium": 1500000, "small": 100000, "mini": 750, "score": 6.52}, "training": {"name": "图像训练任务的计算性能", "weights": 0.5, "large": 10802, "medium": 254, "small": 10000, "mini": 560, "score": 13.5625}, "issue_score": 9.403589740093938}, "storage": {"single_client_single_fluence": {"name": "文件系统单客户端单流带宽", "weights": 0.2, "large": 8, "medium": 9, "small": 6, "mini": 1, "score": 41.666666666666664}, "single_client_multi_fluence": {"name": "文件系统单客户端多流带宽", "weights": 0.2, "large": 13, "medium": 21, "small": 11, "mini": 5, "score": 63.636363636363626}, "aggregation_bandwidth": {"name": "文件系统聚合带宽", "weights": 0.2, "large": 2500, "medium": 1760, "small": 200, "mini": 80, "score": 25.0}, "IO_rate": {"name": "文件系统聚合IO操作速率", "weights": 0.2, "large": 26000000, "medium": 14000000, "small": 17500000, "mini": 4300000, "score": 14.285714285714285}, "multi_request": {"name": "多协议平均访问效率", "weights": 0.2, "large": 62.0, "medium": 64.6, "small": 65, "mini": 65, "score": 100}, "issue_score": 39.37922967655749}, "network": {"P2P_network_bandwidth": {"name": "点对点网络带宽", "weights": 0.4, "large": 200, "medium": 200, "small": 200, "mini": 100, "score": 62.5}, "P2P_message_latency": {"name": "点对点消息延迟", "weights": 0.3, "large": "1/1.67", "medium": "1/3.7", "small": "1/4.0", "mini": "1/2.0", "score": 100}, "ratio": {"name": "网络对分带宽与注入带宽比值", "weights": 0.3, "large": 1.022, "medium": 2.06, "small": 1.5, "mini": 1, "score": 41.666666666666664}, "issue_score": 63.721887926789826}, "system": {"compute_efficiency": {"name": "单位功耗的浮点计算性能", "weights": 0.6, "large": 14.719, "medium": 3.56, "small": 20, "mini": 6, "score": 45.625}, "IO_operation_rate": {"name": "单位功耗的文件系统聚合IO速率", "weights": 0.4, "large": 2.57, "medium": 3.55, "small": 200, "mini": 100, "score": 31.25}, "issue_score": 39.215859247314846}, "balance": {"mem2cpu": {"name": "内存容量与处理器核心数比", "weights": 0.2, "large": 9.64, "medium": 1.66, "small": 4, "mini": 3.93, "score": 100}, "buffer2mem": {"name": "BurstBuffer与内存的容量比", "weights": 0.2, "large": 3.78, "medium": 2.3, "small": 2, "mini": 2.7, "score": 30.624999999999996}, "file2buffer": {"name": "并行文件系统与BurstBuffer的容量比", "weights": 0.2, "large": 23.87, "medium": 15, "small": 10, "mini": 17, "score": 100}, "mem2buffer": {"name": "内存与BurstBuffer的带宽比", "weights": 0.2, "large": 6000, "medium": 4000, "small": 1000, "mini": 125, "score": 100}, "buffer2file": {"name": "BurstBuffer与并行文件系统的带宽比", "weights": 0.2, "large": 4, "medium": 3, "small": 10, "mini": 5.5, "score": 37.5}, "issue_score": 64.86665050691026}, "sum_score": 40.90884692520521}
--------------------------------------------------------------------------------
/run.AI:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # 加载环境
4 | source init.sh
5 |
6 | # maskrcnn
7 | ./hpcbench -use templates/AI/maskrcnn.$arch.config
8 | ./hpcbench -d
9 | ./hpcbench -dp
10 | ./hpcbench -b
11 | ./hpcbench -rb
12 | ./hpcbench -j
13 |
14 | # hpcg
15 | ./hpcbench -use templates/compute/resnet.$arch.config
16 | ./hpcbench -d
17 | ./hpcbench -dp
18 | ./hpcbench -b
19 | ./hpcbench -rb
20 | ./hpcbench -j
21 |
--------------------------------------------------------------------------------
/run.balance:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | source init.sh
4 | #scratch bw test
5 | module use ./software/moduledeps/gcc${gcc_version_number}/
6 | module load ior/master
7 |
8 | cat << \EOF > scratch_bw.slurm
9 | #!/bin/bash
10 | #SBATCH --job-name="aggreagate_bandwidth"
11 | #SBATCH -N 2
12 | #SBATCH --ntasks-per-node=64
13 | #SBATCH --output=logs/scratch_bandwidth.out
14 | #SBATCH --error=logs/scratch_bandwidth.out
15 | #SBATCH -p {{ CPU_PARTITION }}
16 | #SBATCH --exclusive
17 |
18 | NCT=2
19 |
20 | # Date Stamp for benchmark
21 | SEQ=64
22 | MAXPROCS=128
23 | DATA_SIZE=128
24 |
25 | BASE_DIR=$SCRATCH/iortest
26 | RESULT_DIR=$HPCbench_ROOT/result/balance
27 |
28 | NCT=2 #`grep -v ^# hfile |wc -l`
29 | DS=`date +"%F_%H:%M:%S"`
30 | # Overall data set size in GiB. Must be >=MAXPROCS. Should be a power of 2.
31 |
32 | while [ ${SEQ} -le ${MAXPROCS} ]; do
33 | NPROC=`expr ${NCT} \* ${SEQ}`
34 | BSZ=`expr ${DATA_SIZE} / ${SEQ}`"g"
35 | # Alternatively, set to a static value and let the data size increase.
36 | # BSZ="1g"
37 | # BSZ="${DATA_SIZE}"
38 | mpirun \
39 | ior -v -w -r -i 4 -F \
40 | -o ${BASE_DIR}/ior-test3.file \
41 | -t 1m -b ${BSZ} | tee ${RESULT_DIR}/aggregation_bandwidth.txt
42 | SEQ=`expr ${SEQ} \* 2`
43 | done
44 | EOF
45 |
46 | sbatch scratch_bw.slurm
47 |
48 |
49 | ./hpcbench -use templates/balance/balance.linux64.config
50 | ./hpcbench -rb
51 |
52 |
--------------------------------------------------------------------------------
/run.compute:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # 加载环境
4 | source init.sh
5 |
6 | # hpl
7 | ./hpcbench -install openblas/0.3.18 gcc
8 | ./hpcbench -use templates/compute/hpl.$arch.config
9 | ./hpcbench -d
10 | ./hpcbench -dp
11 | ./hpcbench -cls
12 | ./hpcbench -b
13 | ./hpcbench -j
14 |
15 | # hpcg
16 | ./hpcbench -use templates/compute/hpcg.$arch.config
17 | ./hpcbench -d
18 | ./hpcbench -dp
19 | ./hpcbench -cls
20 | ./hpcbench -b
21 | ./hpcbench -j
22 |
23 |
--------------------------------------------------------------------------------
/run.network:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # 加载环境
4 | source init.sh
5 |
6 | # osu
7 | ./hpcbench -use templates/network/osu.$arch.config
8 | ./hpcbench -d
9 | ./hpcbench -dp
10 | ./hpcbench -cls
11 | ./hpcbench -b
12 | ./hpcbench -j
13 |
--------------------------------------------------------------------------------
/run.storage:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # 加载环境
4 | source init.sh
5 |
6 | # ior
7 | ./hpcbench -use templates/storage/ior.$arch.config
8 | ./hpcbench -d
9 | ./hpcbench -dp
10 | ./hpcbench -cls
11 | ./hpcbench -b
12 | ./hpcbench -j
13 |
14 | # protocol
15 | ## posix
16 | ./hpcbench -use templates/storage/protocol/posix.$arch.config
17 | ./hpcbench -rb
18 |
19 | ## hadoop
20 | ./hpcbench -use templates/storage/protocol/hadoop.$arch.config
21 | ./hpcbench -d
22 | ./hpcbench -dp
23 | ./hpcbench -b
24 | ./hpcbench -rb
25 |
26 | ## warp
27 | ./hpcbench -use templates/storage/protocol/warp.$arch.config
28 | ./hpcbench -d
29 | ./hpcbench -dp
30 | ./hpcbench -b
31 | ./hpcbench -rb
32 |
33 | ## nfs should be run with root or sudo
34 | ./hpcbench -use templates/storage/protocol/nfs.$arch.config
35 | ./hpcbench -d
36 | ./hpcbench -dp
37 | ./hpcbench -b
38 | ./hpcbench -rb
39 |
--------------------------------------------------------------------------------
/run.system:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | source init.sh
4 | ./hpcbench -use templates/system/system.linux64.config
5 | ./hpcbench -rb
6 |
7 |
--------------------------------------------------------------------------------
/setting.py:
--------------------------------------------------------------------------------
1 | import platform
2 | import sys
3 | from os.path import dirname, abspath, join
4 | from environs import Env
5 | from loguru import logger
6 | import shutil
7 |
8 | env = Env()
9 | env.read_env()
10 |
11 | # definition of flags
12 | IS_WINDOWS = platform.system().lower() == 'windows'
13 |
14 | # definition of dirs
15 | ROOT_DIR = dirname(abspath(__file__))
16 | LOG_DIR = join(ROOT_DIR, env.str('LOG_DIR', 'logs'))
17 |
18 | # definition of environments
19 | CLUSTER_SCALE = None
20 | CLUSTER_NAME = env.str('CLUSTER_NAME')
21 | APP_DEBUG = env.bool('APP_DEBUG', False)
22 | APP_CONFIG = env.str('APP_CONFIG', None)
23 |
24 | HPCbench_RESULT = env.str('HPCbench_RESULT',join(ROOT_DIR, 'result'))
25 | HPCbench_BENCHMARK = env.str('HPCbench_BENCHMARK',join(ROOT_DIR, 'benchmark'))
26 |
27 | GPU_PARTITION = env.str('GPU_PARTITION')
28 | CPU_PARTITION = env.str('CPU_PARTITION')
29 | CPU_MAX_CORES = env.str('CPU_MAX_CORES')
30 | HADOOP_DATA = env.str('HADOOP_DATA')
31 | CLUSTER_POWER = env.str('CLUSTER_POWER', 10000)
32 | STORAGE_POWER = env.str('STORAGE_POWER', 10000)
33 | CLUSTER_BURSTBUFFER = env.str('CLUSTER_BURSTBUFFER', 10000)
34 | # CLUSTER_MEMORY = env.int('CLUSTER_MEMORY', 10000)
35 | BW_BURSTBUFFER = env.str('BW_BURSTBUFFER', 10000)
36 | PARA_STORAGE_PATH = env.str('PARA_STORAGE_PATH')
37 | TOTAL_NODES = env.str('TOTAL_NODES')
38 |
39 | ENABLE_LOG_FILE = env.bool('ENABLE_LOG_FILE', True)
40 | ENABLE_LOG_RUNTIME_FILE = env.bool('ENABLE_LOG_RUNTIME_FILE', True)
41 | ENABLE_LOG_ERROR_FILE = env.bool('ENABLE_LOG_ERROR_FILE', True)
42 |
43 | LOG_LEVEL = "DEBUG" if APP_DEBUG else "INFO"
44 | LOG_ROTATION = env.str('LOG_ROTATION', '100MB')
45 | LOG_RETENTION = env.str('LOG_RETENTION', '1 week')
46 |
47 | logger.remove()
48 | logger.add(sys.stderr, level='INFO')
49 |
50 | if ENABLE_LOG_FILE:
51 | if ENABLE_LOG_RUNTIME_FILE:
52 | logger.add(env.str('LOG_RUNTIME_FILE', join(LOG_DIR, 'runtime.log')),
53 | level=LOG_LEVEL, rotation=LOG_ROTATION, retention=LOG_RETENTION)
54 | if ENABLE_LOG_ERROR_FILE:
55 | logger.add(env.str('LOG_ERROR_FILE', join(LOG_DIR, 'error.log')),
56 | level='ERROR', rotation=LOG_ROTATION)
57 | else:
58 | shutil.rmtree(LOG_DIR, ignore_errors=True)
59 |
--------------------------------------------------------------------------------
/templates/AI/maskrcnn.aarch64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | localhost
3 |
4 | [DOWNLOAD]
5 | train2017.zip http://images.cocodataset.org/zips/train2017.zip
6 | val2017.zip http://images.cocodataset.org/zips/val2017.zip
7 | annotations_trainval2017.zip http://images.cocodataset.org/annotations/annotations_trainval2017.zip
8 | maskrcnn.sif https://afdata.sjtu.edu.cn/files/maskrcnn_latest.sif
9 |
10 | [DEPENDENCY]
11 | mkdir -p ./benchmark/AI/maskrcnn/data
12 |
13 | cp -rfv ./downloads/train2017.zip ./benchmark/AI/maskrcnn/data
14 | cp -rfv ./downloads/val2017.zip ./benchmark/AI/maskrcnn/data/
15 | cp -rfv ./downloads/annotations_trainval2017.zip ./benchmark/AI/maskrcnn/data/
16 | cd ./benchmark/AI/maskrcnn/data
17 | unzip train2017.zip
18 | unzip val2017.zip
19 | unzip annotations_trainval2017.zip
20 | cd ..
21 |
22 | [ENV]
23 |
24 | [APP]
25 | app_name = maskrcnn
26 | build_dir = ${HPCbench_ROOT}/benchmark/AI/maskrcnn
27 | binary_dir = ${HPCbench_ROOT}/benchmark/AI/maskrcnn
28 | case_dir = ${HPCbench_ROOT}/benchmark/AI/maskrcnn
29 |
30 | [BUILD]
31 | # MaskRCNN for Ascend
32 |
33 | ## environment
34 | ### miniconda-aarch64
35 | ### conda environment
36 | conda create -n maskrcnn-torch1.11 python=3.7
37 | conda activate maskrcnn-torch1.11
38 |
39 | ### dependency
40 | pip3 install attrs numpy decorator sympy cffi pyyaml pathlib2 psutil protobuf scipy requests absl-py tqdm pyyaml wheel typing_extensions
41 |
42 | ### maskrcnn src
43 | cd ${HPCbench_ROOT}/benchmark/AI/maskrcnn
44 | git clone https://gitee.com/ascend/ModelZoo-PyTorch.git
45 | cd ModelZoo-PyTorch/PyTorch/built-in/cv/detection/MaskRCNN_for_Pytorch/
46 |
47 | ### torch-1.11
48 | wget --no-check-certificate https://repo.huaweicloud.com/kunpeng/archive/Ascend/PyTorch/torch-1.11.0-cp37-cp37m-linux_aarch64.whl
49 | pip3 install torch-1.11.0-cp37-cp37m-linux_aarch64.whl
50 |
51 | ### extra requirements
52 | pip install -r requirements.txt
53 |
54 | ### cocoapi installation
55 | git clone https://github.com/cocodataset/cocoapi.git
56 | cd cocoapi/PythonAPI
57 |
58 | python setup.py build_ext install
59 | cd ../..
60 |
61 | ### torch-npu torchvision apex installation
62 | wget --no-check-certificate https://gitee.com/ascend/pytorch/releases/download/v5.0.rc1-pytorch1.11.0/torch_npu-1.11.0-cp37-cp37m-linux_aarch64.whl
63 | pip3 install torch_npu-1.11.0-cp37-cp37m-linux_aarch64.whl
64 |
65 | git clone https://github.com/pytorch/vision.git
66 | cd vision
67 | git checkout v0.12.0
68 | python setup.py bdist_wheel
69 | cd dist
70 | pip3 install torchvision-0.12.*.whl
71 | cd ../..
72 |
73 | git clone -b v0.12.0 https://gitee.com/ascend/vision.git vision_npu
74 | cd vision_npu
75 | source /opt/Ascend/ascend-toolkit/set_env.sh
76 | python setup.py bdist_wheel
77 | cd dist
78 | pip install torchvision_npu-0.12.*.whl
79 | cd ../..
80 |
81 | pip3 install apex --no-index --find-links https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/MindX/OpenSource/pytorch1_11_0/index.html --trusted-host ascend-repo.obs.cn-east-2.myhuaweicloud.com
82 |
83 | ### maskrcnn
84 | python setup.py build develop
85 |
86 | [RUN]
87 | binary = maskrcnn
88 |
89 | [BATCH]
90 | ## training
91 | bash test/train_full_8p.sh --data_path=${HPCbench_ROOT}/benchmark/AI/maskrcnn/data
92 |
93 |
--------------------------------------------------------------------------------
/templates/AI/maskrcnn.x86_64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | localhost
3 |
4 | [DOWNLOAD]
5 | train2017.zip http://images.cocodataset.org/zips/train2017.zip
6 | val2017.zip http://images.cocodataset.org/zips/val2017.zip
7 | annotations_trainval2017.zip http://images.cocodataset.org/annotations/annotations_trainval2017.zip
8 | maskrcnn.sif https://afdata.sjtu.edu.cn/files/maskrcnn_latest.sif
9 |
10 | [DEPENDENCY]
11 | mkdir -p ./benchmark/AI/maskrcnn/data
12 |
13 | cp -rfv ./downloads/train2017.zip ./benchmark/AI/maskrcnn/data
14 | cp -rfv ./downloads/val2017.zip ./benchmark/AI/maskrcnn/data/
15 | cp -rfv ./downloads/annotations_trainval2017.zip ./benchmark/AI/maskrcnn/data/
16 | cd ./benchmark/AI/maskrcnn/data
17 | unzip train2017.zip
18 | unzip val2017.zip
19 | unzip annotations_trainval2017.zip
20 | cd ..
21 | git clone https://github.com/NVIDIA/DeepLearningExamples.git
22 | mv ${HPCbench_ROOT}/benchmark/AI/maskrcnn/data ${HPCbench_ROOT}/benchmark/AI/maskrcnn/DeepLearningExamples/PyTorch/Segmentation/MaskRCNN/pytorch
23 |
24 | [ENV]
25 |
26 | [APP]
27 | app_name = maskrcnn
28 | build_dir = ${HPCbench_ROOT}/benchmark/AI/maskrcnn
29 | binary_dir = ${HPCbench_ROOT}/benchmark/AI/maskrcnn
30 | case_dir = ${HPCbench_ROOT}/benchmark/AI/maskrcnn
31 |
32 | [BUILD]
33 |
34 | [RUN]
35 | binary = maskrcnn
36 |
37 | [JOB1]
38 | #!/bin/bash
39 | #SBATCH -J maskrcnn
40 | #SBATCH -p {{ GPU_PARTITION }} #GPU partition
41 | #SBATCH -N 1
42 | #SBATCH -n 64
43 | #SBATCH --gres=gpu:4
44 | #SBATCH --exclusive
45 | #SBATCH -o result/AI/maskrcnn.txt
46 |
47 | image=${HPCbench_ROOT}/downloads/maskrcnn_latest.sif
48 | cd ${HPCbench_ROOT}/benchmark/AI/maskrcnn/DeepLearningExamples/PyTorch/Segmentation/MaskRCNN/pytorch
49 | mkdir -p results
50 | singularity exec --nv --bind `pwd`:/datasets,`pwd`/results:/results ${image} bash -c "./scripts/train.sh"
51 |
52 |
--------------------------------------------------------------------------------
/templates/AI/resnet.aarch64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | localhost
3 |
4 | [DOWNLOAD]
5 | ILSVRC2012_img_val https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar
6 | resnet50_v1.pb https://zenodo.org/record/2535873/files/resnet50_v1.pb
7 | val_map.txt https://github.com/microsoft/Swin-Transformer/files/8529898/val_map.txt
8 |
9 | [DEPENDENCY]
10 | mkdir -p ./benchmark/AI/resnet/data/val
11 | tar -xvf ./downloads/ILSVRC2012_img_val.tar -C ./benchmark/AI/resnet/data/val/
12 | cp -rfv ./downloads/resnet50_v1.pb ./benchmark/AI/resnet/data/
13 | cp -rfv ./downloads/val_map.txt ./benchmark/AI/resnet/data/
14 | cd ./benchmark/AI/resnet/
15 | git clone https://gitee.com/ascend/ModelZoo-PyTorch.git
16 |
17 | [ENV]
18 |
19 | [APP]
20 | app_name = resnet
21 | build_dir = ${HPCbench_ROOT}/benchmark/AI/resnet
22 | binary_dir = ${HPCbench_ROOT}/benchmark/AI/resnet
23 | case_dir = ${HPCbench_ROOT}/benchmark/AI/resnet
24 |
25 | [BUILD]
26 | # ResNet50 MLperf for Ascend
27 | ## environment
28 | ### miniconda-aarch64
29 | wget --no-check-certificate https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py39_4.9.2-Linux-aarch64.sh
30 | sh Miniconda3-py39_4.9.2-Linux-aarch64.sh
31 | source activate
32 | ### conda environment
33 | conda create -n resnet50-torch1.11 python=3.7
34 | conda activate resnet50-torch1.11
35 | ### dependency
36 | pip3 install attrs numpy decorator sympy cffi pyyaml pathlib2 psutil protobuf scipy requests absl-py tqdm pyyaml wheel typing_extensions cloudpickle tornado synr==0.5.0
37 | cd ${HPCbench_ROOT}/benchmark/AI/resnet/ModelZoo-PyTorch/ACL_PyTorch/built-in/cv/Resnet50_Pytorch_Infer
38 | pip3 install -r requirements.txt
39 | python3 imagenet_torch_preprocess.py resnet ${HPCbench_ROOT}/benchmark/AI/resnet/data/val/ ./prep_dataset
40 | wget --no-check-certificate https://download.pytorch.org/models/resnet50-0676ba61.pth
41 | source /opt/Ascend/ascend-toolkit/set_env.sh
42 | atc --model=resnet50_official.onnx --framework=5 --output=resnet50_bs64 --input_format=NCHW --input_shape="actual_input_1:64,3,224,224" --enable_small_channel=1 --log=error --soc_version=Ascend910B --insert_op_conf=aipp_resnet50.aippconfig
43 | wget --no-check-certificate https://aisbench.obs.myhuaweicloud.com/packet/ais_bench_infer/0.0.2/aclruntime-0.0.2-cp37-cp37m-linux_aarch64.whl
44 | wget --no-check-certificate https://aisbench.obs.myhuaweicloud.com/packet/ais_bench_infer/0.0.2/ais_bench-0.0.2-py3-none-any.whl
45 | pip3 install aclruntime-0.0.2-cp37-cp37m-linux_aarch64.whl
46 | pip3 install ais_bench-0.0.2-py3-none-any.whl
47 |
48 |
49 | [RUN]
50 | binary = resnet
51 |
52 |
53 | [BATCH]
54 | python3 -m ais_bench --model ./resnet50_bs64.om --input ./prep_dataset/ --output ./ --output_dirname result --outfmt TXT | tee ${RESULT_DIR}/inference.log
55 | python3 vision_metric_ImageNet.py ./result ./ImageNet/val_label.txt ./ result.json
56 |
--------------------------------------------------------------------------------
/templates/AI/resnet.x86_64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | localhost
3 |
4 | [DOWNLOAD]
5 | ILSVRC2012_img_val https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar
6 | resnet50_v1.pb https://zenodo.org/record/2535873/files/resnet50_v1.pb
7 | val_map.txt https://github.com/microsoft/Swin-Transformer/files/8529898/val_map.txt
8 | resnet50.sif https://afdata.sjtu.edu.cn/files/resnet_latest.sif
9 |
10 | [DEPENDENCY]
11 | mkdir -p ./benchmark/AI/resnet/data/val
12 | tar -xvf ./downloads/ILSVRC2012_img_val.tar -C ./benchmark/AI/resnet/data/val/
13 | cp -rfv ./downloads/resnet50_v1.pb ./benchmark/AI/resnet/data/
14 | cp -rfv ./downloads/val_map.txt ./benchmark/AI/resnet/data/
15 | cd ./benchmark/AI/resnet/
16 | git clone https://github.com/mlcommons/inference.git
17 |
18 | [ENV]
19 |
20 | [APP]
21 | app_name = resnet
22 | build_dir = ${HPCbench_ROOT}/benchmark/AI/resnet
23 | binary_dir = ${HPCbench_ROOT}/benchmark/AI/resnet
24 | case_dir = ${HPCbench_ROOT}/benchmark/AI/resnet
25 |
26 | [BUILD]
27 |
28 | [RUN]
29 | binary = resnet
30 |
31 |
32 | [JOB1]
33 | #!/bin/bash
34 | #SBATCH -J inference
35 | #SBATCH -p {{ GPU_PARTITION }}
36 | #SBATCH -n 16
37 | #SBATCH --gres=gpu:1
38 | #SBATCH -o result/AI/inference.txt
39 |
40 | source init.sh
41 | module load cuda/11.8.0 cudnn
42 | export MODEL_DIR=${HPCbench_ROOT}/benchmark/AI/resnet/data/
43 | export DATA_DIR=${HPCbench_ROOT}/benchmark/AI/resnet/data/
44 | export IMAGE=${HPCbench_ROOT}/downloads/resnet_latest.sif
45 | cd ./benchmark/AI/resnet/inference/vision/classification_and_detection
46 |
47 | singularity exec --nv $IMAGE bash -c "./run_local.sh tf resnet50 gpu --count 50000 --time 1200 --scenario Offline --qps 200 --max-latency 0.1"
48 | singularity exec --nv $IMAGE bash -c "./run_local.sh tf resnet50 gpu --accuracy --time 60 --scenario Offline --qps 200 --max-latency 0.2"
49 |
--------------------------------------------------------------------------------
/templates/balance/balance.linux64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | 11.11.11.11
3 |
4 | [DOWNLOAD]
5 |
6 | [DEPENDENCY]
7 |
8 | [ENV]
9 | export RESULT_DIR=$HPCbench_ROOT/result/balance
10 | mkdir -p $RESULT_DIR
11 |
12 | [APP]
13 | app_name = balance
14 | build_dir = $HPCbench_ROOT
15 | binary_dir = $HPCbench_ROOT
16 | case_dir = $HPCbench_ROOT
17 |
18 | [BUILD]
19 |
20 | [CLEAN]
21 |
22 | [RUN]
23 | binary = balance
24 | run = echo
25 | nodes = 1
26 |
27 | [BATCH]
28 | cd $RESULT_DIR
29 | exec 1>$RESULT_DIR/balance.log 2>/dev/null
30 | # 内存容量与核心数比
31 | echo "内存容量与核心数比"
32 | TotalMemPerNode=`grep MemTotal /proc/meminfo|awk -F " " '{print $2}'`
33 | let TotalMemPerNode=$TotalMemPerNode/1024/1024
34 | echo TotalMemPerNode : $TotalMemPerNode GB
35 | TotalCorePerNode=`cat /proc/cpuinfo | grep "processor" | wc -l`
36 | echo TotalCorePerNode : $TotalCorePerNode
37 | mem2cpu=$(echo "scale=2; $TotalMemPerNode/$TotalCorePerNode" | bc)
38 | echo mem2cpu=$mem2cpu
39 | echo " "
40 |
41 | # BurstBuffer 与内存的容量比
42 | echo "BurstBuffer 与内存的容量比"
43 | BurstBuffer={{ CLUSTER_BURSTBUFFER }}
44 | TotalNodeNum={{ TOTAL_NODES }}
45 | let TotalMemAllNode=$TotalMemPerNode*$TotalNodeNum
46 | buffer2mem=$(echo "scale=2; $BurstBuffer/$TotalMemAllNode"|bc)
47 | echo BurstBuffer : $BurstBuffer GB
48 | echo TotalNodeNum : $TotalNodeNum
49 | echo TotalMemPerNode : $TotalMemPerNode GB
50 | echo buffer2mem=$buffer2mem
51 | echo " "
52 |
53 | # 并行文件系统与BurstBuffer的容量比
54 | echo "并行文件系统与BurstBuffer的容量比"
55 |
56 | ParaName={{ PARA_STORAGE_PATH }}
57 | echo $ParaName
58 | ParaSize=`df -a |grep $ParaName|awk '{print $2}'`
59 | let ParaSize=$ParaSize/1024/1024
60 | echo ParaSize : $ParaSize GB
61 | echo BurstBuffer : $BurstBuffer GB
62 | file2buffer=$(echo "scale=2; $ParaSize/$BurstBuffer"|bc)
63 | echo file2buffer=$file2buffer
64 | echo " "
65 |
66 | # 内存与BurstBuffer的带宽比
67 | echo "内存与BurstBuffer的带宽比"
68 | rm stream.c stream.log
69 | wget --no-check-certificate https://raw.githubusercontent.com/jeffhammond/STREAM/master/stream.c > /dev/null 2>&1 &
70 | wait
71 | gcc -mtune=native -march=native -O3 -mcmodel=medium -fopenmp \
72 | -DSTREAM_ARRAY_SIZE=200000000 -DNTIMES=30 -DOFFSET=4096 stream.c \
73 | -o stream.o > /dev/null 2>&1 &
74 | wait
75 | ./stream.o > stream.log 2>&1 &
76 | wait
77 | RateOfMem=`cat stream.log |grep Triad|awk '{print $2}'`
78 | echo RateOfMem:${RateOfMem}
79 |
80 | #BurstBuffer 带宽测试 需要在有闪存节点上进行测试
81 | #bash scrath-ior.sh
82 | BW_BURSTBUFFER={{ BW_BURSTBUFFER }}
83 | mem2buffer=$(echo "scale=2; $RateOfMem*$TotalNodeNum/$BW_BURSTBUFFER"|bc)
84 | echo mem2buffer=$mem2buffer
85 | echo " "
86 |
87 | echo "BurstBuffer与并行文件系统的带宽比"
88 | echo "running bandwidth test of ParaFileSystem"
89 | BW_ParaFile=`cat $HPCbench_ROOT/result/storage/ior/aggregation_bandwidth.txt |grep Write|awk 'NR==1 {print $3}'`
90 | echo BW_ParaFile : $BW_ParaFile
91 | buffer2file=$(echo "scale=2; $BW_BURSTBUFFER/$BW_ParaFile"|bc)
92 | echo buffer2file=$buffer2file
93 |
94 |
--------------------------------------------------------------------------------
/templates/balance/stream/main/stream.linux64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | 11.11.11.11
3 |
4 | [DOWNLOAD]
5 | stream/5.10 https://github.com/jeffhammond/STREAM/archive/refs/heads/master.zip STREAM.zip
6 |
7 | [DEPENDENCY]
8 | export CC=`which gcc`
9 | export CXX=`which g++`
10 | export FC=`which gfortran`
11 | if [ ! -d "STREAM-master" ]; then
12 | unzip ./downloads/STREAM.zip
13 | fi
14 |
15 | [ENV]
16 | export STREAM_HOME=$HPCbench_ROOT/STREAM-master
17 | export OMP_PROC_BIND=true
18 | export OMP_NUM_THREADS=1
19 |
20 | [APP]
21 | app_name = STREAM
22 | build_dir = $STREAM_HOME
23 | binary_dir = $STREAM_HOME
24 | case_dir = $STREAM_HOME
25 |
26 | [BUILD]
27 | cat << \EOF > Makefile
28 | CC = gcc
29 | CFLAGS = -mtune=native -march=native -O3 -mcmodel=medium -fopenmp
30 |
31 | FC = gfortran
32 | FFLAGS = -O2 -fopenmp
33 |
34 | all: stream_f.exe stream_c.exe
35 |
36 | stream_f.exe: stream.f mysecond.o
37 | $(CC) $(CFLAGS) -c mysecond.c
38 | $(FC) $(FFLAGS) -c stream.f
39 | $(FC) $(FFLAGS) stream.o mysecond.o -o stream_f.exe
40 |
41 | stream_c.exe: stream.c
42 | $(CC) $(CFLAGS) stream.c -o stream_c.exe
43 |
44 | clean:
45 | rm -f stream_f.exe stream_c.exe *.o
46 | EOF
47 | # high-throughput mode
48 | # tuned-adm profile throughput-performance
49 | # close transparent hugepage
50 | # echo never > /sys/kernel/mm/transparent_hugepage/enabled
51 | # echo never > /sys/kernel/mm/transparent_hugepage/defrag
52 | make stream_c.exe > compiler.log
53 |
54 | [CLEAN]
55 | make clean
56 |
57 | [RUN]
58 | run =
59 | binary = stream_c.exe 2>&1 >> stream.output.log
60 | nodes = 1
61 |
62 | [BATCH]
63 | for core_num in 1 2 4 8 16 32 64 128
64 | do
65 | echo 3 > /proc/sys/vm/drop_caches
66 | export OMP_NUM_THREADS=$core_num
67 | ./stream_c.exe >> stream.output.log
68 | done
69 |
--------------------------------------------------------------------------------
/templates/compute/hpcg.aarch64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | localhost
3 |
4 | [DOWNLOAD]
5 |
6 | [DEPENDENCY]
7 | export CC=`which gcc`
8 | export CXX=`which g++`
9 | export FC=`which fortran`
10 | mkdir -p $HPCbench_ROOT/benchmark/compute
11 | mkdir -p $HPCbench_ROOT/result/compute
12 | cd $HPCbench_ROOT/benchmark/compute
13 | git config --global http.sslVerify false
14 | git clone --depth=1 https://github.com/hpcg-benchmark/hpcg.git
15 |
16 | [ENV]
17 | export CC=mpicc CXX=mpic++ FC=mpifort
18 | export HPCG_HOME=$HPCbench_ROOT/benchmark/compute/hpcg
19 | export OMPI_MCA_btl=self,tcp
20 |
21 | [APP]
22 | app_name = hpcg
23 | build_dir = $HPCG_HOME
24 | binary_dir = $HPCG_HOME/bin/
25 | case_dir = $HPCG_HOME/bin/
26 |
27 | [BUILD]
28 | cat << \EOF > setup/Make.MPI_GCC_OMP
29 | SHELL = /bin/sh
30 | CD = cd
31 | CP = cp
32 | LN_S = ln -s -f
33 | MKDIR = mkdir -p
34 | RM = /bin/rm -f
35 | TOUCH = touch
36 | TOPdir = .
37 | SRCdir = $(TOPdir)/src
38 | INCdir = $(TOPdir)/src
39 | BINdir = $(TOPdir)/bin
40 | HPCG_INCLUDES = -I$(INCdir) -I$(INCdir)/$(arch) $(MPinc)
41 | HPCG_LIBS =
42 | HPCG_OPTS =
43 | HPCG_DEFS = $(HPCG_OPTS) $(HPCG_INCLUDES)
44 | CXX = mpicxx
45 | CXXFLAGS = $(HPCG_DEFS) -O3 -ffast-math -ftree-vectorize -fopenmp
46 | LINKER = $(CXX)
47 | LINKFLAGS = $(CXXFLAGS)
48 | ARCHIVER = ar
49 | ARFLAGS = r
50 | RANLIB = echo
51 | EOF
52 | ./configure MPI_GCC_OMP
53 | make -j
54 |
55 | [CLEAN]
56 | make clean
57 |
58 | [RUN]
59 | run = mpirun -np 128
60 | binary = xhpcg --nx=104 --rt=60
61 | nodes = 1
62 |
63 | [BATCH]
64 |
65 | [JOB1]
66 | #!/bin/bash
67 | #SBATCH -J hpcg
68 | #SBATCH -N 1
69 | #SBATCH --ntasks-per-node 128
70 | #SBATCH -p {{ CPU_PARTITION }}
71 | #SBATCH --exclusive
72 | #SBATCH -o logs/hpcg.out
73 | #SBATCH -e logs/hpcg.out
74 |
75 | cd $HPCG_HOME/bin/
76 | export UCX_NET_DEVICES=mlx5_0:1
77 | mpirun ./xhpcg --nx=104 --rt=60
78 | cp $HPCbench_ROOT/benchmark/compute/hpcg/bin/HPCG-Benchmark* $HPCbench_ROOT/result/compute/hpcg.txt
79 |
--------------------------------------------------------------------------------
/templates/compute/hpcg.x86_64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | localhost
3 |
4 | [DOWNLOAD]
5 |
6 | [DEPENDENCY]
7 | export CC=`which gcc`
8 | export CXX=`which g++`
9 | export FC=`which fortran`
10 | mkdir -p $HPCbench_ROOT/benchmark/compute
11 | cd $HPCbench_ROOT/benchmark/compute
12 | git config --global http.sslVerify false
13 | git clone --depth=1 https://github.com/hpcg-benchmark/hpcg.git
14 |
15 | [ENV]
16 | export CC=mpicc CXX=mpic++ FC=mpifort
17 | export HPCG_HOME=$HPCbench_ROOT/benchmark/compute/hpcg
18 |
19 | [APP]
20 | app_name = hpcg
21 | build_dir = $HPCG_HOME
22 | binary_dir = $HPCG_HOME/bin/
23 | case_dir = $HPCG_HOME/bin/
24 |
25 | [BUILD]
26 | cat << \EOF > setup/Make.MPI_GCC_OMP
27 | SHELL = /bin/sh
28 | CD = cd
29 | CP = cp
30 | LN_S = ln -s -f
31 | MKDIR = mkdir -p
32 | RM = /bin/rm -f
33 | TOUCH = touch
34 | TOPdir = .
35 | SRCdir = $(TOPdir)/src
36 | INCdir = $(TOPdir)/src
37 | BINdir = $(TOPdir)/bin
38 | HPCG_INCLUDES = -I$(INCdir) -I$(INCdir)/$(arch) $(MPinc)
39 | HPCG_LIBS =
40 | HPCG_OPTS =
41 | HPCG_DEFS = $(HPCG_OPTS) $(HPCG_INCLUDES)
42 | CXX = mpicxx
43 | CXXFLAGS = $(HPCG_DEFS) -O3 -ffast-math -ftree-vectorize -fopenmp
44 | LINKER = $(CXX)
45 | LINKFLAGS = $(CXXFLAGS)
46 | ARCHIVER = ar
47 | ARFLAGS = r
48 | RANLIB = echo
49 | EOF
50 | ./configure MPI_GCC_OMP
51 | make -j
52 |
53 | [CLEAN]
54 | make clean
55 |
56 | [RUN]
57 | run = mpirun -np 64
58 | binary = xhpcg
59 | nodes = 1
60 |
61 | [BATCH]
62 |
63 | [JOB1]
64 | #!/bin/bash
65 | #SBATCH -J hpcg
66 | #SBATCH -N 2
67 | #SBATCH --ntasks-per-node {{ CPU_MAX_CORES }}
68 | #SBATCH -p {{ CPU_PARTITION }}
69 | #SBATCH --exclusive
70 | #SBATCH -o logs/hpcg.out
71 | #SBATCH -e logs/hpcg.out
72 |
73 | cd $HPCG_HOME/bin/
74 | export UCX_NET_DEVICES=mlx5_0:1
75 | mpirun ./xhpcg --nx=104 --rt=60
76 | cp $HPCbench_ROOT/benchmark/compute/hpcg/bin/HPCG-Benchmark* $HPCbench_ROOT/result/compute/hpcg.txt
77 |
--------------------------------------------------------------------------------
/templates/compute/hpl.aarch64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | localhost
3 |
4 | [DOWNLOAD]
5 | hpl/2.3 https://netlib.org/benchmark/hpl/hpl-2.3.tar.gz
6 |
7 | [DEPENDENCY]
8 | export CC=`which gcc`
9 | export CXX=`which g++`
10 | export FC=`which gfortran`
11 | mkdir -p $HPCbench_ROOT/benchmark/compute
12 | mkdir -p $HPCbench_RESULT/compute
13 | tar -xzvf $HPCbench_DOWNLOAD/hpl-2.3.tar.gz -C $HPCbench_ROOT/benchmark/compute
14 |
15 | [ENV]
16 | module use ./software/moduledeps/gcc${gcc_version_number}
17 | module load openblas/0.3.18
18 | export OMPI_MCA_btl=self,tcp
19 | export HPL_HOME=$HPCbench_ROOT/benchmark/compute/hpl-2.3
20 |
21 | [APP]
22 | app_name = hpl
23 | build_dir = $HPL_HOME
24 | binary_dir = $HPL_HOME/bin/aarch64
25 | case_dir = $HPL_HOME/bin/aarch64
26 |
27 | [BUILD]
28 | cat << \EOF > Make.aarch64
29 | SHELL = /bin/sh
30 | CD = cd
31 | CP = cp
32 | LN_S = ln -s
33 | MKDIR = mkdir
34 | RM = /bin/rm -f
35 | TOUCH = touch
36 | ARCH = aarch64
37 | TOPdir = $(HPL_HOME)
38 | INCdir = $(TOPdir)/include
39 | BINdir = $(TOPdir)/bin/$(ARCH)
40 | LIBdir = $(TOPdir)/lib/$(ARCH)
41 | HPLlib = $(LIBdir)/libhpl.a
42 | LAdir = $(OPENBLAS_PATH)
43 | LAinc =
44 | LAlib = $(LAdir)/lib/libopenblas.a
45 | F2CDEFS = -DAdd__ -DF77_INTEGER=int -DStringSunStyle
46 | HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)
47 | HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib)
48 | HPL_OPTS = -DHPL_DETAILED_TIMING -DHPL_PROGRESS_REPORT
49 | HPL_DEFS = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES)
50 | CC = mpicc
51 | CCNOOPT = $(HPL_DEFS)
52 | CCFLAGS = $(HPL_DEFS) -fomit-frame-pointer -O3 -fopenmp -funroll-loops -W -Wall
53 | LINKER = $(CC)
54 | LINKFLAGS = $(CCFLAGS)
55 | ARCHIVER = ar
56 | ARFLAGS = r
57 | RANLIB = echo
58 | EOF
59 | make arch=aarch64 -j
60 | if [ ! -e ./bin/aarch64/xhpl ]; then
61 | echo "Build failed"
62 | exit 1
63 | fi
64 | echo "check if SVE exists"
65 | objdump -d bin/aarch64/xhpl | grep z0
66 | cd bin/aarch64
67 |
68 | # modify HPL.dat
69 | cat << \EOF > HPL.dat
70 | HPLinpack benchmark input file
71 | Innovative Computing Laboratory, University of Tennessee
72 | HPL.out output file name (if any)
73 | 6 device out (6=stdout,7=stderr,file)
74 | 1 # of problems sizes (N)
75 | 10000 Ns
76 | 1 # of NBs
77 | 256 NBs
78 | 0 PMAP process mapping (0=Row-,1=Column-major)
79 | 1 # of process grids (P x Q)
80 | 8 Ps
81 | 16 Qs
82 | 16.0 threshold
83 | 1 # of panel fact
84 | 2 1 0 PFACTs (0=left, 1=Crout, 2=Right)
85 | 1 # of recursive stopping criterium
86 | 1 NBMINs (>= 1)
87 | 1 # of panels in recursion
88 | 2 NDIVs
89 | 1 # of recursive panel fact.
90 | 0 1 2 RFACTs (0=left, 1=Crout, 2=Right)
91 | 1 # of broadcast
92 | 0 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
93 | 1 # of lookahead depth
94 | 0 DEPTHs (>=0)
95 | 0 SWAP (0=bin-exch,1=long,2=mix)
96 | 1 swapping threshold
97 | 1 L1 in (0=transposed,1=no-transposed) form
98 | 1 U in (0=transposed,1=no-transposed) form
99 | 0 Equilibration (0=no,1=yes)
100 | 8 memory alignment in double (> 0)
101 | EOF
102 |
103 | [CLEAN]
104 | make arch=aarch64 clean
105 | rm -rf bin/aarch64
106 |
107 | [RUN]
108 | run = mpirun -np 128
109 | binary = xhpl | tee $HPCbench_RESULT/compute/hpl.txt
110 | nodes = 1
111 |
112 | [JOB1]
113 | #!/bin/bash
114 | #SBATCH -J hpl
115 | #SBATCH -N 1
116 | #SBATCH --ntasks-per-node 128
117 | #SBATCH -p {{ CPU_PARTITION }}
118 | #SBATCH --exclusive
119 | #SBATCH -o logs/hpl.out
120 | #SBATCH -e logs/hpl.out
121 |
122 | cd $HPCbench_ROOT/benchmark/compute/hpl-2.3/bin/aarch64
123 |
124 | # modify HPL.dat
125 | cat << \EOF > HPL.dat
126 | HPLinpack benchmark input file
127 | Innovative Computing Laboratory, University of Tennessee
128 | HPL.out output file name (if any)
129 | 6 device out (6=stdout,7=stderr,file)
130 | 1 # of problems sizes (N)
131 | 170000 Ns
132 | 1 # of NBs
133 | 256 NBs
134 | 0 PMAP process mapping (0=Row-,1=Column-major)
135 | 1 # of process grids (P x Q)
136 | 8 Ps
137 | 16 Qs
138 | 16.0 threshold
139 | 1 # of panel fact
140 | 2 1 0 PFACTs (0=left, 1=Crout, 2=Right)
141 | 1 # of recursive stopping criterium
142 | 1 NBMINs (>= 1)
143 | 1 # of panels in recursion
144 | 2 NDIVs
145 | 1 # of recursive panel fact.
146 | 0 1 2 RFACTs (0=left, 1=Crout, 2=Right)
147 | 1 # of broadcast
148 | 0 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
149 | 1 # of lookahead depth
150 | 0 DEPTHs (>=0)
151 | 0 SWAP (0=bin-exch,1=long,2=mix)
152 | 1 swapping threshold
153 | 1 L1 in (0=transposed,1=no-transposed) form
154 | 1 U in (0=transposed,1=no-transposed) form
155 | 0 Equilibration (0=no,1=yes)
156 | 8 memory alignment in double (> 0)
157 | EOF
158 |
159 | mpirun xhpl | tee $HPCbench_RESULT/compute/hpl.txt
160 |
--------------------------------------------------------------------------------
/templates/compute/hpl.x86_64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | localhost
3 |
4 | [DOWNLOAD]
5 | hpl/2.3 https://netlib.org/benchmark/hpl/hpl-2.3.tar.gz
6 |
7 | [DEPENDENCY]
8 | export CC=`which gcc`
9 | export CXX=`which g++`
10 | export FC=`which fortran`
11 | ./hpcbench -install openblas/0.3.18 gcc
12 | mkdir -p $HPCbench_ROOT/benchmark/compute
13 | tar -xzvf $HPCbench_DOWNLOAD/hpl-2.3.tar.gz -C $HPCbench_ROOT/benchmark/compute
14 |
15 | [ENV]
16 | module use ./software/moduledeps/gcc11.2.0/
17 | module load openblas/0.3.18
18 | export HPL_HOME=$HPCbench_ROOT/benchmark/compute/hpl-2.3
19 |
20 | [APP]
21 | app_name = hpl
22 | build_dir = $HPL_HOME
23 | binary_dir = $HPL_HOME/bin/linux64
24 | case_dir = $HPL_HOME/bin/linux64
25 |
26 | [BUILD]
27 | cat << \EOF > Make.linux64
28 | SHELL = /bin/sh
29 | CD = cd
30 | CP = cp
31 | LN_S = ln -s
32 | MKDIR = mkdir
33 | RM = /bin/rm -f
34 | TOUCH = touch
35 | ARCH = linux64
36 | TOPdir = $(HPL_HOME)
37 | INCdir = $(TOPdir)/include
38 | BINdir = $(TOPdir)/bin/$(ARCH)
39 | LIBdir = $(TOPdir)/lib/$(ARCH)
40 | HPLlib = $(LIBdir)/libhpl.a
41 | LAdir = $(OPENBLAS_PATH)
42 | LAinc =
43 | LAlib = $(LAdir)/lib/libopenblas.a
44 | F2CDEFS = -DAdd__ -DF77_INTEGER=int -DStringSunStyle
45 | HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)
46 | HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib)
47 | HPL_OPTS = -DHPL_DETAILED_TIMING -DHPL_PROGRESS_REPORT
48 | HPL_DEFS = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES)
49 | CC = mpicc
50 | CCNOOPT = $(HPL_DEFS)
51 | CCFLAGS = $(HPL_DEFS) -fomit-frame-pointer -O3 -fopenmp -funroll-loops -W -Wall
52 | LINKER = $(CC)
53 | LINKFLAGS = $(CCFLAGS)
54 | ARCHIVER = ar
55 | ARFLAGS = r
56 | RANLIB = echo
57 | EOF
58 | make arch=linux64 -j
59 | if [ ! -e ./bin/linux64/xhpl ]; then
60 | echo "Build failed"
61 | exit 1
62 | fi
63 | echo "check if SVE exists"
64 | objdump -d bin/linux64/xhpl | grep z0
65 |
66 | cd $HPL_HOME/bin/linux64
67 |
68 | # modify HPL.dat
69 | cat << \EOF > HPL.dat
70 | HPLinpack benchmark input file
71 | Innovative Computing Laboratory, University of Tennessee
72 | HPL.out output file name (if any)
73 | 6 device out (6=stdout,7=stderr,file)
74 | 1 # of problems sizes (N)
75 | 176640 Ns
76 | 1 # of NBs
77 | 256 NBs
78 | 0 PMAP process mapping (0=Row-,1=Column-major)
79 | 1 # of process grids (P x Q)
80 | 8 Ps
81 | 8 Qs
82 | 16.0 threshold
83 | 1 # of panel fact
84 | 2 1 0 PFACTs (0=left, 1=Crout, 2=Right)
85 | 1 # of recursive stopping criterium
86 | 1 NBMINs (>= 1)
87 | 1 # of panels in recursion
88 | 2 NDIVs
89 | 1 # of recursive panel fact.
90 | 0 1 2 RFACTs (0=left, 1=Crout, 2=Right)
91 | 1 # of broadcast
92 | 0 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
93 | 1 # of lookahead depth
94 | 0 DEPTHs (>=0)
95 | 0 SWAP (0=bin-exch,1=long,2=mix)
96 | 1 swapping threshold
97 | 1 L1 in (0=transposed,1=no-transposed) form
98 | 1 U in (0=transposed,1=no-transposed) form
99 | 0 Equilibration (0=no,1=yes)
100 | 8 memory alignment in double (> 0)
101 | EOF
102 |
103 | [CLEAN]
104 | make arch=linux64 clean
105 | rm -rf bin/linux64
106 |
107 | [RUN]
108 | run = mpirun -np 64
109 | binary = xhpl | tee $HPCbench_RESULT/compute/hpl.txt
110 | nodes = 1
111 |
112 | [JOB1]
113 | #!/bin/bash
114 | #SBATCH -J hpl
115 | #SBATCH -N 2
116 | #SBATCH --ntasks-per-node 64
117 | #SBATCH -p {{ CPU_PARTITION }}
118 | #SBATCH --exclusive
119 | #SBATCH -o logs/hpl.out
120 | #SBATCH -e logs/hpl.out
121 |
122 | cd $HPCbench_ROOT/benchmark/compute/hpl-2.3/bin/linux64
123 |
124 | # modify HPL.dat
125 | cat << \EOF > HPL.dat
126 | HPLinpack benchmark input file
127 | Innovative Computing Laboratory, University of Tennessee
128 | HPL.out output file name (if any)
129 | 6 device out (6=stdout,7=stderr,file)
130 | 1 # of problems sizes (N)
131 | 176640 Ns
132 | 1 # of NBs
133 | 256 NBs
134 | 0 PMAP process mapping (0=Row-,1=Column-major)
135 | 1 # of process grids (P x Q)
136 | 8 Ps
137 | 16 Qs
138 | 16.0 threshold
139 | 3 # of panel fact
140 | 0 1 2 PFACTs (0=left, 1=Crout, 2=Right)
141 | 2 # of recursive stopping criterium
142 | 2 4 NBMINs (>= 1)
143 | 1 # of panels in recursion
144 | 2 NDIVs
145 | 3 # of recursive panel fact.
146 | 0 1 2 RFACTs (0=left, 1=Crout, 2=Right)
147 | 1 # of broadcast
148 | 0 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
149 | 1 # of lookahead depth
150 | 0 DEPTHs (>=0)
151 | 2 SWAP (0=bin-exch,1=long,2=mix)
152 | 64 swapping threshold
153 | 0 L1 in (0=transposed,1=no-transposed) form
154 | 0 U in (0=transposed,1=no-transposed) form
155 | 1 Equilibration (0=no,1=yes)
156 | 8 memory alignment in double (> 0)
157 | EOF
158 |
159 | export UCX_NET_DEVICES=mlx5_0:1
160 | mpirun xhpl | tee $HPCbench_RESULT/compute/hpl.txt
161 |
--------------------------------------------------------------------------------
/templates/network/osu.aarch64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | 11.11.11.11
3 |
4 | [DOWNLOAD]
5 |
6 | [DEPENDENCY]
7 | set -e
8 | set -x
9 | export CC=`which gcc`
10 | export CXX=`which g++`
11 | export FC=`which fortran`
12 | ./hpcbench -install osu/7.0.1 gcc
13 | mkdir -p $HPCbench_ROOT/benchmark/network
14 | mkdir -p $HPCbench_ROOT/result/network
15 |
16 | [ENV]
17 |
18 | [APP]
19 | app_name = osu
20 | build_dir = $HPCbench_ROOT/software/libs/gcc${gcc_version_number}/osu/7.0.1/
21 | binary_dir = $HPCbench_ROOT/software/libs/gcc${gcc_version_number}/osu/7.0.1/libexec/osu-micro-benchmarks/mpi/pt2pt/
22 | case_dir = $HPCbench_ROOT/benchmark/network
23 |
24 | [BUILD]
25 |
26 | [CLEAN]
27 |
28 | [RUN]
29 | run = mpirun -np 2
30 | binary = osu_bibw
31 | nodes = 1
32 |
33 | [BATCH]
34 |
35 |
36 |
37 | [JOB1]
38 | #!/bin/bash
39 | #SBATCH --job-name=osu_bibw
40 | #SBATCH --partition={{ CPU_PARTITION }}
41 | #SBATCH -n 2
42 | #SBATCH --ntasks-per-node=1
43 | #SBATCH --exclusive
44 | #SBATCH --output=logs/osu_bibw.out
45 | #SBATCH --error=logs/osu_bibw.out
46 |
47 | mpirun -np 2 $HPCbench_ROOT/software/libs/gcc${gcc_version_number}/osu/7.0.1/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_bibw |tee $HPCbench_ROOT/result/network/osu_bibw.log
48 |
49 | [JOB2]
50 | #!/bin/bash
51 | #SBATCH --job-name=osu_latency
52 | #SBATCH --partition={{ CPU_PARTITION }}
53 | #SBATCH -n 2
54 | #SBATCH --ntasks-per-node=1
55 | #SBATCH --exclusive
56 | #SBATCH --output=logs/osu_latency.out
57 | #SBATCH --error=logs/osu_latency.out
58 |
59 | mpirun -np 2 $HPCbench_ROOT/software/libs/gcc${gcc_version_number}/osu/7.0.1/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency|tee $HPCbench_ROOT/result/network/osu_latency.log
60 |
61 |
--------------------------------------------------------------------------------
/templates/network/osu.x86_64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | 11.11.11.11
3 |
4 | [DOWNLOAD]
5 |
6 | [DEPENDENCY]
7 | set -e
8 | set -x
9 | export CC=`which gcc`
10 | export CXX=`which g++`
11 | export FC=`which fortran`
12 | ./hpcbench -install osu/7.0.1 gcc
13 | mkdir -p $HPCbench_ROOT/benchmark/network
14 | mkdir -p $HPCbench_ROOT/result/network
15 |
16 | [ENV]
17 |
18 | [APP]
19 | app_name = osu
20 | build_dir = $HPCbench_ROOT/software/libs/gcc${gcc_version_number}/osu/7.0.1/
21 | binary_dir = $HPCbench_ROOT/software/libs/gcc${gcc_version_number}/osu/7.0.1/libexec/osu-micro-benchmarks/mpi/pt2pt/
22 | case_dir = $HPCbench_ROOT/benchmark/network
23 |
24 | [BUILD]
25 |
26 | [CLEAN]
27 |
28 | [RUN]
29 | run = mpirun -np 2
30 | binary = osu_bibw
31 | nodes = 1
32 |
33 | [BATCH]
34 |
35 |
36 |
37 | [JOB1]
38 | #!/bin/bash
39 | #SBATCH --job-name=osu_bibw
40 | #SBATCH --partition={{ CPU_PARTITION }}
41 | #SBATCH -n 2
42 | #SBATCH --ntasks-per-node=1
43 | #SBATCH --exclusive
44 | #SBATCH --output=logs/osu_bibw.out
45 | #SBATCH --error=logs/osu_bibw.out
46 |
47 | mpirun -np 2 $HPCbench_ROOT/software/libs/gcc${gcc_version_number}/osu/7.0.1/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_bibw |tee $HPCbench_ROOT/result/network/osu_bibw.log
48 |
49 | [JOB2]
50 | #!/bin/bash
51 | #SBATCH --job-name=osu_latency
52 | #SBATCH --partition={{ CPU_PARTITION }}
53 | #SBATCH -n 2
54 | #SBATCH --ntasks-per-node=1
55 | #SBATCH --exclusive
56 | #SBATCH --output=logs/osu_latency.out
57 | #SBATCH --error=logs/osu_latency.err
58 |
59 | mpirun -np 2 $HPCbench_ROOT/software/libs/gcc${gcc_version_number}/osu/7.0.1/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency|tee $HPCbench_ROOT/result/network/osu_latency.log
60 |
61 |
--------------------------------------------------------------------------------
/templates/storage/ior.aarch64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | localhost
3 |
4 | [DOWNLOAD]
5 |
6 | [DEPENDENCY]
7 | export CC=`which mpicc`
8 | export CXX=`which mpic++`
9 | ./hpcbench -install ior/master gcc
10 |
11 | [ENV]
12 | module use ./software/moduledeps/gcc${gcc_version_number}/
13 | module load ior/master
14 | mkdir -p $HPCbench_ROOT/benchmark/storage/ior
15 | mkdir -p $HPCbench_ROOT/result/storage/ior
16 |
17 | [APP]
18 | app_name = ior
19 | build_dir = $IOR_PATH
20 | binary_dir = $IOR_PATH/bin
21 | case_dir = $HPCbench_ROOT/benchmark/storage/ior
22 |
23 | [BUILD]
24 |
25 | [CLEAN]
26 |
27 | [RUN]
28 | binary = ior
29 |
30 | [JOB1]
31 | #!/bin/bash
32 | #SBATCH --job-name=single_client_single_fluence
33 | #SBATCH --ntasks=1
34 | #SBATCH --ntasks-per-node=1
35 | #SBATCH --output=logs/single_client_single_fluence.out
36 | #SBATCH --error=losg/single_client_single_fluence.out
37 | #SBATCH -p {{ CPU_PARTITION }}
38 | #SBATCH --exclusive
39 |
40 | # Date Stamp for benchmark
41 | DS=`date +"%F_%H:%M:%S"`
42 | SEQ=1
43 | MAXPROCS=1
44 | IOREXE=ior
45 | BASE_DIR=$HPCbench_ROOT/benchmark/storage/ior
46 | RESULT_DIR=$HPCbench_ROOT/result/storage/ior
47 |
48 | # Overall data set size in GiB. Must be >=MAXPROCS. Should be a power of 2.
49 | DATA_SIZE=8
50 | while [ ${SEQ} -le ${MAXPROCS} ]; do
51 | NPROC=`expr ${NCT} \* ${SEQ}`
52 | BSZ=`expr ${DATA_SIZE} / ${SEQ}`"g"
53 | mpirun $IOREXE -v -w -r -i 4 \
54 | -o ${BASE_DIR}/ior-test1.file \
55 | -t 1m -b ${BSZ} | tee ${RESULT_DIR}/single_client_single_fluence.txt
56 | SEQ=`expr ${SEQ} \* 2`
57 | done
58 |
59 | [JOB2]
60 | #!/bin/bash
61 | #SBATCH --job-name="single_client_multi_fluence"
62 | #SBATCH -N 1
63 | #SBATCH --ntasks-per-node=64
64 | #SBATCH --output=logs/single_client_multi_fluence.out
65 | #SBATCH --error=logs/single_client_multi_fluence.out
66 | #SBATCH -p {{ CPU_PARTITION }}
67 | #SBATCH --exclusive
68 |
69 | IOREXE=ior
70 | NCT=2
71 |
72 | # Date Stamp for benchmark
73 | DS=`date +"%F_%H:%M:%S"`
74 | SEQ=8
75 | MAXPROCS=8
76 | DATA_SIZE=16
77 |
78 | BASE_DIR=$HPCbench_ROOT/benchmark/storage/ior
79 | RESULT_DIR=$HPCbench_ROOT/result/storage/ior
80 |
81 | while [ ${SEQ} -le ${MAXPROCS} ]; do
82 | NPROC=`expr ${NCT} \* ${SEQ}`
83 | BSZ=`expr ${DATA_SIZE} / ${SEQ}`"g"
84 | mpirun -np ${NPROC} \
85 | ior -v -w -r -i 4 -F \
86 | -o ${BASE_DIR}/ior-test2.file \
87 | -t 1m -b ${BSZ} | tee ${RESULT_DIR}/single_client_multi_fluence.txt
88 | SEQ=`expr ${SEQ} \* 2`
89 | done
90 |
91 | [JOB3]
92 | #!/bin/bash
93 | #SBATCH --job-name="aggreagate_bandwidth"
94 | #SBATCH -N 2
95 | #SBATCH --ntasks-per-node=64
96 | #SBATCH --output=logs/aggreagate_bandwidth.out
97 | #SBATCH --error=logs/aggreagate_bandwidth.out
98 | #SBATCH -p {{ CPU_PARTITION }}
99 | #SBATCH --exclusive
100 |
101 | NCT=2
102 |
103 | # Date Stamp for benchmark
104 | SEQ=64
105 | MAXPROCS=128
106 | DATA_SIZE=128
107 |
108 | BASE_DIR=$HPCbench_ROOT/benchmark/storage/ior
109 | RESULT_DIR=$HPCbench_ROOT/result/storage/ior
110 |
111 | NCT=2 #`grep -v ^# hfile |wc -l`
112 | DS=`date +"%F_%H:%M:%S"`
113 | # Overall data set size in GiB. Must be >=MAXPROCS. Should be a power of 2.
114 |
115 | while [ ${SEQ} -le ${MAXPROCS} ]; do
116 | NPROC=`expr ${NCT} \* ${SEQ}`
117 | BSZ=`expr ${DATA_SIZE} / ${SEQ}`"g"
118 | # Alternatively, set to a static value and let the data size increase.
119 | # BSZ="1g"
120 | # BSZ="${DATA_SIZE}"
121 | mpirun \
122 | ior -v -w -r -i 4 -F \
123 | -o ${BASE_DIR}/ior-test3.file \
124 | -t 1m -b ${BSZ} | tee ${RESULT_DIR}/aggregation_bandwidth.txt
125 | SEQ=`expr ${SEQ} \* 2`
126 | done
127 |
128 | [JOB4]
129 | #!/bin/bash
130 | #SBATCH --job-name="iops"
131 | #SBATCH -N 5
132 | #SBATCH --ntasks-per-node=64
133 | #SBATCH --output=logs/iops.out
134 | #SBATCH --error=logs/iops.out
135 | #SBATCH -p {{ CPU_PARTITION }}
136 | #SBATCH --exclusive
137 |
138 | NCT=2
139 |
140 | # Date Stamp for benchmark
141 | SEQ=320
142 | MAXPROCS=320
143 | DATA_SIZE=640
144 |
145 | BASE_DIR=$HPCbench_ROOT/benchmark/storage/ior
146 | RESULT_DIR=$HPCbench_ROOT/result/storage/ior
147 | mpirun --mca btl_openib_allow_ib true ior -vv -e -g -w -F\
148 | -o ${BASE_DIR}/ior-test4.file \
149 | -t 4k -b 8g | tee ${RESULT_DIR}/iops.txt
150 | SEQ=`expr ${SEQ} \* 2`
151 |
152 |
--------------------------------------------------------------------------------
/templates/storage/ior.x86_64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | 11.11.11.11
3 |
4 | [DOWNLOAD]
5 |
6 | [DEPENDENCY]
7 | export CC=`which mpicc`
8 | export CXX=`which mpic++`
9 | ./hpcbench -install ior/master gcc
10 |
11 | [ENV]
12 | module use ./software/moduledeps/gcc${gcc_version_number}/
13 | module load ior/master
14 | mkdir -p $HPCbench_ROOT/benchmark/storage/ior
15 | mkdir -p $HPCbench_ROOT/result/storage/ior
16 |
17 | [APP]
18 | app_name = ior
19 | build_dir = $IOR_PATH
20 | binary_dir = $IOR_PATH/bin
21 | case_dir = $HPCbench_ROOT/benchmark/storage/ior
22 |
23 | [BUILD]
24 |
25 | [CLEAN]
26 |
27 | [RUN]
28 | binary = ior
29 |
30 | [JOB1]
31 | #!/bin/bash
32 | #SBATCH --job-name=single_client_single_fluence
33 | #SBATCH --ntasks=1
34 | #SBATCH --ntasks-per-node=1
35 | #SBATCH --output=logs/single_client_single_fluence.out
36 | #SBATCH --error=logs/single_client_single_fluence.out
37 | #SBATCH -p {{ CPU_PARTITION }}
38 |
39 | # Date Stamp for benchmark
40 | DS=`date +"%F_%H:%M:%S"`
41 | SEQ=1
42 | MAXPROCS=1
43 | IOREXE=ior
44 | BASE_DIR=$HPCbench_ROOT/benchmark/storage/ior
45 | RESULT_DIR=$HPCbench_ROOT/result/storage/ior
46 |
47 | # Overall data set size in GiB. Must be >=MAXPROCS. Should be a power of 2.
48 | DATA_SIZE=8
49 | while [ ${SEQ} -le ${MAXPROCS} ]; do
50 | NPROC=`expr ${NCT} \* ${SEQ}`
51 | BSZ=`expr ${DATA_SIZE} / ${SEQ}`"g"
52 | mpirun $IOREXE -v -w -r -i 4 \
53 | -o ${BASE_DIR}/ior-test1.file \
54 | -t 1m -b ${BSZ} | tee ${RESULT_DIR}/single_client_single_fluence.txt
55 | SEQ=`expr ${SEQ} \* 2`
56 | done
57 |
58 | [JOB2]
59 | #!/bin/bash
60 | #SBATCH --job-name="single_client_multi_fluence"
61 | #SBATCH -N 1
62 | #SBATCH --ntasks-per-node={{ CPU_MAX_CORES }}
63 | #SBATCH --output=logs/single_client_multi_fluence.out
64 | #SBATCH --error=logs/single_client_multi_fluence.out
65 | #SBATCH -p {{ CPU_PARTITION }}
66 |
67 | IOREXE=ior
68 | NCT=2
69 |
70 | # Date Stamp for benchmark
71 | DS=`date +"%F_%H:%M:%S"`
72 | SEQ=8
73 | MAXPROCS=8
74 | DATA_SIZE=16
75 |
76 | BASE_DIR=$HPCbench_ROOT/benchmark/storage/ior
77 | RESULT_DIR=$HPCbench_ROOT/result/storage/ior
78 |
79 | while [ ${SEQ} -le ${MAXPROCS} ]; do
80 | NPROC=`expr ${NCT} \* ${SEQ}`
81 | BSZ=`expr ${DATA_SIZE} / ${SEQ}`"g"
82 | mpirun -np ${NPROC} \
83 | ior -v -w -r -i 4 -F \
84 | -o ${BASE_DIR}/ior-test2.file \
85 | -t 1m -b ${BSZ} | tee ${RESULT_DIR}/single_client_multi_fluence.txt
86 | SEQ=`expr ${SEQ} \* 2`
87 | done
88 |
89 | [JOB3]
90 | #!/bin/bash
91 | #SBATCH --job-name="aggreagate_bandwidth"
92 | #SBATCH -N 2
93 | #SBATCH --ntasks-per-node={{ CPU_MAX_CORES }}
94 | #SBATCH --output=logs/aggreagate_bandwidth.out
95 | #SBATCH --error=logs/aggreagate_bandwidth.out
96 | #SBATCH -p {{ CPU_PARTITION }}
97 |
98 | NCT=2
99 |
100 | # Date Stamp for benchmark
101 | SEQ=64
102 | MAXPROCS=128
103 | DATA_SIZE=128
104 |
105 | BASE_DIR=$HPCbench_ROOT/benchmark/storage/ior
106 | RESULT_DIR=$HPCbench_ROOT/result/storage/ior
107 |
108 | NCT=2 #`grep -v ^# hfile |wc -l`
109 | DS=`date +"%F_%H:%M:%S"`
110 | # Overall data set size in GiB. Must be >=MAXPROCS. Should be a power of 2.
111 |
112 | while [ ${SEQ} -le ${MAXPROCS} ]; do
113 | NPROC=`expr ${NCT} \* ${SEQ}`
114 | BSZ=`expr ${DATA_SIZE} / ${SEQ}`"g"
115 | # Alternatively, set to a static value and let the data size increase.
116 | # BSZ="1g"
117 | # BSZ="${DATA_SIZE}"
118 | mpirun \
119 | ior -v -w -r -i 4 -F \
120 | -o ${BASE_DIR}/ior-test3.file \
121 | -t 1m -b ${BSZ} | tee ${RESULT_DIR}/aggregation_bandwidth.txt
122 | SEQ=`expr ${SEQ} \* 2`
123 | done
124 |
125 | [JOB4]
126 | #!/bin/bash
127 | #SBATCH --job-name="iops"
128 | #SBATCH -N 5
129 | #SBATCH --ntasks-per-node={{ CPU_MAX_CORES }}
130 | #SBATCH --output=logs/iops.out
131 | #SBATCH --error=logs/iops.out
132 | #SBATCH -p {{ CPU_PARTITION }}
133 |
134 | NCT=2
135 |
136 | # Date Stamp for benchmark
137 | SEQ=320
138 | MAXPROCS=320
139 | DATA_SIZE=640
140 |
141 | BASE_DIR=$HPCbench_ROOT/benchmark/storage/ior
142 | RESULT_DIR=$HPCbench_ROOT/result/storage/ior
143 | mpirun --mca btl_openib_allow_ib true ior -vv -e -g -w -F\
144 | -o ${BASE_DIR}/ior-test4.file \
145 | -t 4k -b 8g | tee ${RESULT_DIR}/iops.txt
146 | SEQ=`expr ${SEQ} \* 2`
147 |
148 |
--------------------------------------------------------------------------------
/templates/storage/protocol/hadoop.aarch64.config:
--------------------------------------------------------------------------------
1 | [SERVER]
2 | 11.11.11.11
3 |
4 | [DOWNLOAD]
5 | hadoop/3.3.5 https://dlcdn.apache.org/hadoop/common/hadoop-3.3.5/hadoop-3.3.5.tar.gz
6 |
7 | [DEPENDENCY]
8 | mkdir -p $HPCbench_ROOT/benchmark/storage/protocol/hadoop_data
9 | mkdir -p $HPCbench_ROOT/benchmark/storage/protocol/hadoop
10 | mkdir -p $HPCbench_ROOT/result/storage/protocol/hadoop
11 | tar -xzf $HPCbench_DOWNLOAD/hadoop-3.3.5.tar.gz -C $HPCbench_ROOT/benchmark/storage/protocol/hadoop
12 | cd $HPCbench_ROOT/benchmark/storage/protocol/hadoop/hadoop-3.3.5
13 | ## 配置JAVA路径、HADOOP路径
14 | ## vim etc/hadoop/hadoop-env.sh
15 | echo "export JAVA_HOME="/usr"" >> etc/hadoop/hadoop-env.sh
16 | echo "export HADOOP_HOME="$HPCbench_ROOT/benchmark/storage/protocol/hadoop/hadoop-3.3.5"" >> etc/hadoop/hadoop-env.sh
17 | export HADOOP_HOME=$HPCbench_ROOT/benchmark/storage/protocol/hadoop/hadoop-3.3.5
18 | export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
19 |
20 | ## 配置hadoop访问ip和数据存储目录
21 | cat > etc/hadoop/core-site.xml << \EOF
22 |
维度 | 82 |指标 | 83 |实测值 | 84 |参考值 | 85 |指标分数 | 86 |权重 | 87 |维度分 | 88 |
---|---|---|---|---|---|---|
计算性能 | 93 |HPL双精度浮点计算性能(PFLOPS) | 94 |{{ "%.2f"|format(test.compute.HPL) }} | 95 |{{ compute.HPL[scale] }} | 96 |{{ "%.2f"|format(compute.HPL.score) }} | 97 |{{ compute.HPL.weights }} | 98 |{{ "%.2f"|format(compute.issue_score) }} | 99 |
HPCG双精度浮点计算性能(GFLOPS) | 102 |{{ "%.2f"|format(test.compute.HPCG) }} | 103 |{{ compute.HPCG[scale] }} | 104 |{{ "%.2f"|format(compute.HPCG.score) }} | 105 |{{ compute.HPCG.weights }} | 106 |||
AI计算性能 | 109 |图像推理任务的计算性能(Fig/s) | 110 |{{ "%.2f"|format(test.AI.infering) }} | 111 |{{ AI.infering[scale] }} | 112 |{{ "%.2f"|format(AI.infering.score) }} | 113 |{{ AI.infering.weights }} | 114 |{{ "%.2f"|format(AI.issue_score) }} | 115 |
图像训练任务的计算性能(Fig/s) | 118 |{{ "%.2f"|format(test.AI.training) }} | 119 |{{ AI.training[scale] }} | 120 |{{ "%.2f"|format(AI.training.score) }} | 121 |{{ AI.training.weights }} | 122 |||
存储性能 | 125 |文件系统单客户端单流带宽(GB/s) | 126 |{{ "%.2f"|format(test.storage.single_client_single_fluence) }} | 127 |{{ storage.single_client_single_fluence[scale] }} | 128 |{{ "%.2f"|format(storage.single_client_single_fluence.score) }} | 129 |{{ storage.single_client_single_fluence.weights }} | 130 |{{ "%.2f"|format(storage.issue_score) }} | 131 |
文件系统单客户端多流带宽(GB/s) | 134 |{{ "%.2f"|format(test.storage.single_client_multi_fluence) }} | 135 |{{ storage.single_client_multi_fluence[scale] }} | 136 |{{ "%.2f"|format(storage.single_client_multi_fluence.score) }} | 137 |{{ storage.single_client_multi_fluence.weights }} | 138 |||
文件系统聚合带宽(GB/s) | 141 |{{ "%.2f"|format(test.storage.aggregation_bandwidth) }} | 142 |{{ storage.aggregation_bandwidth[scale] }} | 143 |{{ "%.2f"|format(storage.aggregation_bandwidth.score) }} | 144 |{{ storage.aggregation_bandwidth.weights }} | 145 |||
文件系统聚合IO操作速率(IOPS) | 148 |{{ "%d"|format(test.storage.IO_rate) }} | 149 |{{ storage.IO_rate[scale] }} | 150 |{{ "%.2f"|format(storage.IO_rate.score) }} | 151 |{{ storage.IO_rate.weights }} | 152 |||
多协议平均访问效率(%) | 155 |{{ "%.2f"|format(test.storage.multi_request) }} | 156 |{{ storage.multi_request[scale] }} | 157 |{{ "%.2f"|format(storage.multi_request.score) }} | 158 |{{ storage.multi_request.weights }} | 159 |||
网络性能 | 162 |点对点网络带宽(Gbps) | 163 |{{ "%.2f"|format(test.network.P2P_network_bandwidth) }} | 164 |{{ network.P2P_network_bandwidth[scale] }} | 165 |{{ "%.2f"|format(network.P2P_network_bandwidth.score) }} | 166 |{{ network.P2P_network_bandwidth.weights }} | 167 |{{ "%.2f"|format(network.issue_score) }} | 168 |
点对点消息延迟(μs) | 171 |{{ test.network.P2P_message_latency }} | 172 |{{ network.P2P_message_latency[scale] }} | 173 |{{ "%.2f"|format(network.P2P_message_latency.score) }} | 174 |{{ network.P2P_message_latency.weights }} | 175 |||
网络对分带宽与注入带宽比值 | 178 |{{ "%.2f"|format(test.network.ratio) }} | 179 |{{ network.ratio[scale] }} | 180 |{{ "%.2f"|format(network.ratio.score) }} | 181 |{{ network.ratio.weights }} | 182 |||
系统能效 | 185 |单位功耗的浮点计算性能(FLOPS/W) | 186 |{{ "%.2f"|format(test.system.compute_efficiency) }} | 187 |{{ system.compute_efficiency[scale] }} | 188 |{{ "%.2f"|format(system.compute_efficiency.score) }} | 189 |{{ system.compute_efficiency.weights }} | 190 |{{ "%.2f"|format(system.issue_score) }} | 191 |
单位功耗的文件系统聚合IO操作速率(TB/W) | 194 |{{ "%.2f"|format(test.system.IO_operation_rate) }} | 195 |{{ system.IO_operation_rate[scale] }} | 196 |{{ "%.2f"|format(system.IO_operation_rate.score) }} | 197 |{{ system.IO_operation_rate.weights }} | 198 |||
系统平衡性 | 201 |内存容量与处理器核心数的比值 | 202 |{{ "%.2f"|format(test.balance.mem2cpu) }} | 203 |{{ balance.mem2cpu[scale] }} | 204 |{{ "%.2f"|format(balance.mem2cpu.score) }} | 205 |{{ balance.mem2cpu.weights }} | 206 |{{ "%.2f"|format(balance.issue_score) }} | 207 |
BurstBuffer与内存的容量比 | 210 |{{ "%.2f"|format(test.balance.buffer2mem) }} | 211 |{{ balance.buffer2mem[scale] }} | 212 |{{ "%.2f"|format(balance.buffer2mem.score) }} | 213 |{{ balance.buffer2mem.weights }} | 214 |||
并行文件系统与BurstBuffer的容量比 | 217 |{{ "%.2f"|format(test.balance.file2buffer) }} | 218 |{{ balance.file2buffer[scale] }} | 219 |{{ "%.2f"|format(balance.file2buffer.score) }} | 220 |{{ balance.file2buffer.weights }} | 221 |||
内存与BurstBuffer的带宽比 | 224 |{{ "%.2f"|format(test.balance.mem2buffer) }} | 225 |{{ balance.mem2buffer[scale] }} | 226 |{{ "%.2f"|format(balance.mem2buffer.score) }} | 227 |{{ balance.mem2buffer.weights }} | 228 |||
BurstBuffer与并行文件系统的带宽比 | 231 |{{ "%.2f"|format(test.balance.buffer2file) }} | 232 |{{ balance.buffer2file[scale] }} | 233 |{{ "%.2f"|format(balance.buffer2file.score) }} | 234 |{{ balance.buffer2file.weights }} | 235 |
243 | • 该集群HPL性能{{ test.compute.HPL }}PF,属于{{ scale_CN }}型系统
244 | {% if (good | length > 0) and (better | length > 0) %}
245 | • 集群在{{ good }}方面性能较好,在{{ better }}方面有待提高
246 | {% elif (good | length == 0) and (better | length > 0) %}
247 | • 集群在{{ better }}方面有待提高
248 | {% else %}
249 | • 集群在{{ good }}方面性能较好
250 | {% endif %}
251 | • 集群的综合分数为{{ "%.2f"|format(sum_score) }}分
252 |