├── CONTRIBUTING.md ├── LICENSE ├── README.md └── doRoCE.sh /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | ## Individual Contributor License Agreement (CLA) 2 | 3 | **Thank you for submitting your contributions to this project.** 4 | 5 | By signing this CLA, you agree that the following terms apply to all of your past, present and future contributions 6 | to the project. 7 | 8 | ### License. 9 | 10 | You hereby represent that all present, past and future contributions are governed by the 11 | [MIT License](https://opensource.org/licenses/MIT) 12 | copyright statement. 13 | 14 | This entails that to the extent possible under law, you transfer all copyright and related or neighboring rights 15 | of the code or documents you contribute to the project itself or its maintainers. 16 | Furthermore you also represent that you have the authority to perform the above waiver 17 | with respect to the entirety of you contributions. 18 | 19 | ### Moral Rights. 20 | 21 | To the fullest extent permitted under applicable law, you hereby waive, and agree not to 22 | assert, all of your “moral rights” in or relating to your contributions for the benefit of the project. 23 | 24 | ### Third Party Content. 25 | 26 | If your Contribution includes or is based on any source code, object code, bug fixes, configuration changes, tools, 27 | specifications, documentation, data, materials, feedback, information or other works of authorship that were not 28 | authored by you (“Third Party Content”) or if you are aware of any third party intellectual property or proprietary 29 | rights associated with your Contribution (“Third Party Rights”), 30 | then you agree to include with the submission of your Contribution full details respecting such Third Party 31 | Content and Third Party Rights, including, without limitation, identification of which aspects of your 32 | Contribution contain Third Party Content or are associated with Third Party Rights, the owner/author of the 33 | Third Party Content and Third Party Rights, where you obtained the Third Party Content, and any applicable 34 | third party license terms or restrictions respecting the Third Party Content and Third Party Rights. For greater 35 | certainty, the foregoing obligations respecting the identification of Third Party Content and Third Party Rights 36 | do not apply to any portion of a Project that is incorporated into your Contribution to that same Project. 37 | 38 | ### Representations. 39 | 40 | You represent that, other than the Third Party Content and Third Party Rights identified by 41 | you in accordance with this Agreement, you are the sole author of your Contributions and are legally entitled 42 | to grant the foregoing licenses and waivers in respect of your Contributions. If your Contributions were 43 | created in the course of your employment with your past or present employer(s), you represent that such 44 | employer(s) has authorized you to make your Contributions on behalf of such employer(s) or such employer 45 | (s) has waived all of their right, title or interest in or to your Contributions. 46 | 47 | ### Disclaimer. 48 | 49 | To the fullest extent permitted under applicable law, your Contributions are provided on an "as is" 50 | basis, without any warranties or conditions, express or implied, including, without limitation, any implied 51 | warranties or conditions of non-infringement, merchantability or fitness for a particular purpose. You are not 52 | required to provide support for your Contributions, except to the extent you desire to provide support. 53 | 54 | ### No Obligation. 55 | 56 | You acknowledge that the maintainers of this project are under no obligation to use or incorporate your contributions 57 | into the project. The decision to use or incorporate your contributions into the project will be made at the 58 | sole discretion of the maintainers or their authorized delegates. 59 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2020 NVIDIA Corporation 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # doRoCE linux 2 | 3 | doRoCE configures and sets persistency behavior for optimal RoCE deployments 4 | it supports both MLNX-OFED and upstream deployments 5 | 6 | NOTE - this script aggregates steps described in the Mellanox-NVIDIA community 7 | pages and provided as a reference for recipe implementation 8 | 9 | It is recommended for use during bring-up and that you implement only 10 | required components for deployment in production environments 11 | 12 | use "-h" for more details -------------------------------------------------------------------------------- /doRoCE.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # The MIT License (MIT) 3 | # 4 | # Copyright (c) 2020, NVIDIA CORPORATION 5 | # 6 | # Permission is hereby granted, free of charge, to any person obtaining a copy of 7 | # this software and associated documentation files (the "Software"), to deal in 8 | # the Software without restriction, including without limitation the rights to 9 | # use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 10 | # the Software, and to permit persons to whom the Software is furnished to do so, 11 | # subject to the following conditions: 12 | # 13 | # The above copyright notice and this permission notice shall be included in all 14 | # copies or substantial portions of the Software. 15 | # 16 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 18 | # FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 19 | # COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 20 | # IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 21 | # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 22 | VERSION=0.98 23 | # TODOs 24 | #- multi-port 25 | 26 | uninst=0 27 | run_once=0 28 | lossless=0 29 | lossy=0 30 | rttcc=0 31 | selective_repeat=0 32 | debug=0 33 | verbose=0 34 | device_list=() 35 | y_n="" 36 | tos_val=106 37 | gix_val=3 38 | mtu_val="" 39 | set_default=0 40 | trust="dscp" 41 | trust_val="2" 42 | inparams=$@ 43 | specific_devices_selected=0 44 | 45 | function yn_question () 46 | { 47 | text=$1 48 | while true; do 49 | if [ -z $y_n ] ; then read -p "$text (Yy/Nn) " yn 50 | else yn=$y_n 51 | fi 52 | case $yn in 53 | [Yy]* ) return 0;; 54 | [Nn]* ) return 1;; 55 | * ) echo "Yy/Nn";; 56 | esac 57 | done 58 | } 59 | 60 | function yn_question_cont_wo () 61 | { 62 | text="Continue without $1? (not recommended)" 63 | yn_question "$text" 64 | if [ 0 -ne $? ] ; then 65 | echo "Exiting" 66 | exit 0 67 | fi 68 | } 69 | 70 | function run_cmd () 71 | { 72 | cmd_name=$1 73 | cmd_line=$2 74 | care=`sudo bash -c "$cmd_line" 2>&1` 75 | err=$? 76 | if [ 0 -ne $err ] ; then 77 | echo "[E] Failed to run $cmd_name (err $err)" 78 | echo "[E] Failed command output:" 79 | echo "$cmd_line" ; echo "$care" 80 | return $err 81 | fi 82 | if [ 1 -eq $verbose ] ; then 83 | echo "-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-" 84 | echo "[V] Running $cmd_name:" 85 | echo "$cmd_line" 86 | echo "$care" 87 | fi 88 | } 89 | 90 | function mount_cm_configfs() 91 | { 92 | if (! sudo cat /proc/mounts | \grep /sys/kernel/config > /dev/null) ; then 93 | if (! sudo mount -t configfs none /sys/kernel/config) ; then 94 | echo "[E] Fail to mount configfs" 95 | return 1 96 | fi 97 | fi 98 | 99 | if (sudo modinfo configfs &> /dev/null) ; then 100 | if (! cat /proc/modules | \grep configfs > /dev/null) ; then 101 | if (! sudo modprobe configfs) ; then 102 | echo "[E] Fail to modprobe configfs" 103 | return 1 104 | fi 105 | fi 106 | fi 107 | if [ ! -d /sys/kernel/config/rdma_cm ] ; then return 1 ; fi 108 | } 109 | 110 | function set_cm_tos() 111 | { 112 | dev=$1 113 | dev_path="/sys/kernel/config/rdma_cm/${dev}" 114 | rem_after_set=0 115 | if [ ! -d $dev_path ] ; then 116 | rem_after_set=1 117 | run_cmd "create configfs dir for $dev" "mkdir $dev_path" 118 | fi 119 | for port in ${dev_path}/ports/* ; do 120 | run_cmd "set TOS for $dev, port `basename $port`" "bash -c \"echo $tos_val > ${port}/default_roce_tos\"" 121 | done 122 | if [ 1 -eq $rem_after_set ] ; then 123 | run_cmd "remove configfs dir for $dev" "rmdir $dev_path" 124 | fi 125 | } 126 | 127 | \echo "" 128 | \echo " DoRoCE Version $VERSION" 129 | \echo "---------------------------" 130 | \echo " NOTE - this script aggregates steps described in the Mellanox-NVIDIA community" 131 | \echo " pages and provided as a reference for recipe implementation" 132 | \echo " " 133 | \echo " It is recommended for use during bring-up and that you implement only" 134 | \echo " required components for deployment in production environments" 135 | \echo "" 136 | 137 | for arg in "$@" 138 | do 139 | case "$arg" in 140 | -h|--help|--h) 141 | \echo "" 142 | \echo " DoRoCE script configures Mellanox-NVIDIA NICs for RoCE deployments" 143 | \echo "" 144 | \echo " Usage: ./doRoCE.sh (options)" 145 | \echo "" 146 | \echo " Options:" 147 | \echo " --run_once - don't install to driver boot process, only run configuration" 148 | \echo " --uninstall - remove from boot process, don't run configuration" 149 | \echo " -d - comma separated RDMA device list (for example: mlx5_0)" 150 | \echo " if '-d' not provided, tool will configure all found devices" 151 | \echo " -t - set TOS value (default: $tos_val) DSCP=TOS>>2, PRIO=DSCP>>3" 152 | \echo " -m - set MTU value (default: don't change)" 153 | \echo " -g - set NCCL conf GID-index value (default: $gix_val)" 154 | \echo " -l / --lossless_opt - assume lossless configuration for performance optimizations (default: $lossless)" 155 | \echo " use this option if you configured a Mellanox-NVIDIA switch with \"roce\" command" 156 | \echo " -s / --lossy_buf - disable PFC, use single larger buffer for all traffic types (default: $lossy)" 157 | \echo " -c / --rttcc - force the usage of ZTR-RTTCC congestion control (default: nvconfig & ECE synchronization)" 158 | \echo " nvconfig parameter: ROCE_CC_LEGACY_DCQCN=False" 159 | \echo " this configuration is required on both ports for dual port NIC" 160 | \echo " -r / --selective_repeat - force the usage of Selective Repeat retransmission mechanism (default: nvconfig & ECE synchronization)" 161 | \echo " nvconfig parameter: RDMA_SELECTIVE_REPEAT_EN" 162 | \echo " -u / --debug - add debug prints" 163 | \echo " -v / --verbose - print commands and outputs" 164 | \echo " -y / --yes - ignore errors and proceed with what's available (default - ask)" 165 | \echo " -n / --no - exit on any missing component" 166 | \echo " -b / --back_to_def - restore OOB config (note - will not restore MTU, please set it manually)" 167 | \echo "" 168 | \echo " List of configurations performed:" 169 | \echo " - Installs the script (with selected parameters) to driver boot process" 170 | \echo " - Set trust mode to DSCP" 171 | \echo " - Enable/disable PFC on priority (TOS>>5) - aligns with default DSCP-to-Priority mapping" 172 | \echo " - Enable/disable lossless performance optimizations" 173 | \echo " - Set /etc/nccl.conf to TOS=106" 174 | \echo " note: conf files are set once, not on every boot" 175 | \echo " note: UCX uses UCX_IB_TRAFFIC_CLASS=106 by default. Change through command line, as conf file isn't supported yet" 176 | \echo " - Set IB VERB override to TOS=106" 177 | \echo " - Set RDMA-CM default TOS" 178 | \echo "" 179 | exit 5; 180 | ;; 181 | 182 | "--uninstall") uninst=1;; 183 | "--run_once") run_once=1;; 184 | "-l"|"--lossless_opt") lossless=1;; 185 | "-s"|"--lossy_buf") lossy=1;; 186 | "-c"|"--rttcc") rttcc=1;; 187 | "-r"|"--selective_repeat") selective_repeat=1;; 188 | "-u"|"--debug") debug=1;; 189 | "-v"|"--verbose") verbose=1;; 190 | "-y"|"--yes") y_n="y";; 191 | "-n"|"--no") y_n="n";; 192 | "-b"|"--back_to_def") set_default=1;; 193 | -d) p_arg=${arg##"-"} ;; 194 | -t) p_arg=${arg##"-"} ;; 195 | -m) p_arg=${arg##"-"} ;; 196 | -g) p_arg=${arg##"-"} ;; 197 | *) case $p_arg in 198 | d) device_list=(${arg//,/ }) ; p_arg="" ;; 199 | t) tos_val="$arg" ; p_arg="" ;; 200 | m) mtu_val="$arg" ; p_arg="" ;; 201 | g) gix_val="$arg" ; p_arg="" ;; 202 | *) echo "[E] Unknown paramater, see help (-h/--help)" ; exit 5 ;; 203 | esac 204 | esac 205 | done 206 | 207 | if [ -d "/etc/infiniband" ] && [ -f /etc/init.d/openibd ] ; then 208 | psh_caller="openibd" 209 | psh_path="/etc/infiniband/post-start-hook.sh" 210 | else 211 | psh_caller="rc.local" 212 | psh_path="/etc/rc.d/rc.local" 213 | fi 214 | 215 | nccl_conf_path="/etc/nccl.conf" 216 | if [ 1 -eq $uninst ] || [ 1 -eq $set_default ] ; then 217 | echo "[I] Removing NCCL conf hook" 218 | if [ -f $nccl_conf_path ] ; then 219 | run_cmd "Clear NCCL conf DSCP" "\sed -i -- '/doRoCE\|NCCL_IB_TC\|NCCL_IB_GID_INDEX/I d' $nccl_conf_path" 220 | fi 221 | if [ 1 -eq $uninst ] ; then 222 | echo "[I] Removing script from boot process" 223 | if [ -f $psh_path ] ; then 224 | run_cmd "Clear post-start hook" "sed -i -- '/doRoCE/ d' $psh_path" 225 | fi 226 | echo "[I] Removing script from /usr/bin" 227 | run_cmd "Remove script from /usr/bin" "rm -f /usr/bin/doRoCE.sh" 228 | exit 0 229 | fi 230 | fi 231 | 232 | if [ ! -d /sys/bus/pci/drivers/mlx5_core/ ] ; then 233 | echo "[E] mlx5 driver is down, exiting" 234 | exit 6 235 | fi 236 | 237 | if [ 1 -eq $lossless ] && [ 1 -eq $lossy ] ; then echo "[E] Lossy and lossless can't be configured at the same time, exiting" ; exit 7 ; fi 238 | 239 | 240 | if [ 1 -eq $set_default ] ; then 241 | lossless=0 242 | lossy=1 243 | rttcc=0 244 | selective_repeat=0 245 | tos_val=0 246 | trust="pcp" 247 | trust_val=1 248 | fi 249 | 250 | pfc_cmd_mask=$((1 << ($tos_val>>5))) 251 | pfc_set_mask=$((!$lossy << ($tos_val>>5))) 252 | 253 | if [ 1 -eq $debug ] ; then echo -n "[D] PFC-MASK=" ; printf "0x%.2x\n" $pfc_set_mask ; fi 254 | 255 | if [ 1 -eq $debug ] ; then echo "[D] checking for mlxreg/mstreg" ; fi 256 | mlxreg_cmd="" 257 | if (which mlxreg >/dev/null 2>&1) || [ -f "/usr/bin/mlxreg" ] ; then mlxreg_cmd="mlxreg" 258 | elif (which mstreg >/dev/null 2>&1) || [ -f "/usr/bin/mstreg" ] ; then mlxreg_cmd="mstreg" 259 | else 260 | echo "[E] Could not find mlxreg/mstreg tool in \$PATH" 261 | echo "to install: install MLNX_OFED, or:" 262 | echo " " 263 | echo "# git clone https://github.com/Mellanox/mstflint.git" 264 | echo "# cd mstflint" 265 | echo "# ./autogen.sh" 266 | echo "# ./configure --disable-inband --enable-adb-generic-tools" 267 | echo "# make" 268 | echo "# sudo make install" 269 | 270 | yn_question_cont_wo "PFC, trust layer and lossy fabric accelerations" 271 | fi 272 | 273 | if [ 1 -eq $debug ] ; then echo "[D] checking for mlxconfig/mstconfig" ; fi 274 | mlxconfig_cmd="" 275 | if (which mlxconfig >/dev/null 2>&1) || [ -f "/usr/bin/mlxconfig" ] ; then mlxconfig_cmd="mlxconfig" 276 | elif (which mstconfig >/dev/null 2>&1) || [ -f "/usr/bin/mstconfig" ] ; then mlxconfig_cmd="mstconfig" 277 | else 278 | echo "[E] Could not find mlxconfig/mstconfig tool in \$PATH" 279 | echo "to install: install MLNX_OFED, or:" 280 | echo " " 281 | echo "# git clone https://github.com/Mellanox/mstflint.git" 282 | echo "# cd mstflint" 283 | echo "# ./autogen.sh" 284 | echo "# ./configure --disable-inband --enable-adb-generic-tools" 285 | echo "# make" 286 | echo "# sudo make install" 287 | 288 | yn_question_cont_wo "PFC, trust layer and lossy fabric accelerations" 289 | fi 290 | 291 | if [ 1 -eq $debug ] ; then echo "[D] checking for RDMA-CM configfs" ; fi 292 | cm_configfs_found=1 293 | if (! mount_cm_configfs) ; then 294 | cm_configfs_found=0 295 | yn_question_cont_wo "setting RDMA-CM default TOS" 296 | fi 297 | 298 | mlnx_qos_found=0 299 | if [ 1 -eq $debug ] ; then echo "[D] checking for mlnx_qos/lldptool" ; fi 300 | if (which mlnx_qos >/dev/null 2>&1) || [ -f "/usr/bin/mlnx_qos" ] ; then 301 | mlnx_qos_found=1 302 | fi 303 | 304 | # Install to /usr/bin 305 | PARENT_COMMAND=$(ps -o comm= $PPID) 306 | if [ "$PARENT_COMMAND" = "$psh_caller" ] ; then let run_once=1 ; fi 307 | if [ 0 -eq $run_once ] ; then 308 | mypath=`realpath $0` 309 | if [ 0 -ne $? ] ; then 310 | echo "[E] Could not determine current path, exiting" 311 | exit 5 312 | fi 313 | if [ "$mypath" != "/usr/bin/doRoCE.sh" ] ; then 314 | if [ 1 -eq $debug ] ; then echo "[D] Installing to /usr/bin" ; fi 315 | run_cmd "Copy to /usr/bin" "sudo cp -f $mypath /usr/bin/doRoCE.sh && chmod a+x /usr/bin/doRoCE.sh" 316 | fi 317 | 318 | if [ 1 -eq $debug ] ; then echo "[D] Adding to OFED post-start-hook" ; fi 319 | if [ -f $psh_path ] ; then 320 | run_cmd "Clear post-start-hook" "sed -i -- '/doRoCE/I d' $psh_path" 321 | fi 322 | run_cmd "Add post-start-hook" "echo -e \"# Added by doRoCE scirpt:\n/usr/bin/doRoCE.sh $inparams --yes >/dev/null\" >> $psh_path" 323 | if [ ! -x $psh_path ] ; then run_cmd "Set post-start-hook +x" "chmod a+x $psh_path" ; fi 324 | 325 | fi 326 | 327 | if [ "$PARENT_COMMAND" != "$psh_caller" ] && [ 1 -ne $set_default ] ; then 328 | # Set nccl.conf 329 | if [ 1 -eq $debug ] ; then echo "[D] setting NCCL conf" ; fi 330 | if [ -f $nccl_conf_path ] ; then 331 | run_cmd "Clear NCCL conf DSCP" "\sed -i -- '/doRoCE\|NCCL_IB_TC\|NCCL_IB_GID_INDEX/I d' $nccl_conf_path" 332 | fi 333 | run_cmd "Add NCCL conf DSCP" "echo -e \"# Added by doRoCE scirpt:\nNCCL_IB_TC=$tos_val\nNCCL_IB_GID_INDEX=$gix_val\" >> $nccl_conf_path" 334 | 335 | #set ucx.conf - not supported by UCX yet! 336 | if [ 106 -ne $tos_val ] ; then 337 | echo "[I] NOTE - for UCX, make sure to add to the command line: \"UCX_IB_TRAFFIC_CLASS=$tos_val\"" 338 | fi 339 | fi 340 | 341 | if [ -z "$device_list" ] ; then 342 | for dev in `\ls /sys/class/infiniband/` ; do 343 | device_list+=("$dev") 344 | done 345 | else 346 | specific_devices_selected=1 347 | fi 348 | 349 | if [ 1 -eq $debug ] ; then echo "[I] Device list: ${device_list[@]}" ; fi 350 | for dev in ${device_list[@]} ; do 351 | if [ 1 -eq $debug ] ; then echo "[D] Starting device $dev" ; fi 352 | 353 | # Get device info 354 | dev_linktype=`\cat /sys/class/infiniband/${dev}/ports/1/link_layer` 355 | if [[ "Ethernet" != "$dev_linktype" ]] ; then 356 | echo "[I] Device $dev - link type $dev_linktype, skipping" 357 | continue 358 | fi 359 | bdf=`\readlink /sys/class/infiniband/${dev}/device | \xargs basename` 360 | netdev=`\ls /sys/class/infiniband/${dev}/device/net/ | \xargs basename` 361 | 362 | if [ 1 -eq $debug ] ; then echo "[D] Device $dev - bdf: $bdf, netdev: $netdev" ; fi 363 | 364 | 365 | if [ ! -z $mlxreg_cmd ] ; then 366 | # Configure PFC, trust mode 367 | if [ 1 -eq $mlnx_qos_found ] ; then 368 | mlnx_qos_pfc_mask="" 369 | for i in {0..7} ; do 370 | mlnx_qos_pfc_mask+="$(( ($pfc_set_mask>>$i) & 0x1 ))" 371 | if [ 7 -ne $i ] ; then mlnx_qos_pfc_mask+="," ; fi 372 | done 373 | run_cmd mlnx_qos "mlnx_qos -i $netdev --trust=$trust --pfc=$mlnx_qos_pfc_mask" 374 | else 375 | run_cmd "Set trust DSCP" "$mlxreg_cmd -y -d $bdf --reg_name QPTS -i \"local_port=1\" --set \"trust_state=$trust_val\"" 376 | run_cmd "Set PFC" "$mlxreg_cmd -y -d $bdf --reg_name PFCC -i \"local_port=1,pnat=0,dcbx_operation_type=0\" --set \"prio_mask_rx=${pfc_cmd_mask},prio_mask_tx=${pfc_cmd_mask},pfctx=${pfc_set_mask},pfcrx=${pfc_set_mask},pprx=0,pptx=0\"" 377 | fi 378 | 379 | # Configure lossy accelerations 380 | accl_val=$((1-($lossless || $set_default))) 381 | run_cmd "Set lossy optimizations" "$mlxreg_cmd -y -d $bdf --reg_name ROCE_ACCL --set \"roce_adp_retrans_en=$accl_val,roce_tx_window_en=$accl_val,roce_slow_restart_en=$accl_val\"" 382 | 383 | # Get device type; Activate the following based on minimum device level; ConnectX-6 Dx onwards 384 | dev_fw_version=`\cat /sys/class/infiniband/${dev}/fw_ver` 385 | dev_type=$(echo $dev_fw_version | cut -c -2) 386 | ga_release_version=$(echo $dev_fw_version | cut -c 4-5) 387 | 388 | # Set selective repeat 389 | if [ 1 -eq $selective_repeat ] ; then 390 | if ((22 > $dev_type)) ; then 391 | echo "[I] Device $dev - does not support Selective Repeat, skipping" 392 | continue 393 | fi 394 | accl_val=$((1-$set_default)) 395 | run_cmd "Set selective repeat" "$mlxreg_cmd -y -d $bdf --reg_name ROCE_ACCL --set \"selective_repeat_forced_en=$accl_val,adaptive_routing_forced_en=0\"" 396 | fi 397 | 398 | # Set ZTR-RTTCC 399 | if [ 1 -eq $rttcc ] ; then 400 | if ((22 > $dev_type)) ; then 401 | echo "[I] Device $dev - does not support ZTR-RTTCC, skipping" 402 | continue 403 | fi 404 | 405 | if ((37 > $ga_release_version)) ; then 406 | echo "[I] Use a newer GA (37 and above) for the below to work properly, skipping" 407 | continue 408 | fi 409 | 410 | # Configure ZTR-RTTCC congestion control 411 | echo "[I] Configure ZTR-RTTCC congestion control" 412 | 413 | if [[ "22" == "$dev_type" ]] ; then 414 | echo "[I] Device $dev - is ConnectX-6 Dx, check if legacy DCQCN is enabled" 415 | # verify device is NOT in legacy DCQCN congestion control mode 416 | mlxconfig_out=$($mlxconfig_cmd -d $bdf q ROCE_CC_LEGACY_DCQCN | grep ROCE_CC_LEGACY_DCQCN) 417 | if [[ $mlxconfig_out == *"True"* ]] ; then 418 | echo "[I] Device $dev - DCQCN congestion control in use, skipping" 419 | echo "[I] disable ROCE_CC_LEGACY_DCQCN with mlxconfig and reset the device" 420 | continue 421 | fi 422 | fi 423 | 424 | run_cmd "Activate ZTR-RTTCC congestion control" "$mlxreg_cmd -y -d $bdf --reg_name PPCC --set \"cmd_type=2\" --indexes \"local_port=1,pnat=0,lp_msb=0,algo_slot=15,algo_param_index=0\"" 425 | if [ 1 -eq $specific_devices_selected ] ; then 426 | echo "[I] Activating ZTR-RTTCC is required on both ports for dual port NIC!" 427 | fi 428 | fi 429 | fi 430 | # Set MTU 431 | if [ ! -z $mtu_val ] ; then 432 | run_cmd "Set MTU" "ifconfig $netdev mtu $mtu_val" 433 | fi 434 | 435 | # Set verb default DSCP 436 | tc_filename="/sys/class/infiniband/${dev}/tc/1/traffic_class" 437 | if [ -f $tc_filename ] ; then 438 | run_cmd "Set verbs default DSCP" "echo $tos_val > ${tc_filename}" 439 | else 440 | echo "[E] Could not find $tc_filename, used to force verbs interface TCLASS" 441 | echo "[E] Make sure to configure TCLASS in your applications" 442 | fi 443 | 444 | # Set RDMA-CM 445 | set_cm_tos $dev 446 | 447 | # if back_to_def - set global pause 448 | if [ 1 -eq $set_default ] ; then 449 | care=`run_cmd "Back to default - set global pause" "ethtool -A $netdev rx on tx on"` 450 | if [ 1 -eq $? ] ; then echo $care ; fi 451 | fi 452 | echo "[I] Device $dev - done" 453 | done 454 | 455 | --------------------------------------------------------------------------------