6 |
7 |
8 |
9 | mlag-vip cray-mlag-domain ip 192.168.255.242 /29 force
10 | no mlag shutdown
11 | mlag system-mac 00:00:5E:00:01:01
12 | interface port-channel 100 ipl 1
13 | interface vlan 4000 ipl 1 peer-address 192.168.255.253
14 | |
15 |
16 |
17 |
18 |
19 | mlag-vip cray-mlag-domain ip 192.168.255.242 /29 force
20 | no mlag shutdown
21 | mlag system-mac 00:00:5E:00:01:5D
22 | interface port-channel 100 ipl 1
23 | interface vlan 4000 ipl 1 peer-address 192.168.255.254
24 | |
25 |
26 |
27 |
28 | [Back to Index](../README.md)
29 |
--------------------------------------------------------------------------------
/operations/network/management_network/mellanox/verify-switches_are_forwarding_dhcp_traffic.md:
--------------------------------------------------------------------------------
1 | # Verify the switches are forwarding DHCP traffic
2 |
3 | If you made it this far and still cannot pxe boot, you may have run into the IP-Helper breaking on the switch.
4 |
5 | [Back to Index](../README.md)
6 |
--------------------------------------------------------------------------------
/operations/network/management_network/mellanox/verify_bgp.md:
--------------------------------------------------------------------------------
1 | # Verify BGP
2 |
3 | Verify the BGP neighbors are in the established state on BOTH the switches.
4 |
5 | How to check Aruba BGP status:
6 |
7 | ```
8 | show bgp ipv4 u s
9 |
10 | VRF : default
11 | BGP Summary
12 | -----------
13 | Local AS : 65533 BGP Router Identifier : 10.252.0.3
14 | Peers : 4 Log Neighbor Changes : No
15 | Cfg. Hold Time : 180 Cfg. Keep Alive : 60
16 | Confederation Id : 0
17 |
18 | Neighbor Remote-AS MsgRcvd MsgSent Up/Down Time State AdminStatus
19 | 10.252.0.2 65533 45052 45044 02m:02w:02d Established Up
20 | 10.252.1.7 65533 78389 90090 02m:02w:02d Established Up
21 | 10.252.1.8 65533 78384 90059 02m:02w:02d Established Up
22 | 10.252.1.9 65533 78389 90108 02m:02w:02d Established Up
23 | ```
24 |
25 | [Back to Index](../README.md)
26 |
--------------------------------------------------------------------------------
/operations/network/management_network/mellanox/verify_route_to_tftp.md:
--------------------------------------------------------------------------------
1 | # Verify route to TFTP
2 |
3 | On BOTH Aruba switches we need a single route to the TFTP server 10.92.100.60 (your configuration may differ).
4 |
5 | This is needed because there are issues with Aruba ECMP hashing and TFTP traffic.
6 |
7 | ```
8 | show ip route 10.92.100.60
9 |
10 | Displaying ipv4 routes selected for forwarding
11 |
12 | '[x/y]' denotes [distance/metric]
13 |
14 | 10.92.100.60/32, vrf default, tag 0
15 | via 10.252.1.9, [70/0], bgp
16 | ```
17 |
18 | * This route can be a static route or a BGP route that is pinned to a single worker. (1.4.2 patch introduces the BGP pinned route)
19 | * Verify that you can ping the next hop of this route.
20 | * For example above we would ping 10.252.1.9. If this is not reachable this is your problem.
21 |
22 | [Back to Index](../README.md)
23 |
--------------------------------------------------------------------------------
/operations/network/management_network/mellanox/very_large.md:
--------------------------------------------------------------------------------
1 | # Very Large (Exascale)
2 |
3 | 
4 |
5 | [Back to index](README.md).
6 |
--------------------------------------------------------------------------------
/operations/network/management_network/mellanox/vlan_interface.md:
--------------------------------------------------------------------------------
1 | # VLAN interface
2 |
3 | The switch also supports classic L3 VLAN interfaces.
4 |
5 | Relevant Configuration
6 |
7 | Configure the VLAN
8 |
9 | ```
10 | switch (config) # vlan 6
11 | switch (config vlan 6) #
12 | ```
13 |
14 | Create and enable the VLAN interface, and assign it an IP address
15 |
16 | ```
17 | switch(config vlan 6)# ip address 10.1.0.2/16
18 | ```
19 |
20 | Show Commands to Validate Functionality
21 |
22 | ```
23 | show vlan
24 | ```
25 |
26 | Expected Results
27 |
28 | * Step 1: You can configure the VLAN
29 | * Step 2: You can enable the interface and associate it with the VLAN
30 | * Step 3: You can create an IP-enabled VLAN interface, and it is up
31 | * Step 4: You validate the configuration is correct
32 | * Step 5: You can ping from the switch to the client and from the client to the switch
33 |
34 | [Back to Index](../README.md)
35 |
--------------------------------------------------------------------------------
/operations/network/management_network/mellanox/web-ui.md:
--------------------------------------------------------------------------------
1 | # Web user interface (WebUI)
2 |
3 | A web-based management user interface provides a visual representation of a subset of the current switch configuration and states. The Web-UI allows for easy access from modern browsers to modify some aspects of the configuration.
4 |
5 | Relevant Configuration
6 |
7 | Enable the WebUI
8 |
9 | ```
10 | switch(config)# web enable
11 | ```
12 |
13 | Configure REST API
14 |
15 | ```
16 | switch(config)# web enable http|https
17 | ```
18 |
19 | Show Commands to Validate Functionality
20 |
21 | ```
22 | show web
23 | ```
24 |
25 | Expected Results
26 |
27 | * Step 1: You can connect the management interface to a private network
28 | * Step 2: You can enable web-management
29 | * Step 3: You can connect to the IP address from a browser login to the management menu
30 |
31 | [Back to Index](../README.md)
32 |
--------------------------------------------------------------------------------
/operations/network/management_network/reinstall.md:
--------------------------------------------------------------------------------
1 | # Reinstall
2 |
3 | Reinstall the same CSM version.
4 |
5 | ***Before continuing with install***, make sure that CANU is running the most current version:
6 |
7 | [Install/Upgrade CANU](canu_install_update.md)
8 |
9 | > **CAUTION:** All of these steps should be done using an out-of-band connection. This process is disruptive and will require downtime.
10 |
11 | ## Procedure
12 |
13 | 1. If the switches being reinstalled are already in the right CSM version, no configuration changes should be required.
14 |
15 | 2. Check the differences between generated configurations and the configurations on the system.
16 |
17 | Refer to [Validate switch configurations](validate_switch_configs.md).
18 |
19 | 3. Run a suite of tests against the management network switches.
20 |
21 | Refer to [Network tests](network_tests.md).
22 |
--------------------------------------------------------------------------------
/operations/network/management_network/replace_switch.md:
--------------------------------------------------------------------------------
1 | # Replace Switch
2 |
3 | > **CAUTION:** Do not plug in a switch that is not configured. This can cause unpredictable behavior and network outages.
4 |
5 | ## Prerequisites
6 |
7 | - Out-of-band access to the switches (console).
8 | - GA generated switch configuration or backed-up switch configuration exists.
9 | - [Generate Switch Configurations](generate_switch_configs.md)
10 | - [Configuration Management](config_management.md)
11 |
12 | ### Procedure
13 |
14 | The following steps are required to replace a switch.
15 |
16 | 1. Update firmware on new switch.
17 |
18 | See [Update Management Network Firmware](firmware/update_management_network_firmware.md).
19 |
20 | 1. Apply the configuration.
21 |
22 | See [Apply Switch Configurations](apply_switch_configurations.md).
23 |
24 | 1. Unplug all the network and power cables and remove the failed switch.
25 |
26 | 1. Plug in the network cables.
27 |
28 | 1. Plug in the power cables.
29 |
--------------------------------------------------------------------------------
/operations/node_management/Add_Remove_Replace_NCNs/Remove_Switch_Config.md:
--------------------------------------------------------------------------------
1 | # Remove Switch Configuration for NCN
2 |
3 | ## Description
4 |
5 | Update the network switches for the NCN that was removed.
6 |
7 | ## Procedure
8 |
9 | ### Update Networking to Remove NCN
10 |
11 | Details coming soon.
12 |
13 | ## Next Step
14 |
15 | Proceed to the next step to [Redeploy Services](Redeploy_Services.md) or return to the main [Add, Remove, Replace, or Move NCNs](Add_Remove_Replace_NCNs.md) page.
16 |
--------------------------------------------------------------------------------
/operations/node_management/Add_Remove_Replace_NCNs/Update_Firmware.md:
--------------------------------------------------------------------------------
1 | # Update Firmware
2 |
3 | ## Description
4 |
5 | Use FAS to update the firmware and set the BMC password.
6 |
7 | ## Procedure
8 |
9 | See [Update Firmware](../../firmware/Update_Firmware_with_FAS.md).
10 |
11 | Proceed to the next step to [Update NCN BIOS TPM State](Update_NCN_BIOS_TPM_State.md) or return to the main [Add, Remove, Replace, or Move NCNs](Add_Remove_Replace_NCNs.md) page.
12 |
--------------------------------------------------------------------------------
/operations/node_management/Find_Node_Type_and_Manufacturer.md:
--------------------------------------------------------------------------------
1 | # Find Node Type and Manufacturer
2 |
3 | There are three different vendors providing nodes for air-cooled cabinets, which are Gigabyte, Intel, and HPE. The Hardware State Manager \(HSM\) contains the information required to determine which type of air-cooled node is installed. The endpoint returned in the HSM command can be used to determine the manufacturer.
4 |
5 | HPE nodes contain the /redfish/v1/Systems/1 endpoint:
6 |
7 | ```
8 | cray hsm inventory componentEndpoints describe XNAME --format json | jq '.RedfishURL'
9 | "x3000c0s18b0/redfish/v1/Systems/1"
10 | ```
11 |
12 | Gigabyte nodes contain the /redfish/v1/Systems/Self endpoint:
13 |
14 | ```
15 | cray hsm inventory componentEndpoints describe XNAME --format json | jq '.RedfishURL'
16 | "x3000c0s7b0/redfish/v1/Systems/Self"
17 | ```
18 |
19 | Intel nodes contain the /redfish/v1/Systems/SERIAL\_NUMBER endpoint:
20 |
21 | ```
22 | cray hsm inventory componentEndpoints describe XNAME --format json | jq '.RedfishURL'
23 | "x3000c0s15b0/redfish/v1/Systems/BQWT92000021"
24 | ```
25 |
26 |
--------------------------------------------------------------------------------
/operations/node_management/NCN_Identify_Drives_Using_ledctl.md:
--------------------------------------------------------------------------------
1 | # NCN Drive Identification
2 |
3 | Basic usage for the ledmon/ledctl software for drive identification using the drive LEDs.
4 |
5 | ## Usage
6 |
7 | Turn on led locator beacon
8 |
9 | ```bash
10 | ledctl locate=/dev/