├── README.md └── rps.py /README.md: -------------------------------------------------------------------------------- 1 | # rps-rfs-configuration 2 | A script for configuring the Receive Packet Steering (RPS) and Receive Flow Steering (RFS) on linux 3 | 4 | This script helps in configuring RPS and RFS on a linux machine and works as follows. 5 | 1. First it finds the number of CPU cores available on system 6 | 2. Finds the number of tx/rx queues available on specified interface 7 | 3. Finds the CPU core which is assigned to process the interrupts and handle packets for a tx/rx queue using `/proc/interrupts` output. 8 | 4. Configures the CPU mask for RPS based on CPU core set for a particular queue. 9 | 10 | For example, lets say rx-0 queue was assigned CPU core 0 and there are 8 cores available on system. 11 | 12 | if smp_affinity for rx-0 queue - CPU core '0' specified as mask '00000001' (from `/proc/interrupt` output) 13 | then, RPS for rx-0 is configured as '11111110' - Which means interrupt is still handled by CPU core '0' but processing of packets is now distributed among CPU cores 1-7. This reduces load on CPU core '0' and promotes high PPS (packets per second) 14 | 15 | Without RPS configuration, the CPU core '0' handles the soft_irq - interrupt generated when a packet is received on interface and also processes the packet - sending the packet to TCP/IP stack. On configuring RPS - this packet processing is offloaded to other cores. 16 | 17 | 5. RFS configuration is applied on top of RPS configuration 18 | 19 | Applying these configurations on AWS EC2 instances might help in acheiving more PPS as we are limited by number of queues available on an EC2 instance (8 queues) 20 | 21 | ## Reference Links - RPS/RFS 22 | 1. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/network-rps 23 | 2. https://balodeamit.blogspot.com/2013/10/receive-side-scaling-and-receive-packet.html 24 | 3. https://www.kernel.org/doc/Documentation/networking/scaling.txt 25 | 4. https://medium.com/@Pinterest_Engineering/building-pinterest-in-the-cloud-6c7280dcc196 26 | 5. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/network-rfs 27 | 28 | ## Sample Usage 29 | ``` 30 | # python rps.py eth0 --configure-rfs 31 | Number of CPU(s) available.. : 16 32 | Number of queues in interface 'eth0'.. : 8 33 | Mapping the corresponding TX queue to same CPU(s) as RX queue 34 | Setting RPS CPU mask for /sys/class/net/eth0/queues/rx-1/rps_cpus as fbff 35 | Setting RPS CPU mask for /sys/class/net/eth0/queues/tx-1/xps_cpus as fbff 36 | Setting RPS CPU mask for /sys/class/net/eth0/queues/rx-0/rps_cpus as f7ff 37 | Setting RPS CPU mask for /sys/class/net/eth0/queues/tx-0/xps_cpus as f7ff 38 | Setting RPS CPU mask for /sys/class/net/eth0/queues/rx-3/rps_cpus as ff7f 39 | Setting RPS CPU mask for /sys/class/net/eth0/queues/tx-3/xps_cpus as ff7f 40 | Setting RPS CPU mask for /sys/class/net/eth0/queues/rx-2/rps_cpus as fdff 41 | Setting RPS CPU mask for /sys/class/net/eth0/queues/tx-2/xps_cpus as fdff 42 | Setting RPS CPU mask for /sys/class/net/eth0/queues/rx-5/rps_cpus as ffdf 43 | Setting RPS CPU mask for /sys/class/net/eth0/queues/tx-5/xps_cpus as ffdf 44 | Setting RPS CPU mask for /sys/class/net/eth0/queues/rx-4/rps_cpus as feff 45 | Setting RPS CPU mask for /sys/class/net/eth0/queues/tx-4/xps_cpus as feff 46 | Setting RPS CPU mask for /sys/class/net/eth0/queues/rx-7/rps_cpus as bfff 47 | Setting RPS CPU mask for /sys/class/net/eth0/queues/tx-7/xps_cpus as bfff 48 | Setting RPS CPU mask for /sys/class/net/eth0/queues/rx-6/rps_cpus as fff7 49 | Setting RPS CPU mask for /sys/class/net/eth0/queues/tx-6/xps_cpus as fff7 50 | Setting /proc/sys/net/core/rps_sock_flow_entries with value 32768 51 | Configuring /sys/class/net/eth0/queues/rx-7/rps_flow_cnt with value 2048 52 | Configuring /sys/class/net/eth0/queues/rx-5/rps_flow_cnt with value 2048 53 | Configuring /sys/class/net/eth0/queues/rx-3/rps_flow_cnt with value 2048 54 | Configuring /sys/class/net/eth0/queues/rx-1/rps_flow_cnt with value 2048 55 | Configuring /sys/class/net/eth0/queues/rx-6/rps_flow_cnt with value 2048 56 | Configuring /sys/class/net/eth0/queues/rx-4/rps_flow_cnt with value 2048 57 | Configuring /sys/class/net/eth0/queues/rx-2/rps_flow_cnt with value 2048 58 | Configuring /sys/class/net/eth0/queues/rx-0/rps_flow_cnt with value 2048 59 | # 60 | ``` 61 | 62 | _*Note* You might want to update the regex in rps.py script which is used to match the interface queue in `/proc/interrupt` output. This is used to grab the queue number and associated CPU core handling the interrupt_ 63 | -------------------------------------------------------------------------------- /rps.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import argparse 4 | import os 5 | import re 6 | import shlex 7 | from subprocess import Popen, PIPE 8 | import sys 9 | 10 | def execute_command(cmd): 11 | proc = Popen(shlex.split(cmd), stdout=PIPE) 12 | return proc.communicate()[0].strip() 13 | 14 | RPS_XPS_CONFIG_PATH = "/sys/class/net/{iface}/queues/{q}/{config}" 15 | RPS_SOCK_FLOW_ENTRIES = "/proc/sys/net/core/rps_sock_flow_entries" 16 | RPS_FLOW_COUNT = "/sys/class/net/{iface}/queues/{q}/rps_flow_cnt" 17 | SMP_AFFINITY = "/proc/irq/{irq}/smp_affinity" 18 | NUM_CPUS = None 19 | 20 | def numcpus(): 21 | global NUM_CPUS 22 | if not NUM_CPUS: 23 | NUM_CPUS = int(execute_command("nproc")) 24 | 25 | return NUM_CPUS 26 | 27 | def iface_queues(iface, qtype="rx"): 28 | qs = os.listdir( 29 | "/sys/class/net/{iface}/queues".format(iface=iface) 30 | ) 31 | return [q for q in qs if q.startswith(qtype)] 32 | 33 | def irq_cpu_map(iface, qtype="rx"): 34 | p1 = Popen(shlex.split("cat /proc/interrupts"), stdout=PIPE) 35 | p2 = Popen(shlex.split("egrep -i \"CPU|{}\"".format(iface)), stdin=p1.stdout, stdout=PIPE) 36 | info = p2.communicate()[0].strip() 37 | 38 | def irq(data): 39 | return data[0].split(":")[0] 40 | 41 | def queue(data): 42 | m = re.match("{}-tx-rx-(?P\d+)".format(iface), 43 | data[-1], 44 | flags=re.IGNORECASE 45 | ) 46 | return m.groupdict()["queue"] if m else None 47 | 48 | irq_map = dict() 49 | for line in info.splitlines(): 50 | line = line.lower() 51 | if re.search(iface, line) and re.search(qtype, line): 52 | # sample -> 27: * eth0-Tx-Rx-0 53 | data = line.split() 54 | irqnum, q = irq(data), queue(data) 55 | if irqnum and q: 56 | irq_map[q] = irqnum 57 | 58 | return irq_map 59 | 60 | def irq_smp_affinity(irq): 61 | with open(SMP_AFFINITY.format(irq=irq), "r") as fd: 62 | masks = fd.read().strip().split(",") 63 | return [int(mask, 16) for mask in masks] 64 | 65 | def configure_rfs(iface, rps_sock_flow_entries=32768): 66 | print("Setting {} with value {}".format(RPS_SOCK_FLOW_ENTRIES, rps_sock_flow_entries)) 67 | 68 | with open(RPS_SOCK_FLOW_ENTRIES, "wb") as fd: 69 | fd.write("{}\n".format(rps_sock_flow_entries)) 70 | 71 | rps_flow_cnt = rps_sock_flow_entries / numcpus() 72 | 73 | qs = iface_queues(iface, qtype="rx") 74 | for q in qs: 75 | path = RPS_FLOW_COUNT.format(iface=iface, q=q) 76 | print("Configuring {} with value {}".format(path, rps_flow_cnt)) 77 | 78 | with open(path, "wb") as fd: 79 | fd.write("{}\n".format(rps_flow_cnt)) 80 | 81 | def configure_rps(iface): 82 | cpus = numcpus() 83 | print("Number of CPU(s) available.. : {}".format(cpus)) 84 | 85 | qs = iface_queues(iface) 86 | print("Number of queues in interface '{}'.. : {}".format(iface, len(qs))) 87 | 88 | if cpus <= len(qs): 89 | print("Available CPU(s) <= rx queues.. Nothing to be done here") 90 | sys.exit(0) 91 | 92 | q_cpu_map = dict() 93 | for q, irq in irq_cpu_map(iface).items(): 94 | masks = irq_smp_affinity(irq) 95 | available_cpus = cpus 96 | 97 | rps_cpus = list() 98 | for mask in masks: 99 | cur_cpus = min(available_cpus, 32) 100 | if cur_cpus: 101 | irq_cpu_mask = (2 ** cur_cpus) - 1 102 | rps_cpus.append(format(mask ^ irq_cpu_mask, 'x')) 103 | available_cpus -= cur_cpus 104 | 105 | q_cpu_map[q] = ",".join(rps_cpus) 106 | 107 | rps_xps_configs = [ 108 | ("rx", "rps_cpus"), 109 | ("tx", "xps_cpus") 110 | ] 111 | 112 | print("Mapping the corresponding TX queue to same CPU(s) as RX queue") 113 | for q, mask in q_cpu_map.items(): 114 | for qtype, config in rps_xps_configs: 115 | path = RPS_XPS_CONFIG_PATH.format(iface=iface, q="{}-{}".format(qtype, q), config=config) 116 | print("Setting RPS CPU mask for {} as {}".format(path, mask)) 117 | 118 | with open(path, "wb") as fd: 119 | fd.write(mask + "\n") 120 | 121 | if __name__ == "__main__": 122 | parser = argparse.ArgumentParser(description="Script to setup RPS - Receiver Packet Steering on the specified interface") 123 | parser.add_argument("iface", type=str, help="Network adapter name") 124 | parser.add_argument("--configure-rfs", action="store_true", help="Configures RFS along with RPS") 125 | args = parser.parse_args(sys.argv[1:]) 126 | configure_rps(args.iface) 127 | 128 | if args.configure_rfs: 129 | configure_rfs(args.iface) --------------------------------------------------------------------------------