├── .gitattributes ├── Collector ├── README.md └── collector.cpp ├── Config ├── README.md ├── collector.json └── translator.json ├── Generator ├── README.md ├── dta_append_basic.py ├── dta_append_basic_orig.py ├── dta_keyincrement_basic.py ├── dta_keywrite_basic.py ├── int_paths.py ├── udp_report.py └── udp_report_flowcard.py ├── LICENSE ├── Manager ├── Collector.py ├── Generator.py ├── Machine.py ├── Manager.py ├── README.md ├── Tofino.py └── common.py ├── Overview.png ├── README.md ├── Reporter ├── README.md ├── p4src │ └── dta_reporter.p4 ├── pktgen.py └── switch_cpu.py ├── Testbed.png └── Translator ├── README.md ├── init_rdma_connection.py ├── inject_dta.py ├── p4src ├── README.md └── dta_translator.p4 ├── pktgen.py ├── send_rdma_synthetic.py └── switch_cpu.py /.gitattributes: -------------------------------------------------------------------------------- 1 | *.p4 linguist-language=P4 2 | *.py linguist-detectable=false 3 | -------------------------------------------------------------------------------- /Collector/README.md: -------------------------------------------------------------------------------- 1 | # DTA - Collector 2 | This directory contains code for the Collector component of DTA. 3 | 4 | ## Prerequisites 5 | - A server equipped with an RDMA-capable rNIC, supporting RoCEv2. 6 | - Installed, configured, and verified RDMA drivers ready work workloads. 7 | - **Preferably** a direct link between the Translator-switch and the rNIC. 8 | 9 | Please verify that the RDMA setup works before proceeding with the DTA installation. 10 | This can be done for example by connecting two RDMA-capable NICs, and using the `ib_send_bw` utility. 11 | 12 | ## Setup 13 | 1. Compile the collector `gcc -o collector collector.cpp -lrdmacm -libverbs -lstdc++` 14 | 2. Disable iCRC verification on the network card (contact the manufacturer for support) 15 | 3. Ensure that the network card has a direct connection to the Translator 16 | 17 | ## Runtime 18 | The collector should start first, before launching the translator. 19 | 20 | 1. Start the collector `sudo ./collector` 21 | 22 | -------------------------------------------------------------------------------- /Collector/collector.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | 7 | #include 8 | #include 9 | #include //To force hugetables 10 | #include 11 | #include 12 | 13 | #include // for boost::crc_32_type 14 | 15 | #include 16 | 17 | using namespace std; 18 | 19 | struct dataListEntry 20 | { 21 | uint32_t data; 22 | }; 23 | 24 | struct keywriteEntry 25 | { 26 | uint32_t checksum; 27 | uint32_t data; 28 | //uint32_t data2; 29 | //uint32_t data3; 30 | //uint32_t data4; 31 | //uint32_t data5; 32 | //uint32_t data6; 33 | //uint32_t data7; 34 | 35 | //uint64_t data; 36 | //uint32_t checksum; 37 | //uint32_t offset; 38 | }; 39 | 40 | struct postcarderEntry 41 | { 42 | uint32_t hop1_data; 43 | uint32_t hop2_data; 44 | uint32_t hop3_data; 45 | uint32_t hop4_data; 46 | uint32_t hop5_data; 47 | //The slots have to be sized to a power of 2 for the Translator implementation, so we add 12B padding 48 | uint32_t padding1; 49 | uint32_t padding2; 50 | uint32_t padding3; 51 | }; 52 | 53 | struct appendEntry 54 | { 55 | uint32_t data; 56 | }; 57 | 58 | 59 | //Send RDMA metadata in this format between serv-cli 60 | struct __attribute((packed)) rdma_buffer_attr 61 | { 62 | uint64_t address; 63 | uint32_t length; 64 | union stag 65 | { 66 | /* if we send, we call it local stags */ 67 | uint32_t local_stag; 68 | /* if we receive, we call it remote stag */ 69 | uint32_t remote_stag; 70 | }stag; 71 | }; 72 | 73 | class RDMAService 74 | { 75 | public: 76 | struct ibv_context* context; 77 | struct ibv_pd* protection_domain; 78 | struct ibv_cq* completion_queue; 79 | struct ibv_qp* queue_pair; 80 | struct ibv_mr* memory_region; 81 | struct rdma_event_channel *cm_channel; 82 | struct rdma_cm_id *listen_id; 83 | struct rdma_cm_id *cm_id; 84 | struct sockaddr_in sin; 85 | struct rdma_cm_event *event; 86 | struct ibv_comp_channel *completion_channel; 87 | struct ibv_mr* client_metadata_mr; 88 | int initial_psn; 89 | string name; 90 | int rdmaCMPort; 91 | bool isReady = false; 92 | 93 | 94 | void setupConnectionManager() 95 | { 96 | int ret; 97 | 98 | cout << "Creating event channel" << endl; 99 | cm_channel = rdma_create_event_channel(); 100 | if (!cm_channel) 101 | cerr << "Failed to set up connection manager!" << endl; 102 | 103 | cout << "Creating RDMA ID" << endl; 104 | ret = rdma_create_id(cm_channel,&listen_id, NULL, RDMA_PS_TCP); 105 | if(ret) 106 | cerr << "Failed to create RDMA ID! err: " << ret << endl; 107 | 108 | sin.sin_family = AF_INET; 109 | sin.sin_port = htons(rdmaCMPort); 110 | sin.sin_addr.s_addr = INADDR_ANY; 111 | 112 | cout << "Binding RDMA_CM to port " << rdmaCMPort << endl; 113 | ret = rdma_bind_addr(listen_id, (struct sockaddr *) &sin); 114 | if(ret) 115 | cerr << "Failed to bind RDMA address! err: " << ret << endl; 116 | 117 | cout << "Listening for RDMA connections to " << name << endl; 118 | ret = rdma_listen(listen_id, 1); 119 | if(ret) 120 | cerr << "Failed to listen to RDMA! err: " << ret << endl; 121 | 122 | //Wait for a connection request 123 | do 124 | { 125 | cout << "Waiting for ConnectionManager event (incoming connection request) for " << name << "..." << endl; 126 | ret = rdma_get_cm_event(cm_channel, &event); 127 | if(ret) 128 | cerr << "Failed to get CM event! err: " << ret << endl; 129 | cout << "Event detected" << endl; 130 | } 131 | while(event->event != RDMA_CM_EVENT_CONNECT_REQUEST); //stay here until correct event 132 | 133 | cm_id = event->id; 134 | 135 | cout << "Sending back an ack" << endl; 136 | rdma_ack_cm_event(event); 137 | } 138 | 139 | 140 | void allocProtectionDomain() 141 | { 142 | cout << "Allocating a protection domain..." << endl; 143 | protection_domain = ibv_alloc_pd(cm_id->verbs); 144 | cout << "Protection domain: " << protection_domain << endl; 145 | } 146 | 147 | void createCompletionQueue(int cq_size) 148 | { 149 | cout << "Creating a completion channel" << endl; 150 | completion_channel = ibv_create_comp_channel(cm_id->verbs); 151 | cout << "Completion channel created at " << completion_channel << endl; 152 | 153 | cout << "Creating a completion queue of size " << cq_size << "..." << endl; 154 | completion_queue = ibv_create_cq(cm_id->verbs, cq_size, nullptr, completion_channel, 0); 155 | cout << "Completion queue created at " << completion_queue << endl; 156 | } 157 | 158 | 159 | //Create a queue pair 160 | void createQueuePair() 161 | { 162 | cout << "Creating a queue pair for '" << name << "'..." << endl; 163 | 164 | struct ibv_qp_init_attr qp_attr; 165 | int ret; 166 | 167 | memset(&qp_attr, 0, sizeof(qp_attr)); 168 | 169 | 170 | qp_attr.cap.max_send_wr = 32; 171 | qp_attr.cap.max_send_sge = 32; 172 | qp_attr.cap.max_recv_wr = 32; 173 | qp_attr.cap.max_recv_sge = 32; 174 | 175 | qp_attr.send_cq = completion_queue; //these don't have to be the same 176 | qp_attr.recv_cq = completion_queue; 177 | qp_attr.qp_type = IBV_QPT_RC; 178 | //qp_attr.sq_sig_all = 1; //All send queues will be posted to completion queue? 179 | 180 | ret = rdma_create_qp(cm_id, protection_domain, &qp_attr); 181 | queue_pair = cm_id->qp; 182 | 183 | if(ret) 184 | cerr << "Failed to create a queue-pair! err: " << ret << endl; 185 | 186 | cout << "Queue pair created at " << queue_pair << endl; 187 | } 188 | 189 | void acceptClientConnection() 190 | { 191 | cout << "Accepting the client connection for '" << name << "'..." << endl; 192 | 193 | struct rdma_conn_param conn_param = { }; 194 | 195 | int ret; 196 | 197 | cout << "ignoring the receive.." << endl; 198 | 199 | cout << "Accepting the connection" << endl; 200 | ret = rdma_accept(cm_id, &conn_param); 201 | if(ret) 202 | cerr << "Failed to accept RDMA connection! err: " << ret << endl; 203 | 204 | ret = rdma_get_cm_event(cm_channel, &event); 205 | if(ret) 206 | cerr << "Failed to get RDMA event! err: " << ret << endl; 207 | 208 | cout << "Sending back an ack" << endl; 209 | rdma_ack_cm_event(event); 210 | } 211 | 212 | void registerMemoryRegion(void* buffer, size_t size) 213 | { 214 | cout << "Registering memory region " << buffer << " of size " << size << "B for '" << name << "'..." << endl; 215 | 216 | memory_region = ibv_reg_mr(protection_domain, buffer, size, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_ATOMIC); 217 | 218 | /* 219 | //RDMA has to be recompiled with support of physical addresses first. might improve keywrite performance at large memories 220 | struct ibv_exp_reg_mr_in in = {0}; 221 | int my_access_flags = IBV_ACCESS_LOCAL_WRITE |\ 222 | IBV_ACCESS_REMOTE_READ |\ 223 | IBV_ACCESS_REMOTE_WRITE |\ 224 | IBV_ACCESS_REMOTE_ATOMIC |\ 225 | IBV_EXP_ACCESS_PHYSICAL_ADDR; 226 | in.pd = protection_domain; 227 | in.addr = buffer; 228 | in.length = size; 229 | in.exp_access = my_access_flags; 230 | memory_region = ibv_exp_reg_mr(&in); 231 | */ 232 | 233 | cout << "Registered memory region at " << memory_region << endl; 234 | } 235 | 236 | int getRQPSN() 237 | { 238 | struct ibv_qp_attr attr; 239 | struct ibv_qp_init_attr init_attr; 240 | 241 | ibv_query_qp(queue_pair, &attr, IBV_QP_RQ_PSN, &init_attr); 242 | 243 | return attr.rq_psn; 244 | } 245 | 246 | void setInitialPSN() 247 | { 248 | int currentPSN = getRQPSN(); 249 | cout << "Setting the initial PSN for '" << name << "' to the current value of " << currentPSN << endl; 250 | initial_psn = currentPSN; 251 | } 252 | 253 | //buf should be a pointer to the registered buffer we can use to transfer server metadata 254 | void shareServerMetadata() 255 | { 256 | struct ibv_sge sge; 257 | struct ibv_send_wr send_wr = { }; 258 | struct ibv_send_wr *bad_send_wr; 259 | int ret; 260 | 261 | cout << "Waiting to share server/storage metadata with client/translator" << endl; 262 | 263 | //Send 16 bytes from storage 264 | sge.addr = (uint64_t) memory_region->addr; 265 | sge.length = (uint32_t) 16; // 266 | sge.lkey = (uint32_t)memory_region->lkey; 267 | 268 | cout << "Storing metadata in provided pre-mapped buffer..." << endl; 269 | 270 | //Store the metadata in the correct format in the provided (mapped) buffer 271 | *((uint32_t*)memory_region->addr) = (uint32_t)(((uint64_t)memory_region->addr)&0xffffffff); 272 | *((uint32_t*)memory_region->addr+1) = (uint32_t)(((uint64_t)memory_region->addr) >> 32); 273 | *((uint32_t*)memory_region->addr+2) = (uint32_t)memory_region->length; 274 | *((uint32_t*)memory_region->addr+3) = (uint32_t)memory_region->lkey; 275 | 276 | cout << "Advertising addr: " << sge.addr << " len: " << sge.length << " lkey: " << sge.lkey << endl; 277 | 278 | send_wr.opcode = IBV_WR_SEND; 279 | send_wr.send_flags = IBV_SEND_SIGNALED; 280 | send_wr.sg_list = &sge; 281 | send_wr.num_sge = 1; 282 | 283 | cout << "Sending RDMA metadata to client" << endl; 284 | ret = ibv_post_send(cm_id->qp, &send_wr, &bad_send_wr); 285 | if(ret) 286 | cerr << "Failed to send metadata!" << endl; 287 | 288 | 289 | cout << "Sleeping a while to ensure the client received the metadata before resetting it" << endl; 290 | sleep(2); 291 | 292 | //Reset the borrowed buffer, to not leave garbage in storage 293 | *((uint32_t*)memory_region->addr) = (uint32_t)0; 294 | *((uint32_t*)memory_region->addr+1) = (uint32_t)0; 295 | *((uint32_t*)memory_region->addr+2) = (uint32_t)0; 296 | *((uint32_t*)memory_region->addr+3) = (uint32_t)0; 297 | *((uint32_t*)memory_region->addr+4) = (uint32_t)0; 298 | *((uint32_t*)memory_region->addr+5) = (uint32_t)0; 299 | *((uint32_t*)memory_region->addr+6) = (uint32_t)0; 300 | *((uint32_t*)memory_region->addr+7) = (uint32_t)0; 301 | 302 | //We now set the initial PSN 303 | setInitialPSN(); 304 | } 305 | 306 | void printRDMAInfo() 307 | { 308 | int currentPSN = getRQPSN(); 309 | 310 | cout << "Printing RDMA info for '" << name << "'..." << endl; 311 | cout << "Local QP number: " << queue_pair->qp_num << endl; 312 | cout << "lkey: " << memory_region->lkey << endl; 313 | cout << "rkey: " << memory_region->rkey << endl; 314 | cout << "rq_psn: " << currentPSN << " (diff " << currentPSN-initial_psn << " from initial PSN)" << endl; 315 | } 316 | 317 | bool pollCompletion() 318 | { 319 | struct ibv_wc wc; 320 | int result; 321 | 322 | //do 323 | //{ 324 | // ibv_poll_cq returns the number of WCs that are newly completed, 325 | // If it is 0, it means no new work completion is received. 326 | // Here, the second argument specifies how many WCs the poll should check, 327 | // however, giving more than 1 incurs stack smashing detection with g++8 compilation. 328 | result = ibv_poll_cq(completion_queue, 1, &wc); 329 | //} while (result == 0); 330 | 331 | cout << "Polling completion queue returned " << result << endl; 332 | 333 | if(result >= 0 && wc.status == ibv_wc_status::IBV_WC_SUCCESS) 334 | { 335 | // success 336 | return true; 337 | } 338 | 339 | // You can identify which WR failed with wc.wr_id. 340 | printf("Poll failed with status %s (work request ID: %lu)\n", ibv_wc_status_str(wc.status), wc.wr_id); 341 | return false; 342 | } 343 | 344 | void getAsyncEvent() 345 | { 346 | struct ibv_async_event *event; 347 | 348 | cout << "Waiting for an async event" << endl; 349 | ibv_get_async_event(cm_id->verbs, event); 350 | 351 | cout << "Event type: " << event->event_type << endl; 352 | } 353 | 354 | void printCompletionQueue() 355 | { 356 | cout << "Polling completion queue" << endl; 357 | pollCompletion(); 358 | 359 | //cout << "Async event" << endl; 360 | //getAsyncEvent(); 361 | } 362 | 363 | void sendPSNReync() 364 | { 365 | int currentPSN = getRQPSN(); 366 | cout << "Sending a packet to trigger PSN resync in the translator for '" << name << "'..." << endl; 367 | cout << "TODO" << endl; 368 | } 369 | 370 | //This should be called before initiating the storage (that is inheriting RDMA) 371 | void initiateRDMA() 372 | { 373 | cout << "Initiating RDMA service for " << name << endl; 374 | 375 | setupConnectionManager(); 376 | 377 | allocProtectionDomain(); 378 | 379 | createCompletionQueue(128); 380 | 381 | createQueuePair(); 382 | 383 | acceptClientConnection(); 384 | } 385 | 386 | void allocateStorage(); 387 | void printStorage(); 388 | void analStorage(); 389 | void clearStorage(); 390 | void initiate(); 391 | 392 | 393 | //Constructor 394 | RDMAService(string init_name, int init_rdmaCMPort) 395 | { 396 | name = init_name; 397 | rdmaCMPort = init_rdmaCMPort; 398 | cout << "RDMA service constructor for " << name << endl; 399 | } 400 | }; 401 | 402 | void* allocateHugepages(uint64_t size) 403 | { 404 | uint64_t hugepage_size = 1<<30; 405 | uint64_t num_hugepages = ceil((double)size/(double)hugepage_size); 406 | uint64_t mmap_alloc_size = num_hugepages * hugepage_size; 407 | 408 | cout << "Allocating hugepages for buffer size " << size << ". This requires " << num_hugepages << " hugepages." << endl; 409 | 410 | void* p = mmap(NULL, mmap_alloc_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0); 411 | 412 | cout << "Buffer allocated at address " << p << endl; 413 | 414 | return p; 415 | } 416 | 417 | class KeywriteStore : public RDMAService 418 | { 419 | public: 420 | uint64_t num_entries; 421 | uint64_t buffer_size; 422 | struct keywriteEntry* storage; 423 | 424 | void allocateStorage() 425 | { 426 | buffer_size = num_entries*sizeof(struct keywriteEntry); //in bytes 427 | 428 | cout << "Allocating keywrite storage for '" << name << "'... Entries: " << num_entries << " size(B): " << buffer_size << endl; 429 | 430 | storage = (struct keywriteEntry*)allocateHugepages(buffer_size); 431 | 432 | cout << "keywrite buffer starts at address " << storage << endl; 433 | } 434 | 435 | void printStorage() 436 | { 437 | cout << "Printing storage of '" << name << "'..." << endl; 438 | 439 | int max_output = 256; //Prevent printing more entries than this 440 | int i; 441 | int numPerRow = 8; 442 | 443 | for(i = 0; i < num_entries; i++) 444 | { 445 | if( i > max_output ) 446 | { 447 | cout << "Killing print, reached limit of " << max_output << " printed entries" << endl; 448 | break; 449 | } 450 | 451 | uint32_t checksum = ntohl(storage[i].checksum); 452 | uint32_t data = ntohl(storage[i].data); 453 | //uint32_t data2 = ntohl(storage[i].data2); 454 | //uint32_t data3 = ntohl(storage[i].data3); 455 | 456 | if( i%numPerRow==0 ) 457 | cout << endl << i << ":\t"; 458 | cout << "("; 459 | cout << std::setfill('0') << std::setw(10) << checksum; 460 | cout << ","; 461 | cout << std::setfill('0') << std::setw(10) << data; 462 | /*cout << ","; 463 | cout << std::setfill('0') << std::setw(10) << data2; 464 | cout << ","; 465 | cout << std::setfill('0') << std::setw(10) << data3;*/ 466 | cout << ") "; 467 | } 468 | } 469 | 470 | //Query a key 471 | uint32_t query(uint32_t key, char redundancy) 472 | { 473 | char buffer[5]; //This is the crc input 474 | char buffer_key[4]; 475 | uint32_t slot; //this will be store calculated keyval slots 476 | boost::crc_32_type result; 477 | boost::crc_32_type result_csum; 478 | 479 | 480 | memcpy( buffer, &key, 4 ); //Copy key into this buffer (same for all redundancies) 481 | memcpy( buffer_key, &key, 4 ); 482 | 483 | //Calculate concatenated checksum 484 | result_csum.process_bytes(buffer_key, 4); 485 | uint32_t checksum = result_csum.checksum(); 486 | //cout << "We are expecting checksum " << checksum << endl; 487 | 488 | for(char n = 0; n < redundancy; n++) //Loop through all redundancies 489 | { 490 | buffer[4] = n; //The redundancy slot will differ 491 | 492 | //Calculate the index slot for this redundancy entry 493 | result.process_bytes(buffer, 5); 494 | slot = result.checksum(); 495 | //cout << "pure index-selecting checksum: " << slot << endl; 496 | 497 | slot %= num_entries; //modulo this into actual storage 498 | //cout << "Key " << key << " redundancy n=" << (int)n << " hashed to slot " << slot << endl; 499 | //cout << "Data in this slot: " << storage[slot].data << "csum: " << storage[slot].checksum << endl; 500 | 501 | //cout << endl; 502 | 503 | //If the checksum is correct, stop here and return an answer 504 | if( checksum == storage[slot].checksum ) 505 | return storage[slot].data; 506 | } 507 | 508 | //If no answer was found, return back a 0 (assuming that 0 signals return-None) 509 | return 0; 510 | } 511 | 512 | //Query time breakdown 513 | uint32_t query_timeBreakdown(uint32_t key, char redundancy) 514 | { 515 | char buffer[5]; //This is the crc input 516 | char buffer_key[4]; 517 | uint32_t slot; //this will be store calculated keyval slots 518 | boost::crc_32_type result; 519 | boost::crc_32_type result_csum; 520 | 521 | using std::chrono::high_resolution_clock; 522 | using std::chrono::duration_cast; 523 | using std::chrono::duration; 524 | using std::chrono::milliseconds; 525 | 526 | duration ms_double; 527 | double duration_timestamping; 528 | double duration_entry; 529 | double duration_bufferAlloc; 530 | double duration_checksumCalc; 531 | double duration_allEntries; 532 | double duration_total; 533 | 534 | auto t_start = high_resolution_clock::now(); //start time 535 | auto t_timestampDelay = high_resolution_clock::now(); //Time to get a timestamp 536 | 537 | memcpy( buffer, &key, 4 ); //Copy key into this buffer (same for all redundancies) 538 | memcpy( buffer_key, &key, 4 ); 539 | 540 | auto t_bufferAlloc = high_resolution_clock::now(); //start time 541 | 542 | //Calculate concatenated checksum 543 | result_csum.process_bytes(buffer_key, 4); 544 | uint32_t checksum = result_csum.checksum(); 545 | //cout << "We are expecting checksum " << checksum << endl; 546 | 547 | auto t_checksumCalculated = high_resolution_clock::now(); 548 | 549 | for(char n = 0; n < redundancy; n++) //Loop through all redundancies 550 | { 551 | auto t_entryStart = high_resolution_clock::now(); 552 | 553 | buffer[4] = n; //The redundancy slot will differ 554 | 555 | //Calculate the index slot for this redundancy entry 556 | result.process_bytes(buffer, 5); 557 | slot = result.checksum(); 558 | //cout << "pure index-selecting checksum: " << slot << endl; 559 | 560 | slot %= num_entries; //modulo this into actual storage 561 | //cout << "Key " << key << " redundancy n=" << (int)n << " hashed to slot " << slot << endl; 562 | //cout << "Data in this slot: " << storage[slot].data << "csum: " << storage[slot].checksum << endl; 563 | 564 | //cout << endl; 565 | 566 | auto t_entryEnd = high_resolution_clock::now(); 567 | ms_double = t_entryEnd - t_entryStart; 568 | duration_entry = ms_double.count()/1000; 569 | cout << "Retrieving this entry took " << duration_entry << " seconds" << endl; 570 | 571 | //If the checksum is correct, stop here and return an answer 572 | if( checksum == storage[slot].checksum ) 573 | return storage[slot].data; 574 | } 575 | auto t_finished = high_resolution_clock::now(); 576 | 577 | ms_double = t_timestampDelay - t_start; 578 | duration_timestamping = ms_double.count()/1000; 579 | ms_double = t_bufferAlloc - t_timestampDelay; 580 | duration_bufferAlloc = ms_double.count()/1000; 581 | ms_double = t_checksumCalculated - t_bufferAlloc; 582 | duration_checksumCalc = ms_double.count()/1000; 583 | ms_double = t_finished - t_checksumCalculated; 584 | duration_allEntries = ms_double.count()/1000; 585 | ms_double = t_finished - t_start; 586 | duration_total = ms_double.count()/1000; 587 | 588 | cout << "duration_timestamping: " << duration_timestamping << endl; 589 | cout << "duration_bufferAlloc: " << duration_bufferAlloc << endl; 590 | cout << "duration_checksumCalc: " << duration_checksumCalc << endl; 591 | cout << "duration_allEntries: " << duration_allEntries << endl; 592 | cout << "duration_total: " << duration_total << endl; 593 | 594 | //If no answer was found, return back a 0 (assuming that 0 signals return-None) 595 | return 0; 596 | } 597 | 598 | uint32_t benchmark_querying(uint64_t num_keys, int offset, int step, int redundancy) 599 | { 600 | uint32_t result; //This prevents optimizing away memory retrieval 601 | 602 | cout << name << " starts querying " << num_keys << " keys, offset " << offset << " every " << step << "..." << endl; 603 | 604 | for(uint64_t key = offset; key < num_keys; key+=step) 605 | result = query(key,redundancy); //query the key 606 | 607 | cout << name << " is done!" << endl; 608 | 609 | return result; 610 | } 611 | 612 | void benchmark_querying_multithread(int num_threads, uint64_t num_queries, int redundancy) 613 | { 614 | cout << "Benchmarking querying in " << name << " through " << num_threads << " threads." << endl; 615 | cout << "This will query a total of " << num_queries << " keys, shared between all threads" << endl; 616 | thread thread_queryers[num_threads]; 617 | 618 | using std::chrono::high_resolution_clock; 619 | using std::chrono::duration_cast; 620 | using std::chrono::duration; 621 | using std::chrono::milliseconds; 622 | 623 | auto t1 = high_resolution_clock::now(); //start time 624 | 625 | //Starting the querying threads 626 | for(int i = 0; i < num_threads; i++) 627 | thread_queryers[i] = thread(&KeywriteStore::benchmark_querying, this, num_queries, i, num_threads, redundancy); //Start the querying thread 628 | 629 | //Waiting for all threads to finish 630 | for(int i = 0; i < num_threads; i++) 631 | thread_queryers[i].join(); 632 | 633 | auto t2 = high_resolution_clock::now(); //end time 634 | 635 | cout << "All threads are now done!" << endl; 636 | 637 | duration ms_double = t2 - t1; //query duration 638 | 639 | double duration_s = ms_double.count()/1000; 640 | 641 | cout << "Having " << num_threads << " threads query " << num_queries << " keys total took " << duration_s << " seconds" << endl; 642 | cout << "This equal a query rate of " << num_queries/(duration_s*1000000*num_threads) << " million queries per second each, totalling " << num_queries/(duration_s*1000000) << " million queries per second!" << endl; 643 | } 644 | 645 | 646 | void analStorage() 647 | { 648 | cout << "Analyzing " << name << " storage... Total slots to iterage over: " << num_entries << endl; 649 | 650 | 651 | uint64_t numEmpty = 0; //The number of empty/unused memory slots 652 | for(uint64_t i = 0; i < num_entries; i++) 653 | { 654 | if(i%1000000==0) 655 | cout << "." << flush; 656 | if( storage[i].checksum==0 && storage[i].data==0 ) //If the slot is just zeroes, assume empty 657 | numEmpty++; 658 | } 659 | cout << endl; 660 | 661 | double loadFactor = (double)(num_entries-numEmpty)/(double)num_entries; 662 | 663 | cout << "Memory slots in use: " << num_entries-numEmpty << " / " << num_entries << endl; 664 | cout << "Load factor: " << loadFactor*100 << "%" << endl; 665 | } 666 | 667 | void clearStorage() 668 | { 669 | cout << "Clearing storage of '" << name << "'..." << endl; 670 | 671 | if( !storage ) 672 | { 673 | cout << "Storage of " << name << " is not initialized! Skipping..." << endl; 674 | return; 675 | } 676 | 677 | int i; 678 | for(i = 0; i < num_entries; i++) 679 | { 680 | storage[i].checksum = 0; 681 | storage[i].data = 0; 682 | } 683 | } 684 | 685 | void initiate() 686 | { 687 | cout << "Initiating storage for " << name << endl; 688 | 689 | initiateRDMA(); 690 | allocateStorage(); 691 | registerMemoryRegion(storage, buffer_size); 692 | shareServerMetadata(); 693 | 694 | isReady = true; 695 | } 696 | 697 | thread initiate_threaded() 698 | { 699 | thread t(&KeywriteStore::initiate, this); 700 | 701 | return t; 702 | } 703 | 704 | KeywriteStore(uint64_t init_num_entries, int rdmaPort, string init_name = "KeyWriteStore"): RDMAService(init_name, rdmaPort) 705 | { 706 | num_entries = init_num_entries; 707 | if(num_entries > 536870912) 708 | { 709 | cerr << "!!! Translator pipeline currently supports as most 536870912 entries! " << num_entries << " allocated" << endl; 710 | } 711 | cout << "Constructor for '" << name << "'..." << endl; 712 | } 713 | }; 714 | 715 | 716 | class PostcarderStore : public RDMAService 717 | { 718 | public: 719 | uint64_t num_entries; 720 | uint64_t buffer_size; 721 | struct postcarderEntry* storage; 722 | 723 | void allocateStorage() 724 | { 725 | buffer_size = num_entries*sizeof(struct postcarderEntry); //in bytes 726 | 727 | cout << "Allocating postcarder storage for '" << name << "'... Entries: " << num_entries << " size(B): " << buffer_size << endl; 728 | 729 | storage = (struct postcarderEntry*)allocateHugepages(buffer_size); 730 | 731 | cout << "Postcarder buffer starts at address " << storage << endl; 732 | } 733 | 734 | void printStorage() 735 | { 736 | cout << "Printing storage of '" << name << "'..." << endl; 737 | 738 | int max_output = 256; //Prevent printing more entries than this 739 | int i; 740 | int numPerRow = 4; 741 | 742 | for(i = 0; i < num_entries; i++) 743 | { 744 | if( i > max_output ) 745 | { 746 | cout << "Killing print, reached limit of " << max_output << " printed entries" << endl; 747 | break; 748 | } 749 | 750 | uint32_t hop1_data = ntohl(storage[i].hop1_data); 751 | uint32_t hop2_data = ntohl(storage[i].hop2_data); 752 | uint32_t hop3_data = ntohl(storage[i].hop3_data); 753 | uint32_t hop4_data = ntohl(storage[i].hop4_data); 754 | uint32_t hop5_data = ntohl(storage[i].hop5_data); 755 | 756 | if( i%numPerRow==0 ) 757 | cout << endl << i << ":\t"; 758 | cout << "("; 759 | cout << std::setfill('0') << std::setw(10) << hop1_data; 760 | cout << ","; 761 | cout << std::setfill('0') << std::setw(10) << hop2_data; 762 | cout << ","; 763 | cout << std::setfill('0') << std::setw(10) << hop3_data; 764 | cout << ","; 765 | cout << std::setfill('0') << std::setw(10) << hop4_data; 766 | cout << ","; 767 | cout << std::setfill('0') << std::setw(10) << hop5_data; 768 | cout << ") "; 769 | } 770 | } 771 | 772 | void analStorage() 773 | { 774 | cout << "Analyzing " << name << " storage... Total slots to iterage over: " << num_entries << endl; 775 | 776 | 777 | uint64_t numEmpty = 0; //The number of empty/unused memory slots 778 | for(uint64_t i = 0; i < num_entries; i++) 779 | { 780 | if(i%1000000==0) 781 | cout << "." << flush; 782 | if( storage[i].hop1_data==0 && storage[i].hop2_data==0 && storage[i].hop3_data==0 && storage[i].hop4_data==0 && storage[i].hop5_data==0 ) //If the slot is just zeroes, assume empty 783 | numEmpty++; 784 | } 785 | cout << endl; 786 | 787 | double loadFactor = (double)(num_entries-numEmpty)/(double)num_entries; 788 | 789 | cout << "Memory slots in use: " << num_entries-numEmpty << " / " << num_entries << endl; 790 | cout << "Load factor: " << loadFactor*100 << "%" << endl; 791 | } 792 | 793 | void clearStorage() 794 | { 795 | cout << "Clearing storage of '" << name << "'..." << endl; 796 | 797 | if( !storage ) 798 | { 799 | cout << "Storage of " << name << " is not initialized! Skipping..." << endl; 800 | return; 801 | } 802 | 803 | int i; 804 | for(i = 0; i < num_entries; i++) 805 | { 806 | storage[i].hop1_data = 0; 807 | storage[i].hop2_data = 0; 808 | storage[i].hop3_data = 0; 809 | storage[i].hop4_data = 0; 810 | storage[i].hop5_data = 0; 811 | } 812 | } 813 | 814 | void initiate() 815 | { 816 | cout << "Initiating storage for " << name << endl; 817 | 818 | initiateRDMA(); 819 | allocateStorage(); 820 | registerMemoryRegion(storage, buffer_size); 821 | shareServerMetadata(); 822 | 823 | isReady = true; 824 | } 825 | 826 | thread initiate_threaded() 827 | { 828 | thread t(&PostcarderStore::initiate, this); 829 | 830 | return t; 831 | } 832 | 833 | PostcarderStore(uint64_t init_num_entries, int rdmaPort, string init_name = "PostcarderStore"): RDMAService(init_name, rdmaPort) 834 | { 835 | num_entries = init_num_entries; 836 | if(num_entries > 536870912) 837 | { 838 | cerr << "!!! Translator pipeline currently supports as most 536870912 entries! " << num_entries << " allocated" << endl; 839 | } 840 | cout << "Constructor for '" << name << "'..." << endl; 841 | } 842 | }; 843 | 844 | class DataList : public RDMAService 845 | { 846 | public: 847 | uint64_t num_entries; 848 | uint64_t buffer_size; 849 | struct dataListEntry* storage; 850 | uint64_t tail_pointer; 851 | 852 | void allocateStorage() 853 | { 854 | buffer_size = num_entries*sizeof(struct dataListEntry); //in bytes 855 | 856 | cout << "Allocating List storage for '" << name << "'... Entries: " << num_entries << " size(B): " << buffer_size << endl; 857 | //storage = new struct dataListEntry[num_entries]; //Allocate on stack (required for tons of lists where not enough hugepages) 858 | storage = (struct dataListEntry*)allocateHugepages(buffer_size); //This is default, allocating on hugepages 859 | cout << "List buffer starts at address " << storage << endl; 860 | } 861 | 862 | //Retrieve a value from the list 863 | uint32_t pull() 864 | { 865 | if(++tail_pointer >= num_entries) 866 | tail_pointer = 0; 867 | 868 | return storage[tail_pointer].data; 869 | } 870 | 871 | //Breaking down costs of pulling an entry from the list. This seems silly here :) 872 | uint32_t pull_timeBreakdown() 873 | { 874 | using std::chrono::high_resolution_clock; 875 | using std::chrono::duration_cast; 876 | using std::chrono::duration; 877 | using std::chrono::milliseconds; 878 | 879 | duration ms_double; 880 | double duration_headpointer_s; 881 | double duration_retrieval_s; 882 | uint32_t result; 883 | 884 | auto t1 = high_resolution_clock::now(); 885 | 886 | if(++tail_pointer >= num_entries) 887 | tail_pointer = 0; 888 | 889 | auto t2 = high_resolution_clock::now(); 890 | 891 | result = storage[tail_pointer].data; 892 | 893 | auto t3 = high_resolution_clock::now(); 894 | 895 | ms_double = t2 - t1; 896 | duration_headpointer_s = ms_double.count()/1000; 897 | ms_double = t3 - t2; 898 | duration_retrieval_s = ms_double.count()/1000; 899 | 900 | cout << "Updating the head pointer took " << duration_headpointer_s << " seconds." << endl; 901 | cout << "Retrieval memory slot took " << duration_retrieval_s << " seconds." << endl; 902 | 903 | return result; 904 | } 905 | 906 | void printStorage() 907 | { 908 | cout << "Printing storage of '" << name << "'..." << endl; 909 | 910 | int max_output = 64; //Prevent printing more entries than this 911 | int i; 912 | int numPerRow = 16; 913 | for(i = 0; i < num_entries; i++) 914 | { 915 | if( i > max_output ) 916 | { 917 | cout << "Killing print, reached limit of " << max_output << " printed entries" << endl; 918 | break; 919 | } 920 | if( i%numPerRow==0 ) 921 | cout << endl << i << ":\t"; 922 | cout << "("; 923 | cout << std::setfill('0') << std::setw(10) << storage[i].data; 924 | cout << ") "; 925 | } 926 | cout << endl; 927 | } 928 | 929 | void findDataGaps() 930 | { 931 | cout << "Finding data gaps in the list, where values are not sequential..." << endl; 932 | 933 | cout << "Following is a list of gaps in the append list (gapStart,gapEnd)" << endl; 934 | 935 | uint32_t lastValue = 0; 936 | uint64_t lastMissingIndex = 0; 937 | 938 | for(uint64_t i = 0; i < num_entries; i++) 939 | { 940 | uint32_t thisValue = storage[i].data; 941 | 942 | //Ignore empty slots 943 | if(thisValue == 0) 944 | continue; 945 | 946 | if( thisValue != lastValue+1 ) 947 | { 948 | /* 949 | cout << "Non-sequential value found at index " << i << "!" << endl; 950 | cout << "Previous value: " << lastValue << ", this value: " << thisValue << ", diff: " << (thisValue-lastValue) << endl; 951 | cout << "The last skipped value was back at index " << lastMissingIndex << " which was " << (i-lastMissingIndex) << " slots ago" << endl; 952 | cout << endl; 953 | */ 954 | cout << "(" << lastValue << "," << thisValue << ")," << flush; 955 | } 956 | 957 | lastValue = storage[i].data; 958 | } 959 | cout << endl; 960 | } 961 | 962 | void findEmptySlots() 963 | { 964 | cout << "Finding data gaps in the list, where data was not written..." << endl; 965 | 966 | cout << "Following is a list of gaps in the append list (gapStart,gapEnd)" << endl; 967 | 968 | uint64_t lastEmptyIndex = 0; 969 | uint64_t lastWrittenIndex = 0; 970 | 971 | for(uint64_t i = 0; i < num_entries; i++) 972 | { 973 | uint32_t thisValue = storage[i].data; 974 | 975 | if(thisValue == 0) //This slot is empty 976 | { 977 | lastEmptyIndex = i; 978 | if(i == num_entries-1) //This is the last slot, and it was empty. Report a gap until the end 979 | cout << "(" << lastWrittenIndex << "," << i << ")," << flush; 980 | } 981 | else //This slot is not empty 982 | { 983 | if(lastEmptyIndex == i-1) //If we just got out of a gap 984 | cout << "(" << lastWrittenIndex << "," << i << ")," << flush; 985 | 986 | lastWrittenIndex = i; 987 | } 988 | } 989 | 990 | cout << endl; 991 | } 992 | 993 | void findListFullness() 994 | { 995 | cout << "Finding out how full the list is..." << endl; 996 | uint64_t numEmpty = 0; //The number of empty/unused memory slots 997 | for(uint64_t i = 0; i < num_entries; i++) 998 | { 999 | if(i%1000000==0) 1000 | cout << "." << flush; 1001 | if( storage[i].data==0 ) //If the slot is just zeroes, assume empty 1002 | numEmpty++; 1003 | } 1004 | cout << endl; 1005 | 1006 | double loadFactor = (double)(num_entries-numEmpty)/(double)num_entries; 1007 | 1008 | cout << "Memory slots in use: " << num_entries-numEmpty << " / " << num_entries << endl; 1009 | cout << "Load factor: " << loadFactor*100 << "%" << endl; 1010 | } 1011 | 1012 | void analStorage() 1013 | { 1014 | cout << "Analyzing " << name << "... " << endl; 1015 | 1016 | findListFullness(); 1017 | //findDataGaps(); 1018 | //findEmptySlots(); 1019 | 1020 | } 1021 | 1022 | void clearStorage() 1023 | { 1024 | cout << "Clearing storage of '" << name << "'..." << endl; 1025 | int i; 1026 | for(i = 0; i < num_entries; i++) 1027 | { 1028 | storage[i].data = 0; 1029 | } 1030 | } 1031 | 1032 | uint32_t benchmark_querying(uint64_t num_pulls) 1033 | { 1034 | uint32_t pulledData; //This prevents optimizing away memory retrieval 1035 | 1036 | cout << name << " starts pulling " << num_pulls << " list entries..." << endl; 1037 | for(uint64_t i = 0; i < num_pulls; i++) 1038 | pulledData = pull(); //pull one data entry 1039 | cout << name << " is done!" << endl; 1040 | 1041 | return pulledData; 1042 | } 1043 | 1044 | void initiate() 1045 | { 1046 | cout << "Initiating storage for " << name << endl; 1047 | 1048 | initiateRDMA(); 1049 | allocateStorage(); 1050 | registerMemoryRegion(storage, buffer_size); 1051 | shareServerMetadata(); 1052 | clearStorage(); 1053 | isReady = true; 1054 | } 1055 | 1056 | 1057 | thread initiate_threaded() 1058 | { 1059 | thread t(&DataList::initiate, this); 1060 | 1061 | return t; 1062 | } 1063 | 1064 | DataList(uint64_t init_num_entries, int rdmaPort, string init_name="DataList"): RDMAService(init_name, rdmaPort) 1065 | { 1066 | num_entries = init_num_entries; 1067 | cout << "Constructor for '" << name << "'..." << endl; 1068 | 1069 | tail_pointer = num_entries; //This ensures that polling will start at 0 1070 | } 1071 | }; 1072 | 1073 | int main() 1074 | { 1075 | cout << "Starting collector service..." << endl; 1076 | 1077 | //They only support storage sizes powers of 2 1078 | 1079 | 1080 | //Allocate storage services 1081 | PostcarderStore postcarderStore(256, 1336); //256 is 8KiB 1082 | 1083 | KeywriteStore keywriteStore(256, 1337); //256 is 2KiB 1084 | //KeywriteStore keywriteStore(8388608, 1337); //8388608 is 64MiB 1085 | //KeywriteStore keywriteStore(16777216, 1337); //16777216 is 128MiB 1086 | //KeywriteStore keywriteStore(33554432, 1337); //33554432 is 256MiB 1087 | //KeywriteStore keywriteStore(67108864, 1337); //67108864 is 512MiB 1088 | //KeywriteStore keywriteStore(134217728, 1337); //134217728 is 1GiB 1089 | //KeywriteStore keywriteStore(268435456, 1337); //268435456: 2GiB 1090 | //KeywriteStore keywriteStore(536870912, 1337); //536870912: 4GiB 1091 | //KeywriteStore keywriteStore(1073741824, 1337); //1073741824: 8GiB 1092 | //KeywriteStore keywriteStore(2147483648, 1337); //2147483648: 16GiB 1093 | 1094 | 1095 | //DataList dataList(256, 1338); //256 is 1KiB 1096 | //DataList dataList(16777216, 1338); //16777216 is 64MiB 1097 | //DataList dataList(33554432, 1338); //67108864 is 128MiB 1098 | //DataList dataList(67108864, 1338); //67108864 is 256MiB 1099 | //DataList dataList(134217728, 1338); //134217728 is 512MiB 1100 | //DataList dataList(268435456, 1338); //268435456 is 1GiB 1101 | //DataList dataList(536870912, 1338); //536870912 is 2GiB 1102 | //DataList dataList(1073741824, 1338); //1073741824 is 4GiB 1103 | //DataList dataList(2147483648, 1338); //2147483648 is 8GiB 1104 | //DataList dataList(4294967296, 1338); //4294967296 is 16GiB 1105 | 1106 | int num_lists = 4; //Number of lists 1107 | int list_port_start = 1338; 1108 | uint64_t slots_per_list = 256; //The size of the published lists, in number of slots 268435456 1109 | 1110 | //Create data lists 1111 | DataList **dataLists; 1112 | dataLists = new DataList*[num_lists]; 1113 | for(int i = 0; i < num_lists; i++) 1114 | { 1115 | cout << i << endl; 1116 | *(dataLists+i) = new DataList(slots_per_list, list_port_start+i, "List"+to_string(i)); 1117 | } 1118 | 1119 | //Initiate the storages in separate threads 1120 | thread thread_keyvalue = keywriteStore.initiate_threaded(); 1121 | thread thread_datalists[num_lists]; 1122 | for(int i = 0; i < num_lists; i++) 1123 | thread_datalists[i] = dataLists[i]->initiate_threaded(); //ignore the returned thread handler 1124 | 1125 | //Stay here indefinitely, keeping collector alive 1126 | while(1) 1127 | { 1128 | cout << endl; 1129 | 1130 | cout << "Press ENTER to analyze storage. This MIGHT impact RDMA performance, so avoid during benchmarking!"; 1131 | cin.ignore(); 1132 | 1133 | //Print info for all lists (APPEND) 1134 | for(int i = 0; i < num_lists; i++) 1135 | { 1136 | if(dataLists[i]->isReady) 1137 | { 1138 | dataLists[i]->printRDMAInfo(); 1139 | dataLists[i]->printStorage(); 1140 | dataLists[i]->analStorage(); 1141 | //dataLists[i]->findEmptySlots(); //0 is used to get throughput 1142 | } 1143 | else 1144 | cout << dataLists[i]->name << " is not ready yet, skipping. " << endl; 1145 | } 1146 | 1147 | //Print some list info 1148 | /* 1149 | if(dataLists[num_lists-1]->isReady) 1150 | { 1151 | dataLists[0]->printRDMAInfo(); 1152 | dataLists[0]->printStorage(); 1153 | dataLists[0]->analStorage(); 1154 | dataLists[0]->findEmptySlots(); //0 is used to get throughput 1155 | dataLists[1]->printRDMAInfo(); 1156 | dataLists[1]->printStorage(); 1157 | dataLists[1]->analStorage(); 1158 | dataLists[num_lists-1]->printRDMAInfo(); 1159 | dataLists[num_lists-1]->printStorage(); 1160 | dataLists[num_lists-1]->analStorage(); 1161 | } 1162 | else 1163 | cout << "All lists not yet ready" << endl; 1164 | */ 1165 | 1166 | /* 1167 | //Append querying benchmark. This under-estimates the speed due to startup costs being included 1168 | if(dataLists[num_lists-1]->isReady) 1169 | { 1170 | uint64_t num_pulls; 1171 | int num_lists_to_query; 1172 | cout << "Will benchmark querying for all lists" << endl; 1173 | cout << "Enter number of lists to query in parallel: "; 1174 | cin >> num_lists_to_query; 1175 | 1176 | if(num_lists_to_query > num_lists) 1177 | { 1178 | cout << "TOO MANY LISTS!" << endl; 1179 | continue; //skip 1180 | } 1181 | 1182 | cout << "Enter number of pulls to do for each list: "; 1183 | cin >> num_pulls; 1184 | 1185 | thread thread_queryers[num_lists]; 1186 | 1187 | using std::chrono::high_resolution_clock; 1188 | using std::chrono::duration_cast; 1189 | using std::chrono::duration; 1190 | using std::chrono::milliseconds; 1191 | 1192 | auto t1 = high_resolution_clock::now(); //start time 1193 | 1194 | //Start threads one-by-one 1195 | for(int i = 0; i < num_lists_to_query; i++) 1196 | thread_queryers[i] = thread(&DataList::benchmark_querying, dataLists[i], num_pulls); //Start the querying thread 1197 | 1198 | //wait here for all threads to complete (pulling done) 1199 | for(int i = 0; i < num_lists_to_query; i++) 1200 | thread_queryers[i].join(); 1201 | 1202 | auto t2 = high_resolution_clock::now(); //end time 1203 | 1204 | cout << "All queryers are now done!" << endl; 1205 | 1206 | duration ms_double = t2 - t1; //query duration 1207 | 1208 | double duration_s = ms_double.count()/1000; 1209 | 1210 | cout << "Having " << num_lists_to_query << " lists pull " << num_pulls << " each took " << duration_s << " seconds" << endl; 1211 | cout << "This equal a query rate of " << num_pulls/(duration_s*1000000) << " million reports each, totalling " << num_lists_to_query*num_pulls/(duration_s*1000000) << " million reports per second!" << endl; 1212 | 1213 | 1214 | cout << "Breaking down list 0 pulling work..." << endl; 1215 | dataLists[0]->pull_timeBreakdown(); 1216 | } 1217 | */ 1218 | 1219 | if(keywriteStore.isReady) 1220 | { 1221 | keywriteStore.printRDMAInfo(); 1222 | keywriteStore.printStorage(); 1223 | keywriteStore.analStorage(); 1224 | cout << endl; 1225 | 1226 | 1227 | 1228 | cout << "Completion queue of keywrite..." << endl; 1229 | keywriteStore.printCompletionQueue(); 1230 | cout << endl; 1231 | 1232 | /* 1233 | cout << "Enter a key integer to query: " << endl; 1234 | uint32_t key; 1235 | cin >> key; 1236 | cin.clear(); 1237 | cout << "Query result: " << keywriteStore.query(key, 4) << endl; 1238 | cin.ignore(); 1239 | */ 1240 | 1241 | /* 1242 | cout << "Number of incrementing keys to query: "; 1243 | int num_queries; 1244 | cin >> num_queries; 1245 | for(uint32_t i = 0; i < num_queries; i++) 1246 | keywriteStore.query(i, 2); 1247 | cout << "Done" << endl; 1248 | */ 1249 | 1250 | /* QUERY benchmark 1251 | int num_threads; 1252 | uint64_t num_queries; 1253 | int redundancy; 1254 | 1255 | cout << "Enter number of threads that will query Key-Write: "; 1256 | cin >> num_threads; 1257 | cout << "Enter total number of queries to run: "; 1258 | cin >> num_queries; 1259 | cout << "Enter level of redundancy (N) to used during queries: "; 1260 | cin >> redundancy; 1261 | 1262 | if(num_threads > 32) 1263 | { 1264 | cout << "Too many threads" << endl; 1265 | continue; 1266 | } 1267 | 1268 | keywriteStore.benchmark_querying_multithread(num_threads, num_queries, redundancy); 1269 | 1270 | cout << "Querying key 1 at chosen redundancy, and printing query processing breakdown" << endl; 1271 | uint32_t result = keywriteStore.query_timeBreakdown(1, redundancy); 1272 | cout << "res: " << result << endl; 1273 | */ 1274 | } 1275 | else 1276 | cout << keywriteStore.name << " is not yet ready. Please initiate a connection first" << endl; 1277 | } 1278 | 1279 | cout << "Collector stopping" << endl; 1280 | 1281 | return 1; 1282 | } 1283 | -------------------------------------------------------------------------------- /Config/README.md: -------------------------------------------------------------------------------- 1 | # DTA - Config (placeholder) 2 | This directory will contain system-wide configuration files. 3 | It is currently a placeholder. 4 | 5 | The plan is to just have to clone the repository to all physical machines, and update these configuration files to match the setup 6 | -------------------------------------------------------------------------------- /Config/collector.json: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Config/translator.json: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Generator/README.md: -------------------------------------------------------------------------------- 1 | # DTA - Generator 2 | 3 | We used TRex for traffic generation. 4 | TRex either send traffic towards the reporter (so that the reporter generates DTA telemetry reports from user traffic), or to directly generate DTA reports going to the translator (which statefully translates these into RDMA going to the collector). 5 | 6 | Download and install TRex according to their guide [here](https://trex-tgn.cisco.com/trex/doc/trex_manual.html#_download_and_installation) 7 | 8 | In this directory, you can find the scripts used to define the DTA traffic that the generator will emit 9 | -------------------------------------------------------------------------------- /Generator/dta_append_basic.py: -------------------------------------------------------------------------------- 1 | from trex_stl_lib.api import * 2 | import argparse 3 | 4 | 5 | class dta_base(Packet): 6 | name = "dtaBase" 7 | fields_desc = [ 8 | XByteField("opcode", 0x01), 9 | XByteField("seqnum", 0), #DTA sequence number 10 | BitField("immediate", 0, 1), 11 | BitField("retransmitable", 0, 1), 12 | BitField("reserved", 0, 6) 13 | ] 14 | 15 | class dta_keyWrite(Packet): 16 | name = "dtaKeyWrite" 17 | fields_desc = [ 18 | ByteField("redundancy", 0x02), 19 | IntField("key", 0), 20 | IntField("data", 0) 21 | ] 22 | 23 | class dta_append(Packet): 24 | name = "dtaAppend" 25 | fields_desc = [ 26 | IntField("listID", 0), 27 | IntField("data", 0) 28 | ] 29 | 30 | class STLS1(object): 31 | 32 | def create_stream (self): 33 | base_pkt = Ether(dst="b8:ce:f6:d2:12:c7")\ 34 | /IP(src="10.0.0.200",dst="10.0.0.51")\ 35 | /UDP(sport=40041,dport=40040)\ 36 | /dta_base(opcode=0x02)\ 37 | /dta_append() 38 | 39 | vm = STLVM() 40 | 41 | num_lists = 4 42 | 43 | #Increment srcIP for RSS 44 | vm.var(name='srcIP', size=4, op='random', step=1, min_value=0, max_value=2000000000) 45 | vm.write(fv_name='srcIP', pkt_offset='IP.src') 46 | 47 | 48 | #Increment sequence number 49 | vm.var(name='dta_seqnum', size=1, op='inc', step=1, min_value=0, max_value=255) 50 | vm.write(fv_name='dta_seqnum', pkt_offset='dta_base.seqnum') 51 | 52 | #Increment the data 53 | vm.var(name='dta_data', size=4, op='inc', step=1, min_value=1, max_value=1000000000) #Specify the data to append 54 | vm.write(fv_name='dta_data', pkt_offset='dta_append.data') 55 | 56 | #Round-robin on list IDs 57 | vm.var(name='dta_listID', size=4, op='inc', step=1, min_value=0, max_value=num_lists) #Specify which lists to append into 58 | vm.write(fv_name='dta_listID', pkt_offset='dta_append.listID') 59 | 60 | return STLStream(packet = STLPktBuilder(pkt = base_pkt, vm = vm), mode = STLTXCont()) 61 | 62 | def get_streams (self, tunables, **kwargs): 63 | parser = argparse.ArgumentParser(description='Argparser for {}'.format(os.path.basename(__file__)), formatter_class=argparse.ArgumentDefaultsHelpFormatter) 64 | args = parser.parse_args(tunables) 65 | # create 1 stream 66 | return [ self.create_stream() ] 67 | 68 | 69 | # dynamic load - used for trex console or simulator 70 | def register(): 71 | return STLS1() 72 | 73 | 74 | 75 | -------------------------------------------------------------------------------- /Generator/dta_append_basic_orig.py: -------------------------------------------------------------------------------- 1 | from trex_stl_lib.api import * 2 | import argparse 3 | 4 | 5 | class dta_base(Packet): 6 | name = "dtaBase" 7 | fields_desc = [ 8 | XByteField("opcode", 0x01), 9 | XByteField("seqnum", 0), #DTA sequence number 10 | BitField("immediate", 0, 1), 11 | BitField("retransmitable", 0, 1), 12 | BitField("reserved", 0, 6) 13 | ] 14 | 15 | class dta_keyWrite(Packet): 16 | name = "dtaKeyWrite" 17 | fields_desc = [ 18 | ByteField("redundancy", 0x02), 19 | IntField("key", 0), 20 | IntField("data", 0) 21 | ] 22 | 23 | class dta_append(Packet): 24 | name = "dtaAppend" 25 | fields_desc = [ 26 | IntField("listID", 0), 27 | IntField("data", 0) 28 | ] 29 | 30 | class STLS1(object): 31 | 32 | def create_stream (self): 33 | base_pkt = Ether(dst="b8:ce:f6:d2:12:c7")\ 34 | /IP(src="10.0.0.200",dst="10.0.0.51")\ 35 | /UDP(sport=40041,dport=40040)\ 36 | /dta_base(opcode=0x02)\ 37 | /dta_append() 38 | 39 | vm = STLVM() 40 | 41 | num_lists = 4 42 | 43 | #Increment sequence number 44 | vm.var(name='dta_seqnum', size=1, op='inc', step=1, min_value=0, max_value=255) 45 | vm.write(fv_name='dta_seqnum', pkt_offset='dta_base.seqnum') 46 | 47 | #Increment the data 48 | vm.var(name='dta_data', size=4, op='inc', step=1, min_value=1, max_value=1000000000) #Specify the data to append 49 | vm.write(fv_name='dta_data', pkt_offset='dta_append.data') 50 | 51 | #Round-robin on list IDs 52 | vm.var(name='dta_listID', size=4, op='inc', step=1, min_value=0, max_value=num_lists) #Specify which lists to append into 53 | vm.write(fv_name='dta_listID', pkt_offset='dta_append.listID') 54 | 55 | return STLStream(packet = STLPktBuilder(pkt = base_pkt, vm = vm), mode = STLTXCont()) 56 | 57 | def get_streams (self, tunables, **kwargs): 58 | parser = argparse.ArgumentParser(description='Argparser for {}'.format(os.path.basename(__file__)), formatter_class=argparse.ArgumentDefaultsHelpFormatter) 59 | args = parser.parse_args(tunables) 60 | # create 1 stream 61 | return [ self.create_stream() ] 62 | 63 | 64 | # dynamic load - used for trex console or simulator 65 | def register(): 66 | return STLS1() 67 | 68 | 69 | 70 | -------------------------------------------------------------------------------- /Generator/dta_keyincrement_basic.py: -------------------------------------------------------------------------------- 1 | from trex_stl_lib.api import * 2 | import argparse 3 | 4 | 5 | class dta_base(Packet): 6 | name = "dtaBase" 7 | fields_desc = [ 8 | XByteField("opcode", 0x01), 9 | XByteField("seqnum", 0), #DTA sequence number 10 | BitField("immediate", 0, 1), 11 | BitField("retransmitable", 0, 1), 12 | BitField("reserved", 0, 6) 13 | ] 14 | 15 | class dta_keyIncrement(Packet): 16 | name = "dtaKeyIncrement" 17 | fields_desc = [ 18 | ByteField("redundancy", 0), 19 | IntField("key", 0), 20 | LongField("counter", 0) 21 | ] 22 | 23 | class STLS1(object): 24 | 25 | def create_stream (self, redundancy=4, key=1, counter=1, seqnum=0): 26 | #redundancy = 4 #modify this for different redundancy tests 27 | #key = 1 28 | #data = 1 29 | #seqnum = 0 30 | 31 | base_pkt = Ether(dst="b8:ce:f6:d2:12:c7")\ 32 | /IP(src="10.0.0.200",dst="10.0.0.51")\ 33 | /UDP(sport=40041,dport=40040)\ 34 | /dta_base(opcode=0x03,seqnum=seqnum)\ 35 | /dta_keyIncrement(redundancy=redundancy, key=key, counter=counter)#, data2=2, data3=3) 36 | 37 | vm = STLVM() 38 | 39 | #Increment sequence number 40 | vm.var(name='dta_seqnum', size=1, op='inc', step=1, min_value=0, max_value=255) 41 | vm.write(fv_name='dta_seqnum', pkt_offset='dta_base.seqnum') 42 | 43 | #Increment key 44 | vm.var(name='dta_key', size=4, op='inc', step=1, min_value=0, max_value=1000000000) 45 | vm.write(fv_name='dta_key', pkt_offset='dta_keyIncrement.key') 46 | 47 | #Increment data 48 | vm.var(name='dta_counter', size=8, op='inc', step=1, min_value=0, max_value=1000000000) 49 | vm.write(fv_name='dta_counter', pkt_offset='dta_keyIncrement.counter') 50 | 51 | return STLStream(packet = STLPktBuilder(pkt = base_pkt, vm = vm), mode = STLTXCont()) 52 | 53 | def get_streams (self, tunables, **kwargs): 54 | parser = argparse.ArgumentParser(description='Argparser for {}'.format(os.path.basename(__file__)), formatter_class=argparse.ArgumentDefaultsHelpFormatter) 55 | parser.add_argument('--redundancy', type=int, default=4, help='The KeyWrite redundancy') 56 | args = parser.parse_args(tunables) 57 | print(args) 58 | # create 1 stream 59 | return [ self.create_stream(redundancy=int(args.redundancy)) ] 60 | 61 | 62 | # dynamic load - used for trex console or simulator 63 | def register(): 64 | return STLS1() 65 | 66 | 67 | 68 | -------------------------------------------------------------------------------- /Generator/dta_keywrite_basic.py: -------------------------------------------------------------------------------- 1 | from trex_stl_lib.api import * 2 | import argparse 3 | 4 | 5 | class dta_base(Packet): 6 | name = "dtaBase" 7 | fields_desc = [ 8 | XByteField("opcode", 0x01), 9 | XByteField("seqnum", 0), #DTA sequence number 10 | BitField("immediate", 0, 1), 11 | BitField("retransmitable", 0, 1), 12 | BitField("reserved", 0, 6) 13 | ] 14 | 15 | class dta_keyWrite(Packet): 16 | name = "dtaKeyWrite" 17 | fields_desc = [ 18 | ByteField("redundancy", 0), 19 | IntField("key", 0), 20 | IntField("data", 0), 21 | #IntField("data2", 0), 22 | #IntField("data3", 0), 23 | #IntField("data4", 0), 24 | #IntField("data5", 0), 25 | #IntField("data6", 0), 26 | #IntField("data7", 0) 27 | ] 28 | 29 | class dta_append(Packet): 30 | name = "dtaAppend" 31 | fields_desc = [ 32 | IntField("listID", 0), 33 | IntField("data", 0) 34 | ] 35 | 36 | class STLS1(object): 37 | 38 | def create_stream (self, redundancy=4, key=1, data=1, seqnum=0): 39 | #redundancy = 4 #modify this for different redundancy tests 40 | #key = 1 41 | #data = 1 42 | #seqnum = 0 43 | 44 | base_pkt = Ether(dst="b8:ce:f6:d2:12:c7")\ 45 | /IP(src="10.0.0.200",dst="10.0.0.51")\ 46 | /UDP(sport=40041,dport=40040)\ 47 | /dta_base(opcode=0x01,seqnum=seqnum)\ 48 | /dta_keyWrite(redundancy=redundancy, key=key, data=data)#, data2=2, data3=3) 49 | 50 | vm = STLVM() 51 | 52 | #Increment sequence number 53 | vm.var(name='dta_seqnum', size=1, op='inc', step=1, min_value=0, max_value=255) 54 | vm.write(fv_name='dta_seqnum', pkt_offset='dta_base.seqnum') 55 | 56 | #Increment key 57 | vm.var(name='dta_key', size=4, op='inc', step=1, min_value=0, max_value=1000000000) 58 | vm.write(fv_name='dta_key', pkt_offset='dta_keyWrite.key') 59 | 60 | #Increment data 61 | vm.var(name='dta_data', size=4, op='inc', step=1, min_value=0, max_value=1000000000) 62 | vm.write(fv_name='dta_data', pkt_offset='dta_keyWrite.data') 63 | 64 | return STLStream(packet = STLPktBuilder(pkt = base_pkt, vm = vm), mode = STLTXCont()) 65 | 66 | def get_streams (self, tunables, **kwargs): 67 | parser = argparse.ArgumentParser(description='Argparser for {}'.format(os.path.basename(__file__)), formatter_class=argparse.ArgumentDefaultsHelpFormatter) 68 | parser.add_argument('--redundancy', type=int, default=4, help='The KeyWrite redundancy') 69 | args = parser.parse_args(tunables) 70 | print(args) 71 | # create 1 stream 72 | return [ self.create_stream(redundancy=int(args.redundancy)) ] 73 | 74 | 75 | # dynamic load - used for trex console or simulator 76 | def register(): 77 | return STLS1() 78 | 79 | 80 | 81 | -------------------------------------------------------------------------------- /Generator/int_paths.py: -------------------------------------------------------------------------------- 1 | from trex_stl_lib.api import * 2 | import argparse 3 | 4 | 5 | class int_flow(Packet): 6 | name = "intFlow" 7 | fields_desc = [ 8 | IntField("srcIP", 1), 9 | IntField("dstIP", 1), 10 | XByteField("proto", 0x06), 11 | ShortField("srcPort", 1), 12 | ShortField("dstPort", 1) 13 | ] 14 | class int_path(Packet): 15 | name = "intPath" 16 | fields_desc = [ 17 | IntField("s1", 1), 18 | IntField("s2", 1), 19 | IntField("s3", 1), 20 | IntField("s4", 1), 21 | IntField("s5", 1) 22 | ] 23 | 24 | 25 | 26 | class STLS1(object): 27 | 28 | def create_stream (self): 29 | 30 | base_pkt = Ether(dst="b4:96:91:b3:ac:e8")\ 31 | /IP(src="10.0.0.51",dst="10.0.0.200")\ 32 | /UDP(sport=5000,dport=5002)\ 33 | /int_flow()\ 34 | /int_path() 35 | 36 | vm = STLVM() 37 | 38 | #Generate random flow tuples 39 | vm.var(name='srcIP', min_value=1, max_value=10000000, size=4, op='random') 40 | vm.var(name='dstIP', min_value=1, max_value=10000000, size=4, op='random') 41 | vm.var(name='srcPort', min_value=1, max_value=65535, size=2, op='random') 42 | vm.var(name='dstPort', min_value=1, max_value=65535, size=2, op='random') 43 | vm.var(name='proto', min_value=1, max_value=2, size=1, op='random') 44 | vm.write(fv_name='srcIP', pkt_offset='int_flow.srcIP') 45 | vm.write(fv_name='dstIP', pkt_offset='int_flow.dstIP') 46 | vm.write(fv_name='proto', pkt_offset='int_flow.proto') 47 | vm.write(fv_name='srcPort', pkt_offset='int_flow.srcPort') 48 | vm.write(fv_name='dstPort', pkt_offset='int_flow.dstPort') 49 | 50 | #Set some random hops 51 | vm.var(name='s1', min_value=1, max_value=10000, size=4, op='random') 52 | vm.var(name='s2', min_value=1, max_value=10000, size=4, op='random') 53 | vm.var(name='s3', min_value=1, max_value=10000, size=4, op='random') 54 | vm.var(name='s4', min_value=1, max_value=10000, size=4, op='random') 55 | vm.var(name='s5', min_value=1, max_value=10000, size=4, op='random') 56 | vm.write(fv_name='s1', pkt_offset='int_path.s1') 57 | vm.write(fv_name='s2', pkt_offset='int_path.s2') 58 | vm.write(fv_name='s3', pkt_offset='int_path.s3') 59 | vm.write(fv_name='s4', pkt_offset='int_path.s4') 60 | vm.write(fv_name='s5', pkt_offset='int_path.s5') 61 | 62 | return STLStream(packet = STLPktBuilder(pkt = base_pkt, vm = vm), mode = STLTXCont()) 63 | 64 | def get_streams (self, tunables, **kwargs): 65 | parser = argparse.ArgumentParser(description='Argparser for {}'.format(os.path.basename(__file__)), formatter_class=argparse.ArgumentDefaultsHelpFormatter) 66 | args = parser.parse_args(tunables) 67 | # create 1 stream 68 | return [ self.create_stream() ] 69 | 70 | 71 | # dynamic load - used for trex console or simulator 72 | def register(): 73 | return STLS1() 74 | 75 | 76 | 77 | 78 | -------------------------------------------------------------------------------- /Generator/udp_report.py: -------------------------------------------------------------------------------- 1 | from trex_stl_lib.api import * 2 | import argparse 3 | 4 | 5 | class teledata(Packet): 6 | name = "teledata" 7 | fields_desc = [ 8 | IntField("key",0), 9 | IntField("data",0) 10 | ] 11 | 12 | class STLS1(object): 13 | 14 | def create_stream (self): 15 | base_pkt = Ether(dst="b8:ce:f6:d2:12:c7")\ 16 | /IP(src="10.0.0.200",dst="10.0.0.51")\ 17 | /UDP(sport=40041,dport=1337)\ 18 | /teledata(key=123,data=666)\ 19 | 20 | vm = STLVM() 21 | 22 | #Increment srcIP for RSS 23 | vm.var(name='srcIP', size=4, op='random', step=1, min_value=0, max_value=2000000000) 24 | vm.write(fv_name='srcIP', pkt_offset='IP.src') 25 | 26 | 27 | #random key 28 | vm.var(name='telekey', size=4, op='inc', min_value=0, max_value=1000000000) 29 | vm.write(fv_name='telekey', pkt_offset='teledata.key') 30 | 31 | #Increment the data 32 | vm.var(name='teledat', size=4, op='inc', step=1, min_value=1, max_value=1000000000) #Specify the data to append 33 | vm.write(fv_name='teledat', pkt_offset='teledata.data') 34 | 35 | return STLStream(packet = STLPktBuilder(pkt = base_pkt, vm = vm), mode = STLTXCont()) 36 | 37 | def get_streams (self, tunables, **kwargs): 38 | parser = argparse.ArgumentParser(description='Argparser for {}'.format(os.path.basename(__file__)), formatter_class=argparse.ArgumentDefaultsHelpFormatter) 39 | args = parser.parse_args(tunables) 40 | # create 1 stream 41 | return [ self.create_stream() ] 42 | 43 | 44 | # dynamic load - used for trex console or simulator 45 | def register(): 46 | return STLS1() 47 | 48 | 49 | 50 | -------------------------------------------------------------------------------- /Generator/udp_report_flowcard.py: -------------------------------------------------------------------------------- 1 | #This is a 32-bit postcard with a 5-tuple key 2 | from trex_stl_lib.api import * 3 | import argparse 4 | 5 | 6 | class teledata(Packet): 7 | name = "teledata" 8 | fields_desc = [ 9 | IntField("srcIP",1), 10 | IntField("dstIP",1), 11 | ShortField("srcPort",1), 12 | ShortField("dstPort",1), 13 | XByteField("proto",0x06), 14 | IntField("data",0) 15 | ] 16 | 17 | class STLS1(object): 18 | 19 | def create_stream (self): 20 | base_pkt = Ether(dst="b8:ce:f6:d2:12:c7")\ 21 | /IP(src="10.0.0.200",dst="10.0.0.51")\ 22 | /UDP(sport=40041,dport=1337)\ 23 | /teledata()\ 24 | 25 | vm = STLVM() 26 | 27 | #Increment srcIP for RSS 28 | vm.var(name='srcIP', size=4, op='random', step=1, min_value=0, max_value=2000000000) 29 | vm.write(fv_name='srcIP', pkt_offset='IP.src') 30 | 31 | #Generate random flow tuples 32 | vm.var(name='tele_srcIP', min_value=1, max_value=10000000, size=4, op='random') 33 | vm.var(name='tele_dstIP', min_value=1, max_value=10000000, size=4, op='random') 34 | vm.var(name='tele_srcPort', min_value=1, max_value=65535, size=2, op='random') 35 | vm.var(name='tele_dstPort', min_value=1, max_value=65535, size=2, op='random') 36 | vm.var(name='tele_proto', min_value=1, max_value=2, size=1, op='random') 37 | vm.write(fv_name='tele_srcIP', pkt_offset='teledata.srcIP') 38 | vm.write(fv_name='tele_dstIP', pkt_offset='teledata.dstIP') 39 | vm.write(fv_name='tele_proto', pkt_offset='teledata.proto') 40 | vm.write(fv_name='tele_srcPort', pkt_offset='teledata.srcPort') 41 | vm.write(fv_name='tele_dstPort', pkt_offset='teledata.dstPort') 42 | 43 | #Increment the data 44 | vm.var(name='teledat', size=4, op='inc', step=1, min_value=1, max_value=1000000000) #Specify the data to append 45 | vm.write(fv_name='teledat', pkt_offset='teledata.data') 46 | 47 | return STLStream(packet = STLPktBuilder(pkt = base_pkt, vm = vm), mode = STLTXCont()) 48 | 49 | def get_streams (self, tunables, **kwargs): 50 | parser = argparse.ArgumentParser(description='Argparser for {}'.format(os.path.basename(__file__)), formatter_class=argparse.ArgumentDefaultsHelpFormatter) 51 | args = parser.parse_args(tunables) 52 | # create 1 stream 53 | return [ self.create_stream() ] 54 | 55 | 56 | # dynamic load - used for trex console or simulator 57 | def register(): 58 | return STLS1() 59 | 60 | 61 | 62 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Jonatan Langlet 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Manager/Collector.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | #This script handled communication with the collector 3 | 4 | import pexpect 5 | import time 6 | import datetime 7 | 8 | from Machine import Machine 9 | 10 | class Collector(Machine): 11 | ssh_collector = None 12 | interface = None 13 | ip = None 14 | 15 | def __init__(self, host, name="Collector"): 16 | self.host = host 17 | self.name = name 18 | self.log("Initiating %s at %s..." %(self.name, host)) 19 | 20 | assert self.testConnection(), "Connection to the Collector does not work!" 21 | 22 | def configureNetworking(self): 23 | self.log("Configuring networking...") 24 | 25 | ssh = self.init_ssh() 26 | 27 | time.sleep(0.5) 28 | 29 | ssh.sendline("./network_setup.sh") 30 | i = ssh.expect(["$", pexpect.TIMEOUT], timeout=10) 31 | assert i == 0, "Timeout while running network setup script!" 32 | 33 | #Check IP assignment 34 | ssh.sendline("ifconfig ens1f1np1") 35 | i = ssh.expect(["10.0.0.51", pexpect.TIMEOUT], timeout=2) 36 | assert i == 0, "Network interface failed to configure!" 37 | 38 | #Check one of the ARP rules 39 | ssh.sendline("arp 10.0.0.101") 40 | i = ssh.expect(["84:c7:8f:00:6d:b3", pexpect.TIMEOUT], timeout=2) 41 | assert i == 0, "ARP rules failed to update!" 42 | 43 | time.sleep(2) 44 | 45 | self.debug("Networking is set up.") 46 | 47 | def disabelICRCVerification(self): 48 | self.log("Disabling iCRC verification on the NIC...") 49 | 50 | ssh = self.init_ssh() 51 | 52 | ssh.sendline("./disable-icrc.sh") 53 | i = ssh.expect(["WARNING: this script assumes", pexpect.TIMEOUT], timeout=2) 54 | assert i == 0, "Failed to start the disable-icrc.sh script!" 55 | 56 | i = ssh.expect(["$", pexpect.TIMEOUT], timeout=10) 57 | assert i == 0, "Timeout while running icrc disabling script!" 58 | 59 | time.sleep(5) 60 | 61 | self.debug("iCRC verification is now disabled") 62 | 63 | def setupRDMA(self): 64 | self.log("Setting up RDMA...") 65 | 66 | ssh = self.init_ssh() 67 | 68 | ssh.sendline("./rdma/setup_rdma.sh") 69 | i = ssh.expect(["Removing old modules", pexpect.TIMEOUT], timeout=2) 70 | assert i == 0, "Failed to start setup_rdma.sh!" 71 | 72 | i = ssh.expect(["INFO System info file", pexpect.TIMEOUT], timeout=20) 73 | assert i == 0, "Timeout while setting up RDMA!" 74 | 75 | time.sleep(2) 76 | 77 | self.debug("RDMA is now set up and configured!") 78 | 79 | def recompileRDMA(self): 80 | self.log("Recompiling RDMA...") 81 | 82 | ssh = self.init_ssh() 83 | 84 | ssh.sendline("cd rdma/rdma-core") 85 | ssh.expect("$", timeout=2) 86 | 87 | ssh.sendline("sudo ./build.sh") 88 | i = ssh.expect(["Build files have been written to", pexpect.TIMEOUT], timeout=30) 89 | assert i == 0, "Failed to build rdma-core!" 90 | 91 | self.debug("RDMA-core is now built") 92 | 93 | ssh.sendline("cd ~/rdma/mlnx_ofed/MLNX_OFED_SRC-5.5-1.0.3.2") 94 | ssh.expect("$", timeout=2) 95 | 96 | ssh.sendline("sudo ./install.pl") 97 | i = ssh.expect(["Checking SW Requirements", pexpect.TIMEOUT], timeout=30) 98 | assert i == 0, "Failed to start the mlnx_ofed installation!" 99 | self.debug("mlnx_ofed is now installing prerequisites...") 100 | 101 | i = ssh.expect(["This program will install the OFED package on your machine.", pexpect.TIMEOUT], timeout=180) 102 | assert i == 0, "Stuck at installing prerequisites!" 103 | 104 | i = ssh.expect(["Uninstalling the previous version of OFED", pexpect.TIMEOUT], timeout=180) 105 | assert i == 0, "Something went wrong!" 106 | self.debug("Uninstalling the previous version of OFED...") 107 | 108 | i = ssh.expect(["Building packages", pexpect.TIMEOUT], timeout=300) 109 | assert i == 0, "Stuck at uninstalling old OFED!" 110 | self.debug("Building new OFED (this will take a while)...") 111 | 112 | i = ssh.expect(["Installation passed successfully", pexpect.TIMEOUT], timeout=1800) 113 | assert i == 0, "Installation failed or timed out!" 114 | 115 | self.debug("OFED is now reinstalled!") 116 | 117 | 118 | 119 | def compileCollector(self): 120 | self.log("Compiling the collector service...") 121 | 122 | ssh = self.init_ssh() 123 | 124 | ssh.sendline("cd ./rdma/playground") 125 | ssh.expect("$", timeout=2) 126 | 127 | ssh.sendline("mv ./collector_new ./collector_backup_new") 128 | ssh.expect("$", timeout=2) 129 | 130 | ssh.sendline("./compile.sh") 131 | i = ssh.expect(["Compiling DTA Collector...", pexpect.TIMEOUT], timeout=10) 132 | assert i == 0, "Compilation did not start!" 133 | 134 | i = ssh.expect(["Compilation done", pexpect.TIMEOUT], timeout=20) 135 | assert i == 0, "Timeout while compiling collector service!" 136 | 137 | ssh.sendline("ls -l") 138 | i = ssh.expect(["collector_new", pexpect.TIMEOUT], timeout=5) 139 | assert i == 0, "Failed to compile collector service!" 140 | 141 | self.debug("Compilation finished") 142 | 143 | def killOldCollector(self): 144 | self.debug("Killing old collectors, if any are running") 145 | ssh = self.init_ssh() 146 | ssh.sendline("sudo killall collector_new") 147 | i = ssh.expect(["$", pexpect.TIMEOUT], timeout=10) 148 | assert i == 0, "Timeout while killing old collector service(s)!" 149 | 150 | time.sleep(2) 151 | 152 | def setupCollector(self): 153 | self.log("Setting up the collector...") 154 | 155 | self.killOldCollector() 156 | 157 | self.setupRDMA() 158 | #self.compileCollector() 159 | self.configureNetworking() 160 | self.disabelICRCVerification() 161 | self.setupRDMA() 162 | 163 | def startCollector(self): 164 | self.log("Starting the DTA collector service...") 165 | 166 | self.ssh_collector = self.init_ssh() 167 | 168 | self.debug("Starting the service...") 169 | self.ssh_collector.sendline("sudo ./rdma/playground/collector_new") 170 | i = self.ssh_collector.expect(["Press ENTER to analyze storage.", pexpect.TIMEOUT], timeout=10) 171 | assert i == 0, "Failed to start the DTA collector!" 172 | 173 | 174 | time.sleep(3) #Give time for various primitive threads to complete 175 | 176 | i = self.ssh_collector.expect(["Segmentation fault", pexpect.TIMEOUT], timeout=2) 177 | assert i == 1, "The collector returned a segfault during startup!" 178 | 179 | time.sleep(2) 180 | 181 | self.log("DTA collector service is now running") 182 | 183 | def verifyRDMAConnections(self): 184 | self.log("Verifying RDMA connections from the translator") 185 | 186 | 187 | self.ssh_collector.sendline("") #Send an enter to collector service 188 | numStructures = 0 189 | while True: 190 | i = self.ssh_collector.expect(["Printing RDMA info for ", pexpect.TIMEOUT], timeout=5) 191 | if i == 0: 192 | numStructures += 1 193 | self.debug("Found output for DTA structure. Total %i" %numStructures) 194 | else: 195 | break 196 | 197 | self.log("There seems to be %i active DTA structures" %numStructures) 198 | assert numStructures > 0, "No RDMA connections were detected at the collector!" 199 | 200 | def ui_menu(self): 201 | self.log("Entering menu") 202 | 203 | while True: 204 | print("1: \t(B)ack") 205 | print("2: \t(R)eboot") 206 | print("3: \tRe(c)ompile DTA collector") 207 | print("4: \t(S)tart collector service") 208 | print("5: \tS(e)tup RDMA") 209 | print("6: \tConfigure (n)etworking") 210 | print("7: \t(D)isable iCRC verification") 211 | 212 | option = input("Selection: ").lower() 213 | 214 | if option in ["b", "1"]: 215 | break 216 | 217 | if option in ["r","2"]: 218 | self.reboot() 219 | 220 | if option in ["c","3"]: 221 | self.compileCollector() 222 | 223 | if option in ["s","4"]: 224 | self.startCollector() 225 | 226 | if option in ["e","5"]: 227 | self.setupRDMA() 228 | 229 | if option in ["n","6"]: 230 | self.configureNetworking() 231 | 232 | if option in ["d","7"]: 233 | self.disabelICRCVerification() 234 | 235 | 236 | return 237 | -------------------------------------------------------------------------------- /Manager/Generator.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | #This script handled communication with the generator 3 | 4 | import pexpect 5 | import time 6 | import datetime 7 | 8 | from common import strToPktrate 9 | from Machine import Machine 10 | 11 | class Generator(Machine): 12 | interface = None 13 | ip = None 14 | 15 | ssh_trex = None 16 | ssh_trexConsole = None 17 | 18 | def __init__(self, host, name="Generator"): 19 | self.host = host 20 | self.name = name 21 | self.log("Initiating %s at %s..." %(self.name, host)) 22 | 23 | assert self.testConnection(), "Connection to the Generator does not work!" 24 | 25 | def configureNetworking(self): 26 | self.log("Configuring networking...") 27 | 28 | ssh = self.init_ssh() 29 | 30 | ssh.sendline("./network_setup.sh") 31 | i = ssh.expect(["$", pexpect.TIMEOUT], timeout=10) 32 | assert i == 0, "Timeout while running network setup script!" 33 | 34 | #Check IP assignment (disabled, dpdk will remove this interface) 35 | #ssh.sendline("ifconfig ens2f0") 36 | #i = ssh.expect(["10.0.0.200", pexpect.TIMEOUT], timeout=2) 37 | #assert i == 0, "Network interface failed to configure!" 38 | 39 | #Check one of the ARP rules 40 | #ssh.sendline("arp 10.0.0.51") 41 | #i = ssh.expect(["b8:ce:f6:d2:12:c7", pexpect.TIMEOUT], timeout=2) 42 | #assert i == 0, "ARP rules failed to update!" 43 | 44 | self.log("Networking is set up.") 45 | 46 | def startTrex(self): 47 | self.log("Starting TReX...") 48 | 49 | self.ssh_trex = self.init_ssh() 50 | self.ssh_trex.sendline("cd ./generator/trex") 51 | i = self.ssh_trex.expect("$", timeout=5) 52 | 53 | self.log("Launching trex service") 54 | self.ssh_trex.sendline("sudo ./t-rex-64 -i -c 16") 55 | i = self.ssh_trex.expect(["Starting Scapy server", pexpect.TIMEOUT], timeout=10) 56 | assert i == 0, "Trex does not respond!" 57 | 58 | i = self.ssh_trex.expect(["Global stats enabled", pexpect.TIMEOUT], timeout=30) 59 | assert i == 0, "Trex start timed out!" 60 | 61 | 62 | 63 | 64 | def startTrexConsole(self): 65 | self.log("Starting TReX Console...") 66 | self.ssh_trexConsole = self.init_ssh() 67 | 68 | self.ssh_trexConsole.sendline("cd ./generator/trex") 69 | i = self.ssh_trexConsole.expect("$", timeout=2) 70 | 71 | self.ssh_trexConsole.sendline("./trex-console") 72 | i = self.ssh_trexConsole.expect(["Server Info", pexpect.TIMEOUT], timeout=5) 73 | assert i == 0, "Console does not launch!" 74 | 75 | self.ssh_trexConsole.sendline("./trex-console") 76 | i = self.ssh_trexConsole.expect(["trex>", pexpect.TIMEOUT], timeout=10) 77 | assert i == 0, "Console timed out!" 78 | 79 | self.log("TReX Console is running!") 80 | 81 | time.sleep(2) 82 | 83 | def setup(self): 84 | self.log("Setting up the generator") 85 | 86 | self.configureNetworking() 87 | self.startTrex() 88 | self.startTrexConsole() 89 | 90 | #TODO: make this check Tofino rate-show! 91 | def findCurrentRate(self): 92 | trafficFlowing = False 93 | for i in range(10): 94 | 95 | time.sleep(2) 96 | 97 | #Clear the output buffer 98 | self.ssh_trex.read_nonblocking(1000000000, timeout = 1) 99 | i = self.ssh_trex.expect(["Total-PPS : 0.00 pps", pexpect.TIMEOUT], timeout=1) 100 | 101 | if i == 1: 102 | trafficFlowing = True 103 | break 104 | else: 105 | self.log("No traffic yet...") 106 | 107 | if not trafficFlowing: 108 | return 0 109 | 110 | #assert trafficFlowing, "TReX does not actually generate traffic!" 111 | 112 | #Retrieve reported packet rate 113 | self.ssh_trex.expect("Total-PPS", timeout=3) 114 | rate_str = self.ssh_trex.readline() 115 | rate_str = rate_str.decode('ascii').replace(" ", "").replace(":", "").replace("\r\n", "") 116 | self.debug("The reported rate is: %s" %rate_str) 117 | 118 | return strToPktrate(rate_str) 119 | 120 | def waitForSpeed(self, speed_target, error_target=0.1): 121 | #Check in trex daemon that traffic is actually generating 122 | self.log("Checking in TReX daemon that traffic is flowing...") 123 | 124 | #Wait for the traffic to be in the correct range 125 | rate_target = strToPktrate(speed_target) 126 | for i in range(10): 127 | time.sleep(2) 128 | rate = self.findCurrentRate() 129 | 130 | error = 1 - rate/rate_target 131 | 132 | self.debug("We generate %s pps, the target is %s pps" %(str(rate), str(rate_target))) 133 | 134 | if error < error_target: 135 | self.debug("The speed error is acceptable: %s" %str(error)) 136 | break 137 | 138 | self.debug("The speed error is too large: %s" %str(error)) 139 | 140 | assert error < error_target, "The traffic rate error is too large!" 141 | 142 | self.log("Traffic is flowing correctly!") 143 | 144 | #Start STL-based replays 145 | def startTraffic_script(self, script="stl/dta_keywrite_basic.py", speed="1kpps", tuneables=""): 146 | self.log("Starting traffic generation script %s at speed %s" %(script, speed)) 147 | 148 | cmd = "start -f %s -m %s -t %s" %(script, speed, tuneables) 149 | print(cmd) 150 | self.ssh_trexConsole.sendline(cmd) 151 | i = self.ssh_trexConsole.expect(["Starting traffic on port", "are active - please stop them or specify", pexpect.TIMEOUT], timeout=10) 152 | 153 | if i == 1: 154 | self.error("Can't start traffic, already running!") 155 | 156 | assert i != 2, "Traffic generation start timed out!" 157 | 158 | self.waitForSpeed(speed) #Wait until the target rate is achieved 159 | 160 | #Push Marple PCAP 161 | def startTraffic_pcap(self, pcap, speed="1kpps"): 162 | self.log("Replaying pcap %s at speed %s" %(pcap, speed)) 163 | 164 | rate = strToPktrate(speed) 165 | print("rate", rate) 166 | 167 | ipg = 1000000/rate 168 | print("ipg", ipg) 169 | 170 | cmd = "push --force -f %s -p 0 -i %f -c 0 --dst-mac-pcap" %(pcap, ipg) 171 | 172 | print(cmd) 173 | self.ssh_trexConsole.sendline(cmd) 174 | i = self.ssh_trexConsole.expect(["Starting traffic on port", "are active - please stop them or specify", pexpect.TIMEOUT], timeout=10) 175 | 176 | if i == 1: 177 | self.error("Can't start traffic, already running!") 178 | 179 | assert i != 2, "Traffic generation start timed out!" 180 | 181 | self.waitForSpeed(speed) #Wait until the target rate is achieved 182 | 183 | def startTraffic_keywrite(self, speed="1kpps", redundancy=4): 184 | self.log("Replaying KeyWrite traffic at redundancy %i and speed %s" %(redundancy, speed)) 185 | 186 | tuneables = "--redundancy %i" %(redundancy) 187 | self.startTraffic_script(script="stl/dta_keywrite_basic.py", speed=speed, tuneables=tuneables) 188 | 189 | def startTraffic_keyincrement(self, speed="1kpps", redundancy=4): 190 | self.log("Replaying KeyIncrement traffic at redundancy %i and speed %s" %(redundancy, speed)) 191 | 192 | tuneables = "--redundancy %i" %(redundancy) 193 | self.startTraffic_script(script="stl/dta_keyincrement_basic.py", speed=speed, tuneables=tuneables) 194 | 195 | def startTraffic_append(self, speed="1kpps"): 196 | self.log("Replaying Append traffic at speed %s" %(speed)) 197 | 198 | tuneables = "" 199 | self.startTraffic_script(script="stl/dta_append_basic.py", speed=speed, tuneables=tuneables) 200 | 201 | def startTraffic_marple(self): 202 | speed = input("Speed (e.g., 1mpps): ") 203 | pcap = "/home/jlanglet/generator/marple_dta.pcap" 204 | 205 | self.startTraffic_pcap(pcap, speed) 206 | 207 | def stopTraffic(self): 208 | self.log("Stopping traffic generation") 209 | 210 | self.ssh_trexConsole.sendline("stop") 211 | i = self.ssh_trexConsole.expect(["Stopping traffic on port", "no active ports", pexpect.TIMEOUT], timeout=5) 212 | if i == 1: 213 | self.error("No traffic is playing! Nothing to stop") 214 | 215 | assert i != 2, "Traffic stop timed out!" 216 | 217 | def ui_startTraffic(self): 218 | speed = input("Speed (e.g., 1mpps): ") 219 | 220 | #Configure and run primitive 221 | while True: 222 | 223 | primitive = input("Primitive (keywrite, append, keyincrement): ") 224 | 225 | if primitive == "keywrite": 226 | redundancy = int(input("Redundancy: ")) 227 | self.startTraffic_keywrite(speed=speed, redundancy=redundancy) 228 | 229 | elif primitive == "append": 230 | self.startTraffic_append(speed=speed) 231 | 232 | elif primitive == "keyincrement": 233 | redundancy = int(input("Redundancy: ")) 234 | self.startTraffic_keyincrement(speed=speed, redundancy=redundancy) 235 | 236 | 237 | else: 238 | print("Invalid choice") 239 | continue 240 | 241 | print("Started") 242 | 243 | break 244 | 245 | def ui_menu(self): 246 | self.log("Entering menu") 247 | 248 | while True: 249 | print("1: \t(B)ack") 250 | print("2: \t(S)tart traffic (script)") 251 | print("3: \tStart (M)arple traffic") 252 | print("4: \tS(t)op traffic") 253 | print("5: \t(K)ill console") 254 | print("6: \t(R)eboot") 255 | 256 | option = input("Selection: ").lower() 257 | 258 | if option in ["b", "1"]: 259 | break 260 | 261 | if option in ["s","2"]: 262 | self.ui_startTraffic() 263 | 264 | if option in ["m","3"]: 265 | self.startTraffic_marple() 266 | 267 | if option in ["t","4"]: 268 | self.stopTraffic() 269 | 270 | if option in ["k","5"]: 271 | self.log("Removing reference of TReX console!") 272 | self.ssh_trexConsole = None 273 | 274 | if option in ["r","5"]: 275 | self.reboot() 276 | 277 | -------------------------------------------------------------------------------- /Manager/Machine.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | #This script contains the Machine class, with various functions shared between the components 3 | import time 4 | import pexpect 5 | 6 | from common import log 7 | 8 | class Machine: 9 | host = None 10 | name = None 11 | 12 | def log(self, text): 13 | log("%s: \t%s" %(self.name, text)) 14 | 15 | #High verbosity output 16 | def debug(self, text): 17 | self.log(" Debug: %s" %(text)) 18 | 19 | def error(self, text): 20 | self.log("ERROR: %s" %(text)) 21 | 22 | def reboot(self): 23 | self.log("Rebooting %s at %s" %(self.name, self.host)) 24 | ssh = self.init_ssh() 25 | ssh.sendline("sudo reboot") 26 | i = ssh.expect([pexpect.EOF, pexpect.TIMEOUT], timeout=10) 27 | assert i == 0, "Failed to detect reboot!" 28 | 29 | time.sleep(1) 30 | 31 | def testConnection(self): 32 | self.debug("Testing connection to %s at %s..." %(self.name, self.host)) 33 | 34 | p = pexpect.spawn("ssh %s" %self.host) 35 | i = p.expect(["Welcome to Ubuntu", "Connection refused", pexpect.TIMEOUT, pexpect.EOF], timeout=5) 36 | if i == 0: 37 | self.debug("Logged into %s" %self.host) 38 | elif i == 1: 39 | self.debug("Connection refused!") 40 | return False 41 | elif i == 2: 42 | self.debug("Connection timeout!") 43 | return False 44 | elif i == 3: 45 | self.debug("SSH terminates!") 46 | return False 47 | 48 | self.debug("Verifying command capability...") 49 | content = "Testing" 50 | p.sendline("echo \"%s\" > ssh_works" %content) 51 | p.expect("$") 52 | p.sendline("cat ssh_works") 53 | i = p.expect([content, pexpect.TIMEOUT], timeout=2) 54 | if i != 0: 55 | self.error("Did not find expected output!") 56 | return False 57 | 58 | self.debug("Commands work! Resetting and logging out...") 59 | 60 | p.sendline("rm ssh_works") 61 | p.expect("$") 62 | p.sendline("exit") 63 | time.sleep(1) 64 | 65 | p.close() 66 | 67 | return True 68 | 69 | def init_ssh(self): 70 | self.debug("Logging into %s at %s..." %(self.name, self.host)) 71 | p = pexpect.spawn("ssh %s" %self.host) 72 | i = p.expect(["$", pexpect.TIMEOUT], timeout=5) 73 | 74 | if i != 0: 75 | self.error("Timeout!") 76 | return None 77 | 78 | self.debug("SSH to %s is initiated" %self.host) 79 | 80 | return p 81 | -------------------------------------------------------------------------------- /Manager/Manager.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | #This script manages the entire DTA system. This is the script you want to run. 3 | 4 | import time 5 | 6 | #from common import log, debug, strToPktrate 7 | from Tofino import Tofino 8 | from Collector import Collector 9 | from Generator import Generator 10 | 11 | host_tofino = "jonatan@138.37.32.13" #Point to the Tofino 12 | host_collector = "jlanglet@138.37.32.24" #Point to the collector 13 | host_generator = "jlanglet@138.37.32.28" #Point to the traffic generator 14 | 15 | 16 | def setup(do_reboot=False, manual_collector=True): 17 | #Reboot the machines and wait for them to come back online 18 | if do_reboot: 19 | systems = [tofino, collector, generator] 20 | 21 | #Reboot all 22 | for system in systems: 23 | system.reboot() 24 | 25 | #Wait for all to come online 26 | for system in systems: 27 | while not system.testConnection(): 28 | print("%s is offline..." %system.name) 29 | time.sleep(20) 30 | 31 | 32 | tofino.flashPipeline() 33 | tofino.confPorts() 34 | 35 | tofino.configureNetworking() 36 | collector.setupCollector() 37 | 38 | if manual_collector: 39 | print("sudo /home/jlanglet/rdma/playground/collector_new") 40 | input("Start the DTA collector and press ENTER") 41 | else: 42 | collector.startCollector() 43 | 44 | 45 | tofino.startController() 46 | 47 | if not manual_collector: 48 | collector.verifyRDMAConnections() #Manually disabled 49 | 50 | generator.setup() 51 | 52 | def Menu(): 53 | print("1: \t(S)tart up DTA environment") 54 | print("2: \t(T)ofino menu") 55 | print("3: \t(C)ollector menu") 56 | print("4: \t(G)enerator menu") 57 | 58 | option = input("Selection: ").lower() 59 | 60 | if option in ["s","1"]: #Setup 61 | resp = input("Reboot machines? (y/N): ") 62 | do_reboot = resp == "y" 63 | resp = input("Start collector manually? (y/N): ") 64 | manual_collector = resp == "y" 65 | 66 | setup(do_reboot=do_reboot, manual_collector=manual_collector) 67 | 68 | if option in ["t","2"]: #Tofino 69 | tofino.ui_menu() 70 | 71 | if option in ["c","3"]: #Collector 72 | collector.ui_menu() 73 | 74 | if option in ["g","4"]: #Generator 75 | generator.ui_menu() 76 | 77 | 78 | #Set up connection to machines 79 | tofino = Tofino(host=host_tofino, pipeline="dta_translator") 80 | collector = Collector(host=host_collector) 81 | generator = Generator(host=host_generator) 82 | 83 | #Loop the menu 84 | while True: 85 | Menu() 86 | -------------------------------------------------------------------------------- /Manager/README.md: -------------------------------------------------------------------------------- 1 | # DTA - Manager 2 | This directory contains code for the Direct Telemetry Access manager. 3 | 4 | The manager is a central computer who is in charge of setting up, configuring, and running the tests. 5 | This is meant to automate and simplify running DTA tests, and can be used as a guide on how the components are set up to work together 6 | 7 | ## Configuration 8 | At the moment, the management scripts contain a lot of hard-coded hostnames, paths, etc. 9 | Please iterate over the scripts and update these to match your testbed. 10 | 11 | TODO: simplify by making this into a configuration file. 12 | 13 | ## Usage 14 | You interact with the manager through a simple CLI menu. 15 | Please just launch the manager on a machine (with connectivity to the machines in the testbed) through `./Manager.py`, and use the menu to set up and test DTA. 16 | 17 | TODO: write a guide of example meny actions, and expected outputs. Also explain how they can manually start the collector if they want to inspect the data structures. 18 | -------------------------------------------------------------------------------- /Manager/Tofino.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | #This script prepares and launches a DTA pipeline on the Tofino switch, and prepares RDMA states 3 | #Assuming our setup, and SDE 9.7.0 4 | 5 | 6 | 7 | import pexpect 8 | import time 9 | import datetime 10 | 11 | from Machine import Machine 12 | 13 | #This is currently forced to only be the Translator Tofino 14 | class Tofino(Machine): 15 | pipeline = None 16 | port_config = None 17 | essential_ports = None 18 | 19 | ssh_switchd = None 20 | ssh_controller = None 21 | 22 | def __init__(self, host, pipeline, name="Tofino"): 23 | self.host = host 24 | self.name = name 25 | self.pipeline = pipeline 26 | 27 | self.log("Initiating %s at %s..." %(self.name, host)) 28 | 29 | #TODO: Move these centralized into some config file 30 | self.port_config = [ 31 | "pm port-del -/-", 32 | "pm port-add 49/0 100G rs", 33 | "pm port-add 55/0 100G rs", 34 | "pm port-add 57/0 10G none", 35 | "pm port-add 57/1 10G none", 36 | "pm an-set 49/0 1", 37 | "pm an-set 55/0 1", 38 | "pm an-set 57/0 1", 39 | "pm an-set 57/1 1", 40 | "pm port-enb -/-", 41 | "bf_pltfm led", 42 | "led-task-cfg -r 1", 43 | "..", 44 | ".." 45 | ] 46 | self.essential_ports = [ 47 | "49/0", 48 | "55/0", 49 | "57/0", 50 | "57/1", 51 | ] 52 | 53 | assert self.testConnection(), "Connection to the Tofino does not work!" 54 | 55 | #This is currently forced to just compile the Translator pipeline 56 | def compilePipeline(self, enable_nack_tracking=True, num_tracked_nacks=65536, append_batch_size=4, resync_grace_period=100000, max_supported_qps=256): 57 | 58 | project = "dta_translator" 59 | p4_file = "~/projects/dta/translator/p4src/dta_translator.p4" 60 | 61 | self.log("Compiling project %s from source %s..." %(project, p4_file)) 62 | 63 | assert append_batch_size in [1,2,4,8,16], "Unsupported Append batch size" 64 | 65 | 66 | ssh = self.init_ssh() 67 | 68 | # 69 | # Generate compilation command 70 | # 71 | preprocessor_directives = "" 72 | #Nack tracking/retransmission 73 | if enable_nack_tracking: 74 | preprocessor_directives += " -DDO_NACK_TRACKING" 75 | preprocessor_directives += " -DNUM_TRACKED_NACKS=%i" %num_tracked_nacks 76 | 77 | #Append batch size (num batched entries) 78 | preprocessor_directives += " -DAPPEND_BATCH_SIZE=%i" %append_batch_size 79 | preprocessor_directives += " -DNUM_APPEND_ENTRIES_IN_REGISTERS=%i" %(append_batch_size-1) 80 | preprocessor_directives += " -DAPPEND_RDMA_PAYLOAD_SIZE=%i" %(append_batch_size*4) 81 | 82 | #Grace period 83 | preprocessor_directives += " -DQP_RESYNC_PACKET_DROP_NUM=%i" %(resync_grace_period) 84 | 85 | #Max supported queue pairs 86 | preprocessor_directives += " -DMAX_SUPPORTED_QPS=%i" %(max_supported_qps) 87 | 88 | #Build actual compilation command out of components 89 | command = "bf-p4c --target tofino --arch tna --std p4-16 -g -o $P4_BUILD_DIR/%s/ %s %s && echo Compilation\ finished" %(project, preprocessor_directives, p4_file) 90 | 91 | self.debug("Executing '%s'..." %command) 92 | 93 | ssh.sendline(command) 94 | i = ssh.expect(["Compilation finished", "error:", pexpect.TIMEOUT], timeout=180) 95 | 96 | if i == 1: 97 | self.error("Compilation error!") 98 | elif i == 2: 99 | self.error("Compilation timeout!") 100 | 101 | assert i == 0, "Pipeline compilation failed!" 102 | 103 | self.debug("Compilation done!") 104 | 105 | 106 | def flashPipeline(self): 107 | self.ssh_switchd = self.init_ssh() 108 | 109 | #Killing old process (if one is running) 110 | self.ssh_switchd.sendline("sudo killall bf_switchd") 111 | self.ssh_switchd.expect("$", timeout=4) 112 | 113 | 114 | #Flash the pipeline 115 | self.log("Flashing pipeline %s at %s" %(self.pipeline, self.host)) 116 | self.ssh_switchd.sendline("./start_p4.sh %s" %self.pipeline) 117 | 118 | i = self.ssh_switchd.expect(["Using SDE_INSTALL", pexpect.TIMEOUT], timeout=5) 119 | assert i == 0, "Failing to initiate pipeline statup!" 120 | 121 | self.debug("Pipeline is flashing...") 122 | 123 | i = self.ssh_switchd.expect(["WARNING: Authorised Access Only", pexpect.TIMEOUT], timeout=10) 124 | assert i == 0, "Failed to flash the pipeline!" 125 | self.debug("Pipeline '%s' is now running on host %s!" %(self.pipeline, self.host)) 126 | 127 | def initBFshell(self): 128 | ssh = self.init_ssh() 129 | 130 | self.debug("Entering bfshell...") 131 | ssh.sendline("bfshell") 132 | i = ssh.expect(["WARNING: Authorised Access Only", pexpect.TIMEOUT], timeout=10) 133 | assert i == 0, "Failed to enter bfshell!" 134 | self.debug("bfshell established!") 135 | 136 | return ssh 137 | 138 | def initUCLI(self): 139 | ssh = self.initBFshell() 140 | 141 | self.debug("Entering ucli...") 142 | ssh.sendline("ucli") 143 | i = ssh.expect(["bf-sde", pexpect.TIMEOUT], timeout=5) 144 | assert i == 0, "Failed to enter ucli!" 145 | 146 | return ssh 147 | 148 | #This assumes that ports are already configured 149 | def verifyPorts(self): 150 | self.log("Verifying that ports are online...") 151 | 152 | ssh_ucli = self.initUCLI() 153 | 154 | ssh_ucli.sendline("pm") 155 | ssh_ucli.expect("bf-sde.pm>", timeout=2) 156 | 157 | for port in self.essential_ports: 158 | self.debug("Checking port %s..." %port) 159 | 160 | portUp = False 161 | for i in range(10): 162 | ssh_ucli.sendline("show %s" %port) 163 | i = ssh_ucli.expect([port, pexpect.TIMEOUT], timeout=10) 164 | assert i == 0, "Port %s was not configured!" %port 165 | 166 | i = ssh_ucli.expect(["UP", "DWN", pexpect.TIMEOUT], timeout=10) 167 | 168 | assert i != 2, "Timeout when checking port status!" 169 | if i == 1: 170 | self.debug("Port %s is down..." %port) 171 | time.sleep(5) 172 | continue 173 | elif i == 0: 174 | self.debug("Port %s is up!" %port) 175 | portUp = True 176 | break 177 | assert portUp, "Port %s did not come alive! Is the host connected and online?" %port 178 | 179 | 180 | self.debug("Ports are configured and ready for action!") 181 | 182 | #This assumes that a pipeline is already flashed, and a switchd session is running 183 | def confPorts(self): 184 | self.log("Configuring Tofino ports on %s..." %self.host) 185 | 186 | ssh_ucli = self.initUCLI() 187 | 188 | for cmd in self.port_config: 189 | self.debug(" > %s" %cmd) 190 | ssh_ucli.sendline(cmd) 191 | time.sleep(0.1) 192 | 193 | i = ssh_ucli.expect(["bf-sde.bf_pltfm.led", pexpect.TIMEOUT], timeout=5) 194 | assert i == 0, "Failed to enter port config commands!" 195 | 196 | time.sleep(1) #Give configuration time to trigger 197 | self.debug("Ports are now configured.") 198 | 199 | self.verifyPorts() 200 | 201 | def configureNetworking(self): 202 | self.log("Configuring networking...") 203 | 204 | ssh = self.init_ssh() 205 | 206 | ssh.sendline("./network_setup.sh") 207 | i = ssh.expect(["$", pexpect.TIMEOUT], timeout=10) 208 | assert i == 0, "Timeout while running network setup script!" 209 | 210 | #Check an IP assignment 211 | ssh.sendline("ifconfig enp4s0f0") 212 | i = ssh.expect(["10.0.0.101", pexpect.TIMEOUT], timeout=2) 213 | assert i == 0, "Network interface failed to configure!" 214 | 215 | #Check one of the ARP rules 216 | ssh.sendline("arp 10.0.0.51") 217 | i = ssh.expect(["b8:ce:f6:d2:12:c7", pexpect.TIMEOUT], timeout=2) 218 | assert i == 0, "ARP rules failed to update!" 219 | 220 | self.debug("Networking is set up.") 221 | 222 | def startController(self): 223 | self.log("Starting the controller...") 224 | 225 | #TODO: Make this into a parameter or dynamic depending on pipeline 226 | #file_script = "/home/jonatan/projects/dta/translator/switch_cpu.py" 227 | 228 | self.ssh_controller = self.init_ssh() 229 | 230 | self.ssh_controller.sendline("$SDE/run_bfshell.sh -b /home/jonatan/projects/dta/translator/switch_cpu.py -i") 231 | 232 | i = self.ssh_controller.expect(["Using SDE_INSTALL", pexpect.TIMEOUT], timeout=5) 233 | assert i == 0, "bfshell failed to start!" 234 | 235 | i = self.ssh_controller.expect(["DigProc: Starting", pexpect.TIMEOUT], timeout=5) 236 | assert i == 0, "Controller script failed to start!" 237 | self.debug("Controller script is starting...") 238 | 239 | #TODO: add checks that we hear back from the collector RDMA NIC here! 240 | 241 | i = self.ssh_controller.expect(["Inserting KeyWrite rules...", pexpect.TIMEOUT], timeout=5) 242 | assert i == 0, "Timeout waiting for keywrite preparation!" 243 | self.debug("Controller is configuring KeyWrite...") 244 | 245 | 246 | numConnections = 0 247 | while True: 248 | i = self.ssh_controller.expect(["DigProc: Setting up a new RDMA connection from virtual client...", pexpect.TIMEOUT], timeout=10) 249 | if i == 0: 250 | numConnections += 1 251 | self.debug("An RDMA connection is establishing at translator. Total %i" %numConnections) 252 | else: 253 | break 254 | 255 | self.log("There seems to be %i RDMA connections established at the translator" %numConnections) 256 | assert numConnections > 0, "No RDMA connections were detected at the translator!" 257 | 258 | i = self.ssh_controller.expect(["DigProc: Bootstrap complete", pexpect.TIMEOUT], timeout=60) 259 | assert i == 0, "Timeout waiting for controller to finish!" 260 | self.log("Controller bootstrap finished!") 261 | 262 | def ui_compilePipeline(self): 263 | self.log("Menu for compiling Translator pipeline.") 264 | 265 | #enable_nack_tracking 266 | resp = input("Enable NACK tracking? (Y/n): ") 267 | enable_nack_tracking = resp != "n" 268 | 269 | #num_tracked_nacks 270 | num_tracked_nacks = 0 271 | if enable_nack_tracking: 272 | resp = input("Num tracked NACKs? (Def:65536): ") 273 | if resp == "": 274 | num_tracked_nacks = 65536 275 | else: 276 | num_tracked_nacks = int(resp) 277 | 278 | #append_batch_size 279 | resp = input("Size of Append batches? (Def:4): ") 280 | if resp == "": 281 | append_batch_size = 4 282 | else: 283 | append_batch_size = int(resp) 284 | 285 | #resync_grace_period 286 | resp = input("Resync grace period? (Def:100000): ") 287 | if resp == "": 288 | resync_grace_period = 100000 289 | else: 290 | resync_grace_period = int(resp) 291 | 292 | #max_supported_qps 293 | resp = input("Number of supported QPs? (Def:256): ") 294 | if resp == "": 295 | max_supported_qps = 256 296 | else: 297 | max_supported_qps = int(resp) 298 | 299 | self.compilePipeline(enable_nack_tracking=enable_nack_tracking, num_tracked_nacks=num_tracked_nacks, append_batch_size=append_batch_size, resync_grace_period=resync_grace_period, max_supported_qps=max_supported_qps) 300 | 301 | def ui_menu(self): 302 | self.log("Entering menu") 303 | 304 | while True: 305 | print("1: \t(B)ack") 306 | print("2: \t(C)ompile pipeline") 307 | print("3: \t(F)lash pipeline") 308 | print("4: \tConfigure (p)orts") 309 | print("5: \tConfigure (n)etworking") 310 | print("6: \t(S)tart controller") 311 | print("7: \t(R)eboot") 312 | 313 | option = input("Selection: ").lower() 314 | 315 | if option in ["b", "1"]: 316 | break 317 | 318 | if option in ["c","2"]: 319 | self.ui_compilePipeline() 320 | 321 | if option in ["f","3"]: 322 | self.flashPipeline() 323 | 324 | if option in ["p","4"]: 325 | self.confPorts() 326 | 327 | if option in ["n","5"]: 328 | self.configureNetworking() 329 | 330 | if option in ["s","6"]: 331 | self.startController() 332 | 333 | if option in ["r","7"]: 334 | self.reboot() 335 | 336 | -------------------------------------------------------------------------------- /Manager/common.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | #This contains various helper functions 3 | 4 | import pexpect 5 | import time 6 | import datetime 7 | import re 8 | 9 | def getTime(): 10 | return datetime.datetime.now() 11 | 12 | def log(text): 13 | timestamp_str = getTime() 14 | fulltext = "%s\t %s" %(timestamp_str, text) 15 | 16 | print(fulltext) 17 | 18 | #Converting inputs like 966.95kpps or 1MPPS to float with raw PPS 19 | def strToPktrate(rate_str): 20 | rate_str = rate_str.lower() 21 | 22 | number = float(re.findall("[0-9]+\.[0-9]+|[0-9]+", rate_str)[0]) 23 | 24 | order = str(re.findall("[km]?pps", rate_str)[0]) 25 | 26 | #print(number) 27 | #print(order) 28 | #print() 29 | 30 | if order == "pps": 31 | return number 32 | elif order == "kpps": 33 | return number*1000 34 | elif order == "mpps": 35 | return number*1000000 36 | 37 | 38 | return None 39 | -------------------------------------------------------------------------------- /Overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jonlanglet/DTA/6da6cb5e13b53bd6155ee67617827d2b499ac906/Overview.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Direct Telemetry Access (DTA) 2 | ![Overview](Overview.png) 3 | 4 | This repository contains the code for Direct Telemetry Access. 5 | 6 | Direct Telemetry Access is a peer-reviewed system for high-speed telemetry collection, capable of line-rate report ingestion. 7 | 8 | The paper is available here: [ACM SIGCOMM](https://dl.acm.org/doi/10.1145/3603269.3604827) / [arXiv](https://arxiv.org/abs/2202.02270) / [langlet.io](https://langlet.io/assets/papers/DTA_SIGCOMM.pdf). 9 | 10 | ## Overview of Components 11 | DTA is a system consisting of several components, each in their own directories. 12 | 13 | ### Reporter 14 | [Reporter/](Reporter/) is a DTA reporter switch. 15 | This switch can generate telemetry reports through DTA. 16 | 17 | ### Translator 18 | [Translator/](Translator/) is a DTA translator switch. 19 | This switch will intercept DTA reports and convert these into RDMA traffic. 20 | It is in charge of establishing and managing RDMA queue-pairs with the collector server. 21 | 22 | ### Collector 23 | [Collector/](Collector/) contains files for the DTA collector. 24 | This component will reside on the collector server, and will host the in-memory data aggregation structures that the translator will write telemetry reports into. 25 | 26 | ### Generator 27 | [Generator/](Generator/) contains files for the TReX traffic generator. 28 | 29 | ### Manager 30 | [Manager/](Manager/) is a set of automation scripts for DTA that handles testbed setup and configuration by connecting to and running commands on the various DTA components. 31 | While the manager is not essential for DTA, it greatly simplifies tests while also indirectly acting as documentation for how to use the DTA system in this repository. 32 | 33 | 34 | ## Requirements 35 | 1. A fully installed and functional Tofino switch. 36 | 2. A server equipped with a RoCEv2-capable RDMA NIC, configured and ready for RDMA workloads. 37 | 3. Optional: one additional server to act as a traffic generator. 38 | 4. Cabling between the devices. 39 | 40 | ### NICs 41 | DTA likely works with most RoCEv2-capable rNICs where you can disable iCRC verification. 42 | 43 | It is so far confirmed to work with the following rNICs: 44 | - ConnectX-6 45 | - Bluefield-2 DPU (we used this) 46 | 47 | Please let me know if you have tried other NICs, and I will update the list. 48 | 49 | ### Testbed 50 | Our development/evaluation testbed was set up as follows: 51 | 52 | ![Testbed](Testbed.png) 53 | 54 | If you change the cabling, update the [Translator](Translator/) accordingly. 55 | 56 | ## Installation 57 | **The installation is complex. Make sure that you understand the components and workflow.** 58 | 59 | As previously mentioned, DTA consists of several components. A working translator and collector are the base essentials. 60 | Please refer to the individual component directoried for installation guides and tips. 61 | 62 | 1. Install the DTA [Collector](Collector/) **Essential** 63 | 2. Install the DTA [Translator](Translator/) **Essential** 64 | 3. Set up the traffic [Generator](Generator/) (Optional) 65 | 4. Set up the DTA [Manager](Manager/) (Optional) 66 | 5. Compile and install the DTA [Reporter](Reporter/) (Optional) 67 | 68 | ### Tofino setup 69 | Our DTA prototype is written for the Tofino-1 ASIC, specifically running SDE version 9.7. 70 | Newer SDE versions most likely to work just as well (possibly with minor tweaks to the translator code) 71 | 72 | 1. Install the SDE and BSP according to official documentation from Intel and the board manufacturer. 73 | 2. Verify that you can compile and launch P4 pipelines on the Tofino ASIC, and that you can successfully process network traffic. 74 | 3. Modify the translator P4 code to generate RDMA packets with correct MAC addresses for the NIC (function `ControlCraftRDMA` in file [dta_translator.p4](Translator/p4src/dta_translator.p4)) 75 | 4. **This step could prove difficult.** Modify the initial RDMA packets generated from the Translator CPU to be compatible with your network card (in file [init_rdma_connection.py](Translator/init_rdma_connection.py)), so that is can successfully establish new RDMA connections. I recommend establishing an RDMA connection to the collector NIC through normal means (using another machine) and dumping the first few packets to use as a template on how to establish an RDMA queue-pair. The current packets establish a queue-pair with our specific Mellanox Bluefield-2 DPU. 76 | 5. Update `--dir` value in init_rdma_connection.py and `metadata_dir` in switch.py to point to the same directory. This is where the RDMA metadata values (parsed from responses during the RDMA connection phase) are written. These values are later used to populate P4 M/A tables, required for generation of connection-specific RDMA packets from within the data plane 77 | 78 | See [Translator/](Translator/) for more information. 79 | 80 | ## Running DTA 81 | Once the DTA testbed is successfully set up, running it is relatively straightforward. We provide a set of automation scripts that could be useful, as well as a brief guide on how to do it manually. 82 | 83 | ### Using the DTA manager (automated) 84 | The DTA manager automates starting DTA and performing simple tests. 85 | Follow the guide in [Manager/](Manager/). 86 | 87 | ### Running DTA manually 88 | Basically, you can manually do the tasks that the manager does automatically. If you get stuck, please refer to the manager scripts for hints. 89 | 1. Start the [Collector](Collector/) 90 | 2. Start the [Translator](Translator/) 91 | 3. Replay DTA traffic to the translator (for example using a [traffic generator](Generator/)) 92 | 4. Analyze and print out the data structures at the collector (you should see how they are populated according to the DTA traffic intercepted by the translator). 93 | 94 | ## Integrating DTA into your telemetry system 95 | Integrate DTA into your telemetry data flows to benefit from improved collection performance. 96 | 97 | You need to update the telemetry-generating devices (reporters) to generate their telemetry reports with DTA headers (see [Reporter/](Reporter/) for an example). 98 | Additionally, you need to update your centralized collector(s) to register the telemetry-storing data structures with RDMA to allow the translator(s) to access these regions (see [Collector/](Collector/) for an example). 99 | 100 | It is also possible to craft new DTA primitives to better fit the specifics of your telemetry system. 101 | This could be a challenging process, but you can use our already implemented primitives as a reference on how to do this. 102 | 103 | 104 | ## Cite As 105 | Please cite our work as follows: 106 | 107 | ``` 108 | @inproceedings{langlet2023DTA, 109 | author = {Langlet, Jonatan and Ben Basat, Ran and Oliaro, Gabriele and Mitzenmacher, Michael and Yu, Minlan and Antichi, Gianni}, 110 | title = {Direct Telemetry Access}, 111 | year = {2023}, 112 | isbn = {9798400702365}, 113 | publisher = {Association for Computing Machinery}, 114 | address = {New York, NY, USA}, 115 | url = {https://doi.org/10.1145/3603269.3604827}, 116 | doi = {10.1145/3603269.3604827}, 117 | booktitle = {Proceedings of the ACM SIGCOMM 2023 Conference}, 118 | pages = {832–849}, 119 | numpages = {18}, 120 | keywords = {monitoring, telemetry collection, remote direct memory access}, 121 | location = {New York, NY, USA}, 122 | series = {ACM SIGCOMM '23} 123 | } 124 | ``` 125 | 126 | ## Need Help? 127 | This repository is a prototype to demonstrate the capabilities and feasibility of DTA. 128 | However, the installation is not streamlined. 129 | 130 | If you get stuck, please reach out to [Jonatan Langlet](https://langlet.io/) at `jonatan at langlet.io` and I will help out best I can. 131 | 132 | I am also open to collaborations on DTA-adjacent research. 133 | -------------------------------------------------------------------------------- /Reporter/README.md: -------------------------------------------------------------------------------- 1 | # DTA - Reporter 2 | This is an example telemetry-generating switch with DTA support. 3 | 4 | The reporter presented here generates telemetry postcards containing placeholder data, to be queryable in a key/value storage using the source IP address as key. 5 | 6 | Generation of a report is triggered both through change-detection, and for random packets even when a change is not detected. 7 | 8 | Ingress performs change detection and triggers generation of a new packet (to be used for report-creation). 9 | Egress transforms report-packets into DTA reports. 10 | 11 | ## Prerequisites 12 | You need a Tofino switch, fully installed and operational. 13 | We used the Barefoot SDE 9.7 during development and evaluation. 14 | 15 | ## Setup 16 | 1. Compile the Reporter pipeline [here](p4src/dta_reporter.p4) 17 | 18 | ## Runtime 19 | 1. Launch the compiled pipeline on the Tofino ASIC 20 | 2. Configure the ports 21 | 3. Launch the on-switch CPU component `$SDE/run_bfshell.sh -b