├── README.md ├── analyzer ├── Analyzer.java ├── Makefile ├── analyzer.sh ├── bottleneck.py ├── change-jstack.py ├── knot.sh ├── merge.py ├── short.py └── waitfor.csv ├── annotation ├── c │ ├── main.c │ └── thread.c └── java │ ├── Makefile │ ├── edu │ └── osu │ │ └── cse │ │ └── ops │ │ ├── UDS.class │ │ ├── UDS.java │ │ ├── UDSResult.class │ │ └── UDSResult.java │ ├── edu_osu_cse_ops_UDS.h │ ├── kerntool.c │ └── kerntool.h ├── module ├── 1prepare-java.sh ├── 1prepare.sh ├── Makefile ├── ioctl_perf.c └── ioctl_perf.h └── recorder ├── Makefile ├── cpufreq.sh ├── gdb.script ├── record-c.sh ├── record-java.sh ├── record.sh ├── recorder.c └── recorder.h /README.md: -------------------------------------------------------------------------------- 1 | ## wPerf: Generic Off-CPU Analysis to Identify Bottleneck Waiting Events 2 | 3 | This repository is the implementation of wPerf. 4 | There are several modules in this repo: 5 | - /annotation - Users can use the code to annotate waiting events which do not go through the kernel. 6 | - /analyzer - wPerf's analyzer to generate the wait-for graph. 7 | - /module - wPerf's kernel module to facilitate recording waiting events in kernel low level. 8 | - /record - wPerf's recorder. 9 | 10 | 11 | wPerf is designed to identify bottlenecks caused by all kinds of waiting events. 12 | To identify waiting events that limit the application’s throughput, wPerf uses cascaded re-distribution to compute 13 | the local impact of a waiting event and uses wait-for graph to compute whether such impact can reach other threads. 14 | 15 | 16 | Check our papers for more details: [wPerf: Generic Off-CPU Analysis to Identify Bottleneck Waiting Events](https://www.usenix.org/system/files/osdi18-zhou.pdf), OSDI 2018. 17 | 18 | ## Requirements: 19 | - [Kernel](http://www.kernel.org/): version 4.4 and newer. You also need to enable the KProbe and CONFIG_SCHEDSTATS features. 20 | 21 | We tested wPerf on several kernel versions: 4.4, 4.7, 4.9, and 4.13 and it worked well. For the kernel version below 4.4, once the mentioned features are enabled, it should work. 22 | - [Python 2](http://www.python.org/) 23 | 24 | You also need to install two python libraries: [PrettyTable](https://pypi.org/project/PrettyTable/) and [NetworkX](https://networkx.github.io/). 25 | - perf. perf should support dwarf call-graph profiling. 26 | - iostat: version >= 11.2.0, used to record disk IOPS. 27 | - ifstat: version >= 1.1, used to record NIC bandwidth. 28 | - gdb: connect thread name to thread id for c programs. 29 | - jstack: connect thread name to thread id for java programs. 30 | - [perf-map-agent](https://github.com/jvm-profiling-tools/perf-map-agent): For recording call stacks of java applications. Copy this directory into wPerf/ 31 | - wPerf uses /tmp/ to store some intermediate results. Make sure that you have this directory. 32 | 33 | ## Compile your application: 34 | - Annotate if necessary: If there are any waiting events, such as RDMA operations and spinlocks, that do not go through kernel, you have to annotate by yourself. 35 | 36 | For C programs, copy the code in wPerf/annotation/c/main.c into the main function and the code in wPerf/annotation/c/thread.c into the code where you want to annotate. 37 | 38 | For Java programs, run make in wPerf/annotation/java/ and include the annotation library in your code. 39 | 40 | Then for both kinds of programs, use uds_add(&address, event_type) to record the such waiting events. 41 | 42 | - C program: compile your program with "-g" option. 43 | - Java program: start JVM with the option "-XX:+PreserveFramePointer", which has been added since JDK8u60. 44 | 45 | ## Compile wPerf kernel module and recorder: 46 | Run make in the repo's root directory. 47 | 48 | ## Test your application with wPerf: 49 | After you start your application, you can run wPerf/recorder/record.sh and follow the instructions. 50 | 51 | ## Generate wait-for graph: 52 | Once the test is finished, you can run wPerf/analyzer/analyzer.sh to generate the wait-for graph. 53 | Then use wPerf/analyzer/knot.sh to start the python program identifying the bottlenecks in your application. 54 | Use 'g' with the bottleneck component number to generate a csv file, which contains the bottleneck. 55 | 56 | ## Show time: 57 | Use our [online graph explorer](https://osusyslab.github.io/wperf/) with your result file and check out the bottleneck in that knot. 58 | 59 | As one can see from the screenshot, the left part shows the graph information and the right part shows the wait-for graph. 60 | ### Screenshot: 61 | ![Screenshot of wPerf](https://github.com/OSUSysLab/OSUSysLab.github.io/blob/master/wperf/wPerf_screenshot.png) 62 | 63 | ## Contact 64 | If you have any questions, please contact Fang at zhou.1250@osu.edu, Yifan at gan.101@osu.edu, Sixiang at ma.1189@osu.edu, and Yang at wang.7564@osu.edu 65 | -------------------------------------------------------------------------------- /analyzer/Analyzer.java: -------------------------------------------------------------------------------- 1 | import java.io.*; 2 | import java.util.*; 3 | import java.nio.*; 4 | import java.nio.file.*; 5 | import java.nio.charset.*; 6 | import java.util.concurrent.ConcurrentHashMap; 7 | import java.util.concurrent.ConcurrentSkipListMap; 8 | 9 | class Analyzer { 10 | 11 | static Map overAllRes = new ConcurrentHashMap(); 12 | static Map> segmentMap = new ConcurrentHashMap>(); 13 | static Map> softMap = new ConcurrentHashMap>(); 14 | static Map resultMap = new ConcurrentHashMap(); 15 | static ArrayList tidList = new ArrayList(); 16 | static Map prevList = new HashMap(); 17 | static ArrayList> totalResult = new ArrayList>(); 18 | static Map> udsList = new ConcurrentHashMap>(); 19 | static Map> dpList = new ConcurrentHashMap>(); 20 | static Map> udswList = new ConcurrentHashMap>(); 21 | static Map> udsCheck = new ConcurrentHashMap>(); 22 | static Map> udsQuick = new ConcurrentHashMap>(); 23 | static Map createTimeList = new HashMap(); 24 | static Map killTimeList = new HashMap(); 25 | static ArrayList elist = new ArrayList(); 26 | static ArrayList spinlist = new ArrayList(); 27 | 28 | 29 | static int isUDS = 0; 30 | 31 | // Remove duplicated function stacks 32 | static Map finalPerf = new HashMap(); 33 | static Map> compPerf = new HashMap>(); 34 | 35 | static long starttime = 0; 36 | static long endtime = 0; 37 | 38 | //static double freq = 2.4e9; 39 | static double freq = 0.0; 40 | 41 | //Added for test 42 | //static long waitTime=0; 43 | //static long totalWaitTime = 0; 44 | 45 | public void Fangtest() { 46 | ConcurrentSkipListMap sm1 = new ConcurrentSkipListMap(); 47 | sm1.put((long)1, new Segment(111, 2, 1, 10, 222)); 48 | segmentMap.put(111, sm1); 49 | ConcurrentSkipListMap sm2 = new ConcurrentSkipListMap(); 50 | sm2.put((long)1, new Segment(222, 2, 1, 10, -4)); 51 | segmentMap.put(222, sm2); 52 | 53 | Map ret1 = getBreakdown(111); 54 | Map ret2 = getBreakdown(222); 55 | totalResult.add(ret1); 56 | totalResult.add(ret2); 57 | } 58 | 59 | public static void main(String args[]) throws IOException { 60 | Analyzer t = new Analyzer(); 61 | 62 | if (args[8].equals("fake")) { 63 | isUDS = 1; 64 | } 65 | else { 66 | isUDS = 2; 67 | } 68 | 69 | int numT = Integer.parseInt(args[9]); 70 | 71 | long stime = 0; 72 | long etime = 0; 73 | 74 | stime = System.currentTimeMillis(); 75 | t.readCPUFreq(args[5]); 76 | etime = System.currentTimeMillis(); 77 | System.out.printf("[Info] readCPUFreqency spends %f second\n" , (etime - stime)*1.0/1e3); 78 | 79 | stime = System.currentTimeMillis(); 80 | t.readThread(args[3]); 81 | etime = System.currentTimeMillis(); 82 | System.out.printf("[Info] readThread spends %f second\n" , (etime - stime)*1.0/1e3); 83 | 84 | stime = System.currentTimeMillis(); 85 | t.readFile(args[0], elist); 86 | etime = System.currentTimeMillis(); 87 | System.out.printf("[Info] readFile spends %f second\n" , (etime - stime)*1.0/1e3); 88 | 89 | stime = System.currentTimeMillis(); 90 | //t.readPerf(args[4]); 91 | etime = System.currentTimeMillis(); 92 | System.out.printf("[Info] readPerf spends %f second\n" , (etime - stime)*1.0/1e3); 93 | 94 | stime = System.currentTimeMillis(); 95 | t.readSoft(args[1]); 96 | etime = System.currentTimeMillis(); 97 | System.out.printf("[Info] readSoft spends %f second\n" , (etime - stime)*1.0/1e3); 98 | //t.readFutex(args[2]); 99 | 100 | stime = System.currentTimeMillis(); 101 | if (isUDS == 1) { 102 | t.readFake(args[2]); 103 | } 104 | else if (isUDS == 2) { 105 | t.readFakeSpin(args[2]); 106 | } 107 | //System.exit(-1); 108 | etime = System.currentTimeMillis(); 109 | System.out.printf("[Info] readFake spends %f second\n" , (etime - stime)*1.0/1e3); 110 | //System.exit(0); 111 | 112 | // System.out.println(System.currentTimeMillis() + " Finish the load log file."); 113 | //stime = System.currentTimeMillis(); 114 | //t.runThread(elist, 0); 115 | //etime = System.currentTimeMillis(); 116 | //System.out.printf("[Info] Basic segment spends %f second\n" , (etime - stime)*1.0/1e3); 117 | 118 | ArrayList dpTidList = null; 119 | ArrayList segTidList = null; 120 | 121 | segTidList = new ArrayList(); 122 | for (int id : prevList.keySet()) { 123 | segTidList.add(id); 124 | } 125 | 126 | Index index = t.createIndex(dpTidList, segmentMap); 127 | ArrayList workerList = new ArrayList(); 128 | t.startWorker(workerList, index, numT); 129 | 130 | // /* Start for busy-wait 131 | if (isUDS == 1) { 132 | // /* For fake wake up 133 | dpTidList = new ArrayList(); 134 | for (int id :dpList.keySet()) { 135 | ConcurrentSkipListMap tmpList = dpList.get(id); 136 | //if (id == 31234 || id == 31235) { 137 | // for (DPEvent m : tmpList.values()) { 138 | // m.printOut(id); 139 | // } 140 | //} 141 | if (dpList.get(id).size() > 0) { 142 | dpTidList.add(id); 143 | udsCheck.put(id, new ConcurrentSkipListMap()); 144 | udsQuick.put(id, new ConcurrentSkipListMap()); 145 | } 146 | } 147 | 148 | //segTidList = new ArrayList(); 149 | //for (int id : prevList.keySet()) { 150 | // segTidList.add(id); 151 | //} 152 | 153 | //index = t.createIndex(dpTidList, segmentMap); 154 | 155 | index.reset(); 156 | index.tidList = segTidList; 157 | 158 | stime = System.currentTimeMillis(); 159 | for ( Worker w : workerList) { 160 | synchronized(w.cond) { 161 | w.op = 5; 162 | w.cond.notifyAll(); 163 | } 164 | } 165 | 166 | synchronized(index) { 167 | while(index.finishNum < numT) { 168 | try { 169 | index.wait(); 170 | } 171 | catch (InterruptedException e) { 172 | } 173 | } 174 | } 175 | 176 | etime = System.currentTimeMillis(); 177 | System.out.printf("[Info] Basic segment spends %f second\n" , (etime - stime)*1.0/1e3); 178 | 179 | 180 | index.reset(); 181 | index.tidList = dpTidList; 182 | 183 | stime = System.currentTimeMillis(); 184 | for ( Worker w : workerList) { 185 | synchronized(w.cond) { 186 | w.op = 1; 187 | w.cond.notifyAll(); 188 | } 189 | } 190 | 191 | synchronized(index) { 192 | while(index.finishNum < numT) { 193 | try { 194 | index.wait(); 195 | } 196 | catch (InterruptedException e) { 197 | } 198 | } 199 | } 200 | 201 | etime = System.currentTimeMillis(); 202 | System.out.printf("[Info] FakePrep spends %f second\n" , (etime - stime)*1.0/1e3); 203 | 204 | stime = System.currentTimeMillis(); 205 | index.reset(); 206 | 207 | for ( Worker w : workerList) { 208 | synchronized(w.cond) { 209 | w.op = 2; 210 | w.cond.notifyAll(); 211 | } 212 | } 213 | 214 | synchronized(index) { 215 | while(index.finishNum < numT) { 216 | try { 217 | index.wait(); 218 | } 219 | catch (InterruptedException e) { 220 | } 221 | } 222 | } 223 | 224 | etime = System.currentTimeMillis(); 225 | System.out.printf("[Info] FakeClean spends %f second\n" , (etime - stime)*1.0/1e3); 226 | // END fake Wakeup */ 227 | } 228 | else if (isUDS == 2) { 229 | segTidList = new ArrayList(); 230 | for (int id : prevList.keySet()) { 231 | segTidList.add(id); 232 | } 233 | 234 | dpTidList = new ArrayList(); 235 | for (int id :udsCheck.keySet()) { 236 | dpTidList.add(id); 237 | //for (Segment s : udsCheck.get(id).values()) { 238 | // s.printOut(); 239 | //} 240 | } 241 | 242 | index = t.createIndex(dpTidList, segmentMap); 243 | 244 | workerList = new ArrayList(); 245 | t.startWorker(workerList, index, numT); 246 | 247 | index.reset(); 248 | index.tidList = segTidList; 249 | 250 | stime = System.currentTimeMillis(); 251 | for ( Worker w : workerList) { 252 | synchronized(w.cond) { 253 | w.op = 5; 254 | w.cond.notifyAll(); 255 | } 256 | } 257 | 258 | synchronized(index) { 259 | while(index.finishNum < numT) { 260 | try { 261 | index.wait(); 262 | } 263 | catch (InterruptedException e) { 264 | } 265 | } 266 | } 267 | 268 | etime = System.currentTimeMillis(); 269 | System.out.printf("[Info] Basic segment spends %f second\n" , (etime - stime)*1.0/1e3); 270 | 271 | 272 | stime = System.currentTimeMillis(); 273 | 274 | index.reset(); 275 | index.tidList = dpTidList; 276 | 277 | for ( Worker w : workerList) { 278 | synchronized(w.cond) { 279 | w.op = 4; 280 | w.cond.notifyAll(); 281 | } 282 | } 283 | 284 | synchronized(index) { 285 | while(index.finishNum < numT) { 286 | try { 287 | index.wait(); 288 | } 289 | catch (InterruptedException e) { 290 | } 291 | } 292 | } 293 | 294 | etime = System.currentTimeMillis(); 295 | System.out.printf("[Info] FakeClean spends %f second\n" , (etime - stime)*1.0/1e3); 296 | // End for busy-wait */ 297 | } 298 | 299 | stime = System.currentTimeMillis(); 300 | 301 | index.reset(); 302 | index.tidList = segTidList; 303 | 304 | for ( Worker w : workerList) { 305 | synchronized(w.cond) { 306 | w.op = 3; 307 | w.cond.notifyAll(); 308 | } 309 | } 310 | 311 | synchronized(index) { 312 | while(index.finishNum < numT) { 313 | try { 314 | index.wait(); 315 | } 316 | catch (InterruptedException e) { 317 | } 318 | } 319 | } 320 | 321 | etime = System.currentTimeMillis(); 322 | System.out.printf("[Info] Cascading spends %f second\n" , (etime - stime)*1.0/1e3); 323 | 324 | stime = System.currentTimeMillis(); 325 | for (Worker w : workerList) { 326 | totalResult.add(w.hm); 327 | } 328 | 329 | for (Map tmp : totalResult) { 330 | if (tmp == null || tmp.size()==0) { 331 | //System.out.println("Empty result!\n"); 332 | continue; 333 | } 334 | for (String str : tmp.keySet()) { 335 | if (resultMap.containsKey(str)) { 336 | resultMap.put(str, resultMap.get(str) + tmp.get(str)); 337 | } 338 | else { 339 | resultMap.put(str, tmp.get(str)); 340 | } 341 | } 342 | } 343 | 344 | System.out.println(System.currentTimeMillis() + " Finish the final edge weight."); 345 | PrintWriter fwait = new PrintWriter(args[6], "ASCII"); 346 | 347 | for (String s : resultMap.keySet()) { 348 | if (s.contains("-99")) continue; 349 | //System.out.printf("%s%f\n", s,resultMap.get(s)/100000000.0); 350 | fwait.printf("%s%f\n", s,resultMap.get(s)/freq); 351 | } 352 | 353 | PrintWriter fbreak = new PrintWriter(args[7], "ASCII"); 354 | fbreak.printf("ThreadID Running Runnable Wait HardIRQ SoftIRQ Network Disk Other Unknown Period\n"); 355 | for (Integer tid : prevList.keySet()) { 356 | if (prevList.get(tid).state==-1) { 357 | continue; 358 | } 359 | overAllRes.get(tid).printOut(tid,fbreak); 360 | } 361 | 362 | fbreak.close(); 363 | fwait.close(); 364 | 365 | etime = System.currentTimeMillis(); 366 | System.out.printf("[Info] Collection spends %f second\n" , (etime - stime)*1.0/1e3); 367 | 368 | 369 | long infiniteTotal = 0L; 370 | long processTotal = 0L; 371 | for (Worker w : workerList) { 372 | infiniteTotal += w.infiniteTime; 373 | processTotal += w.processTime; 374 | } 375 | System.out.printf("[Info] Infinite Loop ignores cycle = %f, total processed cycles = %f, percent = %f\n" , infiniteTotal/freq, processTotal/freq, infiniteTotal*100.0/processTotal); 376 | System.exit(0); 377 | } 378 | 379 | public void runThread(ArrayList elist, int op) { 380 | ArrayList tmpList = new ArrayList(); 381 | if (op == 0) { 382 | //for (Integer tid : tidList) { 383 | for (Integer tid : prevList.keySet()) { 384 | doSegment dos = new doSegment(tid, elist, op); 385 | tmpList.add(dos); 386 | } 387 | } 388 | else { 389 | for (Integer tid : prevList.keySet()) { 390 | if (prevList.get(tid).state==-1) { 391 | continue; 392 | } 393 | doSegment dos = new doSegment(tid, elist, op); 394 | tmpList.add(dos); 395 | } 396 | } 397 | 398 | //if (op == 0) { 399 | for (doSegment dos : tmpList) { 400 | dos.start(); 401 | } 402 | 403 | for (doSegment dos : tmpList) { 404 | dos.join(); 405 | } 406 | //} 407 | //else { 408 | // int j = 0; 409 | // while (j < tmpList.size()) { 410 | // int i = 0; 411 | // for (i=0;i<10;i++) { 412 | // if ((j+i) elist; 448 | public doSegment(int tid, ArrayList elist, int op) { 449 | this.tid = tid; 450 | this.elist = elist; 451 | this.op = op; 452 | } 453 | 454 | public void run() { 455 | if (op == 0) { 456 | directBreakdown(elist, tid); 457 | System.out.println("[directBreakdown] Thread " + tid + " has been finished"); 458 | } 459 | else if (op == 1) { 460 | //removeFakeWakeup(tid); 461 | System.out.println("[removeFakeWakeup] Thread " + tid + " has been finished"); 462 | } 463 | else { 464 | Map ret = getBreakdown(tid); 465 | totalResult.add(ret); 466 | System.out.println("[getBreakdown] Thread " + tid + " has been finished"); 467 | } 468 | } 469 | 470 | public void join() { 471 | try { 472 | t.join(); 473 | } 474 | catch (InterruptedException e) { 475 | } 476 | } 477 | 478 | public void start () { 479 | t = new Thread (this); 480 | t.start(); 481 | } 482 | } 483 | 484 | public class UDSResult implements Comparable { 485 | long ts; 486 | int tid; 487 | long lock; 488 | short type; 489 | 490 | public UDSResult(ByteBuffer buf) throws IOException { 491 | this.ts = buf.getLong(); 492 | this.tid = buf.getInt(); 493 | this.lock = buf.getLong(); 494 | this.type = buf.getShort(); 495 | } 496 | 497 | @Override 498 | public int compareTo(UDSResult e) 499 | { 500 | if (this.ts > e.ts) return 1; 501 | else if (this.ts < e.ts) return -1; 502 | else return 0; 503 | } 504 | 505 | public void printOut() { 506 | System.out.printf("%d %d %d %d\n", this.ts, this.tid, this.lock, this.type); 507 | } 508 | } 509 | 510 | // public void removeFakeWakeup(int tid) { 511 | // TreeSet tmpList = udsList.get(tid); 512 | // ConcurrentSkipListMap tm = segmentMap.get(tid); 513 | // long lastTime = tm.lastKey(); 514 | // long firstTime = tm.firstKey(); 515 | // for (UDSResult uds: tmpList) { 516 | // if (lastTime <= uds.ts && firstTime >= uds.ts) continue; 517 | // 518 | // Segment sTmp = null; 519 | // Segment tmp = null; 520 | // 521 | // sTmp = (Segment) tm.lowerEntry(uds.ts).getValue(); 522 | // uds.printOut(); 523 | // sTmp.printOut(); 524 | // while(true) { 525 | // if (sTmp.state == 2) { 526 | // break; 527 | // } 528 | // if (tm.lowerEntry(sTmp.startTime) == null) { 529 | // sTmp = (Segment) tm.firstEntry().getValue(); 530 | // break; 531 | // } 532 | // sTmp = (Segment) tm.lowerEntry(sTmp.startTime).getValue(); 533 | // } 534 | // 535 | // long sTime = sTmp.startTime; 536 | // long startTime = sTmp.startTime; 537 | // //long eTime = 0; 538 | // Segment newSeg; 539 | // 540 | // while(true) { 541 | // tm.remove(sTime); 542 | // if (tm.higherEntry(sTime) == null) { 543 | // newSeg = new Segment(tid, 2, sTime, lastTime, -99); 544 | // break; 545 | // } 546 | // tmp = (Segment) tm.higherEntry(sTime).getValue(); 547 | // if (tmp.state == 2) { 548 | // newSeg = new Segment(tid, 2, startTime, tmp.endTime, tmp.waitFor); 549 | // tm.remove(tmp.startTime); 550 | // break; 551 | // } 552 | // sTime = tmp.startTime; 553 | // } 554 | // tm.put(newSeg.startTime, newSeg); 555 | // } 556 | // } 557 | 558 | public void directBreakdown(ArrayList elist, int tid) { 559 | //int WAITSTART = 1; 560 | //int IOSTART = 2; 561 | //double traceTime = elist.get(0).time; 562 | //double time = 0; 563 | 564 | prevState ps = prevList.get(tid); 565 | ConcurrentSkipListMap tm = segmentMap.get(tid); 566 | GeneralRes gr = overAllRes.get(tid); 567 | 568 | for (Event e : elist) { 569 | // switch_to 570 | long prevtime = ps.time; 571 | if (e.type== 0) { 572 | if (e.pid1== tid) { 573 | if (e.pid1state == 0) { 574 | //Update previous state and time 575 | ps.state = 1; 576 | ps.time = e.time; 577 | } 578 | else { 579 | ps.state = 2; 580 | ps.time = e.time; 581 | } 582 | 583 | try { 584 | SoftEvent se = softMap.get(e.core).higherEntry(prevtime).getValue(); 585 | 586 | //tm.put(prevtime, new Segment(tid, 0, prevtime, e.time, 0)); 587 | 588 | // For global info 589 | gr.running += e.time - prevtime; 590 | 591 | if (( se.stime < e.time) && (se.etime < e.time)) { 592 | tm.put(prevtime, new Segment(tid, 0, prevtime, se.stime, 0)); 593 | tm.put(se.stime, new Segment(tid, 1, se.stime, se.etime, 0)); 594 | //tm.put(se.stime, new Segment(tid, 2, se.stime, se.etime, -16)); 595 | tm.put(se.etime, new Segment(tid, 0, se.etime, e.time, 0)); 596 | 597 | // For global info 598 | gr.running += se.stime-prevtime; 599 | gr.softirq += se.etime - se.stime; 600 | gr.running += e.time - se.etime; 601 | } 602 | else { 603 | tm.put(prevtime, new Segment(tid, 0, prevtime, e.time, 0)); 604 | 605 | // For global info 606 | gr.running += e.time - prevtime; 607 | } 608 | } 609 | catch (Exception error) { 610 | tm.put(prevtime, new Segment(tid, 0, prevtime, e.time, 0)); 611 | // For global info 612 | gr.running += e.time - prevtime; 613 | } 614 | } 615 | else if (e.pid2== tid) { 616 | //In __switch_to, the switch-in thread state is absolutely 0. 617 | tm.put(prevtime, new Segment(tid, 1, prevtime, e.time, 0)); 618 | ps.state = 0; 619 | ps.time = e.time; 620 | 621 | // For global info 622 | gr.runnable += e.time - prevtime; 623 | } 624 | } 625 | // try_to_wake_up 626 | else { 627 | if (e.pid2== tid) { 628 | if (ps.state != 2) continue; 629 | 630 | //// Added for test 631 | //synchronized((Object) totalWaitTime) { 632 | //totalWaitTime += e.time - prevtime; 633 | //} 634 | 635 | if (e.irq== 0) { 636 | tm.put(prevtime, new Segment(tid, 2, prevtime, e.time, e.pid1)); 637 | ps.state = 1; 638 | ps.time = e.time; 639 | 640 | // For global info 641 | gr.wait += e.time - prevtime; 642 | } 643 | else { 644 | tm.put(prevtime, new Segment(tid, 2, prevtime, e.time, e.irq)); 645 | ps.state = 1; 646 | ps.time = e.time; 647 | 648 | // For global info 649 | if (e.irq == -5) gr.disk += e.time - prevtime; 650 | else if (e.irq == -4) gr.network += e.time - prevtime; 651 | else if (e.irq == -15) gr.hardirq += e.time - prevtime; 652 | else if (e.irq == -16) gr.softirq += e.time - prevtime; 653 | else gr.other += e.time - prevtime; 654 | } 655 | } 656 | } 657 | } 658 | 659 | // Final round 660 | ps = prevList.get(tid); 661 | if (ps.state == 0) { 662 | tm.put(ps.time, new Segment(tid, 0, ps.time, endtime, 0)); 663 | 664 | // For global info 665 | gr.running += endtime - ps.time; 666 | } 667 | else if (ps.state == 1) { 668 | tm.put(ps.time, new Segment(tid, 1, ps.time, endtime, 0)); 669 | 670 | // For global info 671 | gr.runnable += endtime - ps.time; 672 | } 673 | else { 674 | if (killTimeList.containsKey(tid)) { 675 | // Solve the kill time 676 | long killTime = killTimeList.get(tid); 677 | tm.lowerEntry(killTime).getValue().endTime = killTime; 678 | } 679 | else { 680 | tm.put(ps.time, new Segment(tid, 2, ps.time, endtime, -99)); 681 | } 682 | 683 | // For global info 684 | gr.unknown += endtime - ps.time; 685 | } 686 | // time = elist.get(elist.size()-1).time; 687 | // if (time > timeTable.get(tid).time) { 688 | // timeState tstmp = timeTable.get(tid); 689 | // ConcurrentSkipListMap tm = segmentMap.get(tid); 690 | // if (tstmp.state == 0) { 691 | // tm.put(tstmp.time, new Segment(tid, 0, tstmp.time, time, 0)); 692 | // } 693 | // else if (tstmp.state == 1) { 694 | // tm.put(tstmp.time, new Segment(tid, 1, tstmp.time, time, tid)); 695 | // } 696 | // else if (tstmp.state == 2) { 697 | // tm.put(tstmp.time, new Segment(tid, 2, tstmp.time, time, 0)); 698 | // } 699 | // } 700 | } 701 | 702 | public void readCPUFreq(String fname) throws IOException { 703 | Path fpath = Paths.get(fname); 704 | List lines = Files.readAllLines( fpath, StandardCharsets.US_ASCII); 705 | double ghz = 0.0; 706 | for (String line : lines) { 707 | ghz = Double.valueOf(line.trim()); 708 | } 709 | freq = ghz * 1e9; 710 | } 711 | 712 | public void readThread(String fname) throws IOException { 713 | Path fpath = Paths.get(fname); 714 | List lines = Files.readAllLines( fpath, StandardCharsets.US_ASCII); 715 | for (String line : lines) { 716 | //System.out.println(line); 717 | String[] tmp = line.split(" "); 718 | int pid = Integer.parseInt(tmp[0]); 719 | if (tidList.contains(pid)) { 720 | continue; 721 | } 722 | tidList.add(pid); 723 | 724 | segmentMap.put(pid, new ConcurrentSkipListMap()); 725 | overAllRes.put(pid, new GeneralRes()); 726 | compPerf.put(pid, new TreeSet()); 727 | //udsList.put(pid, new TreeSet()); 728 | //udswList.put(pid, new ConcurrentSkipListMap()); 729 | //dpList.put(pid, new ConcurrentSkipListMap()); 730 | } 731 | } 732 | 733 | public class Segment { 734 | int pid; 735 | int state; // 0 means running, 1 means runnable, 2 means wait, 3 means unknown wait 736 | long startTime; 737 | long endTime; 738 | int waitFor; // Which thread the current thread waits for 739 | int core; 740 | int parent; 741 | long ftime; 742 | ArrayList overlap; 743 | 744 | public Segment(int pid, int state, long startTime, long endTime, int waitFor) { 745 | this.pid = pid; 746 | this.state = state; 747 | this.startTime = startTime; 748 | this.endTime = endTime; 749 | this.waitFor = waitFor; 750 | this.ftime = 0; 751 | this.overlap = new ArrayList(); 752 | } 753 | 754 | public Segment(int pid, int state, long startTime, long endTime, int waitFor, int parent) { 755 | this.pid = pid; 756 | this.state = state; 757 | this.startTime = startTime; 758 | this.endTime = endTime; 759 | this.waitFor = waitFor; 760 | this.ftime = 0; 761 | this.parent = parent; 762 | this.overlap = new ArrayList(); 763 | } 764 | 765 | public void printOut() { 766 | System.out.printf("Tid %d State %d StartTime %d EndTime %d ftime %d waitFor %d parent %d\n", this.pid, this.state, this.startTime, this.endTime, this.ftime, this.waitFor, this.parent); 767 | } 768 | } 769 | 770 | public class timeState { 771 | int state; 772 | double time; 773 | public timeState(int state, double time) { 774 | this.state = state; 775 | this.time = time; 776 | } 777 | } 778 | 779 | public void addResult(HashMap resultMap, int pid, int wid, long start, long end) { 780 | String strtmp = String.valueOf(pid) + " " + String.valueOf(wid) + " "; 781 | //FFang Adddd 782 | //System.out.println(strtmp); 783 | if(resultMap.containsKey(strtmp)) { 784 | resultMap.put(strtmp, resultMap.get(strtmp)+end-start); 785 | } 786 | else { 787 | resultMap.put(strtmp, end-start); 788 | } 789 | } 790 | 791 | //Fang new 792 | public boolean isFakeWakeNew(int tid, Long time) { 793 | ConcurrentSkipListMap udsMap = udsCheck.get(tid); 794 | if (udsMap == null) return false; 795 | 796 | if (udsMap.lowerEntry(time) != null) { 797 | Segment tmp = udsMap.lowerEntry(time).getValue(); 798 | if (tmp.startTime <= time && tmp.endTime >= time) { 799 | return true; 800 | } 801 | } 802 | return false; 803 | } 804 | // Stats for result 805 | //int total=0; 806 | //int ftotal=0; 807 | 808 | // Check if a wait segment is caused by a fake wake-up event 809 | public boolean isFakeWake(int tid, Segment seg) { 810 | TreeSet ts = udsList.get(tid); 811 | ConcurrentSkipListMap currDP = dpList.get(tid); 812 | ConcurrentSkipListMap tm = segmentMap.get(tid); 813 | Segment runSeg1 = null; 814 | 815 | long stime = seg.endTime; 816 | long etime = 0; 817 | long tmptime = stime; 818 | 819 | //total++; 820 | 821 | // No need to check hardware 822 | if (tid < 0) return false; 823 | 824 | // Find out the next several running or runnable segments 825 | try { 826 | while(true) { 827 | // Find the first segment based on the prior segment's endTime 828 | runSeg1 = tm.higherEntry(tmptime).getValue(); 829 | if (runSeg1.state != 2) { 830 | etime = runSeg1.endTime; 831 | } 832 | else { 833 | break; 834 | } 835 | tmptime = runSeg1.startTime; 836 | } 837 | } 838 | catch (Exception e) { 839 | return false; 840 | } 841 | 842 | // Prepare the ArrayList of DPEvent 843 | ArrayList dplist = new ArrayList(); 844 | try { 845 | tmptime = stime; 846 | while(true) { 847 | DPEvent tmpDP = currDP.higherEntry(tmptime).getValue(); 848 | if (tmpDP.time > etime) { 849 | break; 850 | } 851 | dplist.add(tmpDP); 852 | tmptime = tmpDP.time; 853 | } 854 | } 855 | catch (Exception e) { 856 | } 857 | 858 | DPEvent preDP = null; 859 | try { 860 | preDP = currDP.lowerEntry(tmptime).getValue(); 861 | } 862 | catch (Exception e) { 863 | } 864 | 865 | // Nothing in DPList: 866 | if (dplist.size()==0) 867 | return true; 868 | // Only pending DP1, no DP2 in this period 869 | if (preDP != null && preDP.type==15) { 870 | int find = 0; 871 | DPEvent dp = dplist.get(0); 872 | if (dp.type != 16) { 873 | return true; 874 | } 875 | //for (DPEvent dp : dplist) { 876 | // if (dp.type == 16) { 877 | // find = 1; 878 | // break; 879 | // } 880 | //} 881 | //if (find == 0) { 882 | // //ftotal++; 883 | // return true; 884 | //} 885 | } 886 | 887 | // DP2, DP3, and DP1 in this period 888 | if (dplist.size()>=3) { 889 | if (dplist.get(0).type == 16 && dplist.get(1).type == 0 && dplist.get(2).type == 15) { 890 | //ftotal++; 891 | return true; 892 | } 893 | } 894 | return false; 895 | } 896 | 897 | public Segment getRealWake(Segment seg, ConcurrentSkipListMap tm) { 898 | Segment res = new Segment(seg.pid, seg.state, seg.startTime, seg.endTime, seg.waitFor); 899 | Segment tmp = res; 900 | long tmpTime = res.startTime; 901 | 902 | do { 903 | if (tmp.state == 2 && !isFakeWake(tmp.pid, tmp)) { 904 | res.endTime = tmp.endTime; 905 | res.waitFor = tmp.waitFor; 906 | break; 907 | } 908 | tmpTime = tmp.startTime; 909 | try { 910 | tmp = tm.higherEntry(tmpTime).getValue(); 911 | } 912 | catch (Exception e) { 913 | break; 914 | } 915 | } while(true); 916 | //if (res.waitFor != seg.waitFor || res.endTime != seg.endTime) { 917 | // seg.printOut(); 918 | // res.printOut(); 919 | //} 920 | return res; 921 | } 922 | 923 | public HashMap getBreakdown(int pid) { 924 | HashMap hm = new HashMap(); 925 | ConcurrentSkipListMap tm = segmentMap.get(pid); 926 | Segment tmpSeg = null; 927 | int reportInterval = 0; 928 | long lastSeg = -1; 929 | if (tm == null) { 930 | return hm; 931 | } 932 | 933 | //int[] sync = {18046,18047,18048,18049,18050}; 934 | //int[] st = {18551, 18562, 18581, 18591, 18608, 18618}; 935 | //int[] rp = {18554, 18565, 18584, 18594, 18611, 18621}; 936 | //int[] handle = {17993,17994,17995,17996,17997,17998,17999,18000,18001,18002}; 937 | 938 | for (Segment seg : new ArrayList(tm.values())) { 939 | //FFang for test 940 | //System.out.println("pid="+pid+", state="+seg.state); 941 | 942 | if (seg.state == 0 || seg.state == 1) { 943 | } 944 | else if (seg.state == 2) { 945 | // Avoid multiple calculations 946 | if (seg.startTime < lastSeg) { 947 | continue; 948 | } 949 | 950 | // First find out the real wake up period and update to get a new segment. 951 | Segment newSeg = getRealWake(seg, tm); 952 | lastSeg = newSeg.endTime; 953 | 954 | //newSeg.printOut(); 955 | 956 | ArrayList tmpList = new ArrayList(); 957 | tmpList.add(newSeg); 958 | //newSeg.printOut(); 959 | //System.out.println("StartFindAll,pid="+pid+", startTime="+seg.startTime+", endTime="+seg.endTime+", waitFor="+seg.waitFor); 960 | while(!tmpList.isEmpty()) { 961 | Segment segTmp = tmpList.remove(0); 962 | // If the temp segment's time range is wrong 963 | if (segTmp.startTime > segTmp.endTime) 964 | continue; 965 | if (segTmp.state == 2) { 966 | if (segTmp.startTime < newSeg.startTime) { 967 | System.out.println("[Start Warning]"); 968 | newSeg.printOut(); 969 | segTmp.printOut(); 970 | System.out.println("[Finish Warning]"); 971 | } 972 | //segTmp.printOut(); 973 | reportInterval++; 974 | if (reportInterval % 1000000 == 0) { 975 | reportInterval = 0; 976 | System.out.printf("[INFO] "); 977 | segTmp.printOut(); 978 | } 979 | // Check if the wake-up event is a fake one 980 | ConcurrentSkipListMap tmWake = segmentMap.get(segTmp.waitFor); 981 | long wakeTime = segTmp.endTime; 982 | Segment sTmp = null; 983 | try { 984 | while(tmWake.lowerEntry(wakeTime)!=null) { 985 | sTmp = tmWake.lowerEntry(wakeTime).getValue(); 986 | if (sTmp.state == 2) { 987 | break; 988 | } 989 | wakeTime = sTmp.startTime; 990 | } 991 | } 992 | // This means we cannot make sure if the waitFor thread is caused by a real or fake wake up. 993 | catch (Exception e) { 994 | //continue; 995 | } 996 | 997 | if (sTmp == null || !isFakeWake(sTmp.pid, sTmp)) 998 | { 999 | addResult(hm, segTmp.pid, segTmp.waitFor, segTmp.startTime, segTmp.endTime); 1000 | if (segTmp.waitFor <= 0 || segTmp.waitFor == segTmp.pid) { 1001 | continue; 1002 | } 1003 | else { 1004 | //tmpList.addAll(findAll(segTmp.waitFor, segTmp.startTime, segTmp.endTime, segTmp.parent, segTmp, this)); 1005 | } 1006 | } 1007 | else { 1008 | try { 1009 | tmpList.add(new Segment(segTmp.pid, 2, segTmp.startTime, sTmp.endTime, sTmp.waitFor, segTmp.waitFor)); 1010 | } 1011 | catch (Exception e) { 1012 | continue; 1013 | } 1014 | } 1015 | // If not, we'll directly add this segment info into final result 1016 | } 1017 | } 1018 | tmpList.clear(); 1019 | tmpList = null; 1020 | } 1021 | } 1022 | return hm; 1023 | } 1024 | 1025 | // Once I find unqueue_me, I will change the state to 0. 1026 | public ArrayList findAll(int pid, long start, long end, int parent, Segment segT, Worker w) { 1027 | ArrayList ret = new ArrayList(); 1028 | if (start>end) { 1029 | return ret; 1030 | } 1031 | //else { 1032 | // System.out.println("pid="+pid+",start="+start+",end="+end+",parent="+parent); 1033 | //} 1034 | long key = 0; 1035 | //Find the first breakdown for thread pid 1036 | ConcurrentSkipListMap tm = segmentMap.get(pid); 1037 | if (tm == null) { 1038 | //System.out.println("pid " + pid + " and no tm"); 1039 | return ret; 1040 | } 1041 | try { 1042 | key = tm.floorKey(start); 1043 | } catch (Exception e) { 1044 | //System.out.println("[floorKey] pid = " + pid + ", start time = " + start + ", parent = " + parent); 1045 | return ret; 1046 | } 1047 | Segment seg = tm.get(key); 1048 | 1049 | while(true) { 1050 | Segment tmp = new Segment(pid, seg.state, Math.max(start, seg.startTime), Math.min(end, seg.endTime), seg.waitFor, parent); 1051 | //start for spinlock 1052 | tmp.overlap = new ArrayList(segT.overlap); 1053 | if (tmp.overlap.contains(pid)) { 1054 | w.infiniteTime += 1.0 * (tmp.endTime - tmp.startTime) * tmp.overlap.size(); 1055 | } 1056 | else { 1057 | tmp.overlap.add(pid); 1058 | ret.add(tmp); 1059 | } 1060 | //end for spinlock 1061 | 1062 | if (seg.endTime >= end) 1063 | break; 1064 | try { 1065 | seg = tm.get(tm.higherKey(seg.startTime)); 1066 | } catch (Exception e) { 1067 | System.out.println("[higherKey] pid = " + pid + ", start time = " + start + ", parent = " + parent); 1068 | return ret; 1069 | } 1070 | if (seg.startTime >= end) break; 1071 | } 1072 | return ret; 1073 | } 1074 | 1075 | public void readSoft(String fname) throws IOException { 1076 | int nRead = 0; 1077 | int numEvent = 10000000; 1078 | int sum = 0; 1079 | 1080 | byte[] buffer = new byte[19*10000000]; 1081 | DataInputStream is = new DataInputStream(new FileInputStream(fname)); 1082 | 1083 | long statSTime = System.currentTimeMillis(); 1084 | while( nRead != -1 ) 1085 | { 1086 | int i = 0; 1087 | nRead = is.read(buffer); 1088 | ByteBuffer buf = ByteBuffer.wrap(buffer); 1089 | buf.order(ByteOrder.LITTLE_ENDIAN); 1090 | for (i = 0; i < nRead/19; i++) { 1091 | SoftEvent e = new SoftEvent(buf); 1092 | //e.printOut(); 1093 | if (!softMap.containsKey(e.core)) 1094 | softMap.put(e.core, new ConcurrentSkipListMap()); 1095 | softMap.get(e.core).put(e.stime,e); 1096 | } 1097 | sum += i; 1098 | if (i != 0) { 1099 | System.out.println(System.currentTimeMillis() + " Finish loading " + sum + " softirq events."); 1100 | } 1101 | //FFang Adddd 1102 | //if (sum > nRead/26*10) break; 1103 | } 1104 | 1105 | long statETime = System.currentTimeMillis(); 1106 | System.out.printf("[INFO] Finish loading softirq events, spending %f second\n" , (statETime - statSTime)*1.0/1e3); 1107 | } 1108 | 1109 | public class FEvent { 1110 | int pid; 1111 | long time; 1112 | public FEvent(ByteBuffer buf) throws IOException { 1113 | this.pid = buf.getInt(); 1114 | this.time = buf.getLong(); 1115 | } 1116 | public void printOut() { 1117 | System.out.printf("%d %d\n", this.pid, this.time); 1118 | } 1119 | } 1120 | 1121 | public class SPINResult implements Comparable { 1122 | long ts; 1123 | int pid; 1124 | long lock; 1125 | short type; 1126 | 1127 | public SPINResult(ByteBuffer buf) throws IOException { 1128 | this.ts = buf.getLong(); 1129 | this.pid = buf.getInt(); 1130 | this.lock = buf.getLong(); 1131 | this.type = buf.getShort(); 1132 | } 1133 | 1134 | @Override 1135 | public int compareTo(SPINResult e) 1136 | { 1137 | if (this.ts > e.ts) return 1; 1138 | else if (this.ts < e.ts) return -1; 1139 | else return 0; 1140 | } 1141 | 1142 | public void printOut() { 1143 | System.out.printf("%d %d %d %d\n", this.ts, this.pid, this.lock, this.type); 1144 | } 1145 | } 1146 | 1147 | public void readFakeSpin(String fname) throws IOException { 1148 | int nRead = 0; 1149 | int numEvent = 10000000; 1150 | int sum = 0; 1151 | 1152 | int s0 = 0; 1153 | int s1 = 0; 1154 | int s2 = 0; 1155 | int s3 = 0; 1156 | int s4 = 0; 1157 | int s5 = 0; 1158 | int slock = 0; 1159 | int sother = 0; 1160 | 1161 | byte[] buffer = new byte[22*10000000]; 1162 | DataInputStream is = new DataInputStream(new FileInputStream(fname)); 1163 | 1164 | HashMap hm = new HashMap(); 1165 | HashMap thm = new HashMap(); 1166 | HashMap thm1 = new HashMap(); 1167 | HashMap ths = new HashMap(); 1168 | 1169 | long statSTime = System.currentTimeMillis(); 1170 | while( nRead != -1 ) 1171 | { 1172 | int i = 0; 1173 | nRead = is.read(buffer); 1174 | ByteBuffer buf = ByteBuffer.wrap(buffer); 1175 | buf.order(ByteOrder.LITTLE_ENDIAN); 1176 | for (i = 0; i < nRead/22; i++) { 1177 | SPINResult spinRes = new SPINResult(buf); 1178 | //spinRes.printOut(); 1179 | spinlist.add(spinRes); 1180 | } 1181 | sum += i; 1182 | if (i != 0) { 1183 | System.out.println(System.currentTimeMillis() + " Finish loading " + sum + " events."); 1184 | } 1185 | } 1186 | 1187 | long statETime = System.currentTimeMillis(); 1188 | statSTime = statETime; 1189 | System.out.printf("[INFO] Finish loading fake wakeup events, spending %f second\n" , (statETime - statSTime)*1.0/1e3); 1190 | 1191 | Collections.sort(spinlist); 1192 | 1193 | for (SPINResult spinRes : spinlist) { 1194 | if (spinRes.ts < starttime || spinRes.ts > endtime) 1195 | continue; 1196 | 1197 | //if (spinRes.type != -1) 1198 | //spinRes.printOut(); 1199 | 1200 | // type == -3 or -1 means the thread has got the latch 1201 | if (spinRes.type == -1 || spinRes.type == -2) { 1202 | if (thm.get(spinRes.pid) == null || thm.get(spinRes.pid) == 0 || thm1.get(spinRes.pid) == null) { 1203 | } 1204 | else { 1205 | ConcurrentSkipListMap csm = null; 1206 | if (!udsCheck.containsKey(spinRes.pid)) { 1207 | udsCheck.put(spinRes.pid, new ConcurrentSkipListMap()); 1208 | } 1209 | csm = udsCheck.get(spinRes.pid); 1210 | if (hm.get(spinRes.lock) != null) { 1211 | //csm.put(thm.get(spinRes.pid), new Segment(spinRes.pid, 2, thm.get(spinRes.pid), spinRes.ts, hm.get(spinRes.lock), 0)); 1212 | csm.put(thm.get(spinRes.pid), new Segment(spinRes.pid, 2, thm.get(spinRes.pid), spinRes.ts, thm1.get(spinRes.pid), -99)); 1213 | } 1214 | } 1215 | 1216 | if (spinRes.type == -1) { 1217 | hm.put(spinRes.lock, spinRes.pid); 1218 | } 1219 | thm.put(spinRes.pid, 0L); 1220 | //ths.put(spinRes.pid, 0); 1221 | } 1222 | // type > 0 means the thread starts to wait on a latch 1223 | else if (spinRes.type >= 0) { 1224 | // Avoid duplicated updating on the start time of waiting 1225 | //if (ths.get(spinRes.pid) == null || ths.get(spinRes.pid) == 0) { 1226 | thm.put(spinRes.pid, spinRes.ts); 1227 | thm1.put(spinRes.pid, hm.get(spinRes.lock)); 1228 | //} 1229 | //ths.put(spinRes.pid, 1); 1230 | } 1231 | } 1232 | 1233 | spinlist.clear(); 1234 | statETime = System.currentTimeMillis(); 1235 | System.out.printf("[INFO] Finish analyzing fake wakeup events, spending %f second\n" , (statETime - statSTime)*1.0/1e3); 1236 | } 1237 | 1238 | 1239 | // New for fake wakeup running time 1240 | public void readFake(String fname) throws IOException { 1241 | int nRead = 0; 1242 | int numEvent = 10000000; 1243 | int sum = 0; 1244 | 1245 | byte[] buffer = new byte[22*10000000]; 1246 | DataInputStream is = new DataInputStream(new FileInputStream(fname)); 1247 | 1248 | HashMap tmpStart = new HashMap(); 1249 | long wstime = 0; 1250 | long wetime = 0; 1251 | 1252 | long statSTime = System.currentTimeMillis(); 1253 | while( nRead != -1 ) 1254 | { 1255 | int i = 0; 1256 | nRead = is.read(buffer); 1257 | ByteBuffer buf = ByteBuffer.wrap(buffer); 1258 | buf.order(ByteOrder.LITTLE_ENDIAN); 1259 | for (i = 0; i < nRead/22; i++) { 1260 | UDSResult udsr = new UDSResult(buf); 1261 | if (udsList.get(udsr.tid)==null) { 1262 | tidList.add(udsr.tid); 1263 | prevList.put(udsr.tid, new prevState(-1, starttime)); 1264 | segmentMap.put(udsr.tid, new ConcurrentSkipListMap()); 1265 | overAllRes.put(udsr.tid, new GeneralRes()); 1266 | udsList.put(udsr.tid, new TreeSet()); 1267 | udswList.put(udsr.tid, new ConcurrentSkipListMap()); 1268 | dpList.put(udsr.tid, new ConcurrentSkipListMap()); 1269 | } 1270 | 1271 | // dpList generation 1272 | if (udsr.type == 0 || udsr.type == 15 || udsr.type == 16) { 1273 | dpList.get(udsr.tid).put(udsr.ts, new DPEvent(udsr.ts, udsr.type)); 1274 | } 1275 | 1276 | if (udsr.type == 0) { 1277 | udsList.get(udsr.tid).add(udsr.ts); 1278 | // Fang Test 1279 | // udsr.printOut(); 1280 | } 1281 | else if (udsr.type == 15) { 1282 | tmpStart.put(udsr.tid, udsr.ts); 1283 | } 1284 | else if (udsr.type == 16) { 1285 | if (tmpStart.get(udsr.tid)==null) { 1286 | continue; 1287 | } 1288 | else { 1289 | wstime = tmpStart.get(udsr.tid); 1290 | wetime = udsr.ts; 1291 | udswList.get(udsr.tid).put(wstime, new WaitRange(wstime, wetime)); 1292 | //System.out.printf("pid = %d, wstime = %d, wetime = %d\n", udsr.tid, wstime, wetime); 1293 | } 1294 | } 1295 | 1296 | //udsr.printOut(); 1297 | //if (udsr.type == 0) { 1298 | //} 1299 | //e.printOut(); 1300 | //elist.add(e); 1301 | } 1302 | //System.exit(-1); 1303 | sum += i; 1304 | if (i != 0) { 1305 | System.out.println(System.currentTimeMillis() + " Finish loading " + sum + " events."); 1306 | } 1307 | //FFang Adddd 1308 | //break; 1309 | //if (sum > nRead/26*10) break; 1310 | } 1311 | 1312 | long statETime = System.currentTimeMillis(); 1313 | System.out.printf("[INFO] Finish loading fake wakeup events, spending %f second\n" , (statETime - statSTime)*1.0/1e3); 1314 | 1315 | // Sort 1316 | //Collections.sort(elist); 1317 | //starttime = elist.get(0).time; 1318 | //endtime = elist.get(elist.size()-1).time; 1319 | //System.out.println("Start time = " + starttime + ", End time = " + endtime); 1320 | } 1321 | 1322 | // New for futex running time 1323 | public void readFutex(String fname) throws IOException { 1324 | int nRead = 0; 1325 | int numEvent = 10000000; 1326 | int sum = 0; 1327 | 1328 | byte[] buffer = new byte[12*10000000]; 1329 | DataInputStream is = new DataInputStream(new FileInputStream(fname)); 1330 | 1331 | while( nRead != -1 ) 1332 | { 1333 | int i = 0; 1334 | nRead = is.read(buffer); 1335 | ByteBuffer buf = ByteBuffer.wrap(buffer); 1336 | buf.order(ByteOrder.LITTLE_ENDIAN); 1337 | for (i = 0; i < nRead/12; i++) { 1338 | FEvent e = new FEvent(buf); 1339 | e.printOut(); 1340 | } 1341 | sum += i; 1342 | if (i != 0) { 1343 | System.out.println(System.currentTimeMillis() + " Finish loading " + sum + " events."); 1344 | } 1345 | } 1346 | } 1347 | 1348 | public void readPerf(String fname) throws IOException { 1349 | BufferedReader br = new BufferedReader(new FileReader(new File(fname))); 1350 | 1351 | int pid = 0; 1352 | long time = 0L; 1353 | long rtime = 0L; 1354 | String key = null; 1355 | String value = null; 1356 | String line = null; 1357 | while ((line = br.readLine()) != null) { 1358 | if (line.contains("pid=")) { 1359 | if (pid != 0) { 1360 | //System.out.println(pid + "," + rtime); 1361 | try { 1362 | rtime = compPerf.get(pid).higher(time); 1363 | } 1364 | catch (Exception e) { 1365 | rtime = time; 1366 | } 1367 | key = Integer.toString(pid) + "-" + Long.toString(rtime); 1368 | finalPerf.put(key, value); 1369 | } 1370 | pid = Integer.valueOf(line.replaceAll(" +"," ").split(" ")[1]); 1371 | time = (long)(Double.valueOf(line.replaceAll(" +"," ").split(" ")[3].replace(":","")) * 1e9); 1372 | value = line; 1373 | } 1374 | else { 1375 | value += "\n" + line; 1376 | } 1377 | } 1378 | 1379 | PrintWriter fperf = new PrintWriter("presult.final", "ASCII"); 1380 | for (String k : finalPerf.keySet()) { 1381 | //System.out.println(k); 1382 | fperf.printf(finalPerf.get(k)+"\n"); 1383 | } 1384 | fperf.close(); 1385 | //System.exit(0); 1386 | } 1387 | 1388 | public void readFile(String fname, ArrayList elist) throws IOException { 1389 | int nRead = 0; 1390 | int numEvent = 10000000; 1391 | int sum = 0; 1392 | long statSTime = System.currentTimeMillis(); 1393 | 1394 | byte[] buffer = new byte[34*10000000]; 1395 | DataInputStream is = new DataInputStream(new FileInputStream(fname)); 1396 | 1397 | while( nRead != -1 ) 1398 | { 1399 | int i = 0; 1400 | nRead = is.read(buffer); 1401 | ByteBuffer buf = ByteBuffer.wrap(buffer); 1402 | buf.order(ByteOrder.LITTLE_ENDIAN); 1403 | for (i = 0; i < nRead/34; i++) { 1404 | Event e = new Event(buf); 1405 | //e.printOut(); 1406 | //If the event is not 0 and 1, record in another list 1407 | if (e.type == 2) { 1408 | createTimeList.put(e.pid2, e.time); 1409 | } 1410 | else if (e.type == 3) { 1411 | killTimeList.put(e.pid1, e.time); 1412 | } 1413 | elist.add(e); 1414 | if (prevList.containsKey(e.pid1)) { 1415 | compPerf.get(e.pid1).add(e.perftime); 1416 | } 1417 | //if (e.pid1 == 11308 || e.pid2 == 11308) { 1418 | //e.printOut(); 1419 | //} 1420 | } 1421 | sum += i; 1422 | if (i != 0) { 1423 | System.out.println(System.currentTimeMillis() + " Finish loading " + sum + " events."); 1424 | } 1425 | //FFang Adddd 1426 | //break; 1427 | //if (sum > nRead/26*10) break; 1428 | } 1429 | 1430 | // Sort 1431 | Collections.sort(elist); 1432 | // Event test 1433 | //for (Event e:elist) { 1434 | // e.printOut(); 1435 | //} 1436 | //System.exit(0); 1437 | 1438 | starttime = elist.get(0).time; 1439 | endtime = elist.get(elist.size()-1).time; 1440 | long statETime = System.currentTimeMillis(); 1441 | System.out.println("Start time = " + starttime + ", End time = " + endtime); 1442 | 1443 | for (int pid : tidList) { 1444 | if (createTimeList.containsKey(pid)) { 1445 | prevList.put(pid, new prevState(-1, createTimeList.get(pid))); 1446 | } 1447 | else { 1448 | prevList.put(pid, new prevState(-1, starttime)); 1449 | } 1450 | 1451 | long stime = starttime; 1452 | long etime = endtime; 1453 | if (createTimeList.containsKey(pid)) { 1454 | stime = createTimeList.get(pid); 1455 | } 1456 | if (killTimeList.containsKey(pid)) { 1457 | etime = killTimeList.get(pid); 1458 | } 1459 | overAllRes.get(pid).total = etime - stime; 1460 | } 1461 | 1462 | } 1463 | 1464 | public class Event implements Comparable{ 1465 | long time; 1466 | short type; 1467 | short core; 1468 | int pid1; 1469 | int pid2; 1470 | short irq; 1471 | short pid1state; 1472 | short pid2state; 1473 | long perftime; 1474 | 1475 | public Event(ByteBuffer buf) throws IOException { 1476 | this.type = buf.getShort(); 1477 | this.time = buf.getLong(); 1478 | this.core = buf.getShort(); 1479 | this.pid1 = buf.getInt(); 1480 | this.pid2 = buf.getInt(); 1481 | this.irq = buf.getShort(); 1482 | this.pid1state = buf.getShort(); 1483 | this.pid2state = buf.getShort(); 1484 | this.perftime = buf.getLong(); 1485 | //this.printOut(); 1486 | //buf.position(buf.position()+20); 1487 | //this.f1 = buf.getLong(); 1488 | //this.f2 = buf.getInt(); 1489 | //this.f3 = buf.getInt(); 1490 | //this.f4 = buf.getInt(); 1491 | } 1492 | 1493 | @Override 1494 | public int compareTo(Event e) 1495 | { 1496 | if (this.time > e.time) return 1; 1497 | else if (this.time < e.time) return -1; 1498 | else return 0; 1499 | } 1500 | 1501 | public void printOut() { 1502 | System.out.printf("%d %d %d %d %d %d %d %d %d\n", this.time, this.type, this.core, this.pid1, this.pid2, this.irq, this.pid1state, this.pid2state, this.perftime); 1503 | //System.out.printf("%f %d %d %d %d %d %d %d\n", this.time, this.cmd, this.op, this.pid, this.f1, this.f2, this.f3, this.f4); 1504 | } 1505 | } 1506 | 1507 | public class SoftEvent implements Comparable{ 1508 | byte type; 1509 | long stime; 1510 | long etime; 1511 | short core; 1512 | 1513 | public SoftEvent(ByteBuffer buf) throws IOException { 1514 | this.type = buf.get(); 1515 | this.stime = buf.getLong(); 1516 | this.etime = buf.getLong(); 1517 | this.core = buf.getShort(); 1518 | } 1519 | 1520 | @Override 1521 | public int compareTo(SoftEvent e) 1522 | { 1523 | if (this.stime > e.stime) return 1; 1524 | else if (this.stime < e.stime) return -1; 1525 | else return 0; 1526 | } 1527 | 1528 | public void printOut() { 1529 | System.out.printf("%d %d %d %d\n", this.type, this.stime, this.etime, this.core); 1530 | //System.out.printf("%f %d %d %d %d %d %d %d\n", this.time, this.cmd, this.op, this.pid, this.f1, this.f2, this.f3, this.f4); 1531 | } 1532 | } 1533 | 1534 | public class GeneralRes { 1535 | long running; 1536 | long runnable; 1537 | long wait; 1538 | long hardirq; 1539 | long softirq; 1540 | long network; 1541 | long disk; 1542 | long other; 1543 | long unknown; 1544 | long total; 1545 | public GeneralRes() { 1546 | this.running = 0; 1547 | this.runnable = 0; 1548 | this.wait = 0; 1549 | this.hardirq = 0; 1550 | this.softirq = 0; 1551 | this.network = 0; 1552 | this.disk = 0; 1553 | this.other = 0; 1554 | this.unknown = 0; 1555 | this.total = 0; 1556 | } 1557 | 1558 | public void printOut(int pid) { 1559 | System.out.printf("%d %f %f %f %f %f %f %f %f %f\n", pid, this.running/freq, this.runnable/freq, this.wait/freq, this.hardirq/freq, this.softirq/freq, this.network/freq, this.disk/freq, this.other/freq, this.unknown/freq); 1560 | //System.out.printf("%d,%d,%d,%d,%d,%d,%d,%d,%d,%d\n", pid, this.running, this.runnable, this.wait, this.hardirq, this.softirq, this.network, this.disk, this.other, this.unknown); 1561 | } 1562 | 1563 | public void printOut(int pid, PrintWriter fbreak) { 1564 | fbreak.printf("%d %f %f %f %f %f %f %f %f %f %f\n", pid, this.running/freq, this.runnable/freq, this.wait/freq, this.hardirq/freq, this.softirq/freq, this.network/freq, this.disk/freq, this.other/freq, this.unknown/freq, this.total/freq); 1565 | //System.out.printf("%d,%d,%d,%d,%d,%d,%d,%d,%d,%d\n", pid, this.running, this.runnable, this.wait, this.hardirq, this.softirq, this.network, this.disk, this.other, this.unknown); 1566 | } 1567 | } 1568 | 1569 | class WaitRange implements Comparable { 1570 | long stime = 0; 1571 | long etime = 0; 1572 | 1573 | public WaitRange(long stime, long etime) { 1574 | this.stime = stime; 1575 | this.etime = etime; 1576 | } 1577 | 1578 | @Override 1579 | public int compareTo(WaitRange e) 1580 | { 1581 | if (this.stime > e.stime) return 1; 1582 | else if (this.stime < e.stime) return -1; 1583 | else return 0; 1584 | } 1585 | } 1586 | 1587 | class DPEvent implements Comparable { 1588 | long time = 0; 1589 | int type = 0; 1590 | 1591 | public DPEvent(long time, int type) { 1592 | this.time = time; 1593 | this.type = type; 1594 | } 1595 | 1596 | @Override 1597 | public int compareTo(DPEvent e) 1598 | { 1599 | if (this.time > e.time) return 1; 1600 | else if (this.time < e.time) return -1; 1601 | else return 0; 1602 | } 1603 | 1604 | public void printOut(int tid) { 1605 | System.out.printf("tid = %d, time = %d, type = %d\n", tid, time, type); 1606 | } 1607 | } 1608 | 1609 | class Worker extends Thread { 1610 | int id; 1611 | Object cond = new Object(); 1612 | ArrayList tidList = null; 1613 | Index index = null; 1614 | Segment seg = null; 1615 | int op = 0; 1616 | int isRunning = 1; 1617 | 1618 | // For stats collection 1619 | long infiniteTime = 0L; 1620 | long processTime = 0L; 1621 | 1622 | ConcurrentSkipListMap tm = null; 1623 | HashMap hm = new HashMap(); 1624 | 1625 | public Worker(int id, int op, Index index) { 1626 | this.id = id; 1627 | this.op = op; 1628 | this.index = index; 1629 | } 1630 | 1631 | public void run() { 1632 | IndexResult irlist = new IndexResult(); 1633 | IndexResult ir = new IndexResult(); 1634 | while(isRunning == 1) { 1635 | synchronized(cond) { 1636 | while(op == 0) { 1637 | try{ 1638 | System.out.printf("Thread %d starts to wait\n", id); 1639 | cond.wait(); 1640 | } 1641 | catch (InterruptedException e) { 1642 | } 1643 | } 1644 | System.out.printf("Thread %d starts op %d\n", id, op); 1645 | } 1646 | // op == 0, find out the real wake up 1647 | if (op == 1) { 1648 | while(true) { 1649 | if (index.getNext(ir) == null) { 1650 | System.out.printf("Thread %d finishes op %d\n", id, op); 1651 | op = 0; 1652 | synchronized(index) { 1653 | index.finishNum++; 1654 | index.notifyAll(); 1655 | } 1656 | break; 1657 | } 1658 | if (ir.seg.state == 2) { 1659 | Segment newSeg = getRealWake(ir.seg, ir.tm); 1660 | if (newSeg.endTime != ir.seg.endTime) { 1661 | // Record new segment 1662 | udsCheck.get(newSeg.pid).put(newSeg.startTime, newSeg); 1663 | udsQuick.get(newSeg.pid).put(ir.seg.startTime, ir.seg); 1664 | } 1665 | } 1666 | } 1667 | } 1668 | else if (op == 2) { 1669 | int tid = 0; 1670 | while(true) { 1671 | tid = index.getNextTID(); 1672 | if (tid == -1) { 1673 | op = 0; 1674 | synchronized(index) { 1675 | index.finishNum++; 1676 | index.notifyAll(); 1677 | } 1678 | System.out.printf("Thread %d finishes op %d\n", id, op); 1679 | break; 1680 | } 1681 | System.out.printf("Thread %d starts tid %d for op %d\n", id, tid, op); 1682 | ConcurrentSkipListMap cm = udsCheck.get(tid); 1683 | if (cm.lastEntry() == null) { 1684 | System.out.println("[Info] tid = " + tid); 1685 | } 1686 | else { 1687 | seg = cm.lastEntry().getValue(); 1688 | Segment segPrev = null; 1689 | while (cm.lowerEntry(seg.startTime) != null) { 1690 | segPrev = cm.lowerEntry(seg.startTime).getValue(); 1691 | if (seg.endTime == segPrev.endTime) { 1692 | cm.remove(seg.startTime); 1693 | } 1694 | seg = segPrev; 1695 | } 1696 | } 1697 | System.out.printf("Thread %d finishes tid %d for op %d\n", id, tid, op); 1698 | } 1699 | } 1700 | else if (op == 3) { 1701 | LinkedList tmpList = new LinkedList(); 1702 | ArrayList before = new ArrayList(); 1703 | while(true) { 1704 | if (index.getNextUDS(irlist) == null) { 1705 | System.out.printf("Thread %d finishes for op %d\n", id, op); 1706 | op = 0; 1707 | synchronized(index) { 1708 | index.finishNum++; 1709 | index.notifyAll(); 1710 | } 1711 | break; 1712 | } 1713 | 1714 | 1715 | if (isUDS == 1) { 1716 | // For fake wake-up 1717 | for (Segment newSeg : irlist.list) { 1718 | //System.out.printf("list size = %d\n", irlist.list.size()); 1719 | if (newSeg.state != 2) continue; 1720 | ConcurrentSkipListMap udsMap = udsCheck.get(newSeg.pid); 1721 | if (udsMap == null || udsMap.lowerEntry(newSeg.startTime) == null) { 1722 | tmpList.add(newSeg); 1723 | } 1724 | else { 1725 | //Segment tmp = udsMap.lowerEntry(newSeg.startTime).getValue(); 1726 | Segment tmp = udsMap.floorEntry(newSeg.startTime).getValue(); 1727 | if (tmp.endTime < newSeg.startTime || tmp.startTime == newSeg.startTime) { 1728 | if (tmp.startTime == newSeg.startTime) { 1729 | tmpList.add(tmp); 1730 | } 1731 | else { 1732 | tmpList.add(newSeg); 1733 | } 1734 | } 1735 | //if (tmp.endTime < newSeg.startTime) { 1736 | // tmpList.add(newSeg); 1737 | //} 1738 | //else if (tmp.startTime == newSeg.startTime) { 1739 | // tmpList.add(tmp); 1740 | //} 1741 | else { 1742 | } 1743 | } 1744 | } 1745 | } 1746 | else { 1747 | // For spin-lock, also works for other cases. 1748 | for (Segment newSeg : irlist.list) { 1749 | if (newSeg.state == 2) { 1750 | newSeg.overlap.add(newSeg.pid); 1751 | tmpList.add(newSeg); 1752 | } 1753 | } 1754 | // Finish spin-lock 1755 | } 1756 | 1757 | before.clear(); 1758 | while(!tmpList.isEmpty()) { 1759 | Segment segTmp = tmpList.remove(0); 1760 | 1761 | //if (isUDS == 2) { 1762 | // // spinlock 1763 | // processTime += segTmp.endTime - segTmp.startTime; 1764 | // if (segTmp.pid > 0 && before.contains(segTmp.pid)) { 1765 | // infiniteTime += segTmp.endTime - segTmp.startTime; 1766 | // continue; 1767 | // } 1768 | // else if (segTmp.pid > 0) { 1769 | // before.add(segTmp.pid); 1770 | // } 1771 | // // end for spinlock 1772 | //} 1773 | 1774 | // If the temp segment's time range is wrong 1775 | if (segTmp.startTime > segTmp.endTime) 1776 | continue; 1777 | 1778 | if (segTmp.state == 2) { 1779 | Segment sTmp = null; 1780 | if (isUDS == 1) { 1781 | // Check if the wake-up event is a fake one 1782 | if (udsCheck.containsKey(segTmp.waitFor)) { 1783 | try { 1784 | if (!isFakeWakeNew(segTmp.waitFor, segTmp.endTime)) { 1785 | sTmp = null; 1786 | } 1787 | else { 1788 | sTmp = udsQuick.get(segTmp.waitFor).lowerEntry(segTmp.endTime).getValue(); 1789 | } 1790 | } 1791 | catch (Exception e) { 1792 | sTmp = null; 1793 | //continue; 1794 | } 1795 | } 1796 | } 1797 | 1798 | // This means we cannot make sure if the waitFor thread is caused by a real or fake wake up. 1799 | 1800 | if (sTmp == null ) 1801 | { 1802 | //if (segTmp.pid == 31234) { 1803 | // segTmp.printOut(); 1804 | //} 1805 | addResult(hm, segTmp.pid, segTmp.waitFor, segTmp.startTime, segTmp.endTime); 1806 | if (segTmp.waitFor <= 0 || segTmp.waitFor == segTmp.pid) { 1807 | continue; 1808 | } 1809 | else { 1810 | tmpList.addAll(findAll(segTmp.waitFor, segTmp.startTime, segTmp.endTime, segTmp.parent, segTmp, this)); 1811 | } 1812 | } 1813 | else { 1814 | try { 1815 | tmpList.add(new Segment(segTmp.pid, 2, segTmp.startTime, sTmp.endTime, sTmp.waitFor, segTmp.waitFor)); 1816 | } 1817 | catch (Exception e) { 1818 | System.out.println(e.getMessage()); 1819 | continue; 1820 | } 1821 | } 1822 | } 1823 | } 1824 | irlist.list.clear(); 1825 | } 1826 | } 1827 | else if (op == 4) { 1828 | int tid = 0; 1829 | while(true) { 1830 | tid = index.getNextTID(); 1831 | if (tid == -1) { 1832 | op = 0; 1833 | synchronized(index) { 1834 | index.finishNum++; 1835 | index.notifyAll(); 1836 | } 1837 | System.out.printf("Thread %d finishes op %d\n", id, op); 1838 | break; 1839 | } 1840 | 1841 | System.out.printf("Thread %d starts tid %d for op %d\n", id, tid, op); 1842 | 1843 | ConcurrentSkipListMap cm = udsCheck.get(tid); 1844 | ConcurrentSkipListMap tm = segmentMap.get(tid); 1845 | 1846 | //for (Segment s : cm.values()) { 1847 | // addResult(hm, s.pid, s.waitFor, s.startTime, s.endTime); 1848 | //} 1849 | //System.out.printf("Total cm size for thread %d is %d\n", tid, cm.size()); 1850 | 1851 | //int tmpa = 0; 1852 | for (Segment s : cm.values()) { 1853 | //tmpa++; 1854 | //if (tmpa % 100 == 0) { 1855 | // System.out.printf("op4 has worked %d for thread %d\n", tid, tmpa); 1856 | //} 1857 | // Prepare for new segment timestamp 1858 | //System.out.printf("time = %d, tid = %d, wait pid = %d, start time = %d, end time = %d\n", System.currentTimeMillis(), s.pid, s.waitFor, s.startTime, s.endTime); 1859 | if (tm.lowerEntry(s.startTime) == null) continue; 1860 | Segment s1 = tm.lowerEntry(s.startTime).getValue(); 1861 | Segment s2 = null; 1862 | Segment tmp = null; 1863 | 1864 | //if (s.parent == -99) { 1865 | // long tmpTime = s.endTime; 1866 | // do { 1867 | // tmp = tm.lowerEntry(tmpTime).getValue(); 1868 | // tmpTime = tmp.startTime; 1869 | // } while(tmp.state != 2); 1870 | // s2 = tmp; 1871 | // //s.waitFor = s2.waitFor; 1872 | // s.endTime = s2.endTime; 1873 | //} 1874 | //else { 1875 | s2 = tm.lowerEntry(s.endTime).getValue(); 1876 | //} 1877 | long nstime = s1.startTime; 1878 | long netime = s2.endTime; 1879 | 1880 | 1881 | //s.printOut(); 1882 | if (s1.startTime == s2.startTime) { 1883 | tm.remove(s1.startTime); 1884 | } 1885 | else { 1886 | try { 1887 | for (Segment stmp : tm.subMap(s1.startTime, true, s2.startTime, true).values()) { 1888 | tm.remove(stmp.startTime); 1889 | } 1890 | } 1891 | catch (Exception e) { 1892 | System.out.println("op4 exception"); 1893 | continue; 1894 | } 1895 | } 1896 | 1897 | //if (s.parent == -99) { 1898 | // s.endTime = tmp.endTime; 1899 | // // Need add one more runnable state 1900 | // tm.put(nstime, new Segment(tid, 0, nstime, s.startTime, 0, 0)); 1901 | // tm.put(s.startTime, s); 1902 | //} 1903 | //else { 1904 | tm.put(nstime, new Segment(tid, 0, nstime, s.startTime, 0, 0)); 1905 | tm.put(s.startTime, s); 1906 | tm.put(s.endTime, new Segment(tid, 0, s.endTime, netime, 0, 0)); 1907 | //} 1908 | } 1909 | } 1910 | System.out.printf("Thread %d finishes tid %d for op %d\n", id, tid, op); 1911 | } 1912 | else if (op == 5) { 1913 | int tid = 0; 1914 | while(true) { 1915 | tid = index.getNextTID(); 1916 | if (tid == -1) { 1917 | op = 0; 1918 | synchronized(index) { 1919 | index.finishNum++; 1920 | index.notifyAll(); 1921 | } 1922 | System.out.printf("Thread %d finishes op %d\n", id, op); 1923 | break; 1924 | } 1925 | System.out.printf("Thread %d starts tid %d for op %d\n", id, tid, op); 1926 | directBreakdown(elist, tid); 1927 | System.out.printf("Thread %d finishes tid %d for op %d\n", id, tid, op); 1928 | } 1929 | } 1930 | } 1931 | } 1932 | } 1933 | 1934 | class Index { 1935 | int tid = 0; 1936 | int tidpos = 0; 1937 | long pos = 0; 1938 | 1939 | ArrayList tidList = null; 1940 | Map> segmentMap = null; 1941 | ConcurrentSkipListMap tm = null; 1942 | ConcurrentSkipListMap udsMap = null; 1943 | Iterator> it = null; 1944 | int nexttid = 0; 1945 | int finishNum = 0; 1946 | 1947 | public void reset() { 1948 | tid = 0; 1949 | tidpos = 0; 1950 | pos = 0; 1951 | finishNum = 0; 1952 | nexttid = 0; 1953 | tm = null; 1954 | } 1955 | 1956 | public Index() { 1957 | } 1958 | 1959 | public Index(ArrayList tidList, Map> segmentMap) { 1960 | this.tidList = tidList; 1961 | this.segmentMap = segmentMap; 1962 | this.tid = 0; 1963 | this.tidpos = 0; 1964 | this.pos = 0; 1965 | } 1966 | 1967 | synchronized IndexResult getNextUDS(IndexResult ir) { 1968 | int size = 1; 1969 | // First run! 1970 | if (tid == 0 || !it.hasNext()) { 1971 | //System.out.printf("Worker %d finishes cascading on Thread %d, #rest = %d\n", Thread.currentThread().getId(), tid, tidList.size() - tidpos); 1972 | if (tidpos >= tidList.size()) { 1973 | return null; 1974 | } 1975 | else { 1976 | tid = tidList.get(tidpos); 1977 | tm = segmentMap.get(tid); 1978 | it = tm.entrySet().iterator(); 1979 | tidpos++; 1980 | System.out.printf("Worker %d starts cascading on Thread %d, #rest = %d\n", Thread.currentThread().getId(), tid, tidList.size() - tidpos); 1981 | } 1982 | } 1983 | while (it.hasNext()) { 1984 | ir.list.add(it.next().getValue()); 1985 | break; 1986 | //if (ir.list.size() == size) { 1987 | // break; 1988 | //} 1989 | } 1990 | return ir; 1991 | } 1992 | 1993 | synchronized IndexResult getNext(IndexResult ir) { 1994 | // First run! 1995 | if (tid == 0 || !it.hasNext()) { 1996 | System.out.printf("Worker %d finishes fakePre on Thread %d, #rest = %d\n", Thread.currentThread().getId(), tid, tidList.size() - tidpos); 1997 | if (tidpos >= tidList.size()) { 1998 | return null; 1999 | } 2000 | else { 2001 | tid = tidList.get(tidpos); 2002 | tm = segmentMap.get(tid); 2003 | it = tm.entrySet().iterator(); 2004 | tidpos++; 2005 | System.out.printf("Worker %d starts fakePre on Thread %d, #rest = %d\n", Thread.currentThread().getId(), tid, tidList.size() - tidpos); 2006 | } 2007 | } 2008 | while (it.hasNext()) { 2009 | ir.tm = tm; 2010 | ir.seg = it.next().getValue(); 2011 | break; 2012 | } 2013 | return ir; 2014 | } 2015 | 2016 | synchronized int getNextTID() { 2017 | if (nexttid == tidList.size()) 2018 | return -1; 2019 | return tidList.get(nexttid++); 2020 | } 2021 | } 2022 | 2023 | public Index createIndex(ArrayList tidList, Map> segmentMap) { 2024 | return new Index(tidList, segmentMap); 2025 | } 2026 | 2027 | class IndexResult { 2028 | Segment seg = null; 2029 | ArrayList list = new ArrayList(); 2030 | ConcurrentSkipListMap tm = null; 2031 | 2032 | public IndexResult() { 2033 | } 2034 | } 2035 | 2036 | void startWorker(ArrayList workerList, Index index, int numWorker) { 2037 | for (int i = 0; i < numWorker; i++) { 2038 | workerList.add(new Worker(i, 0, index)); 2039 | } 2040 | 2041 | for (int i = 0; i < numWorker; i++) { 2042 | workerList.get(i).start(); 2043 | } 2044 | } 2045 | 2046 | public class SpinVector { 2047 | int pid1; 2048 | int pid2; 2049 | long stime; 2050 | long etime; 2051 | 2052 | public SpinVector() { 2053 | } 2054 | 2055 | public void clear() { 2056 | pid1 = 0; 2057 | pid2 = 0; 2058 | stime = 0; 2059 | etime = 0; 2060 | } 2061 | } 2062 | } 2063 | 2064 | -------------------------------------------------------------------------------- /analyzer/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | javac -cp ".:/tmp/uds.jar" Analyzer.java 3 | clean: 4 | rm *.class 5 | -------------------------------------------------------------------------------- /analyzer/analyzer.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | echo "wPerf analzyer:" 3 | echo -n "Data directory:" 4 | read -r dir 5 | echo -n "Programming language[c or java, default c]:" 6 | read -r lang 7 | echo -n "User annotation[fake(false wakeup), busy(spinlock), or no, default no]:" 8 | read -r uanno 9 | echo -n "Analyzer worker number[default 32]:" 10 | read -r nworker 11 | echo -n "Merge similarity for long term threads[default 0.45]" 12 | read -r lsim 13 | 14 | if [ "$dir" == "" ]; then 15 | echo "No input directory." 16 | exit 17 | fi 18 | 19 | if [ "$lang" == "" ]; then 20 | lang="c" 21 | fi 22 | 23 | if [ "$uanno" == "" ]; then 24 | uanno="no" 25 | fi 26 | 27 | if [ "$nworker" == "" ]; then 28 | nworker=32 29 | fi 30 | 31 | if [ "$lsim" == "" ]; then 32 | lsim=0.45 33 | fi 34 | 35 | rm $dir/waitfor $dir/waitfor.bak 36 | 37 | echo '0. Backup old target.tinfo' 38 | cp $dir/target.tinfo $dir/target.tinfo.bak 39 | grep creates $dir/kmesg | awk '{if (NF>9) {print $8} else {print $7}}' | sort | uniq >> $dir/target.tinfo 40 | 41 | echo '1. Generate perf function stacks' 42 | cd $dir 43 | #sudo cp perf-*.map /tmp/ 44 | sudo perf script -i presult > presult.out 45 | echo "1 2 pid= 3" >> presult.out 46 | cd - 47 | 48 | echo '2. Generate breakdown file and wait-for info' 49 | make 50 | wMem=$(free -mh | grep Mem | awk '{print tolower($2)}') 51 | time java -cp ".:/tmp/uds.jar" -Xms${wMem} Analyzer $dir/result_switch $dir/result_softirq $dir/result_fake $dir/target.tinfo $dir/presult.out $dir/cpufreq $dir/waitfor.bak $dir/breakdown $uanno $nworker 52 | echo "1 2 pid= 3" >> presult.final 53 | mv presult.final $dir/ 54 | cp $dir/waitfor.bak $dir/waitfor 55 | 56 | echo '3. Generate the merge info' 57 | if [ $lang == "java" ] 58 | then 59 | python change-jstack.py $dir/gresult > $dir-gresult 60 | mv $dir-gresult $dir/gresult 61 | fi 62 | 63 | python ./merge.py $dir/presult.out $lsim > $dir/merge.tinfo 64 | 65 | echo '4. Start process short-term processes' 66 | cp short.py $dir/ 67 | cd $dir 68 | python short.py > waitfor-1 69 | python short.py 1 >> merge.tinfo 70 | mv waitfor-1 waitfor 71 | cp target.tinfo.bak target.tinfo 72 | cd - 73 | -------------------------------------------------------------------------------- /analyzer/bottleneck.py: -------------------------------------------------------------------------------- 1 | import networkx as nx 2 | import operator 3 | import sys 4 | import os 5 | from prettytable import PrettyTable 6 | 7 | filenet = open(sys.argv[1]+'/waitfor','r') 8 | filetgroup = open(sys.argv[1]+'/merge.tinfo','r') 9 | filesoftirq = open(sys.argv[1]+'/soft.tinfo','r') 10 | fileallthreads = open(sys.argv[1]+'/target.tinfo','r') 11 | filebreak = open(sys.argv[1]+'/breakdown','r') 12 | filekmesg = open(sys.argv[1]+'/kmesg','r') 13 | filefs = open(sys.argv[1]+'/gresult','r') 14 | fileperf = open(sys.argv[1]+'/presult.out','r') 15 | filecpu = open(sys.argv[1]+'/cpufreq','r') 16 | 17 | tdir = sys.argv[1] 18 | tlang = sys.argv[2] 19 | term = float(sys.argv[3]) 20 | removeThreshold = float(sys.argv[4]) 21 | 22 | btable = None 23 | btag = 0 24 | tmprank = {} 25 | 26 | softirq=[] 27 | allthreads=[] 28 | #Create softirq group 29 | for l in filesoftirq: 30 | l=l.split() 31 | for e in l: 32 | softirq.append(int(e)) 33 | 34 | for l in fileallthreads: 35 | l = l.split() 36 | for e in l: 37 | if int(e) not in softirq: 38 | allthreads.append(int(e)) 39 | 40 | #''' 41 | for l in filebreak: 42 | l=l.split() 43 | if btag == 0: 44 | btable = PrettyTable(l) 45 | btag = 1 46 | else: 47 | total = 0.0 48 | tmp = [] 49 | # Don't show softirq 50 | #if int(l[0]) in softirq: 51 | # continue 52 | if float(l[-1]) == 0: 53 | for e in l[1:-1]: 54 | tmp.append('-'); 55 | continue 56 | for e in l[1:]: 57 | total+=float(e) 58 | tmp.append(l[0]) 59 | for e in l[1:-1]: 60 | tmp.append(format(float(e)/float(l[-1])*100,'.2f')+'%') 61 | tmp.append(l[-1]) 62 | tmprank[int(l[0])]=tmp 63 | for k in sorted(tmprank.keys()): 64 | btable.add_row(tmprank[k]) 65 | #''' 66 | 67 | # For memcached 68 | #netrate = 0.436 69 | #diskrate = 0 70 | 71 | # For mysql memory 8G 72 | #netrate = 34.4/94.3 73 | # For mysql memory 128M 74 | #netrate = 25.5/94.3 75 | #netrate = 0.01 76 | #diskrate = 0.934 77 | 78 | # For mysql disk 79 | #netrate = 0.01 80 | #diskrate = 0.98 81 | stime = 0.0 82 | etime = 0.0 83 | dtime = 0.0 84 | ntime = 0.0 85 | 86 | freq = 0 87 | for l in filecpu: 88 | freq = float(l.strip())*1e9 89 | 90 | send = {} 91 | sendTotal = 0 92 | for k in filekmesg: 93 | l = k[0]+'00'+k[3:] 94 | if 'Inside open' in l: 95 | stime = float(l.split()[-1])/freq 96 | elif 'Inside close' in l: 97 | etime = float(l.split()[-1])/freq 98 | elif 'Disk' in l and 'work time' in l: 99 | dtime = float(l.split()[-1])/1000.0 100 | elif 'Network:' in l: 101 | ntime = float(l.split()[-1])/1000.0 102 | elif 'sends' in l: 103 | tmp = l.strip().split() 104 | pid = int(tmp[2]) 105 | byte = int(tmp[-1]) 106 | send[pid] = byte 107 | sendTotal += byte 108 | 109 | totaltime = etime - stime 110 | #diskrate = dtime / totaltime 111 | #netrate = sendTotal / (125000000 * totaltime) 112 | threshold = totaltime * removeThreshold 113 | #threshold = (etime - stime) * 0.005 114 | 115 | tgroup={} 116 | nodes=[] 117 | originG={} 118 | startG={} 119 | finalG=[] 120 | G=nx.DiGraph() 121 | 122 | #Create a thread group 123 | index = 0 124 | for l in filetgroup: 125 | l=l.split() 126 | tgroup[int(l[0])] = [] 127 | for e in l: 128 | tgroup[int(l[0])].append(int(e)) 129 | 130 | javaBack=[] 131 | 132 | threadName = {} 133 | 134 | def tName(tid): 135 | if tid in threadName.keys(): 136 | return threadName[tid] 137 | #return threadName[tid].split(':')[0].split('.')[0].split('(')[0] 138 | return tid 139 | 140 | fs={} 141 | wt={} 142 | wk={} 143 | 144 | pid = 0 145 | pid1 = 0 146 | fstype = 0 147 | fstr = '' 148 | for l in fileperf: 149 | if l == '\n': 150 | continue 151 | if 'pid=' in l or 'sched:sched_switch'in l: 152 | if pid != 0: 153 | if fstype == 0: 154 | for gkey in tgroup.keys(): 155 | if pid in tgroup[gkey]: 156 | pid = gkey 157 | break 158 | if pid not in wt.keys(): 159 | wt[pid] = {} 160 | if fstr not in wt[pid].keys(): 161 | wt[pid][fstr] = 0 162 | wt[pid][fstr] += 1 163 | else: 164 | for gkey in tgroup.keys(): 165 | if pid in tgroup[gkey]: 166 | pid = gkey 167 | break 168 | 169 | for gkey in tgroup.keys(): 170 | if pid1 in tgroup[gkey]: 171 | pid1 = gkey 172 | break 173 | 174 | if (pid,pid1) not in wk.keys(): 175 | wk[(pid,pid1)] = {} 176 | if fstr not in wk[(pid,pid1)].keys(): 177 | wk[(pid,pid1)][fstr] = 0 178 | wk[(pid,pid1)][fstr] += 1 179 | pid = int(l.split()[1]) 180 | if 'sched_switch' in l: 181 | fstype = 0 182 | else: 183 | fstype = 1 184 | pid1 = int(l.split('pid=')[1].split()[0]) 185 | fstr = '' 186 | else: 187 | fstr += l.strip()+'\n' 188 | 189 | # Get the thread name 190 | fstr = '' 191 | tid = 0 192 | if tlang == 'java': 193 | for l in filefs: 194 | if '(Parallel GC Threads)' in l or '(ParallelGC)' in l: 195 | tid = int(l.split('nid=')[1].split()[0]) 196 | threadName[str(tid)] = 'ParallelGC' 197 | elif '(Parallel CMS Threads)' in l: 198 | tid = int(l.split('nid=')[1].split()[0]) 199 | threadName[str(tid)] = 'ParallelCMS' 200 | elif 'VM Thread' in l: 201 | tid = int(l.split('nid=')[1].split()[0]) 202 | threadName[str(tid)] = 'VMThread' 203 | elif 'at ' in l or 'nid=' in l: 204 | if 'nid=' in l: 205 | if tid != 0: 206 | for gkey in tgroup.keys(): 207 | if tid in tgroup[gkey]: 208 | tid = gkey 209 | break 210 | if tid not in fs.keys(): 211 | fs[tid] = {} 212 | if fstr not in fs[tid].keys(): 213 | fs[tid][fstr] = 0 214 | fs[tid][fstr] += 1 215 | tid = int(l.split('nid=')[1].split()[0]) 216 | fstr = '' 217 | nameList = l.split('"')[1].replace('"','').split() 218 | name = '' 219 | for tmp in nameList: 220 | name += tmp + ' ' 221 | if tmp.isdigit(): 222 | break 223 | threadName[str(tid)] = name + '(' + str(tid)+')' 224 | elif 'at ' in l: 225 | fstr += l.strip() + '\n' 226 | else: 227 | pid = 0 228 | fstr = '' 229 | laststr='' 230 | thisstr='' 231 | for l in filefs: 232 | if '#' in l: 233 | laststr = thisstr 234 | thisstr = l 235 | if 'start_thread' in l: 236 | if '0x' in laststr: 237 | threadName[str(pid)] = laststr.split()[3] + '(' + str(pid) + ')' 238 | else: 239 | threadName[str(pid)] = laststr.split()[1] + '(' + str(pid) + ')' 240 | if 'Total stack' in l or 'Thread: ' in l: 241 | if pid != 0: 242 | if pid not in fs.keys(): 243 | fs[pid] = {} 244 | if fstr not in fs[pid].keys(): 245 | fs[pid][fstr] = 0 246 | fs[pid][fstr] += 1 247 | if 'Total stack' in l: 248 | break 249 | pid = int(l.split()[1]) 250 | for gkey in tgroup.keys(): 251 | if pid in tgroup[gkey]: 252 | pid = gkey 253 | break 254 | fstr = '' 255 | else: 256 | fstr += l.strip()+'\n' 257 | #print threadName 258 | 259 | #Create the originG network 260 | 261 | for l in filenet: 262 | l=l.split() 263 | nodeA=int(l[0]) 264 | nodeB=int(l[1]) 265 | edge=float(format(float(l[2]),'.2f')) 266 | # No need for unkown(-99), scheduler(0), softirq, hardirq/timer, and softirq kernel thread 267 | #if (nodeB==-99) or (nodeB==0): 268 | if (nodeB==-99) or (nodeB==0) or (nodeB==-16) or (nodeB==-15) or (nodeB == -2) or (nodeA in softirq): 269 | continue 270 | startG[(nodeA,nodeB)]=edge 271 | 272 | for gkey in tgroup.keys(): 273 | if nodeA in tgroup[gkey]: 274 | nodeA = gkey 275 | break 276 | for gkey in tgroup.keys(): 277 | if nodeB in tgroup[gkey]: 278 | nodeB = gkey 279 | break 280 | if nodeA not in nodes: 281 | nodes.append(nodeA) 282 | if nodeB not in nodes: 283 | nodes.append(nodeB) 284 | 285 | #Add edge from network to device 286 | netlist={} 287 | total = 0.0 288 | 289 | for nodeA in nodes: 290 | for nodeB in nodes: 291 | if nodeA == nodeB: 292 | continue 293 | if (nodeA, nodeB) not in startG.keys(): 294 | continue 295 | if nodeA in tgroup.keys(): 296 | group1 = tgroup[nodeA] 297 | else: 298 | group1 = [nodeA] 299 | if nodeB in tgroup.keys(): 300 | group2 = tgroup[nodeB] 301 | else: 302 | group2 = [nodeB] 303 | 304 | weight = 0.0 305 | for m in group2: 306 | tmpweight = 0.0 307 | count = 0 308 | for n in group1: 309 | if (n,m) in startG.keys(): 310 | #count+=1 311 | tmpweight+=startG[(n,m)] 312 | weight+=tmpweight/len(group1) 313 | if weight != 0.0: 314 | originG[(nodeA, nodeB)] = weight; 315 | 316 | #if nodeA == 27196 and nodeB == 27198: 317 | # print nodeA, nodeB, weight 318 | # raw_input() 319 | 320 | 321 | for k in originG.keys(): 322 | # Filter!!! Weight should more than 3 second 323 | if originG[k] < threshold: 324 | continue 325 | finalG.append((k[0],k[1],originG[k])); 326 | 327 | def findName(tlist): 328 | res='' 329 | if type(tlist) is list: 330 | for value in tlist: 331 | if value == -4: 332 | res+='NIC, ' 333 | elif value == -5: 334 | res+='Disk, ' 335 | elif value == -2: 336 | res+='Timer, ' 337 | elif value == -15: 338 | res+='HardIRQ(Timer), ' 339 | elif value == -99: 340 | res+='Unknown, ' 341 | else: 342 | res+=str(tName(str(value)))+', ' 343 | #res+=str(value)+', ' 344 | return res[:-2] 345 | else: 346 | value = tlist 347 | if value == -4: 348 | res+='NIC' 349 | elif value == -5: 350 | res+='Disk' 351 | elif value == -2: 352 | res+='Timer' 353 | elif value == -15: 354 | res+='HardIRQ(Timer)' 355 | elif value == -99: 356 | res+='Unknown' 357 | else: 358 | res+=str(tName(str(value))) 359 | # res+=str(value) 360 | return res 361 | 362 | def scc_graph(G): 363 | scc_result = [] 364 | result=[list(c) for c in sorted(nx.strongly_connected_components(G), key=len, reverse=True)] 365 | for i in range(0, len(result)): 366 | waitfor = '' 367 | outgoing = 0.0 368 | for j in range(0, len(result)): 369 | if i==j: 370 | continue 371 | for m in range(0, len(result[i])): 372 | for n in range(0, len(result[j])): 373 | if G.has_edge(result[i][m], result[j][n]): 374 | outgoing += G.get_edge_data(result[i][m], result[j][n])['weight'] 375 | if outgoing == 0: 376 | memo = 'Bottleneck' 377 | else: 378 | memo = 'Normal SCC' 379 | 380 | for j in range(0, len(result)): 381 | weight = 0.0 382 | if i == j: 383 | continue 384 | for m in range(0, len(result[i])): 385 | for n in range(0, len(result[j])): 386 | if G.has_edge(result[j][n],result[i][m]): 387 | weight += G.get_edge_data(result[j][n],result[i][m])['weight'] 388 | if weight != 0: 389 | waitfor+=str(j)+'->'+str(weight)+'|' 390 | #waitfor+='SCC '+str(j)+':'+str(tmpweight)+'|' 391 | scc_result.append((result[i], weight, memo, waitfor[:-1])) 392 | return scc_result 393 | 394 | #print result 395 | 396 | def show_weight(vlist): 397 | wresult={} 398 | for i in range(0, len(vlist)): 399 | for j in (range(0, len(vlist))): 400 | if i==j: 401 | continue 402 | else: 403 | for e in finalG: 404 | if e[0] == vlist[i] and e[1] == vlist[j]: 405 | wresult[(e[0],e[1])] = e[2] 406 | wkey = sorted(wresult, key=wresult.get, reverse=True) 407 | res={} 408 | for i in range(0,len(wkey)): 409 | res[(wkey[i][0],wkey[i][1])] = wresult[wkey[i]] 410 | return res 411 | 412 | def addColor(string): 413 | return '\x1b[6;31;40m'+string+'\x1b[0m' 414 | 415 | def print_scc(scc_result): 416 | #print 'SCC Components:' 417 | x = PrettyTable(["SCC ID", "SCC Members", "Type", "Intra-Wait", "Inter-Wait", "NumOfEdges", "CPU Usage"]) 418 | #x = PrettyTable(["SCC ID", "SCC Members", "Type", "Intra-Wait", "Inter-Wait"]) 419 | x.max_width['SCC Members']=40 420 | #print 'SCC Components:' 421 | #print '%s\t%s\t%s\t%s' % ('Number'.center(5),'Vertex Group'.center(40),'Weight'.center(10), 'Memo'.center(10)) 422 | for i in range(0,len(scc_result)): 423 | intraWait = 0.0 424 | if len(scc_result[i][0])==1 and scc_result[i][0][0] in tgroup.keys(): 425 | tmpnode = scc_result[i][0][0] 426 | for m in tgroup[tmpnode]: 427 | for n in tgroup[tmpnode]: 428 | if m == n: 429 | continue 430 | else: 431 | if (m,n) in startG.keys(): 432 | intraWait+=startG[(m,n)] 433 | intraWait /= (len(tgroup[scc_result[i][0][0]])) 434 | else: 435 | numEdge = 0 436 | for m in scc_result[i][0]: 437 | for n in scc_result[i][0]: 438 | if m==n: 439 | continue 440 | else: 441 | for k in finalG: 442 | if k[0]==m and k[1]==n: 443 | intraWait+=k[2] 444 | break 445 | 446 | for spid in scc_result[i][0]: 447 | for epid in scc_result[i][0]: 448 | if G.has_edge(spid,epid): 449 | numEdge += 1 450 | 451 | cpuutil = 0.0 452 | 453 | #print tmprank 454 | for spid in scc_result[i][0]: 455 | if spid in tmprank.keys(): 456 | cpuutil += float(tmprank[spid][1].replace('%','')) 457 | cpuutil=str(cpuutil)+'%' 458 | 459 | thstr = findName(scc_result[i][0]) 460 | if len(thstr.split(',')) > 5: 461 | thtmp = '' 462 | for k in thstr.split(',')[:5]: 463 | thtmp+=k+',' 464 | thtmp+='...' 465 | thstr = thtmp 466 | if scc_result[i][2]=='Bottleneck': 467 | #x.add_row([addColor(str(i)), addColor(findName(scc_result[i][0])),addColor(str(scc_result[i][1])), addColor(scc_result[i][2]), addColor(str(intraWait)), addColor(str(scc_result[i][3]))]) 468 | x.add_row([addColor(str(i)), addColor(thstr), addColor(scc_result[i][2]), addColor(str(intraWait)), addColor(str(scc_result[i][3])), numEdge, cpuutil]) 469 | else: 470 | #x.add_row([str(i), findName(scc_result[i][0]),str(scc_result[i][1]), scc_result[i][2], intraWait, str(scc_result[i][3])]) 471 | x.add_row([str(i), findName(thstr), scc_result[i][2], intraWait, str(scc_result[i][3]), numEdge, cpuutil]) 472 | #print '%s\t%s\t%s\t%s' % (str(i).center(5), str(scc_result[i][0]).center(40), str(scc_result[i][1]).center(10), scc_result[i][2].center(10)) 473 | j = len(scc_result) 474 | if (len(previous)==0): 475 | for n in allthreads: 476 | ifshow=1 477 | for i in range(0,len(scc_result)): 478 | for m in scc_result[i][0]: 479 | if m in tgroup.keys(): 480 | if n in tgroup[m]: 481 | ifshow = 0 482 | break 483 | else: 484 | if n == m: 485 | ifshow = 0 486 | break 487 | if ifshow == 1: 488 | for thead in tgroup.keys(): 489 | if int(n) == thead: 490 | break 491 | if int(n) in tgroup[thead]: 492 | ifshow = 0 493 | break 494 | #if ifshow==1: 495 | # x.add_row([str(j), n, 'Normal SCC', 0, '']) 496 | # j+=1 497 | print x 498 | 499 | def create_new(vlist, G, op): 500 | new_G=nx.DiGraph() 501 | new_list = [] 502 | for i in range(0,len(vlist)): 503 | for j in range(0,len(vlist)): 504 | if i==j: 505 | continue 506 | else: 507 | for e in finalG: 508 | if G.has_edge(e[0], e[1]) and e[0] == vlist[i] and e[1] == vlist[j]: 509 | new_list.append((e[0],e[1],e[2])) 510 | 511 | tmpmin = sys.float_info.max 512 | if (op == 0): 513 | for i in new_list: 514 | if i[2] < tmpmin: 515 | tmpmin = i[2] 516 | if tmpmin > term: 517 | print '\x1b[6;31;40m'+'[Warn] Termination condition is met, you can stop here.' 518 | for i in range(0,len(new_list)): 519 | e = new_list[i] 520 | if e[2] == tmpmin: 521 | del new_list[i] 522 | print '\x1b[6;31;40m'+'[INFO] Remove edge from %s to %s with weight %f' % (findName(e[0]), findName(e[1]), e[2]) + '\x1b[0m' 523 | break 524 | new_G.add_weighted_edges_from(new_list) 525 | return new_G 526 | 527 | #Create the network based on originG 528 | G.add_weighted_edges_from(finalG) 529 | 530 | previous=[] 531 | 532 | while(1): 533 | try: 534 | scc_result = scc_graph(G) 535 | if len(scc_result) == 1 and len(scc_result[0])!=1: 536 | print '\x1b[6;31;40m[INFO] Only single SCC in current graph. Need to remove the shortest weights in the graph until there\'re more than one SCC.\x1b[0m' 537 | #print '\x1b[6;31;40m[INFO] The current graph info is shown below.\x1b[0m' 538 | #einfo={} 539 | #x = PrettyTable(["Vertex A", "Vertex B", "Weight"]) 540 | #for i in scc_result[0][0]: 541 | # for j in scc_result[0][0]: 542 | # if i==j: 543 | # continue 544 | # if G.has_edge(i,j): 545 | # einfo[(i,j)] = G.get_edge_data(i,j)['weight'] 546 | # #x.add_row([findName(i), findName(j), str(G.get_edge_data(i,j)['weight'])]) 547 | #einfo=sorted(einfo.items(), key=operator.itemgetter(1), reverse=True) 548 | #for e in einfo: 549 | # x.add_row([findName(e[0][0]),findName(e[0][1]),str(e[1])]) 550 | #print x 551 | print '\x1b[6;31;40m[INFO] Press any key to continue...\x1b[0m', 552 | raw_input() 553 | G = create_new(scc_result[0][0], G, 0) 554 | continue 555 | print 'Termination condition: ', term*totaltime 556 | print_scc(scc_result); 557 | keypress = raw_input('Command (h for help):') 558 | #keypress = sys.stdin.readline().strip() 559 | #keypress = keypress.replace('\n','').replace('\r','') 560 | if keypress=='q': 561 | os._exit(1) 562 | elif 'g' in keypress[0] and 'w' in keypress[1]: 563 | pos=int(keypress.split()[1]) 564 | tmpList={} 565 | 566 | if pos == -1: 567 | f1=open('waitfor.csv','w') 568 | for k in finalG: 569 | f1.write('%s,%s,%s\n' % (k[0],k[1],k[2])) 570 | f1.close() 571 | elif len(scc_result[pos][0]) == 1: 572 | if scc_result[pos][0][0] not in tgroup.keys(): 573 | print '\x1b[6;31;40m'+'[ERROR] Cannot unfold this SCC any more'+'\x1b[0m' 574 | else: 575 | einfo={} 576 | tmpid = scc_result[pos][0][0] 577 | x = 'source,target,value\n' 578 | for m in tgroup[tmpid]: 579 | for n in tgroup[tmpid]: 580 | if (m,n) in startG.keys(): 581 | einfo[(m,n)] = str(startG[(m,n)]) 582 | #x.add_row([findName(m),findName(n),str(startG[(m,n)])]) 583 | einfo=sorted(einfo.items(), key=operator.itemgetter(1), reverse=True) 584 | for e in einfo: 585 | x += '%s,%s,%.2f\n' % (tName(findName(e[0][0]))+'-'+str(tgroup[tmpid].index(e[0][0])),tName(findName(e[0][1]))+'-'+str(tgroup[tmpid].index(e[0][1])),e[1]) 586 | print x 587 | else: 588 | einfo={} 589 | for m in scc_result[pos][0]: 590 | for n in scc_result[pos][0]: 591 | if G.has_edge(m,n): 592 | einfo[(m,n)] = G.get_edge_data(m,n)['weight'] 593 | for tmpid in scc_result[pos][0]: 594 | tmpsize=0 595 | if tmpid not in tgroup.keys(): 596 | continue 597 | for m in tgroup[tmpid]: 598 | for n in tgroup[tmpid]: 599 | if (m,n) in startG.keys(): 600 | tmpsize+=startG[(m,n)] 601 | einfo[(tmpid,tmpid)] = tmpsize 602 | for tpos in range(0,len(scc_result)): 603 | if tpos == pos: 604 | continue 605 | for m in scc_result[tpos][0]: 606 | for n in scc_result[pos][0]: 607 | if G.has_edge(m,n): 608 | einfo[(m,n)] = G.get_edge_data(m,n)['weight'] 609 | einfo=sorted(einfo.items(), key=operator.itemgetter(1), reverse=True) 610 | x = 'source,target,value\n' 611 | for e in einfo: 612 | #x += '%s,%s,%s\n' % (findName(e[0][0]),findName(e[0][1]),str(e[1])) 613 | x += '%s,%s,%.2f\n' % (tName(findName(e[0][0])),tName(findName(e[0][1])),e[1]) 614 | f1=open('waitfor.csv','w') 615 | f1.write(x) 616 | f1.close() 617 | print x 618 | raw_input('Press any key to continue...') 619 | elif 'g' in keypress[0]: 620 | pos=int(keypress.split()[1]) 621 | tmpList={} 622 | 623 | if pos == -1: 624 | f1=open('waitfor.csv','w') 625 | for k in finalG: 626 | f1.write('%s,%s,%s\n' % (k[0],k[1],k[2])) 627 | f1.close() 628 | elif len(scc_result[pos][0]) == 1: 629 | if scc_result[pos][0][0] not in tgroup.keys(): 630 | print '\x1b[6;31;40m'+'[ERROR] Cannot unfold this SCC any more'+'\x1b[0m' 631 | else: 632 | einfo={} 633 | tmpid = scc_result[pos][0][0] 634 | x = 'source,target,value\n' 635 | for m in tgroup[tmpid]: 636 | for n in tgroup[tmpid]: 637 | if (m,n) in startG.keys(): 638 | einfo[(m,n)] = str(startG[(m,n)]) 639 | #x.add_row([findName(m),findName(n),str(startG[(m,n)])]) 640 | einfo=sorted(einfo.items(), key=operator.itemgetter(1), reverse=True) 641 | for e in einfo: 642 | x += '%s,%s,%.2f\n' % (tName(findName(e[0][0]))+'-'+str(tgroup[tmpid].index(e[0][0])),tName(findName(e[0][1]))+'-'+str(tgroup[tmpid].index(e[0][1])),e[1]) 643 | print x 644 | else: 645 | einfo={} 646 | for m in scc_result[pos][0]: 647 | for n in scc_result[pos][0]: 648 | if G.has_edge(m,n): 649 | einfo[(m,n)] = G.get_edge_data(m,n)['weight'] 650 | for tmpid in scc_result[pos][0]: 651 | tmpsize=0 652 | if tmpid not in tgroup.keys(): 653 | continue 654 | for m in tgroup[tmpid]: 655 | for n in tgroup[tmpid]: 656 | if (m,n) in startG.keys(): 657 | tmpsize+=startG[(m,n)] 658 | einfo[(tmpid,tmpid)] = tmpsize 659 | einfo=sorted(einfo.items(), key=operator.itemgetter(1), reverse=True) 660 | x = 'source,target,value\n' 661 | for e in einfo: 662 | #x += '%s,%s,%s\n' % (findName(e[0][0]),findName(e[0][1]),str(e[1])) 663 | x += '%s,%s,%.2f\n' % (tName(findName(e[0][0])),tName(findName(e[0][1])),e[1]) 664 | f1=open('waitfor.csv','w') 665 | f1.write(x) 666 | f1.close() 667 | print x 668 | raw_input('Press any key to continue...') 669 | elif 's' in keypress[0]: 670 | pos=int(keypress.split()[1]) 671 | tmpList={} 672 | if len(scc_result[pos][0]) == 1: 673 | if scc_result[pos][0][0] not in tgroup.keys(): 674 | print '\x1b[6;31;40m'+'[ERROR] Cannot unfold this SCC any more'+'\x1b[0m' 675 | else: 676 | einfo={} 677 | tmpid = scc_result[pos][0][0] 678 | x = PrettyTable(["Vertex A", "Vertex B", "Weight"]) 679 | for m in tgroup[tmpid]: 680 | for n in tgroup[tmpid]: 681 | if (m,n) in startG.keys(): 682 | einfo[(m,n)] = str(startG[(m,n)]) 683 | #x.add_row([findName(m),findName(n),str(startG[(m,n)])]) 684 | 685 | einfo=sorted(einfo.items(), key=operator.itemgetter(1), reverse=True) 686 | for e in einfo: 687 | x.add_row([findName(e[0][0]),findName(e[0][1]),str(e[1])]) 688 | print x 689 | else: 690 | einfo={} 691 | for m in scc_result[pos][0]: 692 | for n in scc_result[pos][0]: 693 | if G.has_edge(m,n): 694 | einfo[(m,n)] = G.get_edge_data(m,n)['weight'] 695 | einfo=sorted(einfo.items(), key=operator.itemgetter(1), reverse=True) 696 | print 'Sub-grpaph Information:' 697 | x = PrettyTable(["Vertex A", "Vertex B", "Weight"]) 698 | for e in einfo: 699 | x.add_row([findName(e[0][0]),findName(e[0][1]),str(e[1])]) 700 | print x 701 | elif keypress=='a': 702 | print 'All Edges Information:' 703 | x = PrettyTable(["Vertex A", "Vertex B", "Weight"]) 704 | for i in range(0,len(finalG)): 705 | #if finalG[i][0] in javaBack or finalG[i][1] in javaBack: 706 | # continue 707 | x.add_row([findName(finalG[i][0]), findName(finalG[i][1]), str(finalG[i][2])]) 708 | print x 709 | elif keypress=='b': 710 | if (len(previous) == 0): 711 | print '\x1b[6;31;40m'+'[ERROR] Cannot get back any more'+'\x1b[0m' 712 | else: 713 | G = previous.pop() 714 | elif keypress=='o': 715 | print btable 716 | elif 'edge' in keypress: 717 | pid1 = int(keypress.split()[1]) 718 | pid2 = int(keypress.split()[2]) 719 | for i in scc_result[pid1][0]: 720 | for j in scc_result[pid2][0]: 721 | #print i,j 722 | for k in finalG: 723 | if k[0] == i and k[1] == j: 724 | print k[0],k[1],k[2] 725 | #print scc_result[pid1] 726 | #print scc_result[pid2] 727 | elif 'wt' in keypress: 728 | pid1 = int(keypress.split()[1]) 729 | if len(keypress.split()) == 2: 730 | size = 3 731 | else: 732 | size = int(keypress.split()[2]) 733 | total = 0 734 | for l in wt[pid1].values(): 735 | total+=l 736 | x = PrettyTable(["TID", "Function Stack", "NumOfEvents", "Percent"]) 737 | x.align['Function Stack']='l' 738 | tmplist = sorted(wt[pid1], key=wt[pid1].get, reverse=True) 739 | tmpi = 0 740 | for k in tmplist: 741 | print str(pid1), k[:-1], str(wt[pid1][k]), format(wt[pid1][k]*100.0/total,'2.2f')+'%' 742 | x.add_row([str(pid1), k[:-1], str(wt[pid1][k]), format(wt[pid1][k]*100.0/total,'2.2f')+'%']) 743 | tmpi+=1 744 | if tmpi == size: 745 | break 746 | print x 747 | raw_input('Press any key to continue...') 748 | elif 'wk' in keypress: 749 | pid1 = int(keypress.split()[1]) 750 | pid2 = int(keypress.split()[2]) 751 | if len(keypress.split()) == 3: 752 | size = 3 753 | else: 754 | size = int(keypress.split()[3]) 755 | total = 0 756 | for l in wk[(pid1,pid2)].values(): 757 | total+=l 758 | x = PrettyTable(["TID1", "TID2", "Function Stack", "NumOfEvents", "Percent"]) 759 | x.align['Function Stack']='l' 760 | tmplist = sorted(wk[(pid1,pid2)], key=wk[(pid1,pid2)].get, reverse=True) 761 | tmpi = 0 762 | for k in tmplist: 763 | x.add_row([str(pid1), str(pid2), k[:-1], str(wk[(pid1,pid2)][k]), format(wk[(pid1,pid2)][k]*100.0/total,'2.2f')+'%']) 764 | tmpi+=1 765 | if tmpi == size: 766 | break 767 | print x 768 | raw_input('Press any key to continue...') 769 | #elif keypress == 'u': 770 | # x = PrettyTable(["Disk Utilization", "Network Utilization"]) 771 | # x.add_row([str(diskrate*100)+'%', str(netrate*100)+'%']) 772 | # print x 773 | elif 'fs' in keypress.split()[0]: 774 | pid = int(keypress.split()[1]) 775 | if len(keypress.split()) == 2: 776 | size = 3 777 | else: 778 | size = int(keypress.split()[2]) 779 | total = 0 780 | for l in fs[pid].values(): 781 | total+=l 782 | x = PrettyTable(["Thread ID", "Function Stack", "Percent"]) 783 | x.align['Function Stack']='l' 784 | tmplist = sorted(fs[pid], key=fs[pid].get, reverse=True) 785 | tmpi = 0 786 | for k in tmplist: 787 | x.add_row([str(pid), k[:-1], format(fs[pid][k]*100.0/total,'2.2f')+'%']) 788 | tmpi+=1 789 | if tmpi == size: 790 | break 791 | print x 792 | raw_input('Press any key to continue...') 793 | elif int(keypress) >= 0 and int(keypress) < len(scc_result): 794 | if len(scc_result[int(keypress)][0]) == 1: 795 | print '\x1b[6;31;40m'+'[ERROR] Cannot unfold this SCC any more'+'\x1b[0m' 796 | else: 797 | previous.append(G); 798 | G = create_new(scc_result[int(keypress)][0], G, 1) 799 | #sw = show_weight(scc_result[int(keypress)][0]); 800 | #print '%s\t%s\t%s' % ('Starting point'.center(10),'End point'.center(10),'Weight'.center(10)) 801 | #for k in sw.keys(): 802 | # print '%s\t%s\t%s' % (str(k[0]).center(10),str(k[1]).center(10),str(sw[k]).center(10)) 803 | else: 804 | print '\x1b[6;31;40m'+'[ERROR] Wrong command'+'\x1b[0m' 805 | #print 'Wrong command!' 806 | except: 807 | raw_input('\x1b[6;31;40m'+'Exception! Press any key to continue...'+'\x1b[0m') 808 | os.system('clear') # For Linux/OS X 809 | -------------------------------------------------------------------------------- /analyzer/change-jstack.py: -------------------------------------------------------------------------------- 1 | import sys 2 | 3 | f=open(sys.argv[1],'r') 4 | 5 | for l in f: 6 | if 'nid=0x' in l: 7 | tmp = int(l.split('nid=')[1].split()[0], 16) 8 | l=l.split() 9 | for i in l: 10 | if 'nid=0x' in i: 11 | print 'nid='+str(tmp), 12 | else: 13 | print i, 14 | else: 15 | print l, 16 | -------------------------------------------------------------------------------- /analyzer/knot.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | echo "wPerf knot presenter:" 4 | echo -n "Data directory:" 5 | read -r dir 6 | echo -n "Programming language[c or java, default c]:" 7 | read -r lang 8 | echo -n "Termination condition[default 0.2 (20% of total testing time)]:" 9 | read -r tc 10 | echo -n "Remove threshold[default 0.01 (1% of total testing time)]:" 11 | read -r re 12 | 13 | if [ "$dir" == "" ]; then 14 | echo "No input directory." 15 | exit 16 | fi 17 | 18 | if [ "$lang" == "" ]; then 19 | lang="c" 20 | fi 21 | 22 | if [ "$tc" == "" ]; then 23 | tc=0.2 24 | fi 25 | 26 | if [ "$re" == "" ]; then 27 | re=0.01 28 | fi 29 | 30 | python bottleneck.py $dir $lang $tc $re 31 | -------------------------------------------------------------------------------- /analyzer/merge.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import networkx as nx 3 | import operator 4 | 5 | res = {} 6 | wakeup = {} 7 | pid = 0 8 | pid1 = 0 9 | lsim=float(sys.argv[2]) 10 | f=open(sys.argv[1],'r') 11 | stack='' 12 | for l in f: 13 | if 'sched:sched_switch:' in l or 'probe:try_to_wake_up:' in l: 14 | if pid != 0 and pid1 == 0: 15 | if pid not in res.keys(): 16 | res[pid] = {} 17 | if stack not in res[pid].keys(): 18 | res[pid][stack] = 0 19 | res[pid][stack] += 1 20 | elif pid !=0 and pid1 != 0: 21 | if (pid,pid1) not in wakeup.keys(): 22 | wakeup[(pid,pid1)] = {} 23 | if stack not in wakeup[(pid,pid1)].keys(): 24 | wakeup[(pid, pid1)][stack] = 0 25 | wakeup[(pid, pid1)][stack] += 1 26 | pid = int(l.split()[1]) 27 | if 'probe:try_to_wake_up:' in l and 'pid=' in l: 28 | pid1 = int(l.split('pid=')[1].split(' ')[0]) 29 | else: 30 | pid1 = 0 31 | stack = '' 32 | elif l != '\n' and 'unknown' not in l: 33 | stack += l 34 | 35 | merge={} 36 | for p1 in res.keys(): 37 | for p2 in res.keys(): 38 | if p1 == p2: 39 | continue 40 | else: 41 | keys1 = set(res[p1].keys()) 42 | keys2 = set(res[p2].keys()) 43 | keys = keys1 | keys2 44 | if len(keys) == 0: 45 | continue 46 | inter = 0 47 | union = 0 48 | for k in keys: 49 | a = 0 50 | b = 0 51 | if k not in res[p1].keys(): 52 | a = 0 53 | else: 54 | a = res[p1][k] 55 | #a = 1 56 | if k not in res[p2].keys(): 57 | b = 0 58 | else: 59 | b = res[p2][k] 60 | #b = 1 61 | inter += 2 * min(a, b) 62 | union += a + b 63 | merge[(p1,p2)] = inter * 1.0 / union 64 | 65 | edge=[] 66 | 67 | for k in merge.keys(): 68 | if merge[k] > lsim: 69 | edge.append(k) 70 | 71 | G = nx.Graph() 72 | G.add_edges_from(edge) 73 | result=sorted(nx.connected_components(G), key=len, reverse=True) 74 | for i in result: 75 | for e in i: 76 | print e, 77 | print '' 78 | -------------------------------------------------------------------------------- /analyzer/short.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import networkx as nx 3 | 4 | #Current assume netwrok bandwidth: 125MB/s 5 | nband = 125000000 6 | 7 | class node: 8 | parent = 0 9 | pid = 0 10 | stime = 0 11 | etime = 0 12 | 13 | def printOut(self): 14 | print self.pid, self.parent, self.stime, self.etime 15 | 16 | cpuFreq = 0 17 | fcpu = open('cpufreq','r') 18 | for l in fcpu: 19 | cpuFreq = float(l.strip())*1e9 20 | fcpu.close() 21 | 22 | f=open('kmesg') 23 | 24 | nodes = {} 25 | 26 | starttime = 0 27 | endtime = 0 28 | 29 | dtime = 0.0 30 | ntime = 0.0 31 | 32 | send = {} 33 | sendTotal = 0 34 | 35 | for l in f: 36 | l = l[0] + '0000' + l[5:] 37 | if 'Create' in l: 38 | tmpstr = l.split('Thread')[1].split(' ') 39 | pid1 = int(tmpstr[1]) 40 | pid2 = int(tmpstr[4]) 41 | stime = int(tmpstr[-1]) 42 | nodes[pid2] = node() 43 | nodes[pid2].pid = pid2 44 | nodes[pid2].parent = pid1 45 | nodes[pid2].stime = stime 46 | elif 'Finish' in l: 47 | tmpstr = l.split('Thread')[1].split(' ') 48 | pid1 = int(tmpstr[1]) 49 | etime = int(tmpstr[-1]) 50 | if pid1 not in nodes.keys(): 51 | continue 52 | else: 53 | nodes[pid1].etime = etime 54 | elif 'Inside open' in l: 55 | starttime = int(l.split()[-1]) 56 | elif 'Inside close' in l: 57 | endtime = int(l.split()[-1]) 58 | elif 'Disk work time' in l: 59 | dtime = float(l.split()[-1])/1000.0 60 | elif 'sends' in l: 61 | tmp = l.strip().split() 62 | pid = int(tmp[2]) 63 | byte = int(tmp[-1]) 64 | send[pid] = byte 65 | sendTotal += byte 66 | 67 | # Network backtrace can be done here. 68 | # Disk backtrace should still be done in scc-pypy. Because the information still in the original waitfor file. 69 | totaltime = (endtime - starttime)/cpuFreq 70 | netrest = (1 - sendTotal/(nband * totaltime)) * totaltime 71 | diskrest = (1 - dtime/totaltime) * totaltime 72 | 73 | #parent and child's set 74 | pcset = {} 75 | 76 | for node in nodes.values(): 77 | if node.stime == 0 or node.etime == 0: 78 | continue 79 | if node.parent not in pcset.keys(): 80 | pcset[node.parent] = [] 81 | pcset[node.parent].append(node.pid) 82 | 83 | #print pcset 84 | #parent and child's final set 85 | pcfset = {} 86 | for k in pcset.keys(): 87 | edges = [] 88 | if len(pcset[k]) == 1: 89 | pcfset[k] = [] 90 | pcfset[k].append(list(pcset[k])) 91 | continue 92 | G = nx.Graph() 93 | for i in pcset[k]: 94 | for j in pcset[k]: 95 | if i == j: 96 | continue 97 | if nodes[i].stime > nodes[j].stime and nodes[i].stime < nodes[j].etime - 100000000: 98 | edges.append((i,j)) 99 | if len(edges) > 0: 100 | G.add_edges_from(edges) 101 | result=sorted(nx.connected_components(G), key=len, reverse=True) 102 | pcfset[k] = [] 103 | if (len(result) >= 1): 104 | for m in result: 105 | pcfset[k].append(list(sorted(m))) 106 | else: 107 | pcfset[k] = [] 108 | for kk in sorted(pcset[k]): 109 | tlist = [] 110 | tlist.append(kk) 111 | pcfset[k].append(tlist) 112 | 113 | #long_term threads 114 | lt_thread = [] 115 | 116 | for k in sorted(pcfset.keys()): 117 | islt = 0 118 | for ltotal in pcfset.values(): 119 | for l in ltotal: 120 | if k in l: 121 | islt = 1 122 | break 123 | if islt == 0: 124 | lt_thread.append(k) 125 | #print k, pcfset[k] 126 | 127 | #print lt_thread 128 | 129 | merge={} 130 | mresult={} 131 | 132 | def recursive_merge(k): 133 | maxlen = 0 134 | merge[k] = [] 135 | for v in pcfset[k]: 136 | maxlen = len(v) if len(v) > maxlen else maxlen 137 | for i in range(0,maxlen): 138 | merge[k].append([]) 139 | for i in pcfset[k]: 140 | for j in range(0, maxlen): 141 | if len(i) < j+1: 142 | continue 143 | else: 144 | merge[k][j].append(i[j]) 145 | mresult[k] = [] 146 | lenset = 0 147 | for i in range(0,maxlen): 148 | lenset = len(merge[k][i]) if merge[k][i] > lenset else lenset 149 | for i in range(0,lenset): 150 | mresult[k].append([]) 151 | for v in merge[k]: 152 | for i in range(0,lenset): 153 | mresult[k][i].append(v[i]) 154 | # only consider one level 155 | if v[0] not in pcfset.keys(): 156 | return 157 | vlist = [] 158 | vkey = v[0] 159 | merge[vkey] = [] 160 | for v in mresult[k]: 161 | for vv in v: 162 | for vvv in pcfset[vv]: 163 | for vvvv in vvv: 164 | vlist.append(vvvv) 165 | merge[vkey].append(vlist) 166 | 167 | for k in lt_thread: 168 | recursive_merge(k) 169 | #for k in merge.keys(): 170 | # print k, merge[k] 171 | 172 | #for k in sorted(pcfset.keys()): 173 | # finalk = k 174 | # for m in pcfset.keys(): 175 | # for n in pcfset[m]: 176 | # if k in n: 177 | # finalk = sorted(n)[0] 178 | # if finalk not in merge.keys(): 179 | # merge[finalk] = [] 180 | # for v in pcfset[k]: 181 | # if finalk not in nodes.keys(): 182 | # merge[finalk].append(v) 183 | # else: 184 | # merge[finalk].extend(v) 185 | 186 | tmpresult = {} 187 | 188 | for m in merge.keys(): 189 | for n in merge[m]: 190 | tmplist = [] 191 | tmplist.append((0,starttime)) 192 | for k in n: 193 | tmplist.append((nodes[k].stime,nodes[k].etime)) 194 | tmplist.append((endtime,0)) 195 | tmpvalue = 0 196 | for i in range(0,len(tmplist)-1): 197 | tmpvalue += tmplist[i+1][0] - tmplist[i][1] 198 | if (n[0],m) not in tmpresult.keys(): 199 | tmpresult[(n[0],m)] = 0 200 | tmpresult[(n[0],m)] += tmpvalue/cpuFreq 201 | 202 | mlist = {} 203 | 204 | for k in merge.keys(): 205 | for v in merge[k]: 206 | mlist[v[0]] =v 207 | 208 | # Add network backtrace 209 | for k in send.keys(): 210 | kpid = k 211 | for m in mlist.keys(): 212 | if kpid in mlist[m]: 213 | kpid = m 214 | break 215 | if (-4,kpid) not in tmpresult.keys(): 216 | tmpresult[(-4,kpid)] = 0 217 | tmpresult[(-4,kpid)] += send[k] * 1.0 / sendTotal * netrest 218 | 219 | # Add disk backtrace 220 | dthread={} 221 | diskTotal = 0.0 222 | f2=open('waitfor', 'r') 223 | for l in f2: 224 | l=l.split() 225 | if int(l[1]) == -5: 226 | dthread[int(l[0])] = float(l[2]) 227 | diskTotal+=float(l[2]) 228 | 229 | for k in dthread.keys(): 230 | kpid = k 231 | for m in mlist.keys(): 232 | if kpid in mlist[m]: 233 | kpid = m 234 | break 235 | if (-5,kpid) not in tmpresult.keys(): 236 | tmpresult[(-5,kpid)] = 0 237 | tmpresult[(-5,kpid)] += dthread[k] * 1.0 / diskTotal * diskrest 238 | 239 | f2.seek(0,0) 240 | for l in f2: 241 | l=l.strip().split(' ') 242 | t1 = int(l[0]) 243 | t2 = int(l[1]) 244 | val = float(l[2]) 245 | 246 | for m in mlist.keys(): 247 | if t1 in mlist[m]: 248 | t1 = m 249 | if t2 in mlist[m]: 250 | t2 = m 251 | if (t1,t2) not in tmpresult.keys(): 252 | tmpresult[(t1,t2)] = 0.0 253 | tmpresult[(t1,t2)] += val 254 | 255 | if len(sys.argv) > 1 and sys.argv[1] == "1": 256 | for k in merge.keys(): 257 | if len(merge[k]) == 1: 258 | continue 259 | else: 260 | for v in merge[k]: 261 | print v[0], 262 | print "" 263 | else: 264 | for k in tmpresult.keys(): 265 | print k[0], k[1], tmpresult[k] 266 | 267 | f2.close() 268 | f.close() 269 | -------------------------------------------------------------------------------- /analyzer/waitfor.csv: -------------------------------------------------------------------------------- 1 | source,target,value 2 | 1344,1344,43.09 3 | NIC,1344,9.60 4 | 1344,NIC,0.21 5 | -------------------------------------------------------------------------------- /annotation/c/main.c: -------------------------------------------------------------------------------- 1 | pthread_key_create(&fdkey, NULL); 2 | pthread_key_create(&bufferkey, NULL); 3 | pthread_key_create(&poskey, NULL); 4 | pthread_key_create(&tidkey, NULL); 5 | pthread_key_create(&lasttimekey, NULL); 6 | 7 | pthread_key_create(&fdkey, NULL); 8 | pthread_key_create(&bufferkey, NULL); 9 | pthread_key_create(&poskey, NULL); 10 | pthread_key_create(&tidkey, NULL); 11 | pthread_key_create(&lasttimekey, NULL); 12 | -------------------------------------------------------------------------------- /annotation/c/thread.c: -------------------------------------------------------------------------------- 1 | pthread_key_t fdkey; 2 | pthread_key_t bufferkey; 3 | pthread_key_t poskey; 4 | pthread_key_t lasttimekey; 5 | pthread_key_t tidkey; 6 | 7 | int pos = 0; 8 | char *tmp; 9 | unsigned long lastsync = 0; 10 | 11 | int fpid[300]; 12 | char* fbuffer[300]; 13 | 14 | typedef struct udsResult { 15 | unsigned long ts; 16 | int pid; 17 | long address; 18 | short type; 19 | } __attribute__((packed)) uds_res; 20 | 21 | static __inline__ unsigned long long tsctime(void) 22 | { 23 | unsigned hi, lo; 24 | __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi)); 25 | return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 ); 26 | } 27 | 28 | void uds_add(void* address, int type) { 29 | uds_res ur; 30 | long prev; 31 | char fname[30]; 32 | char *dest = NULL; 33 | char *tmp = NULL; 34 | 35 | int* fd; 36 | char *buffer = NULL; 37 | int* pos; 38 | int* tid; 39 | long* lasttime; 40 | 41 | fd = (int*)pthread_getspecific(fdkey); 42 | buffer = (char*)pthread_getspecific(bufferkey); 43 | pos = (int*)pthread_getspecific(poskey); 44 | tid = (int*)pthread_getspecific(tidkey); 45 | lasttime = (long*)pthread_getspecific(lasttimekey); 46 | if (fd == NULL) { 47 | tid = (int*)malloc(sizeof(int)); 48 | *tid = (int)syscall(SYS_gettid); 49 | sprintf(fname, "/tmp/wperf-%d", *tid); 50 | 51 | fd = (int*)malloc(sizeof(int)); 52 | *fd = open(fname, O_WRONLY|O_CREAT|O_TRUNC,0666); 53 | 54 | pos = (int*)malloc(sizeof(int)); 55 | *pos = 0; 56 | 57 | buffer = (char*)malloc(50*1024*1024*sizeof(char)); 58 | 59 | lasttime = (long*)malloc(sizeof(long)); 60 | *lasttime = 0; 61 | 62 | pthread_setspecific(fdkey, fd); 63 | pthread_setspecific(bufferkey, buffer); 64 | pthread_setspecific(poskey, pos); 65 | pthread_setspecific(lasttimekey, lasttime); 66 | pthread_setspecific(tidkey, tid); 67 | } 68 | 69 | ur.address = (long)address; 70 | ur.type = type; 71 | ur.ts = tsctime(); 72 | ur.pid = *tid; 73 | 74 | tmp = buffer + *pos; 75 | memcpy(tmp, &ur, sizeof(ur)); 76 | *pos+=sizeof(ur); 77 | 78 | if (*pos > 45*1024*1024 || ur.ts - *lasttime > 1e9) { 79 | write(*fd, buffer, *pos); 80 | *pos = 0; 81 | *lasttime = ur.ts; 82 | } 83 | 84 | return; 85 | } 86 | -------------------------------------------------------------------------------- /annotation/java/Makefile: -------------------------------------------------------------------------------- 1 | jdir := $(shell echo ${JAVA_HOME}) 2 | all: 3 | javac edu/osu/cse/ops/UDS.java 4 | javah edu.osu.cse.ops.UDS 5 | gcc -c kerntool.c -lpthread -I"${jdir}/include" -I"${jdir}/include/linux" -o kerntool.o -fPIC 6 | gcc -shared -lpthread -o libkerntool.so kerntool.o -fPIC 7 | jar -cfve uds.jar edu.osu.cse.ops.UDS edu/osu/cse/ops/*.class 8 | cp *.so /tmp/ 9 | cp *.jar /tmp/ 10 | clean: 11 | rm *.jar 12 | rm *.so 13 | rm *.o 14 | -------------------------------------------------------------------------------- /annotation/java/edu/osu/cse/ops/UDS.class: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/OSUSysLab/wPerf/d34d8f7f56c8edb0d89c102e86350e2f0f2703cd/annotation/java/edu/osu/cse/ops/UDS.class -------------------------------------------------------------------------------- /annotation/java/edu/osu/cse/ops/UDS.java: -------------------------------------------------------------------------------- 1 | package edu.osu.cse.ops; 2 | 3 | public class UDS { 4 | static { 5 | System.load("/tmp/libkerntool.so"); 6 | init(); 7 | } 8 | 9 | public native static void add(long address, int type); 10 | public native static void init(); 11 | 12 | public UDS() { 13 | } 14 | 15 | } 16 | -------------------------------------------------------------------------------- /annotation/java/edu/osu/cse/ops/UDSResult.class: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/OSUSysLab/wPerf/d34d8f7f56c8edb0d89c102e86350e2f0f2703cd/annotation/java/edu/osu/cse/ops/UDSResult.class -------------------------------------------------------------------------------- /annotation/java/edu/osu/cse/ops/UDSResult.java: -------------------------------------------------------------------------------- 1 | package edu.osu.cse.ops; 2 | 3 | import java.io.Serializable; 4 | 5 | public class UDSResult implements Serializable{ 6 | long ts; 7 | int pid; 8 | int type; 9 | public UDSResult(long ts, int pid, int type) { 10 | this.ts = ts; 11 | this.pid = pid; 12 | this.type = type; 13 | } 14 | 15 | public void printOut() { 16 | System.out.printf("[Fang-UDS] Time %d ThreadID %d Type %d\n", ts, pid, type); 17 | } 18 | 19 | public long getTs() { 20 | return ts; 21 | } 22 | 23 | public void setTs(long ts) { 24 | this.ts = ts; 25 | } 26 | 27 | public int getPid() { 28 | return pid; 29 | } 30 | 31 | public void setPid(int pid) { 32 | this.pid = pid; 33 | } 34 | 35 | public int getType() { 36 | return type; 37 | } 38 | 39 | public void setType(int type) { 40 | this.type = type; 41 | } 42 | } 43 | -------------------------------------------------------------------------------- /annotation/java/edu_osu_cse_ops_UDS.h: -------------------------------------------------------------------------------- 1 | /* DO NOT EDIT THIS FILE - it is machine generated */ 2 | #include 3 | /* Header for class edu_osu_cse_ops_UDS */ 4 | 5 | #ifndef _Included_edu_osu_cse_ops_UDS 6 | #define _Included_edu_osu_cse_ops_UDS 7 | #ifdef __cplusplus 8 | extern "C" { 9 | #endif 10 | /* 11 | * Class: edu_osu_cse_ops_UDS 12 | * Method: add 13 | * Signature: (JI)V 14 | */ 15 | JNIEXPORT void JNICALL Java_edu_osu_cse_ops_UDS_add 16 | (JNIEnv *, jclass, jlong, jint); 17 | 18 | /* 19 | * Class: edu_osu_cse_ops_UDS 20 | * Method: init 21 | * Signature: ()V 22 | */ 23 | JNIEXPORT void JNICALL Java_edu_osu_cse_ops_UDS_init 24 | (JNIEnv *, jclass); 25 | 26 | #ifdef __cplusplus 27 | } 28 | #endif 29 | #endif 30 | -------------------------------------------------------------------------------- /annotation/java/kerntool.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | 15 | #include 16 | 17 | #include "edu_osu_cse_ops_UDS.h" 18 | #include "kerntool.h" 19 | 20 | 21 | int fd; 22 | pthread_key_t fdkey; 23 | pthread_key_t bufferkey; 24 | pthread_key_t poskey; 25 | pthread_key_t lasttimekey; 26 | pthread_key_t tidkey; 27 | 28 | static __inline__ unsigned long long fangtime(void) 29 | { 30 | unsigned hi, lo; 31 | __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi)); 32 | return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 ); 33 | } 34 | 35 | JNIEXPORT void JNICALL Java_edu_osu_cse_ops_UDS_init (JNIEnv *env, jclass cls) { 36 | pthread_key_create(&fdkey, NULL); 37 | pthread_key_create(&bufferkey, NULL); 38 | pthread_key_create(&poskey, NULL); 39 | pthread_key_create(&tidkey, NULL); 40 | pthread_key_create(&lasttimekey, NULL); 41 | } 42 | 43 | 44 | JNIEXPORT void JNICALL Java_edu_osu_cse_ops_UDS_add (JNIEnv *env, jclass cls, jlong address, jint type) { 45 | uds_res ur; 46 | long prev; 47 | char fname[30]; 48 | char *dest = NULL; 49 | char *tmp = NULL; 50 | 51 | int* fd; 52 | char *buffer = NULL; 53 | int* pos; 54 | int* tid; 55 | long* lasttime; 56 | 57 | fd = (int*)pthread_getspecific(fdkey); 58 | buffer = (char*)pthread_getspecific(bufferkey); 59 | pos = (int*)pthread_getspecific(poskey); 60 | tid = (int*)pthread_getspecific(tidkey); 61 | lasttime = (long*)pthread_getspecific(lasttimekey); 62 | if (fd == NULL) { 63 | tid = (int*)malloc(sizeof(int)); 64 | *tid = (int)syscall(SYS_gettid); 65 | sprintf(fname, "/tmp/wperf-%d", *tid); 66 | 67 | fd = (int*)malloc(sizeof(int)); 68 | *fd = open(fname, O_WRONLY|O_CREAT|O_TRUNC,0666); 69 | 70 | pos = (int*)malloc(sizeof(int)); 71 | *pos = 0; 72 | 73 | buffer = (char*)malloc(50*1024*1024*sizeof(char)); 74 | 75 | lasttime = (long*)malloc(sizeof(long)); 76 | *lasttime = 0; 77 | 78 | pthread_setspecific(fdkey, fd); 79 | pthread_setspecific(bufferkey, buffer); 80 | pthread_setspecific(poskey, pos); 81 | pthread_setspecific(lasttimekey, lasttime); 82 | pthread_setspecific(tidkey, tid); 83 | } 84 | 85 | ur.address = (long)address; 86 | ur.type = type; 87 | ur.ts = fangtime(); 88 | ur.pid = *tid; 89 | 90 | write(*fd, &ur, sizeof(uds_res)); 91 | 92 | return; 93 | } 94 | -------------------------------------------------------------------------------- /annotation/java/kerntool.h: -------------------------------------------------------------------------------- 1 | #define IOC_MAGIC 'f' 2 | #define IOCTL_ADDUDS _IOWR(IOC_MAGIC, 20, unsigned long) 3 | 4 | typedef struct udsResult { 5 | unsigned long ts; 6 | int pid; 7 | long address; 8 | short type; 9 | } __attribute__((packed)) uds_res; 10 | -------------------------------------------------------------------------------- /module/1prepare-java.sh: -------------------------------------------------------------------------------- 1 | rm /tmp/target 2 | rm /tmp/perf_target 3 | 4 | pid=$(sudo jps | grep $1 | awk '{print $1}') 5 | ps -T -p ${pid} | awk '{if (NR>1) print $2}' > /tmp/target 6 | echo ${pid} >> /tmp/perf_target 7 | 8 | #echo $pid > /tmp/target 9 | 10 | for i in $2 $3 $4 $5 $6 $7 11 | do 12 | pgrep $i >> /tmp/target 13 | pgrep $i >> /tmp/perf_target 14 | done 15 | 16 | pgrep ksoftirq >> /tmp/target 17 | pgrep kworker >> /tmp/target 18 | 19 | pgrep ksoftirq >> /tmp/perf_target 20 | pgrep kworker >> /tmp/perf_target 21 | 22 | make 23 | exist=$(sudo lsmod | grep ioctl_perf | awk '{print $1}') 24 | if [ ! "$exist" ] 25 | then 26 | sudo insmod ioctl_perf.ko 27 | if [ ! -f /dev/wperf ] 28 | then 29 | sudo mknod /dev/wperf c 239 0 30 | fi 31 | sudo chmod 666 /dev/wperf 32 | else 33 | echo "Module has been installed." 34 | fi 35 | -------------------------------------------------------------------------------- /module/1prepare.sh: -------------------------------------------------------------------------------- 1 | sudo rm /tmp/target 2 | sudo rm /tmp/perf_target 3 | 4 | for i in $1 $2 $3 $4 $5 5 | do 6 | pgrep $i -w >> /tmp/target 7 | pgrep $i >> /tmp/perf_target 8 | done 9 | 10 | pgrep ksoftirq >> /tmp/target 11 | pgrep kworker >> /tmp/target 12 | 13 | pgrep ksoftirq >> /tmp/perf_target 14 | pgrep kworker >> /tmp/perf_target 15 | 16 | exist=$(sudo lsmod | grep ioctl_perf | awk '{print $1}') 17 | if [ ! "$exist" ] 18 | then 19 | make 20 | sudo insmod ioctl_perf.ko 21 | sudo mknod /dev/wperf c 239 0 22 | sudo chmod 666 /dev/wperf 23 | else 24 | echo "Module has been installed." 25 | fi 26 | -------------------------------------------------------------------------------- /module/Makefile: -------------------------------------------------------------------------------- 1 | obj-m = ioctl_perf.o 2 | 3 | NEWKERNEL := $(shell uname -r | awk -F"." '{if (int($$1) >= 4 && int($$2) > 10) print "1"}') 4 | KVERSION = $(shell uname -r) 5 | 6 | ifdef NEWKERNEL 7 | ccflags-y := -DNEWKERNEL -Wno-unused-function -Wno-unused-variable -Wframe-larger-than=16782 8 | else 9 | ccflags-y := -Wno-unused-function -Wno-unused-variable -Wframe-larger-than=16782 10 | endif 11 | 12 | all: 13 | make -C /lib/modules/$(KVERSION)/build M=$(PWD) modules 14 | clean: 15 | make -C /lib/modules/$(KVERSION)/build M=$(PWD) clean 16 | -------------------------------------------------------------------------------- /module/ioctl_perf.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include /* for totalram_pages */ 5 | #include 6 | #include 7 | #include 8 | 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | #include 16 | #include 17 | #include // required for various structures related to files liked fops. 18 | #include 19 | #include 20 | #include 21 | #include 22 | #include 23 | #include 24 | #include 25 | #include "ioctl_perf.h" 26 | 27 | #define BMAXBYTE 500 28 | #define BSIZEBYTE (BMAXBYTE-50) 29 | #define BMAX (BMAXBYTE * 1024 * 1024) 30 | #define BSIZE (BSIZEBYTE * 1024 * 1024) 31 | 32 | static int Major; 33 | volatile int tag = -1; 34 | volatile int step = -1; 35 | 36 | struct task_struct *p = NULL; 37 | 38 | unsigned long pid = -1; 39 | unsigned long gdbpid = -1; 40 | int ret; 41 | 42 | struct task_struct *ptmp; 43 | struct task_struct *ttmp; 44 | struct task_struct *t; 45 | 46 | char *switch_result; 47 | char *switch_result_bak; 48 | char *switch_result_tmp; 49 | char *futex_result; 50 | char *futex_result_bak; 51 | char *futex_result_tmp; 52 | char *state_result; 53 | char *state_result_bak; 54 | char *state_result_tmp; 55 | 56 | char *wait_result; 57 | char *wait_result_bak; 58 | char *wait_result_tmp; 59 | 60 | //New Added for missing event solution 61 | volatile int ipnum = 0; 62 | volatile int fnum = 0; 63 | volatile int wnum = 0; 64 | long switch_pos = 0; 65 | long state_pos = 0; 66 | long futex_pos = 0; 67 | long wait_pos = 0; 68 | int state_total = 0; 69 | int futex_total = 0; 70 | long tmpip; 71 | 72 | //For softirq events 73 | u64 stime[32]; 74 | 75 | spinlock_t switch_lock; 76 | spinlock_t state_lock; 77 | spinlock_t futex_lock; 78 | spinlock_t wait_lock; 79 | 80 | struct wperf_struct *perf_struct; 81 | 82 | struct task_struct *(*fang_curr_task)(int); 83 | int (*fang_get_futex_key)(u32 __user *, int, union futex_key *, int); 84 | struct futex_hash_bucket *(*fang_hash_futex)(union futex_key *); 85 | static const struct futex_q futex_q_init = { 86 | /* list gets initialized in queue_me()*/ 87 | .key = FUTEX_KEY_INIT, 88 | .bitset = FUTEX_BITSET_MATCH_ANY 89 | }; 90 | int (*fang_futex_wait_setup)(u32 __user *, u32, unsigned int, struct futex_q *, struct futex_hash_bucket **); 91 | long (fang_futex_wait_restart)(struct restart_block *); 92 | //For kernel version larger than 4.8 93 | //#ifdef NEWKERNEL 94 | //u64 (*local_clock)(void); 95 | //#endif 96 | 97 | long firstsec, firstusec, lastsec, lastusec; 98 | 99 | //Fang newly added 100 | volatile int softirq[NR_CPUS]; 101 | volatile int softirq1[NR_CPUS]; 102 | volatile short hardirq[NR_CPUS][600]; 103 | volatile short hardirq_pos[NR_CPUS]; 104 | volatile int irq[NR_CPUS]; 105 | volatile int target[PIDNUM]; 106 | volatile int aiotag[NR_CPUS]; 107 | long netsend[65535]; 108 | //long futextag[65535]; 109 | //long futexsum[65535]; 110 | //long futexstart[65535]; 111 | struct fang_result spin_result[NR_CPUS]; 112 | 113 | //Only for test 114 | //volatile short dump_check[100000]; 115 | //volatile short dump_cpu_check[32]; 116 | 117 | volatile long futex_to = 0; 118 | volatile long futex_noto = 0; 119 | 120 | //static long disk_work = 0; 121 | 122 | int dnum; 123 | char dname[20][20]; 124 | long disk_work[20]; 125 | 126 | static __inline__ unsigned long long tsc_now(void) 127 | { 128 | unsigned hi, lo; 129 | __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi)); 130 | return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 ); 131 | } 132 | 133 | inline u64 fang_clock(void) { 134 | struct timespec64 now; 135 | u64 ret = 0; 136 | 137 | //getnstimeofday64(&now); 138 | //get_monotonic_boottime(&now); 139 | 140 | //ret = now.tv_sec*1000000000+now.tv_nsec; 141 | 142 | ret = tsc_now(); 143 | //printk(KERN_INFO "current time = %lu\n", ret); 144 | return ret; 145 | } 146 | 147 | inline u64 wperfclock(void) { 148 | //struct timespec64 now; 149 | //u64 ret = 0; 150 | 151 | //getrawmonotonic(&now); 152 | //ret = now.tv_sec*1000000000+now.tv_nsec; 153 | 154 | //return ret; 155 | return local_clock(); 156 | } 157 | 158 | //We'll check target ID & jbd thread & ksoftirq threads 159 | int containOfPID(int core, int pid) { 160 | int i = 0; 161 | for (i = 0; i < PIDNUM; i++) { 162 | if (target[i] == 0) break; 163 | if (target[i] == pid) 164 | return 1; 165 | } 166 | //if (target[0] == pid) { 167 | // return 1; 168 | //} 169 | //if (target[1] == pid) { 170 | // return 1; 171 | //} 172 | //if (target[2+core] == pid) { 173 | // return 1; 174 | //} 175 | return 0; 176 | } 177 | 178 | __visible __notrace_funcgraph struct task_struct * j___switch_to(struct task_struct *prev_p, struct task_struct *next_p) { 179 | struct timeval start; 180 | struct fang_result fresult; 181 | struct futex_result furesult; 182 | char *chartmp; 183 | int cpu = 0; 184 | u64 ts; 185 | 186 | if (step == 1) { 187 | cpu = get_cpu(); 188 | if (containOfPID(cpu, prev_p->tgid)||containOfPID(cpu, next_p->tgid)) { 189 | ts = fang_clock(); 190 | fresult.type = 0; 191 | fresult.ts = ts; 192 | fresult.perfts = wperfclock(); 193 | fresult.core = cpu; 194 | fresult.pid1 = prev_p->pid; 195 | fresult.pid2 = next_p->pid; 196 | 197 | if (in_irq()) { 198 | //fresult.irq = hardirq[cpu][hardirq_pos[cpu]]; 199 | fresult.irq = HARDIRQ; 200 | } 201 | else if (in_serving_softirq()) { 202 | fresult.irq = softirq[cpu]; 203 | } 204 | else { 205 | fresult.irq = 0; 206 | } 207 | fresult.pid1state = prev_p->state; 208 | fresult.pid2state = next_p->state; 209 | 210 | //if (prev_p->pid == 18183) { 211 | // printk(KERN_INFO "pid1 = %d, pid2 = %d, pid1state = %d, pid2state = %d\n", prev_p->pid, next_p->pid, prev_p->state, next_p->state); 212 | //} 213 | 214 | spin_lock(&switch_lock); 215 | chartmp = (char *)(switch_result + switch_pos); 216 | memcpy(chartmp, &fresult, sizeof(fresult)); 217 | switch_pos+=sizeof(fresult); 218 | spin_unlock(&switch_lock); 219 | 220 | //if (futextag[next_p->pid] == 1) { 221 | // futexstart[next_p->pid] = ts; 222 | // //printk(KERN_INFO "pid=%d, ts=%ld\n", next_p->pid, futexstart[prev_p->pid]); 223 | //} 224 | //if (futextag[prev_p->pid] == 1 && futexstart[prev_p->pid] > 0) { 225 | // futexsum[prev_p->pid] += ts - futexstart[prev_p->pid]; 226 | 227 | // if (prev_p->state>0) { 228 | // spin_lock(&state_lock); 229 | // chartmp = (char *)(state_result + state_pos); 230 | // furesult.pid = prev_p->pid; 231 | // furesult.time = futexsum[prev_p->pid]; 232 | // memcpy(chartmp, &furesult, sizeof(furesult)); 233 | // state_pos+=sizeof(furesult); 234 | // futexsum[prev_p->pid] = 0; 235 | // futextag[prev_p->pid] = 0; 236 | // futexstart[prev_p->pid] = 0; 237 | // spin_unlock(&state_lock); 238 | // } 239 | //} 240 | } 241 | } 242 | jprobe_return(); 243 | return NULL; 244 | } 245 | 246 | /* We need this jprobe by both steps 247 | * First, we need to record the time entry in the first step. 248 | * Second, we need to record the last state for GDB stack. 249 | * */ 250 | 251 | static int j_try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) { 252 | char tmp[200]; 253 | struct timeval start; 254 | struct fang_result fresult; 255 | char *chartmp = NULL; 256 | int cpu = 0; 257 | int i = 0; 258 | u64 ts; 259 | 260 | if (step == 1) { 261 | //if (p->pid==p->tgid+8) { 262 | // printk(KERN_INFO "Thread %d wakes up IO thread\n", current->pid); 263 | // dump_stack(); 264 | // if (aiotag[cpu] == 1) { 265 | // aiotag = 0; 266 | // printk(KERN_INFO "Thread %d is waken up by aio_complete\n", p->pid); 267 | // } 268 | //} 269 | //if (p->tgid == target[1]) { 270 | // if (in_serving_softirq()) { 271 | // printk(KERN_INFO "It's in softirq context\n"); 272 | // } 273 | // else { 274 | // printk(KERN_INFO "It's NOT in softirq context\n"); 275 | // dump_stack(); 276 | // printk(KERN_INFO "------------------------\n"); 277 | // } 278 | //} 279 | //if (current->pid == 4145 && p->pid == 4144) dump_stack(); 280 | //if (p->pid == 4145) dump_stack(); 281 | cpu = get_cpu(); 282 | //if (containOfPID(cpu, p->tgid) ) { 283 | //if (current->pid == 2381) dump_stack(); 284 | if (containOfPID(cpu, p->tgid)) { 285 | // Check if in hardirq context 286 | //if (in_irq() && hardirq_pos[cpu]<=0) { 287 | // printk(KERN_INFO "[cpu %d][hardirq_pos = %d] hardirq missing, %d wakes up %d\n", cpu, hardirq_pos[cpu], current->pid, p->pid); 288 | // dump_stack(); 289 | //} 290 | //else if (in_serving_softirq() && softirq1[cpu] == 0) { 291 | // printk(KERN_INFO "softirq missing, %d wakes up %d\n", current->pid, p->pid); 292 | // dump_stack(); 293 | //} 294 | 295 | //if (current->pid == target[cpu+2] && p->pid == 376) { 296 | // dump_stack(); 297 | //} 298 | 299 | if ((p->state & state)) { 300 | //if (current->pid != 0 && current->tgid != target[0] && p->pid == 1935) { 301 | //printk(KERN_INFO "%d [parent %d] wakes up %d\n", current->pid, current->tgid, p->pid); 302 | //} 303 | //if (aiotag[cpu] == 1) { 304 | // aiotag[p->pid] = 1; 305 | // aiotag[cpu] = 0; 306 | //} 307 | 308 | ts = fang_clock(); 309 | 310 | fresult.type = 1; 311 | fresult.ts = ts; 312 | fresult.perfts = wperfclock(); 313 | fresult.core = cpu; 314 | fresult.pid1 = current->pid; 315 | fresult.pid2 = p->pid; 316 | if (in_irq()) { 317 | fresult.irq = HARDIRQ; 318 | //dump_check[p->pid] ++; 319 | //if (dump_check[p->pid] == 10) { 320 | // dump_cpu_check[cpu] = p->pid; 321 | // dump_check[p->pid]=0; 322 | //} 323 | } 324 | else if (in_serving_softirq()) { 325 | fresult.irq = softirq[cpu]; 326 | //if (softirq[cpu] == -2 && p->pid == 4145) dump_stack(); 327 | } 328 | else { 329 | fresult.irq = 0; 330 | // Added for futex Result 331 | //futextag[p->pid] = 1; 332 | } 333 | 334 | fresult.pid1state = current->state; 335 | fresult.pid2state = p->state; 336 | 337 | spin_lock(&switch_lock); 338 | chartmp = (char *)(switch_result + switch_pos); 339 | memcpy(chartmp, &fresult, sizeof(fresult)); 340 | switch_pos+=sizeof(fresult); 341 | spin_unlock(&switch_lock); 342 | } 343 | } 344 | } 345 | 346 | jprobe_return(); 347 | return 0; 348 | } 349 | 350 | static void j_tasklet_hi_action(struct softirq_action *a) { 351 | int cpu = 0; 352 | if (step == 1) { 353 | cpu = get_cpu(); 354 | softirq[cpu] = TASKLET_HI; 355 | } 356 | jprobe_return(); 357 | } 358 | 359 | static void j_run_timer_softirq(struct softirq_action *a) { 360 | int cpu = 0; 361 | if (step == 1) { 362 | cpu = get_cpu(); 363 | softirq[cpu] = TIMER; 364 | } 365 | jprobe_return(); 366 | } 367 | 368 | static void j_net_tx_action(struct softirq_action *a) { 369 | int cpu = 0; 370 | if (step == 1) { 371 | cpu = get_cpu(); 372 | softirq[cpu] = NET_TX; 373 | } 374 | jprobe_return(); 375 | } 376 | 377 | static void j_net_rx_action(struct softirq_action *a) { 378 | int cpu = 0; 379 | if (step == 1) { 380 | cpu = get_cpu(); 381 | softirq[cpu] = NET_RX; 382 | //if (cpu == 7) { 383 | // printk(KERN_INFO "[%d]softirq[cpu] = %d\n", cpu, softirq[cpu]); 384 | //} 385 | } 386 | jprobe_return(); 387 | } 388 | 389 | static void j_blk_done_softirq(struct softirq_action *a) { 390 | int cpu = 0; 391 | if (step == 1) { 392 | cpu = get_cpu(); 393 | softirq[cpu] = BLK_DONE; 394 | } 395 | jprobe_return(); 396 | } 397 | 398 | static void j_blk_iopoll_softirq(struct softirq_action *a) { 399 | int cpu = 0; 400 | if (step == 1) { 401 | cpu = get_cpu(); 402 | softirq[cpu] = BLK_IOPOLL; 403 | } 404 | jprobe_return(); 405 | } 406 | 407 | static void j_tasklet_action(struct softirq_action *a) { 408 | int cpu = 0; 409 | if (step == 1) { 410 | cpu = get_cpu(); 411 | softirq[cpu] = TASKLET; 412 | } 413 | jprobe_return(); 414 | } 415 | 416 | static void j_run_rebalance_domains(struct softirq_action *a) { 417 | int cpu = 0; 418 | if (step == 1) { 419 | cpu = get_cpu(); 420 | softirq[cpu] = SCHED; 421 | } 422 | jprobe_return(); 423 | } 424 | 425 | static void j_run_hrtimer_softirq(struct softirq_action *a) { 426 | int cpu = 0; 427 | if (step == 1) { 428 | cpu = get_cpu(); 429 | softirq[cpu] = HRTIMER; 430 | } 431 | jprobe_return(); 432 | } 433 | 434 | static void j_rcu_process_callbacks(struct softirq_action *a) { 435 | int cpu = 0; 436 | if (step == 1) { 437 | cpu = get_cpu(); 438 | softirq[cpu] = RCU; 439 | } 440 | jprobe_return(); 441 | } 442 | 443 | //void j_hrtimer_interrupt(void) { 444 | void j_local_apic_timer_interrupt(void) { 445 | int cpu = 0; 446 | int vector = 0; 447 | if (step == 1) { 448 | // Get information for do_IRQ 449 | vector = APICTIMER; 450 | cpu = get_cpu(); 451 | 452 | hardirq[cpu][hardirq_pos[cpu]] = vector; 453 | hardirq_pos[cpu]+=1; 454 | //printk(KERN_INFO "[%d] local_apic_timer starts\n", cpu); 455 | //printk(KERN_INFO "hardirq[%d] = %d, hardirq_pos[%d] = %d\n", cpu, hardirq[cpu][hardirq_pos[cpu]], cpu, hardirq_pos[cpu]); 456 | } 457 | jprobe_return(); 458 | } 459 | 460 | unsigned int j_do_IRQ(struct pt_regs *regs) { 461 | int cpu = 0; 462 | int vector = 0; 463 | if (step == 1) { 464 | // Get information for do_IRQ 465 | vector = ~regs->orig_ax; 466 | cpu = get_cpu(); 467 | 468 | hardirq[cpu][hardirq_pos[cpu]] = vector; 469 | hardirq_pos[cpu]+=1; 470 | //printk(KERN_INFO "[%d] local_apic_timer starts\n", cpu); 471 | //printk(KERN_INFO "hardirq[%d] = %d, hardirq_pos[%d] = %d\n", cpu, hardirq[cpu][hardirq_pos[cpu]], cpu, hardirq_pos[cpu]); 472 | } 473 | jprobe_return(); 474 | return 0; 475 | } 476 | 477 | static void j_aio_complete(struct kiocb *kiocb, long res, long res2) { 478 | int cpu = get_cpu(); 479 | aiotag[cpu] = 1; 480 | jprobe_return(); 481 | return; 482 | } 483 | 484 | static void j_wakeup_softirqd(void) { 485 | int cpu = 0; 486 | if (step == 1) { 487 | cpu = get_cpu(); 488 | softirq[cpu] = KSOFTIRQ; 489 | } 490 | jprobe_return(); 491 | } 492 | 493 | static struct jprobe jp1 = { 494 | .entry = j___switch_to, 495 | .kp = { 496 | .symbol_name = "__switch_to", 497 | }, 498 | }; 499 | 500 | static struct jprobe jp2 = { 501 | .entry = j_try_to_wake_up, 502 | .kp = { 503 | .symbol_name = "try_to_wake_up", 504 | }, 505 | }; 506 | 507 | static struct jprobe jp3 = { 508 | .entry = j_tasklet_hi_action, 509 | .kp = { 510 | .symbol_name = "tasklet_hi_action", 511 | }, 512 | }; 513 | 514 | static struct jprobe jp4 = { 515 | .entry = j_run_timer_softirq, 516 | .kp = { 517 | .symbol_name = "run_timer_softirq", 518 | }, 519 | }; 520 | 521 | static struct jprobe jp5 = { 522 | .entry = j_net_tx_action, 523 | .kp = { 524 | .symbol_name = "net_tx_action", 525 | }, 526 | }; 527 | 528 | static struct jprobe jp6 = { 529 | .entry = j_net_rx_action, 530 | .kp = { 531 | .symbol_name = "net_rx_action", 532 | }, 533 | }; 534 | 535 | static struct jprobe jp7 = { 536 | .entry = j_blk_done_softirq, 537 | .kp = { 538 | .symbol_name = "blk_done_softirq", 539 | }, 540 | }; 541 | 542 | static struct jprobe jp8 = { 543 | .entry = j_blk_iopoll_softirq, 544 | .kp = { 545 | .symbol_name = "irq_poll_softirq", 546 | }, 547 | }; 548 | 549 | static struct jprobe jp9 = { 550 | .entry = j_tasklet_action, 551 | .kp = { 552 | .symbol_name = "tasklet_action", 553 | }, 554 | }; 555 | 556 | static struct jprobe jp10 = { 557 | .entry = j_run_rebalance_domains, 558 | .kp = { 559 | .symbol_name = "run_rebalance_domains", 560 | }, 561 | }; 562 | 563 | static struct jprobe jp11 = { 564 | .entry = j_run_hrtimer_softirq, 565 | .kp = { 566 | .symbol_name = "run_hrtimer_softirq", 567 | }, 568 | }; 569 | 570 | static struct jprobe jp12 = { 571 | .entry = j_rcu_process_callbacks, 572 | .kp = { 573 | .symbol_name = "rcu_process_callbacks", 574 | }, 575 | }; 576 | 577 | // For Local APIC Timer Interrupt 578 | static struct jprobe jp13 = { 579 | .entry = j_local_apic_timer_interrupt, 580 | .kp = { 581 | .symbol_name = "local_apic_timer_interrupt", 582 | }, 583 | }; 584 | 585 | // For HardIRQ 586 | static struct jprobe jp14 = { 587 | .entry = j_do_IRQ, 588 | .kp = { 589 | .symbol_name = "do_IRQ", 590 | }, 591 | }; 592 | 593 | //static struct jprobe jp15 = { 594 | // .entry = j_aio_complete, 595 | // .kp = { 596 | // .symbol_name = "aio_complete", 597 | // }, 598 | //}; 599 | // For ksoftirq waking up 600 | //static struct jprobe jp15 = { 601 | // .entry = j_wakeup_softirqd, 602 | // .kp = { 603 | // .symbol_name = "wakeup_softirqd", 604 | // }, 605 | //}; 606 | 607 | 608 | // Add it temporarily for do_softirq 609 | void j___do_softirq(void) { 610 | int cpu = 0; 611 | u64 ts; 612 | char *chartmp; 613 | struct softirq_result sr; 614 | if (step == 1) { 615 | cpu = get_cpu(); 616 | ts = fang_clock(); 617 | spin_lock(&futex_lock); 618 | stime[cpu] = ts; 619 | spin_unlock(&futex_lock); 620 | } 621 | jprobe_return(); 622 | } 623 | 624 | static struct jprobe jp16 = { 625 | .entry = j___do_softirq, 626 | .kp = { 627 | .symbol_name = "__do_softirq", 628 | }, 629 | }; 630 | 631 | void j_add_interrupt_randomness(int irq, int irq_flags) { 632 | int cpu = 0; 633 | int vector = 0; 634 | if (step == 1) { 635 | cpu = get_cpu(); 636 | 637 | hardirq[cpu][hardirq_pos[cpu]] = RANDOM; 638 | hardirq_pos[cpu]+=1; 639 | //printk(KERN_INFO "[%d] local_apic_timer starts\n", cpu); 640 | //printk(KERN_INFO "hardirq[%d] = %d, hardirq_pos[%d] = %d\n", cpu, hardirq[cpu][hardirq_pos[cpu]], cpu, hardirq_pos[cpu]); 641 | } 642 | jprobe_return(); 643 | return; 644 | } 645 | 646 | // Special case for add_interrupt_randomness 647 | static struct jprobe jp17 = { 648 | .entry = j_add_interrupt_randomness, 649 | .kp = { 650 | .symbol_name = "add_interrupt_randomness", 651 | }, 652 | }; 653 | 654 | 655 | inline int compareDisk(const char *dname_tmp) { 656 | int i = 0; 657 | for (; i < dnum; i++) { 658 | if (strcmp(dname_tmp, dname[i]) == 0) { 659 | return i; 660 | } 661 | } 662 | return -1; 663 | } 664 | 665 | #ifdef NEWKERNEL 666 | static void j_part_round_stats(struct request_queue *q, int cpu, struct hd_struct *part) { 667 | jprobe_return(); 668 | } 669 | #else 670 | static void old_part_round_stats_single(struct hd_struct *part, unsigned long now) { 671 | int inflight; 672 | const char *dname_tmp = NULL; 673 | struct device *ddev = NULL; 674 | int pos = 0; 675 | 676 | ddev = part_to_dev(part); 677 | dname_tmp = dev_name(ddev); 678 | 679 | pos = compareDisk(dname_tmp); 680 | 681 | if (pos >= 0) { 682 | if (now == part->stamp) { 683 | } 684 | else { 685 | inflight = part_in_flight(part); 686 | if (inflight) { 687 | disk_work[pos] += now - part->stamp; 688 | } 689 | } 690 | } 691 | } 692 | 693 | static void j_part_round_stats(int cpu, struct hd_struct *part) { 694 | unsigned long now = jiffies; 695 | 696 | if (step == 1) { 697 | if (part->partno) { 698 | old_part_round_stats_single(&part_to_disk(part)->part0, now); 699 | } 700 | old_part_round_stats_single(part, now); 701 | } 702 | jprobe_return(); 703 | } 704 | #endif 705 | /* 706 | static void j_part_round_stats_single(int cpu, struct hd_struct *part, unsigned long now) { 707 | int inflight; 708 | const char *dname_tmp = NULL; 709 | struct device *ddev = NULL; 710 | int pos = 0; 711 | 712 | if (step == 1) { 713 | ddev = part_to_dev(part); 714 | dname_tmp = dev_name(ddev); 715 | 716 | pos = compareDisk(dname_tmp); 717 | if (pos >= 0) { 718 | if (now == part->stamp) { 719 | } 720 | else { 721 | #ifndef NEWKERNEL 722 | inflight = part_in_flight(part); 723 | #endif 724 | if (inflight) { 725 | disk_work[pos] += now - part->stamp; 726 | //spin_lock(&futex_lock); 727 | //sr.type = 1; 728 | //sr.stime = part->stamp; 729 | //sr.etime = now; 730 | //sr.core = cpu; 731 | //chartmp = (char *)(futex_result + futex_pos); 732 | //memcpy(chartmp, &sr, sizeof(sr)); 733 | //futex_pos+=sizeof(sr); 734 | //spin_unlock(&futex_lock); 735 | } 736 | } 737 | } 738 | } 739 | jprobe_return(); 740 | } 741 | */ 742 | static struct jprobe jp18 = { 743 | .entry = j_part_round_stats, 744 | .kp = { 745 | .symbol_name = "part_round_stats", 746 | }, 747 | }; 748 | 749 | static void j_futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q *q, 750 | struct hrtimer_sleeper *timeout) { 751 | struct fang_uds fuds; 752 | char *chartmp = NULL; 753 | if (step == 1) { 754 | if (current->tgid==target[0]) { 755 | fuds.ts = fang_clock(); 756 | fuds.pid = current->pid; 757 | fuds.type = 1; 758 | 759 | spin_lock(&wait_lock); 760 | chartmp = (char *)(wait_result + wait_pos); 761 | memcpy(chartmp, &fuds, sizeof(fuds)); 762 | wait_pos+=sizeof(fuds); 763 | spin_unlock(&wait_lock); 764 | } 765 | } 766 | jprobe_return(); 767 | } 768 | 769 | static struct jprobe jp19 = { 770 | .entry = j_futex_wait_queue_me, 771 | .kp = { 772 | .symbol_name = "futex_wait_queue_me", 773 | }, 774 | }; 775 | 776 | static void j_journal_end_buffer_io_sync(struct buffer_head *bh, int uptodate) { 777 | printk(KERN_INFO "I/O sync is happended in softirq context? %s", in_serving_softirq()>0?"True":"False"); 778 | jprobe_return(); 779 | } 780 | 781 | static struct jprobe jp20 = { 782 | .entry = j_journal_end_buffer_io_sync, 783 | .kp = { 784 | .symbol_name = "journal_end_buffer_io_sync", 785 | }, 786 | }; 787 | 788 | void j_wake_up_new_task(struct task_struct *p) { 789 | if (step == 1) { 790 | if (p->tgid == target[0]) { 791 | u64 ts = fang_clock(); 792 | int cpu = get_cpu(); 793 | struct fang_result fresult; 794 | char *chartmp = NULL; 795 | fresult.type = 2; 796 | fresult.ts = ts; 797 | fresult.perfts = wperfclock(); 798 | fresult.core = cpu; 799 | fresult.pid1 = current->pid; 800 | fresult.pid2 = p->pid; 801 | fresult.pid1state = current->state; 802 | fresult.pid2state = p->state; 803 | 804 | spin_lock(&switch_lock); 805 | chartmp = (char *)(switch_result + switch_pos); 806 | memcpy(chartmp, &fresult, sizeof(fresult)); 807 | switch_pos+=sizeof(fresult); 808 | spin_unlock(&switch_lock); 809 | 810 | printk(KERN_INFO "[Create] Thread %d creates thread %d time %lld\n", current->pid, p->pid, ts); 811 | } 812 | } 813 | jprobe_return(); 814 | } 815 | 816 | static struct jprobe jp21 = { 817 | .entry = j_wake_up_new_task, 818 | .kp = { 819 | .symbol_name = "wake_up_new_task", 820 | }, 821 | }; 822 | 823 | void j_do_exit(long code) { 824 | if (step == 1) { 825 | if (current->tgid == target[0]) { 826 | u64 ts = fang_clock(); 827 | int cpu = get_cpu(); 828 | char *chartmp = NULL; 829 | struct fang_result fresult; 830 | fresult.type = 3; 831 | fresult.ts = ts; 832 | fresult.perfts = wperfclock(); 833 | fresult.core = cpu; 834 | fresult.pid1 = current->pid; 835 | fresult.pid2 = 0; 836 | fresult.pid1state = current->state; 837 | fresult.pid2state = 0; 838 | 839 | spin_lock(&switch_lock); 840 | chartmp = (char *)(switch_result + switch_pos); 841 | memcpy(chartmp, &fresult, sizeof(fresult)); 842 | switch_pos+=sizeof(fresult); 843 | spin_unlock(&switch_lock); 844 | 845 | printk(KERN_INFO "[Finish] Thread %d finishes time %lld\n", current->pid, ts); 846 | } 847 | } 848 | jprobe_return(); 849 | } 850 | 851 | static struct jprobe jp22 = { 852 | .entry = j_do_exit, 853 | .kp = { 854 | .symbol_name = "do_exit", 855 | }, 856 | }; 857 | 858 | int j_tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) { 859 | if (step == 1) { 860 | if (containOfPID(0, current->tgid)) { 861 | netsend[current->pid] += (&msg->msg_iter)->count; 862 | } 863 | } 864 | jprobe_return(); 865 | return 0; 866 | } 867 | 868 | static struct jprobe jp26 = { 869 | .entry = j_tcp_sendmsg, 870 | .kp = { 871 | .symbol_name = "tcp_sendmsg", 872 | }, 873 | }; 874 | 875 | 876 | int j_udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) { 877 | if (step == 1) { 878 | if (containOfPID(0, current->tgid)) { 879 | netsend[current->pid] += (&msg->msg_iter)->count; 880 | } 881 | } 882 | jprobe_return(); 883 | return 0; 884 | } 885 | 886 | static struct jprobe jp27 = { 887 | .entry = j_udp_sendmsg, 888 | .kp = { 889 | .symbol_name = "udp_sendmsg", 890 | }, 891 | }; 892 | 893 | int j_tcp_sendpage(struct sock *sk, struct page *page, int offset, 894 | size_t size, int flags) { 895 | if (step == 1) { 896 | if (containOfPID(0, current->tgid)) { 897 | netsend[current->pid] += size; 898 | } 899 | } 900 | jprobe_return(); 901 | return 0; 902 | } 903 | 904 | static struct jprobe jp28 = { 905 | .entry = j_tcp_sendpage, 906 | .kp = { 907 | .symbol_name = "tcp_sendpage", 908 | }, 909 | }; 910 | 911 | int j_udp_sendpage(struct sock *sk, struct page *page, int offset, 912 | size_t size, int flags) { 913 | if (step == 1) { 914 | if (containOfPID(0, current->tgid)) { 915 | netsend[current->pid] += size; 916 | } 917 | } 918 | jprobe_return(); 919 | return 0; 920 | } 921 | 922 | static struct jprobe jp29 = { 923 | .entry = j_udp_sendpage, 924 | .kp = { 925 | .symbol_name = "udp_sendpage", 926 | }, 927 | }; 928 | 929 | 930 | int j_sock_sendmsg(struct socket *sock, struct msghdr *msg) { 931 | //struct fang_uds fuds; 932 | //char *chartmp = NULL; 933 | if (step == 1) { 934 | if (containOfPID(0, current->tgid)) { 935 | netsend[current->pid] += (&msg->msg_iter)->count; 936 | 937 | //fuds.ts = fang_clock(); 938 | //fuds.pid = current->pid; 939 | //fuds.type = 2; 940 | 941 | //spin_lock(&wait_lock); 942 | //chartmp = (char *)(wait_result + wait_pos); 943 | //memcpy(chartmp, &fuds, sizeof(fuds)); 944 | //wait_pos+=sizeof(fuds); 945 | //spin_unlock(&wait_lock); 946 | } 947 | } 948 | jprobe_return(); 949 | return 0; 950 | } 951 | 952 | static struct jprobe jp23 = { 953 | .entry = j_sock_sendmsg, 954 | .kp = { 955 | .symbol_name = "sock_sendmsg", 956 | }, 957 | }; 958 | 959 | static long j_futex_wait_restart(struct restart_block *restart) { 960 | if (step == 1) { 961 | printk(KERN_INFO "Core %d Thread %d futex_wait_restart Time %llu\n", get_cpu(), current->pid, fang_clock()); 962 | } 963 | jprobe_return(); 964 | return 0; 965 | } 966 | 967 | long j_do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, 968 | u32 __user *uaddr2, u32 val2, u32 val3) { 969 | //static int j_futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset) { 970 | struct fang_uds fuds; 971 | char *chartmp = NULL; 972 | if (step == 1) { 973 | if (current->tgid==target[0]) { 974 | fuds.ts = fang_clock(); 975 | fuds.pid = current->pid; 976 | fuds.type = 1; 977 | 978 | spin_lock(&wait_lock); 979 | chartmp = (char *)(wait_result + wait_pos); 980 | memcpy(chartmp, &fuds, sizeof(fuds)); 981 | wait_pos+=sizeof(fuds); 982 | spin_unlock(&wait_lock); 983 | } 984 | } 985 | jprobe_return(); 986 | return 0; 987 | } 988 | 989 | static struct jprobe jp24 = { 990 | //.entry = j_futex_wake, 991 | .entry = j_do_futex, 992 | .kp = { 993 | .symbol_name = "do_futex", 994 | //.addr = (kprobe_opcode_t *) kallsyms_lookup_name("futex_wait_restart"), 995 | }, 996 | }; 997 | 998 | static void j___lock_sock(struct sock *sk) { 999 | struct fang_uds fuds; 1000 | char *chartmp = NULL; 1001 | if (step == 1) { 1002 | if (current->tgid==target[0]) { 1003 | fuds.ts = fang_clock(); 1004 | fuds.pid = current->pid; 1005 | fuds.type = 3; 1006 | 1007 | spin_lock(&wait_lock); 1008 | chartmp = (char *)(wait_result + wait_pos); 1009 | memcpy(chartmp, &fuds, sizeof(fuds)); 1010 | wait_pos+=sizeof(fuds); 1011 | spin_unlock(&wait_lock); 1012 | } 1013 | } 1014 | jprobe_return(); 1015 | } 1016 | static struct jprobe jp25 = { 1017 | //.entry = j_futex_wake, 1018 | .entry = j___lock_sock, 1019 | .kp = { 1020 | .symbol_name = "__lock_sock", 1021 | //.addr = (kprobe_opcode_t *) kallsyms_lookup_name("futex_wait_restart"), 1022 | }, 1023 | }; 1024 | 1025 | 1026 | 1027 | 1028 | int open(struct inode *inode, struct file *filp) 1029 | { 1030 | printk(KERN_INFO "Inside open %llu\n", fang_clock()); 1031 | return 0; 1032 | } 1033 | 1034 | int release(struct inode *inode, struct file *filp) { 1035 | int i = 0; 1036 | printk (KERN_INFO "Inside close %llu\n", fang_clock()); 1037 | for (i = 0; i < dnum; i++) { 1038 | printk(KERN_INFO "Disk %s work time: %ld\n", dname[i], disk_work[i]); 1039 | } 1040 | for (i = 0; i < 65535; i++) { 1041 | if (netsend[i] != 0) { 1042 | printk(KERN_INFO "Thread %d sends bytes: %ld\n",i, netsend[i]); 1043 | } 1044 | } 1045 | 1046 | // printk(KERN_INFO "futex with timeout: %ld\n", futex_to); 1047 | // printk(KERN_INFO "futex without timeout: %ld\n", futex_noto); 1048 | 1049 | return 0; 1050 | } 1051 | 1052 | static int tasklet_hi_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1053 | { 1054 | int cpu = 0; 1055 | if (step == 1) { 1056 | cpu = get_cpu(); 1057 | softirq[cpu] = 0; 1058 | } 1059 | return 0; 1060 | } 1061 | 1062 | static struct kretprobe kret3 = { 1063 | .handler = tasklet_hi_return, 1064 | .maxactive = NR_CPUS, 1065 | //.maxactive = MAX_CPU_NR, 1066 | }; 1067 | 1068 | static int timer_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1069 | { 1070 | int cpu = 0; 1071 | if (step == 1) { 1072 | cpu = get_cpu(); 1073 | softirq[cpu] = 0; 1074 | } 1075 | return 0; 1076 | } 1077 | 1078 | static struct kretprobe kret4 = { 1079 | .handler = timer_return, 1080 | .maxactive = NR_CPUS, 1081 | //.maxactive = MAX_CPU_NR, 1082 | }; 1083 | 1084 | static int net_tx_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1085 | { 1086 | int cpu = 0; 1087 | if (step == 1) { 1088 | cpu = get_cpu(); 1089 | softirq[cpu] = 0; 1090 | } 1091 | return 0; 1092 | } 1093 | 1094 | static struct kretprobe kret5 = { 1095 | .handler = net_tx_return, 1096 | .maxactive = NR_CPUS, 1097 | //.maxactive = MAX_CPU_NR, 1098 | }; 1099 | 1100 | static int net_rx_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1101 | { 1102 | int cpu = 0; 1103 | if (step == 1) { 1104 | cpu = get_cpu(); 1105 | softirq[cpu] = 0; 1106 | //if (cpu == 7) { 1107 | // printk(KERN_INFO "[%d]softirq[cpu] = %d\n", cpu, softirq[cpu]); 1108 | //} 1109 | } 1110 | return 0; 1111 | } 1112 | 1113 | static struct kretprobe kret6 = { 1114 | .handler = net_rx_return, 1115 | .maxactive = NR_CPUS, 1116 | //.maxactive = MAX_CPU_NR, 1117 | }; 1118 | 1119 | static int blk_done_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1120 | { 1121 | int cpu = 0; 1122 | if (step == 1) { 1123 | cpu = get_cpu(); 1124 | softirq[cpu] = 0; 1125 | } 1126 | return 0; 1127 | } 1128 | 1129 | static struct kretprobe kret7 = { 1130 | .handler = blk_done_return, 1131 | .maxactive = NR_CPUS, 1132 | //.maxactive = MAX_CPU_NR, 1133 | }; 1134 | 1135 | static int blk_iopoll_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1136 | { 1137 | int cpu = 0; 1138 | if (step == 1) { 1139 | cpu = get_cpu(); 1140 | softirq[cpu] = 0; 1141 | } 1142 | return 0; 1143 | } 1144 | 1145 | static struct kretprobe kret8 = { 1146 | .handler = blk_iopoll_return, 1147 | .maxactive = NR_CPUS, 1148 | //.maxactive = MAX_CPU_NR, 1149 | }; 1150 | 1151 | static int tasklet_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1152 | { 1153 | int cpu = 0; 1154 | if (step == 1) { 1155 | cpu = get_cpu(); 1156 | softirq[cpu] = 0; 1157 | } 1158 | return 0; 1159 | } 1160 | 1161 | static struct kretprobe kret9 = { 1162 | .handler = tasklet_return, 1163 | .maxactive = NR_CPUS, 1164 | //.maxactive = MAX_CPU_NR, 1165 | }; 1166 | 1167 | static int sched_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1168 | { 1169 | int cpu = 0; 1170 | if (step == 1) { 1171 | cpu = get_cpu(); 1172 | softirq[cpu] = 0; 1173 | } 1174 | return 0; 1175 | } 1176 | 1177 | static struct kretprobe kret10 = { 1178 | .handler = sched_return, 1179 | .maxactive = NR_CPUS, 1180 | //.maxactive = MAX_CPU_NR, 1181 | }; 1182 | 1183 | static int hrtimer_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1184 | { 1185 | int cpu = 0; 1186 | if (step == 1) { 1187 | cpu = get_cpu(); 1188 | softirq[cpu] = 0; 1189 | } 1190 | return 0; 1191 | } 1192 | 1193 | static struct kretprobe kret11 = { 1194 | .handler = hrtimer_return, 1195 | .maxactive = NR_CPUS, 1196 | //.maxactive = MAX_CPU_NR, 1197 | }; 1198 | 1199 | static int rcu_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1200 | { 1201 | int cpu = 0; 1202 | if (step == 1) { 1203 | cpu = get_cpu(); 1204 | softirq[cpu] = 0; 1205 | } 1206 | return 0; 1207 | } 1208 | 1209 | static struct kretprobe kret12 = { 1210 | .handler = rcu_return, 1211 | .maxactive = NR_CPUS, 1212 | //.maxactive = MAX_CPU_NR, 1213 | }; 1214 | 1215 | static int apictimer_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1216 | { 1217 | int cpu = 0; 1218 | if (step == 1) { 1219 | cpu = get_cpu(); 1220 | if (hardirq_pos[cpu] != 0) hardirq_pos[cpu]-=1; 1221 | //printk(KERN_INFO "[%d] local_apic_timer ends\n", cpu); 1222 | //printk(KERN_INFO "hardirq_pos[%d] = %d\n", cpu, hardirq_pos[cpu]); 1223 | } 1224 | return 0; 1225 | } 1226 | 1227 | static struct kretprobe kret13 = { 1228 | .handler = apictimer_return, 1229 | .maxactive = NR_CPUS, 1230 | //.maxactive = 1024, 1231 | }; 1232 | 1233 | static int IRQ_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1234 | { 1235 | int cpu = 0; 1236 | if (step == 1) { 1237 | cpu = get_cpu(); 1238 | if (hardirq_pos[cpu] != 0) hardirq_pos[cpu]-=1; 1239 | //printk(KERN_INFO "[%d] local_apic_timer ends\n", cpu); 1240 | //printk(KERN_INFO "hardirq_pos[%d] = %d\n", cpu, hardirq_pos[cpu]); 1241 | } 1242 | return 0; 1243 | } 1244 | 1245 | static struct kretprobe kret14 = { 1246 | .handler = IRQ_return, 1247 | .maxactive = NR_CPUS, 1248 | //.maxactive = 1024, 1249 | }; 1250 | 1251 | static int KSOFTIRQ_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1252 | { 1253 | int cpu = 0; 1254 | if (step == 1) { 1255 | cpu = get_cpu(); 1256 | softirq[cpu] = 0; 1257 | } 1258 | return 0; 1259 | } 1260 | 1261 | //static struct kretprobe kret15 = { 1262 | // .handler = KSOFTIRQ_return, 1263 | // .maxactive = NR_CPUS, 1264 | // //.maxactive = MAX_CPU_NR, 1265 | //}; 1266 | 1267 | static int SOFTIRQ_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1268 | { 1269 | int cpu = 0; 1270 | u64 ts; 1271 | char *chartmp; 1272 | struct softirq_result sr; 1273 | if (step == 1) { 1274 | cpu = get_cpu(); 1275 | ts = fang_clock(); 1276 | spin_lock(&futex_lock); 1277 | sr.type = 0; 1278 | sr.stime = stime[cpu]; 1279 | sr.etime = ts; 1280 | sr.core = cpu; 1281 | chartmp = (char *)(futex_result + futex_pos); 1282 | memcpy(chartmp, &sr, sizeof(sr)); 1283 | futex_pos+=sizeof(sr); 1284 | spin_unlock(&futex_lock); 1285 | } 1286 | return 0; 1287 | } 1288 | 1289 | static struct kretprobe kret16 = { 1290 | .handler = SOFTIRQ_return, 1291 | .maxactive = NR_CPUS, 1292 | //.maxactive = MAX_CPU_NR, 1293 | }; 1294 | 1295 | // Special for random generator in do_IRQ 1296 | static int RANDOM_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1297 | { 1298 | int cpu = 0; 1299 | if (step == 1) { 1300 | cpu = get_cpu(); 1301 | if (hardirq_pos[cpu] != 0) hardirq_pos[cpu]-=1; 1302 | } 1303 | return 0; 1304 | } 1305 | 1306 | static struct kretprobe kret17 = { 1307 | .handler = RANDOM_return, 1308 | .maxactive = NR_CPUS, 1309 | //.maxactive = MAX_CPU_NR, 1310 | }; 1311 | 1312 | static int switch_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1313 | { 1314 | int cpu = 0; 1315 | if (step == 1) { 1316 | cpu=get_cpu(); 1317 | //if (dump_cpu_check[cpu]>0) { 1318 | //printk(KERN_INFO "threadID = %d\n", dump_cpu_check[cpu]); 1319 | //dump_stack(); 1320 | //dump_cpu_check[cpu]=0; 1321 | //} 1322 | } 1323 | return 0; 1324 | } 1325 | 1326 | static struct kretprobe kret18 = { 1327 | .handler = switch_return, 1328 | .maxactive = NR_CPUS, 1329 | //.maxactive = MAX_CPU_NR, 1330 | }; 1331 | 1332 | static int futex_wait_return(struct kretprobe_instance *ri, struct pt_regs *regs) 1333 | { 1334 | struct fang_uds fuds; 1335 | char *chartmp = NULL; 1336 | if (step == 1) { 1337 | //int retval = regs_return_value(regs); 1338 | //int cpu = get_cpu(); 1339 | if (current->tgid==target[0]) { 1340 | fuds.ts = fang_clock(); 1341 | fuds.pid = current->pid; 1342 | fuds.type = 4; 1343 | 1344 | spin_lock(&wait_lock); 1345 | chartmp = (char *)(wait_result + wait_pos); 1346 | memcpy(chartmp, &fuds, sizeof(fuds)); 1347 | wait_pos+=sizeof(fuds); 1348 | spin_unlock(&wait_lock); 1349 | } 1350 | 1351 | /* 1352 | if (containOfPID(cpu, current->tgid) && retval < 0) { 1353 | 1354 | //printk(KERN_INFO "Core %d Thread %d RetValue %d Time %llu\n", get_cpu(), current->pid, retval, fang_clock()); 1355 | fuds.ts = fang_clock(); 1356 | fuds.pid = current->pid; 1357 | fuds.type = (short)retval; 1358 | 1359 | spin_lock(&wait_lock); 1360 | chartmp = (char *)(wait_result + wait_pos); 1361 | memcpy(chartmp, &fuds, sizeof(fuds)); 1362 | wait_pos+=sizeof(fuds); 1363 | spin_unlock(&wait_lock); 1364 | } 1365 | */ 1366 | } 1367 | return 0; 1368 | } 1369 | 1370 | static struct kretprobe kret19 = { 1371 | .handler = futex_wait_return, 1372 | .maxactive = NR_CPUS, 1373 | //.maxactive = MAX_CPU_NR, 1374 | }; 1375 | 1376 | long ioctl_funcs(struct file *filp,unsigned int cmd, unsigned long arg) 1377 | { 1378 | 1379 | unsigned long ret = 0; 1380 | int tres; 1381 | char tmp[200]; 1382 | char *chartmp = NULL; 1383 | struct timeval start; 1384 | struct fang_uds fuds; 1385 | struct fang_spin_uds spin_uds; 1386 | struct uds_spin_res spin_res; 1387 | int i = 0, cpu = 0; 1388 | 1389 | switch(cmd) { 1390 | case IOCTL_ADDWAIT: 1391 | if (step == 1) { 1392 | fuds.ts = fang_clock(); 1393 | fuds.pid = current->pid; 1394 | fuds.type = (short)arg; 1395 | spin_lock(&wait_lock); 1396 | chartmp = (char *)(wait_result + wait_pos); 1397 | memcpy(chartmp, &fuds, sizeof(fuds)); 1398 | wait_pos+=sizeof(fuds); 1399 | spin_unlock(&wait_lock); 1400 | } 1401 | break; 1402 | case IOCTL_ADDUDS: 1403 | if (step == 1) { 1404 | spin_uds.ts = fang_clock(); 1405 | spin_uds.pid = current->pid; 1406 | copy_from_user(&spin_res, (void*)arg, 12); 1407 | spin_uds.lock = spin_res.addr; 1408 | spin_uds.type = spin_res.type; 1409 | //fuds.ts = fang_clock(); 1410 | //fuds.pid = current->pid; 1411 | //fuds.type = (short)arg; 1412 | spin_lock(&state_lock); 1413 | //chartmp = (char *)(state_result + state_pos); 1414 | chartmp = (char *)(state_result + state_pos); 1415 | memcpy(chartmp, &spin_uds, sizeof(spin_uds)); 1416 | state_pos+=sizeof(spin_uds); 1417 | spin_unlock(&state_lock); 1418 | } 1419 | break; 1420 | case IOCTL_PID: 1421 | //printk(KERN_INFO "Setting PID = %d\n", (int)pid); 1422 | //for_each_process(p) { 1423 | // if (task_pid_nr(p)==pid) break; 1424 | //} 1425 | //target[0] = pid; 1426 | switch_pos = 0; 1427 | futex_pos = 0; 1428 | state_pos = 0; 1429 | wait_pos = 0; 1430 | 1431 | for (i=0;i<20;i++){ 1432 | disk_work[i] = 0; 1433 | } 1434 | 1435 | for (i=0; i<65535; i++) { 1436 | netsend[i] = 0; 1437 | //futextag[i] = 0; 1438 | //futexstart[i] = 0; 1439 | //futexsum[i] = 0; 1440 | } 1441 | 1442 | for (i = 0; i < PIDNUM; i++) { 1443 | target[i] = 0; 1444 | } 1445 | 1446 | dnum = 0; 1447 | 1448 | futex_to = 0; 1449 | futex_noto = 0; 1450 | 1451 | readTarget(); 1452 | 1453 | step = 1; 1454 | break; 1455 | 1456 | case IOCTL_INIT: 1457 | step = -1; 1458 | break; 1459 | 1460 | case IOCTL_COPYSWITCH: 1461 | spin_lock(&switch_lock); 1462 | ret = switch_pos; 1463 | switch_pos = 0; 1464 | switch_result_tmp = switch_result; 1465 | switch_result = switch_result_bak; 1466 | spin_unlock(&switch_lock); 1467 | switch_result_bak = switch_result_tmp; 1468 | tres = copy_to_user((void*)arg, switch_result_bak, ret); 1469 | break; 1470 | 1471 | case IOCTL_COPYWAIT: 1472 | spin_lock(&wait_lock); 1473 | ret = wait_pos; 1474 | wait_pos = 0; 1475 | wait_result_tmp = wait_result; 1476 | wait_result = wait_result_bak; 1477 | spin_unlock(&wait_lock); 1478 | wait_result_bak = wait_result_tmp; 1479 | tres = copy_to_user((void*)arg, wait_result_bak, ret); 1480 | break; 1481 | 1482 | case IOCTL_STATE_BEGIN: 1483 | sprintf(tmp, "Sample %d\n", state_total); 1484 | spin_lock(&switch_lock); 1485 | chartmp = (char *)(switch_result + switch_pos); 1486 | memcpy(chartmp, tmp, strlen(tmp)); 1487 | switch_pos+=strlen(tmp); 1488 | spin_unlock(&switch_lock); 1489 | step = 2; 1490 | state_total++; 1491 | break; 1492 | 1493 | case IOCTL_STATE_END: 1494 | step = -1; 1495 | 1496 | if (switch_pos > BSIZE) { 1497 | ret = 1; 1498 | } 1499 | else { 1500 | ret = 0; 1501 | } 1502 | //ret = 1; 1503 | break; 1504 | case IOCTL_FUTEX: 1505 | spin_lock(&futex_lock); 1506 | tag = 0; 1507 | fnum = 0; 1508 | wnum = 0; 1509 | chartmp = (char *)(futex_result + futex_pos); 1510 | memcpy(chartmp, tmp, strlen(tmp)); 1511 | futex_pos+=strlen(tmp); 1512 | futex_total++; 1513 | spin_unlock(&futex_lock); 1514 | break; 1515 | 1516 | case IOCTL_JPROBE: 1517 | break; 1518 | 1519 | case IOCTL_UNJPROBE: 1520 | break; 1521 | case IOCTL_JSWITCH: 1522 | break; 1523 | case IOCTL_COPYSTATE: 1524 | spin_lock(&state_lock); 1525 | ret = state_pos; 1526 | state_pos = 0; 1527 | state_result_tmp = state_result; 1528 | state_result = state_result_bak; 1529 | spin_unlock(&state_lock); 1530 | state_result_bak = state_result_tmp; 1531 | tres = copy_to_user((void*)arg, state_result_bak, ret); 1532 | //printk(KERN_INFO "state_pos = %d\n", state_pos); 1533 | break; 1534 | case IOCTL_COPYFUTEX: 1535 | spin_lock(&futex_lock); 1536 | ret = futex_pos; 1537 | futex_pos = 0; 1538 | futex_result_tmp = futex_result; 1539 | futex_result = futex_result_bak; 1540 | spin_unlock(&futex_lock); 1541 | futex_result_bak = futex_result_tmp; 1542 | tres = copy_to_user((void*)arg, futex_result_bak, ret); 1543 | break; 1544 | case IOCTL_COPYBUFFER: 1545 | break; 1546 | case IOCTL_GETENTRY: 1547 | spin_lock(&switch_lock); 1548 | ret = copy_to_user((void*)arg, switch_result, switch_pos); 1549 | ret = switch_pos; 1550 | switch_pos = 0; 1551 | spin_unlock(&switch_lock); 1552 | break; 1553 | case IOCTL_STEP1_BEGIN: 1554 | switch_pos = 0; 1555 | state_pos = 0; 1556 | futex_pos = 0; 1557 | wait_pos = 0; 1558 | spin_lock(&switch_lock); 1559 | t = p; 1560 | ptmp = p; 1561 | do { 1562 | do_gettimeofday(&start); 1563 | } while_each_thread(ptmp, t); 1564 | 1565 | printk(KERN_INFO "start time: sec = %ld usec = %ld\n", start.tv_sec, start.tv_usec); 1566 | //step = 1; 1567 | spin_unlock(&switch_lock); 1568 | break; 1569 | case IOCTL_STEP1_END: 1570 | spin_lock(&switch_lock); 1571 | do_gettimeofday(&start); 1572 | lastsec = start.tv_sec; 1573 | lastusec = start.tv_usec; 1574 | printk(KERN_INFO "end time: sec = %ld usec = %ld\n", lastsec, lastusec); 1575 | step = -1; 1576 | spin_unlock(&switch_lock); 1577 | break; 1578 | case IOCTL_USER_STACK: 1579 | //for_each_process(p) { 1580 | // if (task_pid_nr(p)==pid) break; 1581 | //} 1582 | //struct pt_regs *regs = task_pt_regs(p); 1583 | //printk(KERN_INFO "IP=%ld, BP=%ld\n", regs->ip, regs->bp); 1584 | //long currbp = regs->bp; 1585 | //int currpos = 0; 1586 | //while(currbp!=0 || currpos > 10) { 1587 | // copy_from_user(&currbp, currbp, 8); 1588 | // printk(KERN_INFO "Next BP = %ld", currbp); 1589 | //} 1590 | break; 1591 | case IOCTL_DNAME: 1592 | tres = copy_from_user(&dname[dnum], (void*)arg, 20); 1593 | printk(KERN_INFO "Monitored Disk: %s\n", dname[dnum]); 1594 | dnum++; 1595 | break; 1596 | case IOCTL_SPINLOCK: 1597 | cpu = get_cpu(); 1598 | spin_result[cpu].type = (short)arg; 1599 | spin_result[cpu].core = cpu; 1600 | spin_result[cpu].ts = fang_clock(); 1601 | spin_result[cpu].pid1 = current->pid; 1602 | spin_lock(&switch_lock); 1603 | chartmp = (char *)(switch_result + switch_pos); 1604 | memcpy(chartmp, &spin_result[cpu], sizeof(spin_result[cpu])); 1605 | switch_pos+=sizeof(spin_result[cpu]); 1606 | spin_unlock(&switch_lock); 1607 | break; 1608 | case IOCTL_TIME: 1609 | printk(KERN_INFO "now = %llu\n", tsc_now()); 1610 | break; 1611 | } 1612 | return ret; 1613 | } 1614 | 1615 | struct file_operations fops = { 1616 | open: open, 1617 | unlocked_ioctl: ioctl_funcs, 1618 | release: release 1619 | }; 1620 | 1621 | struct cdev *kernel_cdev; 1622 | dev_t dev_no, dev; 1623 | 1624 | //Fang added 1625 | int readTarget(void) { 1626 | mm_segment_t fs; 1627 | struct file *filp = NULL; 1628 | char buf[10240]; 1629 | int tmppos= 0; 1630 | int pidpos = 0; 1631 | char buftmp[64]; 1632 | char* buftmp1; 1633 | int i = 0; 1634 | 1635 | fs = get_fs(); 1636 | set_fs(KERNEL_DS); 1637 | 1638 | filp = filp_open("/tmp/target", O_RDONLY, 0); 1639 | if (filp==NULL) { 1640 | printk(KERN_ERR "Error: something wrong with the pid target file!\n"); 1641 | return -1; 1642 | } 1643 | #ifdef NEWKERNEL 1644 | kernel_read(filp, &filp->f_pos, buf, 10240); 1645 | #else 1646 | vfs_read(filp, buf, 10240, &filp->f_pos); 1647 | #endif 1648 | //filp->f_op->read(filp, buf, 1024, &filp->f_pos); 1649 | filp_close(filp,NULL); 1650 | set_fs(fs); 1651 | 1652 | i=0; 1653 | tmppos= 0; 1654 | 1655 | while(buf[i]) { 1656 | if (buf[i]=='\n') { 1657 | buftmp[tmppos]='\0'; 1658 | //printk(KERN_INFO "%d\n", simple_strtol(buftmp, &buftmp1, 10)); 1659 | target[pidpos] = simple_strtol(buftmp, &buftmp1, 10); 1660 | pidpos++; 1661 | tmppos = 0; 1662 | } 1663 | else { 1664 | buftmp[tmppos] = buf[i]; 1665 | tmppos++; 1666 | } 1667 | i++; 1668 | } 1669 | return 0; 1670 | } 1671 | 1672 | int char_arr_init (void) { 1673 | int ret = 0; 1674 | int i = 0; 1675 | 1676 | for (i= 0;iops = &fops; 1722 | kernel_cdev->owner = THIS_MODULE; 1723 | printk (" Inside init module\n"); 1724 | 1725 | // New added 1726 | dev = MKDEV(239, 0); 1727 | ret = register_chrdev_region(dev, (unsigned int)1, "wperf"); 1728 | 1729 | //ret = alloc_chrdev_region( &dev_no , 0, 1,"wperf"); 1730 | if (ret < 0) { 1731 | printk("Major number allocation is failed\n"); 1732 | return ret; 1733 | } 1734 | 1735 | Major = MAJOR(dev); 1736 | //dev = MKDEV(Major,0); 1737 | printk (" The major number for your device is %d\n", Major); 1738 | ret = cdev_add( kernel_cdev,dev,1); 1739 | 1740 | if(ret < 0 ) 1741 | { 1742 | printk(KERN_INFO "Unable to allocate cdev"); 1743 | return ret; 1744 | } 1745 | 1746 | state_result = (char*)vmalloc(BMAX*sizeof(char)); 1747 | state_result_bak = (char*)vmalloc(BMAX*sizeof(char)); 1748 | futex_result = (char*)vmalloc(BMAX*sizeof(char)); 1749 | futex_result_bak = (char*)vmalloc(BMAX*sizeof(char)); 1750 | switch_result = (char*)vmalloc(BMAX*sizeof(char)); 1751 | switch_result_bak = (char*)vmalloc(BMAX*sizeof(char)); 1752 | 1753 | wait_result = (char*)vmalloc(BMAX*sizeof(char)); 1754 | wait_result_bak = (char*)vmalloc(BMAX*sizeof(char)); 1755 | 1756 | spin_lock_init(&switch_lock); 1757 | spin_lock_init(&state_lock); 1758 | spin_lock_init(&futex_lock); 1759 | spin_lock_init(&wait_lock); 1760 | 1761 | ret = register_jprobe(&jp1); 1762 | if (ret < 0) { 1763 | printk(KERN_INFO "register_kprobe failed, returned %d\n", ret); 1764 | //return ret; 1765 | } 1766 | ret = register_jprobe(&jp2); 1767 | if (ret < 0) { 1768 | printk(KERN_INFO "register_kprobe failed, returned %d\n", ret); 1769 | //return ret; 1770 | } 1771 | register_jprobe(&jp3); 1772 | register_jprobe(&jp4); 1773 | register_jprobe(&jp5); 1774 | if (register_jprobe(&jp6)) { 1775 | printk(KERN_INFO "register net_rx jprobe failed!\n"); 1776 | } 1777 | register_jprobe(&jp7); 1778 | register_jprobe(&jp8); 1779 | register_jprobe(&jp9); 1780 | register_jprobe(&jp10); 1781 | //register_jprobe(&jp11); 1782 | register_jprobe(&jp12); 1783 | //if (register_jprobe(&jp13)) { 1784 | // printk(KERN_INFO "register apic jprobe failed!%d\n"); 1785 | //} 1786 | 1787 | //register_jprobe(&jp14); 1788 | //register_jprobe(&jp15); 1789 | register_jprobe(&jp16); 1790 | //register_jprobe(&jp17); 1791 | register_jprobe(&jp18); 1792 | //register_jprobe(&jp19); 1793 | //register_jprobe(&jp20); 1794 | register_jprobe(&jp21); 1795 | register_jprobe(&jp22); 1796 | //register_jprobe(&jp23); 1797 | //jp24.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("futex_wait_restart"), 1798 | //register_jprobe(&jp24); 1799 | //register_jprobe(&jp25); 1800 | register_jprobe(&jp26); 1801 | register_jprobe(&jp27); 1802 | register_jprobe(&jp28); 1803 | register_jprobe(&jp29); 1804 | 1805 | 1806 | kret3.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("tasklet_hi_action"); 1807 | kret4.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("run_timer_softirq"); 1808 | kret5.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("net_tx_action"); 1809 | kret6.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("net_rx_action"); 1810 | kret7.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("blk_done_softirq"); 1811 | kret8.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("irq_poll_softirq"); 1812 | kret9.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("tasklet_action"); 1813 | kret10.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("run_rebalance_domains"); 1814 | kret11.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("run_hrtimer_softirq"); 1815 | kret12.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("rcu_process_callbacks"); 1816 | kret13.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("local_apic_timer_interrupt"); 1817 | kret14.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("do_IRQ"); 1818 | // kret15.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("wakeup_softirqd"); 1819 | kret16.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("__do_softirq"); 1820 | kret17.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("add_interrupt_randomness"); 1821 | //kret18.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("finish_task_switch"); 1822 | //kret19.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("futex_wait"); 1823 | //kret19.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("__lock_sock"); 1824 | 1825 | register_kretprobe(&kret3); 1826 | register_kretprobe(&kret4); 1827 | register_kretprobe(&kret5); 1828 | register_kretprobe(&kret6); 1829 | register_kretprobe(&kret7); 1830 | register_kretprobe(&kret8); 1831 | register_kretprobe(&kret9); 1832 | register_kretprobe(&kret10); 1833 | //register_kretprobe(&kret11); 1834 | register_kretprobe(&kret12); 1835 | //register_kretprobe(&kret13); 1836 | //register_kretprobe(&kret14); 1837 | register_kretprobe(&kret16); 1838 | //register_kretprobe(&kret17); 1839 | //register_kretprobe(&kret18); 1840 | //register_kretprobe(&kret19); 1841 | 1842 | return 0; 1843 | } 1844 | 1845 | void char_arr_cleanup(void) { 1846 | cdev_del(kernel_cdev); 1847 | 1848 | unregister_jprobe(&jp1); 1849 | 1850 | unregister_jprobe(&jp2); 1851 | unregister_jprobe(&jp3); 1852 | unregister_jprobe(&jp4); 1853 | unregister_jprobe(&jp5); 1854 | unregister_jprobe(&jp6); 1855 | unregister_jprobe(&jp7); 1856 | unregister_jprobe(&jp8); 1857 | unregister_jprobe(&jp9); 1858 | unregister_jprobe(&jp10); 1859 | //unregister_jprobe(&jp11); 1860 | unregister_jprobe(&jp12); 1861 | //unregister_jprobe(&jp13); 1862 | //unregister_jprobe(&jp14); 1863 | //unregister_jprobe(&jp15); 1864 | unregister_jprobe(&jp16); 1865 | //unregister_jprobe(&jp17); 1866 | unregister_jprobe(&jp18); 1867 | //unregister_jprobe(&jp19); 1868 | //unregister_jprobe(&jp20); 1869 | unregister_jprobe(&jp21); 1870 | unregister_jprobe(&jp22); 1871 | //unregister_jprobe(&jp23); 1872 | //unregister_jprobe(&jp24); 1873 | //unregister_jprobe(&jp25); 1874 | unregister_jprobe(&jp26); 1875 | unregister_jprobe(&jp27); 1876 | unregister_jprobe(&jp28); 1877 | unregister_jprobe(&jp29); 1878 | 1879 | 1880 | //unregister_kretprobe(&futex_return); 1881 | //unregister_kretprobe(&sleep_return); 1882 | //unregister_kretprobe(&signal_return); 1883 | //unregister_kretprobe(&aio_return); 1884 | 1885 | unregister_kretprobe(&kret3); 1886 | unregister_kretprobe(&kret4); 1887 | unregister_kretprobe(&kret5); 1888 | unregister_kretprobe(&kret6); 1889 | unregister_kretprobe(&kret7); 1890 | unregister_kretprobe(&kret8); 1891 | unregister_kretprobe(&kret9); 1892 | unregister_kretprobe(&kret10); 1893 | //unregister_kretprobe(&kret11); 1894 | unregister_kretprobe(&kret12); 1895 | //unregister_kretprobe(&kret13); 1896 | //unregister_kretprobe(&kret14); 1897 | unregister_kretprobe(&kret16); 1898 | //unregister_kretprobe(&kret17); 1899 | //unregister_kretprobe(&kret18); 1900 | //unregister_kretprobe(&kret19); 1901 | 1902 | /* nmissed > 0 suggests that maxactive was set too low. */ 1903 | //printk("Missed probing %d instances of %s\n", 1904 | // futex_return.nmissed, "do_futex return"); 1905 | 1906 | cdev_del(kernel_cdev); 1907 | unregister_chrdev_region(dev, 1); 1908 | vfree(switch_result); 1909 | vfree(switch_result_bak); 1910 | vfree(state_result); 1911 | vfree(state_result_bak); 1912 | vfree(futex_result); 1913 | vfree(futex_result_bak); 1914 | 1915 | vfree(wait_result); 1916 | vfree(wait_result_bak); 1917 | 1918 | printk(KERN_INFO " Inside cleanup_module\n"); 1919 | } 1920 | MODULE_LICENSE("GPL"); 1921 | module_init(char_arr_init); 1922 | module_exit(char_arr_cleanup); 1923 | -------------------------------------------------------------------------------- /module/ioctl_perf.h: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #define IOC_MAGIC 'f' 4 | #define IOCTL_STATE_BEGIN _IOWR(IOC_MAGIC,0, unsigned long) 5 | #define IOCTL_PID _IOWR(IOC_MAGIC, 1, unsigned long) 6 | #define IOCTL_JPROBE _IOWR(IOC_MAGIC, 2, unsigned long) 7 | #define IOCTL_UNJPROBE _IOWR(IOC_MAGIC, 3, unsigned long) 8 | #define IOCTL_COPYSWITCH _IOWR(IOC_MAGIC, 4, unsigned long) 9 | #define IOCTL_COPYSTATE _IOWR(IOC_MAGIC, 5, unsigned long) 10 | #define IOCTL_COPYFUTEX _IOWR(IOC_MAGIC, 6, unsigned long) 11 | #define IOCTL_JSWITCH _IOWR(IOC_MAGIC, 7, unsigned long) 12 | #define IOCTL_INIT _IOWR(IOC_MAGIC, 8, unsigned long) 13 | #define IOCTL_GDBPID _IOWR(IOC_MAGIC, 9, unsigned long) 14 | #define IOCTL_STATE_END _IOWR(IOC_MAGIC, 10, unsigned long) 15 | #define IOCTL_FUTEX _IOWR(IOC_MAGIC, 11, unsigned long) 16 | #define IOCTL_COPYBUFFER _IOWR(IOC_MAGIC, 12, unsigned long) 17 | #define IOCTL_MEMINIT _IOWR(IOC_MAGIC, 13, unsigned long) 18 | #define IOCTL_GETENTRY _IOWR(IOC_MAGIC, 14, unsigned long) 19 | #define IOCTL_STEP1_BEGIN _IOWR(IOC_MAGIC, 15, unsigned long) 20 | #define IOCTL_STEP1_END _IOWR(IOC_MAGIC, 16, unsigned long) 21 | #define IOCTL_USER_STACK _IOWR(IOC_MAGIC, 17, unsigned long) 22 | #define IOCTL_DNAME _IOWR(IOC_MAGIC, 18, unsigned long) 23 | #define IOCTL_SPINLOCK _IOWR(IOC_MAGIC, 19, unsigned long) 24 | #define IOCTL_ADDUDS _IOWR(IOC_MAGIC, 20, unsigned long) 25 | #define IOCTL_COPYWAIT _IOWR(IOC_MAGIC, 21, unsigned long) 26 | #define IOCTL_ADDWAIT _IOWR(IOC_MAGIC, 22, unsigned long) 27 | #define IOCTL_TIME _IOWR(IOC_MAGIC, 23, unsigned long) 28 | 29 | 30 | int readTarget(void); 31 | 32 | struct socket; 33 | 34 | struct msghdr { 35 | void *msg_name; /* ptr to socket address structure */ 36 | int msg_namelen; /* size of socket address structure */ 37 | struct iov_iter msg_iter; /* data */ 38 | void *msg_control; /* ancillary data */ 39 | __kernel_size_t msg_controllen; /* ancillary data buffer length */ 40 | unsigned int msg_flags; /* flags on received message */ 41 | struct kiocb *msg_iocb; /* ptr to iocb for async requests */ 42 | }; 43 | 44 | struct futex_q { 45 | struct plist_node list; 46 | 47 | struct task_struct *task; 48 | spinlock_t *lock_ptr; 49 | union futex_key key; 50 | struct futex_pi_state *pi_state; 51 | struct rt_mutex_waiter *rt_waiter; 52 | union futex_key *requeue_pi_key; 53 | u32 bitset; 54 | }; 55 | 56 | struct futex_hash_bucket { 57 | atomic_t waiters; 58 | spinlock_t lock; 59 | struct plist_head chain; 60 | } ____cacheline_aligned_in_smp; 61 | 62 | struct uds_spin_res { 63 | long addr; 64 | int type; 65 | }; 66 | 67 | struct fang_spin_uds { 68 | u64 ts; 69 | int pid; 70 | long lock; 71 | short type; 72 | } __attribute__((packed)); 73 | 74 | struct fang_uds { 75 | u64 ts; 76 | int pid; 77 | short type; 78 | } __attribute__((packed)); 79 | 80 | 81 | struct fang_result { 82 | short type; 83 | u64 ts; 84 | short core; 85 | int pid1; 86 | int pid2; 87 | short irq; 88 | short pid1state; 89 | short pid2state; 90 | u64 perfts; 91 | } __attribute__((packed)); 92 | 93 | struct futex_result { 94 | int pid; 95 | u64 time; 96 | }__attribute__((packed)); 97 | 98 | struct softirq_result { 99 | char type; 100 | u64 stime; 101 | u64 etime; 102 | short core; 103 | }__attribute__((packed)); 104 | 105 | typedef struct NESTIRQ { 106 | int hardirq; 107 | struct NESTIRQ* next; 108 | } nestirq; 109 | 110 | //Fang newly added 111 | #define TASKLET_HI -1 112 | #define TIMER -2 113 | #define NET_TX -3 114 | #define NET_RX -4 115 | #define BLK_DONE -5 116 | #define BLK_IOPOLL -6 117 | #define TASKLET -7 118 | #define SCHED -8 119 | #define HRTIMER -9 120 | #define RCU -10 121 | #define APICTIMER -11 122 | #define IRQ -12 123 | #define KSOFTIRQ -13 124 | #define RANDOM -14 125 | #define HARDIRQ -15 126 | #define SOFTIRQ -16 127 | 128 | #define CORENUM 100 129 | #define PIDNUM 500 130 | -------------------------------------------------------------------------------- /recorder/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | gcc recorder.c -o recorder 3 | clean: 4 | rm recorder 5 | -------------------------------------------------------------------------------- /recorder/cpufreq.sh: -------------------------------------------------------------------------------- 1 | lscpu | grep "GHz" | awk '{print $NF}' | sed 's/GHz//' 2 | -------------------------------------------------------------------------------- /recorder/gdb.script: -------------------------------------------------------------------------------- 1 | t a a bt 2 | -------------------------------------------------------------------------------- /recorder/record-c.sh: -------------------------------------------------------------------------------- 1 | #/bin/bash 2 | # Arg1: Thread Name 3 | # Arg2: Test Time 4 | # Arg3: Disk Name 5 | echo '0. Prepare perf probe and start module' 6 | # Prepare perf probe 7 | sudo rm /tmp/wperf-* 8 | sudo perf probe 'try_to_wake_up pid=p->pid s1=p->state s2=state' 9 | sudo dmesg -c 10 | cd ../module 11 | sh 1prepare.sh $1 $3 12 | cd - 13 | 14 | echo '1. Get threads information' 15 | pid=$(pgrep $1) 16 | echo "pid:" ${pid} 17 | echo "time:" ${2} 18 | 19 | cp /tmp/target target.tinfo 20 | pgrep ksoftirq > soft.tinfo 21 | #Ignore multiple theads 22 | 23 | echo '2. Start module to record kernel events & record I/O' 24 | nohup sudo ./recorder 0 $3 > /dev/null 2>&1 & 25 | 26 | perflist=$(tr '\n' , < /tmp/perf_target | sed 's/,$//g') 27 | echo $perflist 28 | nohup sudo perf record -F $4 -e 'probe:try_to_wake_up' -e sched:sched_switch --call-graph dwarf -o presult -p ${perflist} sleep $2 > /dev/null 2>&1 & 29 | 30 | hname=$(hostname) 31 | ifstat -i $5 > nstat & 32 | iostat $3 -d 1 > iostat & 33 | 34 | sleep $2 35 | sudo pkill recorder 36 | sudo pkill ifstat 37 | sudo pkill iostat 38 | 39 | echo '3. Start gdb to record the function name of all threads' 40 | sudo gdb -p $pid < gdb.script > gresult 2>&1 41 | 42 | echo '4. Copy all the files to result directory.' 43 | time=$(date +%Y-%m-%d-%H-%M) 44 | mkdir $1-${2}-${time} 45 | sudo mv target.tinfo $1-${2}-${time}/ 46 | sudo mv soft.tinfo $1-${2}-${time}/ 47 | sudo mv result* $1-${2}-${time}/ 48 | sudo mv presult $1-${2}-${time}/ 49 | sudo mv gresult $1-${2}-${time}/ 50 | sudo dmesg -c > $1-${2}-${time}/kmesg 51 | awk '{if (NF==2) {a+=$2;b+=1;}} END{print "Network:", a/b}' nstat >> $1-${2}-${time}/kmesg 52 | 53 | sudo mv nstat $1-${2}-${time}/ 54 | sudo mv iostat $1-${2}-${time}/ 55 | 56 | sh cpufreq.sh > cpufreq 57 | sudo mv cpufreq $1-${2}-${time}/ 58 | 59 | sudo rm $1-${2}-${time}/result_fake 60 | sudo cat /tmp/wperf-* > $1-${2}-${time}/result_fake 61 | 62 | echo '5. Finish' 63 | -------------------------------------------------------------------------------- /recorder/record-java.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Arg1: Thread Name 3 | # Arg2: Test Time 4 | # Arg3-7: Disk Name 5 | echo '0. Prepare perf probe and start module' 6 | # Prepare perf probe 7 | sudo rm /tmp/wperf-* 8 | sudo perf probe 'try_to_wake_up pid=p->pid s1=p->state s2=state' 9 | sudo dmesg -c 10 | cd ../module 11 | sh 1prepare-java.sh $1 $3 12 | cd ../recorder 13 | 14 | echo '1. Get threads information' 15 | #pid=$1 16 | #pid=$(pgrep $1 | head -n 1) 17 | pid=$(sudo jps | grep $1 | awk '{print $1}') 18 | echo "pid:" ${pid} 19 | echo "time:" ${2} 20 | 21 | cp /tmp/target target.tinfo 22 | pgrep ksoftirq > soft.tinfo 23 | 24 | echo '2. Start module to record kernel events' 25 | nohup sudo ./recorder 0 $3 > /dev/null 2>&1 & 26 | 27 | perflist=$(tr '\n' , < /tmp/perf_target | sed 's/,$//g') 28 | echo ${perflist} 29 | sudo perf record -e sched:sched_switch -e 'probe:try_to_wake_up' -F $4 -o presult -g -p ${perflist} sleep $2 > /dev/null 2>&1 & 30 | 31 | #echo '5. Start to record network status' 32 | hname=$(hostname) 33 | ifstat -i $5 > nstat & 34 | iostat $3 -d 1 > iostat & 35 | sleep $2 36 | sudo pkill recorder 37 | sudo pkill ifstat 38 | sudo pkill iostat 39 | ./perf-map-agent/bin/create-java-perf-map.sh $pid 40 | 41 | echo '3. Start jstack to record the function stack of all threads' 42 | jstack $pid > gresult 43 | 44 | echo '4. Copy all the files to result directory.' 45 | time=$(date +%Y-%m-%d-%H-%M) 46 | mkdir ${1}-${2}-${time} 47 | sudo mv target.tinfo ${1}-${2}-${time}/ 48 | sudo mv soft.tinfo ${1}-${2}-${time}/ 49 | sudo mv result* ${1}-${2}-${time}/ 50 | sudo mv presult ${1}-${2}-${time}/ 51 | sudo mv gresult ${1}-${2}-${time}/ 52 | sudo dmesg -c > ${1}-${2}-${time}/kmesg 53 | 54 | awk '{if (NF==2) {a+=$2;b+=1;}} END{print "Network:", a/b}' nstat >> ${1}-${2}-${time}/kmesg 55 | 56 | sudo mv nstat $1-${2}-${time}/ 57 | sudo mv iostat $1-${2}-${time}/ 58 | 59 | sh cpufreq.sh > cpufreq 60 | sudo mv cpufreq $1-${2}-${time}/ 61 | 62 | sudo cp /tmp/perf-${pid}.map $1-${2}-${time}/ 63 | 64 | #If fake-wake-up, then copy all files to result_futex 65 | sudo rm $1-${2}-${time}/result_fake 66 | sudo cat /tmp/wperf-* > $1-${2}-${time}/result_fake 67 | 68 | echo '5. Finish' 69 | #killall java 70 | -------------------------------------------------------------------------------- /recorder/record.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | echo "wPerf recorder:" 3 | echo -n "Thread name[part is find, but still need unique]:" 4 | read -r tname 5 | echo -n "Programming language[c or java, default c]:" 6 | read -r lang 7 | echo -n "Profile length[default 90]:" 8 | read -r length 9 | echo -n "Perf frequency[default 100]:" 10 | read -r pfreq 11 | echo -n "Disk[default sda]:" 12 | read -r pdisk 13 | echo -n "NIC[default eth0]:" 14 | read -r pnic 15 | 16 | if [ "$tname" == "" ]; then 17 | echo "No thread name." 18 | exit 19 | fi 20 | 21 | if [ "$lang" == "" ]; then 22 | lang="c" 23 | fi 24 | 25 | if [ "$length" == "" ]; then 26 | length=90 27 | fi 28 | 29 | if [ "$pfreq" == "" ]; then 30 | pfreq=100 31 | fi 32 | 33 | if [ "$pdisk" == "" ]; then 34 | pdisk="sda" 35 | fi 36 | 37 | if [ "$pnic" == "" ]; then 38 | pnic="eth0" 39 | fi 40 | 41 | echo $lang 42 | if [ "$lang" == "c" ]; then 43 | ./record-c.sh ${tname} ${length} ${pdisk} ${pfreq} ${pnic} 44 | else 45 | ./record-java.sh ${tname} ${length} ${pdisk} ${pfreq} ${pnic} 46 | fi 47 | -------------------------------------------------------------------------------- /recorder/recorder.c: -------------------------------------------------------------------------------- 1 | #include "recorder.h" 2 | 3 | int i,fd; 4 | clock_t clk1,clk2; 5 | struct timespec tbuf1; 6 | struct timespec tbuf2; 7 | double rt; 8 | FILE *fp, *fp1, *fp2, *fp3; 9 | char *mem; 10 | long mem_pos = 0; 11 | 12 | void term(int signum) 13 | { 14 | ioctl(fd,IOCTL_INIT, -1); 15 | 16 | // Final Copy 17 | mem_pos = ioctl(fd,IOCTL_COPYSWITCH, mem); 18 | fwrite(mem, 1, mem_pos, fp); 19 | 20 | mem_pos = ioctl(fd,IOCTL_COPYFUTEX, mem); 21 | fwrite(mem, 1, mem_pos, fp1); 22 | 23 | mem_pos = ioctl(fd,IOCTL_COPYSTATE, mem); 24 | //printf("[1] COPYSTATE mem_pos = %d\n", mem_pos); 25 | fwrite(mem, 1, mem_pos, fp2); 26 | 27 | mem_pos = ioctl(fd,IOCTL_COPYWAIT, mem); 28 | fwrite(mem, 1, mem_pos, fp3); 29 | 30 | fclose(fp); 31 | fclose(fp1); 32 | fclose(fp2); 33 | fclose(fp3); 34 | close(fd); 35 | exit(0); 36 | } 37 | 38 | int main (int argc, char *argv[]) { 39 | int i = 0; 40 | int j = 0; 41 | 42 | mem = (char*)malloc(500*1024*1024*sizeof(char)); 43 | struct sigaction action; 44 | memset(&action, 0, sizeof(struct sigaction)); 45 | action.sa_handler = term; 46 | sigaction(SIGTERM, &action, NULL); 47 | 48 | char *dname = (char*)calloc(20,sizeof(char)); 49 | 50 | fd = open("/dev/wperf", O_RDWR); 51 | 52 | if (fd == -1) 53 | { 54 | printf("Error in opening file \n"); 55 | exit(-1); 56 | } 57 | 58 | fp = fopen("result_switch","w"); 59 | fp1 = fopen("result_softirq","w"); 60 | fp2 = fopen("result_fake","w"); 61 | fp3 = fopen("result_wait","w"); 62 | 63 | if (atoi(argv[1])==0) { 64 | ioctl(fd,IOCTL_INIT, 0); 65 | ioctl(fd,IOCTL_PID, 0); 66 | 67 | for ( j = 2; j < argc; j++) { 68 | strcpy(dname, argv[j]); 69 | ioctl(fd, IOCTL_DNAME, dname); 70 | } 71 | 72 | while(i<36000) { 73 | sleep(1); 74 | mem_pos = ioctl(fd,IOCTL_COPYSWITCH, mem); 75 | fwrite(mem, 1, mem_pos, fp); 76 | mem_pos = ioctl(fd,IOCTL_COPYFUTEX, mem); 77 | fwrite(mem, 1, mem_pos, fp1); 78 | mem_pos = ioctl(fd,IOCTL_COPYSTATE, mem); 79 | //printf("[0] COPYSTATE mem_pos = %d\n", mem_pos); 80 | fwrite(mem, 1, mem_pos, fp2); 81 | mem_pos = ioctl(fd,IOCTL_COPYWAIT, mem); 82 | fwrite(mem, 1, mem_pos, fp3); 83 | i++; 84 | } 85 | } 86 | 87 | ioctl(fd,IOCTL_INIT, -1); 88 | //fclose(fp); 89 | close(fd); 90 | return 0; 91 | } 92 | -------------------------------------------------------------------------------- /recorder/recorder.h: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | 13 | #define IOC_MAGIC 'f' 14 | 15 | #define IOCTL_STATE_BEGIN _IOWR(IOC_MAGIC,0, unsigned long) 16 | #define IOCTL_PID _IOWR(IOC_MAGIC, 1, unsigned long) 17 | #define IOCTL_JPROBE _IOWR(IOC_MAGIC, 2, unsigned long) 18 | #define IOCTL_UNJPROBE _IOWR(IOC_MAGIC, 3, unsigned long) 19 | #define IOCTL_COPYSWITCH _IOWR(IOC_MAGIC, 4, unsigned long) 20 | #define IOCTL_COPYSTATE _IOWR(IOC_MAGIC, 5, unsigned long) 21 | #define IOCTL_COPYFUTEX _IOWR(IOC_MAGIC, 6, unsigned long) 22 | #define IOCTL_JSWITCH _IOWR(IOC_MAGIC, 7, unsigned long) 23 | #define IOCTL_INIT _IOWR(IOC_MAGIC, 8, unsigned long) 24 | #define IOCTL_GDBPID _IOWR(IOC_MAGIC, 9, unsigned long) 25 | #define IOCTL_STATE_END _IOWR(IOC_MAGIC, 10, unsigned long) 26 | #define IOCTL_FUTEX _IOWR(IOC_MAGIC, 11, unsigned long) 27 | #define IOCTL_COPYBUFFER _IOWR(IOC_MAGIC, 12, unsigned long) 28 | #define IOCTL_MEMINIT _IOWR(IOC_MAGIC, 13, unsigned long) 29 | #define IOCTL_GETENTRY _IOWR(IOC_MAGIC, 14, unsigned long) 30 | #define IOCTL_STEP1_BEGIN _IOWR(IOC_MAGIC, 15, unsigned long) 31 | #define IOCTL_STEP1_END _IOWR(IOC_MAGIC, 16, unsigned long) 32 | #define IOCTL_USER_STACK _IOWR(IOC_MAGIC, 17, unsigned long) 33 | #define IOCTL_DNAME _IOWR(IOC_MAGIC, 18, unsigned long) 34 | #define IOCTL_COPYWAIT _IOWR(IOC_MAGIC, 21, unsigned long) 35 | #define IOCTL_ADDWAIT _IOWR(IOC_MAGIC, 22, unsigned long) 36 | #define IOCTL_TIME _IOWR(IOC_MAGIC, 23, unsigned long) 37 | --------------------------------------------------------------------------------