├── .gitignore ├── README.md ├── arch.py ├── asmAnalyser.py ├── bt.py ├── config.py ├── gdb.py ├── graph.py ├── prune.py ├── pruneConfig.py ├── pyTracer.py └── trace.png /.gitignore: -------------------------------------------------------------------------------- 1 | result/ 2 | __pycache__/ 3 | .vscode/ 4 | *.tracer 5 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # TOSView 2 | Draw the running traces of OS(linux, xv6, ...) kernel functions in a dynamic call graph and link graph nodes to the source codes 3 | 4 | 5 | 6 | # Why you need this 7 | If you try reading linux kernel source code, you 8 | will find that you are sinking into the sea of codes. It is hard to figure out where the definition of a function or a macro is (from dozens of different definitions with the same name), not to mention a function pointer's actually meanning. 9 | 10 | Using gdb seems a good idea, but typing "break" and "continue" all the time is boring and inefficient. 11 | 12 | While this project types "break" and "continue" for you automaticly, and draw the running traces of OS kernel functions in a graph, and prune this graph according to your configuration. You can open the graph with a browser like Firefox and click the nodes of the graph to see the corresponding source codes. 13 | 14 | This README will show you how to use TOSView to learn linux and xv6. You can inspect other OS using TOSView as long as you can compile that OS and run it in qemu. 15 | 16 | # Dependencies 17 | 1. A linux distribution (ubuntu is recommemended) 18 | 2. Source code of the operating system you want to learn. 19 | 3. qemu 20 | 4. gdb 21 | 5. python3 22 | 23 | # Prepare 24 | 1. Compile the OS 25 | 2. Disassemble binary file of the OS kernel. 26 | 3. Configure TOSView/config.py 27 | 4. Config pruneConfig.py (optional) 28 | 29 | # Prepare Linux 30 | 1. Compile linux kernel source code 31 | 1. make mrproper 32 | 2. make defconfig 33 | 3. make menuconfig 34 | 1. open or close "64-bit kernel" 35 | 2. Close "Processor type and features/Randomize the address of the kernel image (KALSR)" 36 | 3. in "Kernel hacking/Compile-time checks and compiler options" 37 | 1. open "Compile the kernel with debug info" 38 | 2. close "Reduce debug information" 39 | 3. clode "Provide split debuginfo in .dwo files" 40 | 4. open "Generate dwarf4 debuginfo" 41 | 5. open "Provide GDB scripts for kernel debugging" 42 | 6. open "Generate readable assembler code" 43 | 7. open "Debug Filesystem" 44 | 4. save and quit menuconfig 45 | 5. make -j* (* means the cpu cores your computer have) 46 | 6. make modules 47 | 2. You can find "vmlinux" in your linux kernel source code folder after compiling. Use objdump to disassemble it. Run: 48 | 1. objdump -d vmlinux > vmlinux.txt 49 | 3. Create initrd 50 | 1. mkinitramfs -o initrd.img 51 | 4. Config TOSView/config.py 52 | 1. ADDRESSBIT is 32 or 64, depending on whether your kernel is 32-bit or 64-bit. 53 | 2. SOURCEFOLDER is address of your linux kernel source code folder 54 | 3. ASMFILE is the address of your vmlinux.txt. 55 | 4. KERNELOBJ is the address of the binary file of your kernel. 56 | 5. QEMUCOMMAND is the command you use to run your linux kernel with qemu. Please notice that the address of your initrd file is included in it. 57 | 6. PRUNED should be True or False. If PRUNED is True, the program will prune the graph according to TOSView/pruneConfig.py. 58 | 7. PRUNELEVEL is a integer, only used when PRUNED is True. All the topics out of TOSView/pruneConfig.py/LEVELTABLE[0:PRUNELEVEL + 1] will be pruned. 59 | 8. PRUNEOUTCOME is a integer, only used when PRUNE is True. All the topics out of TOSView/pruneConfig.py/OUTCOMETABLE[PRUNEOUTCOME] will be pruned. 60 | 5. Config TOSView/pruneConfig.py if PRUNED is True 61 | 1. TOPICNUMBERS is the number of topics you are interested. 62 | 2. LEVELTABLE is a 2d array that consists of numbers in 0 - TOPICNUMBERS. Each array means a topic level. You should divide the topics in several groups. The lower groups should contains the more fundamental topics. Combined with PRUNELEVEL, you can prune the graph according to different detail levels. 63 | 3. OUTCOMETABLE is a 2d array that consists of numbers in 0 - TOPICNUMBERS. You can have many learning outcomes. Then fill the OUTCOMTABLE according to different learning outcomes. Combined with PRUNEOUTCOME, you can prune the graph according to different learning outcomes. 64 | 4. FILETABLE is a 2d array that consists of files/folders(for folders, please keep a '/' at the end) in linux kernel source folder. Each array means the files that belong to a specific topic. 65 | 66 | # Prepare xv6 67 | 1. Compile xv6. 68 | 1. Change Makefile. Make sure that your CFLAGS contains '-O0, -fno-omit-frame-pointer, -g'. The simplest way to do it is use the default debug CFLAGS option (it is commented out by default, you should umcomment it and then comment out the default release CFLAGS option). 69 | 2. make 70 | 2. Use objdump to disassemble the kernel file. 71 | 1. objdump -d kernel > kernel.asm (The original kernel.asm is generated by objdump -S, which is not suitable for TOSView.) 72 | 3. Config TOSView/config.py and TOSView/pruneConfig.py (Please reference 'Prepare Linux') 73 | 74 | ### gdb 75 | If you debug linux kernel with the official version of gdb, you will encounter a problem: "Remote 'g' packet reply is too long", so you need to download gdb source code, fix this problem and rebuild it. 76 | 77 | change function process_g_packet in gdb/remote.c from 78 | 79 | if (buf_len > 2 * rsa->sizeof_g_packet) 80 | error (_(“Remote ‘g’ packet reply is too long: %s”), rs->buf); 81 | 82 | to 83 | 84 | if (buf_len > 2 * rsa->sizeof_g_packet) { 85 | rsa->sizeof_g_packet = buf_len ; 86 | for (i = 0; i < gdbarch_num_regs (gdbarch); i++) 87 | { 88 | if (rsa->regs->pnum == -1) 89 | continue; 90 | if (rsa->regs->offset >= rsa->sizeof_g_packet) 91 | rsa->regs->in_g_packet = 0; 92 | else 93 | rsa->regs->in_g_packet = 1; 94 | } 95 | } 96 | 97 | Then, compile and install gdb: 98 | 99 | ./configure 100 | make 101 | sudo make install 102 | 103 | 104 | Note: this change will work for gdb 8.1. For different version of gdb, the change may be slightly different ([for example](https://blog.csdn.net/u013592097/article/details/70549657)). If you can compile gdb after the change, it should work. 105 | 106 | 107 | # Run 108 | python3 TOSView/pyTracer.py functionYouWantToTrace 109 | 110 | 111 | Results are in TOSView/result folder. You can open .svg file with Firefox browser and enjoy kernel source code. 112 | -------------------------------------------------------------------------------- /arch.py: -------------------------------------------------------------------------------- 1 | def getArch(arch): 2 | dt = {'x86_64': x86_64_arch, 'i386': i386_arch} 3 | return dt[arch](arch) 4 | 5 | 6 | class base_arch: 7 | def __init__(self, arch): 8 | self._arch = arch 9 | def getBit(self): 10 | pass 11 | def getRets(self): 12 | pass 13 | def getEip(self): 14 | pass 15 | 16 | class x86_64_arch(base_arch): 17 | def getBit(self): 18 | return 64 19 | def getRets(self): 20 | return [['retq']] 21 | def getEip(self): 22 | return 'rip' 23 | 24 | class i386_arch(base_arch): 25 | def getBit(self): 26 | return 32 27 | def getRets(self): 28 | return [['ret'], ['leave'], ['pop', '%ebp']] 29 | def getEip(self): 30 | return 'eip' 31 | 32 | -------------------------------------------------------------------------------- /asmAnalyser.py: -------------------------------------------------------------------------------- 1 | import string 2 | import time 3 | import os 4 | from config import ARCH 5 | from arch import getArch 6 | 7 | class asmAnalyser: 8 | def __init__(self, addr): 9 | 10 | self.calls = {} 11 | self.rets = {} 12 | self.callDsts = {} 13 | self.funcAddrs = {} 14 | ah = getArch(ARCH) 15 | self.BIT = ah.getBit() // 4 16 | self.RET = ah.getRets() 17 | 18 | tracerFile = addr.split('/')[-1] + '.tracer' 19 | if(not self.load(tracerFile, addr)): 20 | self.analyze(addr) 21 | self.save(tracerFile, addr) 22 | 23 | 24 | def load(self, tracerFileName, asmFileName): 25 | cur = -1 26 | if(not os.path.isfile(tracerFileName)): 27 | return False 28 | with open(tracerFileName) as f: 29 | ls = f.readlines() 30 | if(ls[0].strip() != str(os.path.getmtime(asmFileName))): 31 | return False 32 | for l in ls[1:]: 33 | if(l[0] == '#'): 34 | if('rets' in l): 35 | cur = 0 36 | elif('calls' in l): 37 | cur = 1 38 | elif('callDsts' in l): 39 | cur = 2 40 | elif('funcAddrs' in l): 41 | cur = 3 42 | else: 43 | ws = l.strip().split(':') 44 | if(cur == 0): 45 | self.rets[ws[0]] = ws[1:] 46 | elif(cur == 1): 47 | self.calls[ws[0]] = ws[1:] 48 | elif(cur == 2): 49 | self.callDsts[ws[0]] = ws[1] 50 | elif(cur == 3): 51 | self.funcAddrs[ws[0]] = ws[1] 52 | return True 53 | 54 | def save(self, tracerFileName, asmFileName): 55 | f = open(tracerFileName, 'w') 56 | f.write(str(os.path.getmtime(asmFileName)) + '\n') 57 | f.write('#rets:\n') 58 | for fname in self.rets: 59 | f.write(fname) 60 | for r in self.rets[fname]: 61 | f.write(':' + r) 62 | f.write('\n') 63 | f.write('#calls:\n') 64 | for fname in self.calls: 65 | f.write(fname) 66 | for call in self.calls[fname]: 67 | f.write(':' + call) 68 | f.write('\n') 69 | f.write('#callDsts:\n') 70 | for addr in self.callDsts: 71 | f.write(addr + ':' + self.callDsts[addr] + '\n') 72 | f.write('#funcAddrs:\n') 73 | for func in self.funcAddrs: 74 | f.write(func + ':' + self.funcAddrs[func] + '\n') 75 | f.close() 76 | 77 | 78 | def analyze(self, fName): 79 | def beginWithHex(s): 80 | return all(c in string.hexdigits for c in s[:self.BIT]) 81 | 82 | def dst(d): 83 | s = d.split()[-1] 84 | s = s[1:].replace('%', '$') 85 | s = s.replace(')', '').replace('(', '+') 86 | return s 87 | 88 | def checkRet(ws): 89 | for ret in self.RET: 90 | if(all(w in ws for w in ret)): 91 | return True 92 | return False 93 | 94 | 95 | with open(fName) as f: 96 | name = '' 97 | for l in f: 98 | l = l.strip() 99 | ws = l.split() 100 | if(len(l) < self.BIT or not beginWithHex(l)): 101 | name = '' 102 | continue 103 | elif(l[self.BIT] == ' '): 104 | name = l[l.index('<') + 1: l.index('>')] 105 | self.calls[name] = [] 106 | self.rets[name] = [] 107 | funcAddr = '0x' + l[:self.BIT] 108 | self.funcAddrs[name] = funcAddr 109 | elif(l[self.BIT] == ':'): 110 | addr = '0x' + l[:self.BIT] 111 | if('#' in l): 112 | l = l[:l.index('#')] 113 | b = False 114 | if('*' in l ): 115 | b = True 116 | self.callDsts[addr] = dst(l) 117 | target = '*' + addr 118 | elif('<' in l and '>' in l): 119 | cur = l[l.index('<') + 1: l.index('>')] 120 | if('+' in cur): 121 | cur = cur.split('+')[0] 122 | if(cur != name): 123 | b = True 124 | target = '*0x' + ws[-2] 125 | elif(checkRet(ws)): 126 | self.rets[name] += ['*' + addr] 127 | if(b == True and addr != funcAddr): 128 | self.calls[name] += [target, '*' + addr] 129 | 130 | def getCalls(self, func): 131 | return self.calls[func] 132 | 133 | def getRets(self, func): 134 | return self.rets[func] 135 | 136 | def getCallDst(self, addr): 137 | return self.callDsts[addr] 138 | 139 | def getCallSrcs(self): 140 | return self.callDsts.keys() 141 | 142 | def getFuncAddr(self, func): 143 | return self.funcAddrs[func] 144 | 145 | def funcExist(self, func): 146 | return func in self.calls.keys() 147 | 148 | 149 | if __name__=='__main__': 150 | start = time.clock() 151 | asm = asmAnalyser('/home/alan/xv6-public/kernel.asm') 152 | print(time.clock() - start) 153 | print('call func, ret func, callDst addr') 154 | while(True): 155 | ws = input().split() 156 | if(ws[0] == 'call'): 157 | print(sorted(asm.getCalls(ws[1]))) 158 | elif(ws[0] == 'ret'): 159 | print(sorted(asm.getRets(ws[1]))) 160 | elif(ws[0] == 'callDst'): 161 | print(asm.getCallDst(ws[1])) 162 | else: 163 | break 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 | -------------------------------------------------------------------------------- /bt.py: -------------------------------------------------------------------------------- 1 | def parse_bt(ls): 2 | l = ' '.join(ls) 3 | tbts = l.split('#')[1:] 4 | bts = [] 5 | num = 0 6 | for bt in tbts: 7 | tmp = bt.split()[0] 8 | if(tmp.isdigit() and int(tmp) == num): 9 | bts.append(bt) 10 | num += 1 11 | elif(len(bts) > 0): 12 | bts[-1] += bt 13 | 14 | infos = [] 15 | for bt in bts: 16 | left = bt.index('(') 17 | right = bt.rindex(')') 18 | bt = bt[:left] + bt[right + 1:] 19 | ws = bt.split() 20 | start = 1 21 | if(len(ws) > 2 and ws[2] == 'in'): 22 | start = 3 23 | func = ws[start] 24 | start += 1 25 | file = '' 26 | line = '' 27 | if(start < len(ws) and ws[start] == 'at'): 28 | tmp = ws[start + 1].split(':') 29 | file, line = tmp[0], tmp[1] 30 | 31 | 32 | infos.append([func, file, line]) 33 | return infos 34 | 35 | def cmp_bt(bt, base): 36 | n = len(base) 37 | if(len(bt) >= n): 38 | if(n == 1): 39 | return bt[-1][:2] == base[0][:2] 40 | elif(bt[-n + 1:] == base[1:] and 41 | bt[-n][:2] == base[0][:2]): 42 | return True 43 | return False 44 | 45 | def common_bt(cur, pre, n): 46 | x = -n 47 | m = -min(len(cur), len(pre)) 48 | while(x >= m and cur[x][:2] == pre[x][:2]): 49 | x -= 1 50 | return len(cur) + x + 1 51 | 52 | -------------------------------------------------------------------------------- /config.py: -------------------------------------------------------------------------------- 1 | ### GDB Config: GDBPORT 2 | 3 | GDBPORT = '1234' 4 | 5 | ### OS config: ADDRESSBIT, SOURCEFOLDER, ASMFILE, KERNELOBJ, QEMUCOMMAND 6 | 7 | ## 8 | # Linux config 9 | ### 10 | ARCH = 'x86_64' 11 | SOURCEFOLDER = '/home/alan/linux-4.16' 12 | ASMFILE = SOURCEFOLDER + '/vmlinux.txt' 13 | KERNELOBJ = SOURCEFOLDER + '/vmlinux' 14 | QEMUCOMMAND = 'qemu-system-x86_64 -m 512M -kernel %s/arch/x86/boot/bzImage \ 15 | -initrd %s -gdb tcp::%s -S' % (SOURCEFOLDER, SOURCEFOLDER + '/initrd.img', GDBPORT) 16 | 17 | ### 18 | # xv6 config 19 | ### 20 | # ARCH = 'i386' 21 | # SOURCEFOLDER = '/home/alan/xv6-public' 22 | # ASMFILE = SOURCEFOLDER + '/kernel.asm' 23 | # KERNELOBJ = SOURCEFOLDER + '/kernel' 24 | # QEMUCOMMAND = 'qemu-system-i386 -drive \ 25 | # file=%s/fs.img,index=1,media=disk,format=raw -drive \ 26 | # file=%s/xv6.img,index=0,media=disk,format=raw -m 512M -gdb tcp::%s -S' \ 27 | # % (SOURCEFOLDER, SOURCEFOLDER, GDBPORT) 28 | 29 | 30 | ### Prune Config: PRUNED, PRUNELEVEL, PRUNEOUTCOME 31 | 32 | PRUNED = False 33 | PRUNELEVEL = 1 34 | PRUNEOUTCOME = 0 35 | -------------------------------------------------------------------------------- /gdb.py: -------------------------------------------------------------------------------- 1 | from bt import parse_bt, cmp_bt 2 | from graph import graphPainter 3 | from asmAnalyser import asmAnalyser 4 | import time 5 | from config import SOURCEFOLDER, ASMFILE, GDBPORT, ARCH 6 | from arch import getArch 7 | 8 | class gdbTracer: 9 | def __init__(self, p, funcName): 10 | self.p = p 11 | self.func = funcName 12 | self.sourceFolder = SOURCEFOLDER 13 | ah = getArch(ARCH) 14 | self.eip = ah.getEip() 15 | self.max_brkTime = 30 # If a breakpoint is triggered more than this number, it will be removed 16 | self.ed = '394743516231415926' # A random number as end of output of a session 17 | 18 | print('loading %s' % ASMFILE) 19 | cur_time = time.clock() 20 | self.asm = asmAnalyser(ASMFILE) 21 | print('finished loading, cost %ss' % str(time.clock() - cur_time)) 22 | if(not self.asm.funcExist(funcName)): 23 | print('ERROR! Function %s not exist!' % funcName) 24 | return 25 | p.stdin.write(bytes('target remote:%s\n' % GDBPORT, encoding='utf8')) 26 | 27 | 28 | def configure(self, dotFileName, logFileName, depth): 29 | self.dotFileName = dotFileName 30 | self.log = open(logFileName, 'w') 31 | self.maxDepth = depth 32 | self.endAddrs = [] 33 | self. existBreakpoints = set() 34 | self.brks = {} 35 | 36 | def flush(self): 37 | self.write('p ' + self.ed) 38 | self.p.stdin.flush() 39 | 40 | def write(self, s): 41 | self.p.stdin.write(bytes(s + '\n', encoding='utf8')) 42 | self.log.write(s + '\n') 43 | 44 | def read(self): 45 | ls = [] 46 | while(True): 47 | l = self.p.stdout.readline().decode() 48 | ls.append(l) 49 | if(self.ed in l): 50 | break 51 | assert(len(ls) > 0) 52 | self.log.writelines(ls) 53 | ls.pop() 54 | return ls 55 | 56 | def getRip(self): 57 | self.write('p/x $' + self.eip) 58 | 59 | self.flush() 60 | ls = self.read() 61 | assert(len(ls) > 0 and '= ' in ls[-1]) 62 | return ls[-1].split('= ')[-1].strip() 63 | 64 | 65 | def bk(self, point): 66 | if(point[0] != '*'): 67 | self.write('p/x %s' % point) 68 | self.flush() 69 | ls = self.read() 70 | assert(len(ls) > 0 and '=' in ls[-1]) 71 | point = '*' + ls[-1].split('= ')[-1].strip() 72 | 73 | if(point not in self.existBreakpoints): 74 | self.write('b %s' % point) 75 | self.existBreakpoints.add(point) 76 | 77 | def checkBreak(self, ls): 78 | b = 'Breakpoint' 79 | lb = len(b) 80 | bn = -1 81 | for l in ls: 82 | if(len(l) > lb and l[:lb] == b): 83 | bn = int(l.split(',')[0].split()[1]) 84 | break 85 | assert(bn != -1) 86 | if(bn in self.brks.keys()): 87 | self.brks[bn] += 1 88 | else: 89 | self.brks[bn] = 1 90 | if(self.brks[bn] > self.max_brkTime): 91 | self.write('d %d' % bn) 92 | 93 | 94 | def run(self): 95 | self.write('delete') 96 | self.write('b ' + self.func) 97 | self.write('c') 98 | self.write('bt 100') 99 | self.flush() 100 | 101 | ls = self.read() 102 | bt = parse_bt(ls) 103 | base_bt = bt 104 | rip = self.getRip() 105 | func = bt[0][0] 106 | assert(func == self.func) 107 | for point in self.asm.getRets(func): 108 | self.endAddrs.append(point[1:]) 109 | self.bk(point) 110 | self.painter = graphPainter(self.dotFileName, bt, self.sourceFolder, self.maxDepth) 111 | 112 | while(True): 113 | if(cmp_bt(bt, base_bt)): 114 | if(rip in self.endAddrs and len(bt) == len(base_bt)): 115 | break 116 | new = self.asm.funcExist(func) and rip == self.asm.getFuncAddr(func) 117 | self.painter.paint(bt, new) 118 | if(len(bt) - len(base_bt) <= self.maxDepth): 119 | if(new): 120 | for point in self.asm.getCalls(func): 121 | self.bk(point) 122 | if(rip in self.asm.getCallSrcs()): 123 | self.bk(self.asm.getCallDst(rip)) 124 | 125 | self.write('c') 126 | self.write('bt 100') 127 | self.flush() 128 | ls = self.read() 129 | bt = parse_bt(ls) 130 | rip = self.getRip() 131 | func = bt[0][0] 132 | self.checkBreak(ls) 133 | 134 | self.painter.close() 135 | self.log.close() 136 | 137 | 138 | 139 | 140 | 141 | -------------------------------------------------------------------------------- /graph.py: -------------------------------------------------------------------------------- 1 | from bt import common_bt 2 | class graphPainter: 3 | def __init__(self, fname, base_bt, sourceFolder, depth): 4 | self.f = open(fname, 'w') 5 | self.existLinks = {} 6 | self.f.write('digraph {\n') 7 | self.base_bt = base_bt 8 | self.pre_bt = base_bt 9 | self.sourceFolder = sourceFolder + '/' 10 | self.maxDepth = depth 11 | self.nodes = {} 12 | self.cnt = 0 13 | 14 | def insert(self, key, n): 15 | if(key not in self.nodes.keys()): 16 | self.nodes[key] = 'n' + str(len(self.nodes.keys())) 17 | self.f.write(self.nodes[key] + ' [label = %s URL = %s];\n' % 18 | (key[:-1] + ':' + n + '"', '"' + self.sourceFolder + key.split('\\n')[1])) 19 | 20 | def paint(self, bt, new): 21 | height = len(bt) - len(self.base_bt) 22 | if(height <= 0): 23 | return 24 | top = common_bt(bt, self.pre_bt, len(self.base_bt)) 25 | if(new): 26 | top = max(top, 1) 27 | cur = '"' + bt[top][0] + '\\n' + bt[top][1] + '"' 28 | if (height - top < self.maxDepth and top > 0): 29 | self.insert(cur, bt[top][2]) 30 | while(height - top < self.maxDepth and top > 0): 31 | top -= 1 32 | pre = cur 33 | cur = '"' + bt[top][0] + '\\n' + bt[top][1] + '"' 34 | self.insert(cur, bt[top][2]) 35 | link = self.nodes[pre] + ' -> ' + self.nodes[cur] 36 | self.cnt += 1 37 | if(link not in self.existLinks.keys()): 38 | self.existLinks[link] = [self.cnt] 39 | else: 40 | self.existLinks[link].append(self.cnt) 41 | self.pre_bt = bt 42 | 43 | def close(self): 44 | for link in sorted(self.existLinks.keys()): 45 | if(len(self.existLinks[link]) <= 5): 46 | label = ','.join([str(n) for n in self.existLinks[link]]) 47 | else: 48 | ns = self.existLinks[link][:5] 49 | label = ','.join([str(n) for n in ns]) + ',...' 50 | self.f.write('%s [label = "%s"];\n' % (link, label)) 51 | self.f.write("}\n") 52 | self.f.close() 53 | -------------------------------------------------------------------------------- /prune.py: -------------------------------------------------------------------------------- 1 | from pruneConfig import * 2 | from config import PRUNED, PRUNELEVEL, PRUNEOUTCOME 3 | 4 | def pruneCheck(): 5 | topics = [] 6 | for t in LEVELTABLE: 7 | topics += t 8 | topics.sort() 9 | assert(topics == list(range(len(topics)))) 10 | assert(PRUNELEVEL < len(LEVELTABLE)) 11 | assert(PRUNEOUTCOME < len(OUTCOMETABLE)) 12 | for l in OUTCOMETABLE: 13 | for t in l: 14 | assert(t < TOPICNUMBERS and t >= 0) 15 | assert(len(FILETABLE) == TOPICNUMBERS) 16 | 17 | 18 | def prune(dotFileName, prunedFileName): 19 | pruneCheck() 20 | topics = LEVELTABLE[0].copy() 21 | for k in range(1, PRUNELEVEL + 1): 22 | topics += LEVELTABLE[k].copy() 23 | 24 | outcome = set(OUTCOMETABLE[PRUNEOUTCOME]) 25 | topics = set(topics).intersection(outcome) 26 | validFiles = set() 27 | for t in topics: 28 | for f in FILETABLE[t]: 29 | validFiles.add(f) 30 | 31 | parent = {} 32 | validNodes = set() 33 | validNodes.add('n0') 34 | 35 | def valid(x, n): 36 | if(x in validNodes): 37 | return [x, n] 38 | else: 39 | r = ['', 0] 40 | for p in parent[x]: 41 | if(p[1] < n and p[1] > r[1]): 42 | r = p 43 | return valid(r[0], r[1]) 44 | 45 | prunedFile = open(prunedFileName, 'w') 46 | 47 | with open(dotFileName, 'r') as f: 48 | ls = f.readlines() 49 | 50 | prunedFile.write(ls[0]) 51 | prunedFile.write(ls[1]) 52 | 53 | for l in ls[2:]: 54 | ws = l.split() 55 | if(len(ws) == 6): 56 | p = ws[0] 57 | s = ws[2] 58 | ns = ws[5].split('"')[1].split(',') 59 | for t in ns: 60 | if t != '...': 61 | if(s not in parent.keys()): 62 | parent[s] = [[p, int(t)]] 63 | else: 64 | parent[s].append([p, int(t)]) 65 | 66 | elif(len(ws) == 7): 67 | fname = ws[3].split('\\n')[1].split(':')[0] 68 | flag = fname in validFiles 69 | if(len(fname) > 2 and fname[:2] == './'): 70 | start = 2 71 | else: 72 | start = 0 73 | for k in range(start, len(fname)): 74 | if(fname[k] == '/'): 75 | if(fname[start:k + 1] in validFiles): 76 | flag = True 77 | if(flag): 78 | validNodes.add(ws[0]) 79 | nl = ' '.join(ws) 80 | prunedFile.write(nl + '\n') 81 | 82 | 83 | for l in ls[2:]: 84 | ws = l.split() 85 | if(len(ws) == 6): 86 | if(ws[2] in validNodes): 87 | ns = ws[5].split('"')[1].split(',') 88 | ps = {} 89 | for n in ns: 90 | if(n == '...'): 91 | continue 92 | n = int(n) 93 | p = valid(ws[0], n) 94 | if(len(p) > 0): 95 | if(p[1] != n): 96 | p[1] = '%d-%d' % (p[1], n) 97 | else: 98 | p[1] = str(p[1]) 99 | if(p[0] not in ps.keys()): 100 | ps[p[0]] = [p[1]] 101 | else: 102 | ps[p[0]].append(p[1]) 103 | 104 | for p in ps.keys(): 105 | ws[0] = p 106 | ws[5] = '"%s"];' % ','.join(ps[p]) 107 | nl = ' '.join(ws) 108 | prunedFile.write(nl + '\n') 109 | 110 | prunedFile.write(ls[-1]) 111 | prunedFile.close() 112 | 113 | 114 | -------------------------------------------------------------------------------- /pruneConfig.py: -------------------------------------------------------------------------------- 1 | # Topics (of filesystem): 2 | #0 data 3 | #1 metadata 4 | #2 operations 5 | #3 organization 6 | #4 buffering 7 | #5 sequential 8 | #6 nonsequential 9 | #7 directories 10 | #8 partitioning 11 | #9 mount/unmount 12 | #10 virtual file systems 13 | #11 Standard implementation techniques 14 | #12 Memory-mapped files 15 | #13 Special-purpose file systems 16 | #14 Naming, searching 17 | #15 access 18 | #16 backups 19 | #17 Journaling and log-structured file systems 20 | 21 | 22 | # Learning outcomes 23 | #0 Describe the choices to be made in designing file systems. 24 | #1 Compare and contrast different approaches to file organization, recognizing the strengths and weaknesses 25 | # of each. 26 | #2 Summarize how hardware developments have led to changes in the priorities for the design and the 27 | # management of file systems. 28 | #3 Summarize the use of journaling and how log-structured file systems enhance fault tolerance. 29 | 30 | 31 | TOPICNUMBERS = 18 32 | LEVELTABLE = [[2, 7, 9, 14, 15, 16], 33 | [0, 1, 3, 4, 5, 6, 8, 10], 34 | [11, 12, 13, 17]] 35 | OUTCOMETABLE = [[0, 1, 2, 3, 4, 7, 8, 10, 14, 15, 16], 36 | [3, 7, 8], 37 | [3, 4, 5, 6, 9, 11, 12, 13], 38 | [17]] 39 | 40 | FILETABLE = [['fs/binfmt_aout.c', 'fs/binfmt_elf_fdpic.c', 'fs/binfmt_elf.c', 41 | 'fs/binfmt_em86.c', 'fs/binfmt_flat.c', 'fs/binfmt_misc.c', 'fs/binfmt_script.c', 'fs/compat_binfmt_elf.c'], 42 | ['fs/attr.c', 'fs/fhandle.c', 'fs/file_table.c', 'fs/file.c', 'fs/stat.c', 'fs/statfs.c', 'fs/xattr.c', 43 | 'fs/ext2/xattr_security.c', 'fs/ext2/xattr_trusted.c', 'fs/ext2/xattr_user.c', 'fs/ext2/xattr_user.c', 44 | 'fs/ext2/xattr.c', 'fs/ext2/xattr.h'], 45 | ['fs/aio.c', 'fs/direct-io.c', 'fs/exec.c', 'fs/fcntl.c', 'fs/fs-writeback.c', 'fs/libfs.c', 46 | 'fs/mpage.c', 'fs/open.c', 'fs/read_write.c', 'fs/splice.c', 'fs/sync.c'], 47 | ['fs/inode.c', 'fs/super.c', 'fs/ext2/balloc.c', 'fs/ext2/ialloc.c', 'fs/ext2/inode.c', 'fs/ext2/super.c'], 48 | ['fs/buffer.c', 'fs/dcache.c', 'fs/drop_caches.c', 'fs/mbcache.c', 'fs/cachefiles/'], 49 | ['fs/char_dev.c', 'fs/pipe.c', 'fs/seq_file.c'], 50 | ['fs/block_dev.c'], 51 | ['fs/readdir.c', 'fs/ext2/dir.c'], 52 | [], 53 | ['fs/mount.h', 'fs/pnode.c', 'fs/proc_namespace.c'], 54 | ['fs/devpts/'], 55 | ['fs/bad_inode.c', 'fs/eventpoll.c', 'fs/iomap.c', 'fs/locks.c', 'fs/select.c', 'fs/signalfd.c', 56 | 'fs/userfaultfd.c', 'fs/dlm/', 'security/'], 57 | [], 58 | ['fs/9p/', 'fs/afs/', 'fs/cifs/', 'fs/configfs/', 'fs/debugfs/', 'fs/ecryptfs/', 'fs/fuse/'], 59 | ['fs/namei.c', 'fs/namespace.c', 'fs/ext2/namei.c'], 60 | ['fs/dax.c', 'fs/posix_acl.c', 'fs/ext2/acl.c', 'fs/ext2/acl.h'], 61 | ['fs/coredump.c'], 62 | ['fs/nilfs2/']] 63 | 64 | -------------------------------------------------------------------------------- /pyTracer.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import subprocess 3 | import time 4 | import os 5 | import signal 6 | import shutil 7 | 8 | from bt import parse_bt, cmp_bt 9 | from threading import Thread 10 | from graph import graphPainter 11 | from gdb import gdbTracer 12 | from config import QEMUCOMMAND, SOURCEFOLDER, KERNELOBJ 13 | from prune import prune 14 | from config import PRUNED 15 | 16 | 17 | if __name__=='__main__': 18 | funcName = sys.argv[1] 19 | qemu_p = subprocess.Popen(QEMUCOMMAND, shell=True, preexec_fn=os.setsid) 20 | g = subprocess.Popen('gdb ' + KERNELOBJ, shell=True, stdin=subprocess.PIPE, 21 | stdout=subprocess.PIPE, stderr=subprocess.STDOUT, preexec_fn=os.setsid) 22 | 23 | if os.path.exists('result'): 24 | shutil.rmtree('result') 25 | os.makedirs('result') 26 | 27 | logCnt = 0 28 | gt = gdbTracer(g, funcName) 29 | 30 | while True: 31 | dotFileName = 'result/trace%d.dot' % logCnt 32 | svgFileName = 'result/trace%d.svg' % logCnt 33 | logFileName = 'result/log%d.txt' % logCnt 34 | print('input the max depth of calling tree: (max means no limit)') 35 | r = input() 36 | if('max' in r.lower()): 37 | maxDepth = 1000000 38 | else: 39 | maxDepth = int(r.strip()) 40 | gt.configure(dotFileName, logFileName, maxDepth) 41 | try: 42 | gt.run() 43 | except Exception as e: 44 | print(e) 45 | gt.painter.close() 46 | gt.log.close() 47 | 48 | 49 | subprocess.Popen('dot -Tsvg %s -o %s' %(dotFileName, svgFileName), shell=True) 50 | if(PRUNED): 51 | prunedDotFileName = dotFileName.split('.')[0] + '_pruned.dot' 52 | prunedSvgFileName = svgFileName.split('.')[0] + '_pruned.svg' 53 | prune(dotFileName, prunedDotFileName) 54 | subprocess.Popen('dot -Tsvg %s -o %s' %(prunedDotFileName, prunedSvgFileName), shell=True) 55 | 56 | print('Finish ploting %s' % svgFileName) 57 | print('Continue and plot another graph on %s?(n/y)' % funcName) 58 | r = input() 59 | if(r.lower() != 'y'): 60 | break 61 | 62 | logCnt += 1 63 | 64 | os.killpg(os.getpgid(g.pid), signal.SIGTERM) 65 | os.killpg(os.getpgid(qemu_p.pid), signal.SIGTERM) 66 | 67 | 68 | 69 | -------------------------------------------------------------------------------- /trace.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alan-Lee123/TOSView/53f6a252581eaddae6c83e9c58540076ee8f0284/trace.png --------------------------------------------------------------------------------