├── .gitignore ├── LICENSE ├── Makefile ├── README.md ├── addnop.py ├── bin_write.py ├── brute_force_disassembler.py ├── brute_force_mapper.py ├── context.py ├── disassembler.py ├── icount.py ├── mapper.py ├── msearch.py ├── multiverse.py ├── parse_popgm.sh ├── rewrite.py ├── runtime.py ├── simplest.c ├── translator.py ├── x64_assembler.py ├── x64_populate_gm.c ├── x64_runtime.py ├── x64_translator.py ├── x86_assembler.py ├── x86_populate_gm.c ├── x86_runtime.py └── x86_translator.py /.gitignore: -------------------------------------------------------------------------------- 1 | * 2 | !*/ 3 | !*.* 4 | *~ 5 | nolibc 6 | teeny 7 | *.o 8 | .* 9 | *.pyc 10 | peda-session-* 11 | uncached.txt 12 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU LESSER GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | 9 | This version of the GNU Lesser General Public License incorporates 10 | the terms and conditions of version 3 of the GNU General Public 11 | License, supplemented by the additional permissions listed below. 12 | 13 | 0. Additional Definitions. 14 | 15 | As used herein, "this License" refers to version 3 of the GNU Lesser 16 | General Public License, and the "GNU GPL" refers to version 3 of the GNU 17 | General Public License. 18 | 19 | "The Library" refers to a covered work governed by this License, 20 | other than an Application or a Combined Work as defined below. 21 | 22 | An "Application" is any work that makes use of an interface provided 23 | by the Library, but which is not otherwise based on the Library. 24 | Defining a subclass of a class defined by the Library is deemed a mode 25 | of using an interface provided by the Library. 26 | 27 | A "Combined Work" is a work produced by combining or linking an 28 | Application with the Library. The particular version of the Library 29 | with which the Combined Work was made is also called the "Linked 30 | Version". 31 | 32 | The "Minimal Corresponding Source" for a Combined Work means the 33 | Corresponding Source for the Combined Work, excluding any source code 34 | for portions of the Combined Work that, considered in isolation, are 35 | based on the Application, and not on the Linked Version. 36 | 37 | The "Corresponding Application Code" for a Combined Work means the 38 | object code and/or source code for the Application, including any data 39 | and utility programs needed for reproducing the Combined Work from the 40 | Application, but excluding the System Libraries of the Combined Work. 41 | 42 | 1. Exception to Section 3 of the GNU GPL. 43 | 44 | You may convey a covered work under sections 3 and 4 of this License 45 | without being bound by section 3 of the GNU GPL. 46 | 47 | 2. Conveying Modified Versions. 48 | 49 | If you modify a copy of the Library, and, in your modifications, a 50 | facility refers to a function or data to be supplied by an Application 51 | that uses the facility (other than as an argument passed when the 52 | facility is invoked), then you may convey a copy of the modified 53 | version: 54 | 55 | a) under this License, provided that you make a good faith effort to 56 | ensure that, in the event an Application does not supply the 57 | function or data, the facility still operates, and performs 58 | whatever part of its purpose remains meaningful, or 59 | 60 | b) under the GNU GPL, with none of the additional permissions of 61 | this License applicable to that copy. 62 | 63 | 3. Object Code Incorporating Material from Library Header Files. 64 | 65 | The object code form of an Application may incorporate material from 66 | a header file that is part of the Library. You may convey such object 67 | code under terms of your choice, provided that, if the incorporated 68 | material is not limited to numerical parameters, data structure 69 | layouts and accessors, or small macros, inline functions and templates 70 | (ten or fewer lines in length), you do both of the following: 71 | 72 | a) Give prominent notice with each copy of the object code that the 73 | Library is used in it and that the Library and its use are 74 | covered by this License. 75 | 76 | b) Accompany the object code with a copy of the GNU GPL and this license 77 | document. 78 | 79 | 4. Combined Works. 80 | 81 | You may convey a Combined Work under terms of your choice that, 82 | taken together, effectively do not restrict modification of the 83 | portions of the Library contained in the Combined Work and reverse 84 | engineering for debugging such modifications, if you also do each of 85 | the following: 86 | 87 | a) Give prominent notice with each copy of the Combined Work that 88 | the Library is used in it and that the Library and its use are 89 | covered by this License. 90 | 91 | b) Accompany the Combined Work with a copy of the GNU GPL and this license 92 | document. 93 | 94 | c) For a Combined Work that displays copyright notices during 95 | execution, include the copyright notice for the Library among 96 | these notices, as well as a reference directing the user to the 97 | copies of the GNU GPL and this license document. 98 | 99 | d) Do one of the following: 100 | 101 | 0) Convey the Minimal Corresponding Source under the terms of this 102 | License, and the Corresponding Application Code in a form 103 | suitable for, and under terms that permit, the user to 104 | recombine or relink the Application with a modified version of 105 | the Linked Version to produce a modified Combined Work, in the 106 | manner specified by section 6 of the GNU GPL for conveying 107 | Corresponding Source. 108 | 109 | 1) Use a suitable shared library mechanism for linking with the 110 | Library. A suitable mechanism is one that (a) uses at run time 111 | a copy of the Library already present on the user's computer 112 | system, and (b) will operate properly with a modified version 113 | of the Library that is interface-compatible with the Linked 114 | Version. 115 | 116 | e) Provide Installation Information, but only if you would otherwise 117 | be required to provide such information under section 6 of the 118 | GNU GPL, and only to the extent that such information is 119 | necessary to install and execute a modified version of the 120 | Combined Work produced by recombining or relinking the 121 | Application with a modified version of the Linked Version. (If 122 | you use option 4d0, the Installation Information must accompany 123 | the Minimal Corresponding Source and Corresponding Application 124 | Code. If you use option 4d1, you must provide the Installation 125 | Information in the manner specified by section 6 of the GNU GPL 126 | for conveying Corresponding Source.) 127 | 128 | 5. Combined Libraries. 129 | 130 | You may place library facilities that are a work based on the 131 | Library side by side in a single library together with other library 132 | facilities that are not Applications and are not covered by this 133 | License, and convey such a combined library under terms of your 134 | choice, if you do both of the following: 135 | 136 | a) Accompany the combined library with a copy of the same work based 137 | on the Library, uncombined with any other library facilities, 138 | conveyed under the terms of this License. 139 | 140 | b) Give prominent notice with the combined library that part of it 141 | is a work based on the Library, and explaining where to find the 142 | accompanying uncombined form of the same work. 143 | 144 | 6. Revised Versions of the GNU Lesser General Public License. 145 | 146 | The Free Software Foundation may publish revised and/or new versions 147 | of the GNU Lesser General Public License from time to time. Such new 148 | versions will be similar in spirit to the present version, but may 149 | differ in detail to address new problems or concerns. 150 | 151 | Each version is given a distinguishing version number. If the 152 | Library as you received it specifies that a certain numbered version 153 | of the GNU Lesser General Public License "or any later version" 154 | applies to it, you have the option of following the terms and 155 | conditions either of that published version or of any later version 156 | published by the Free Software Foundation. If the Library as you 157 | received it does not specify a version number of the GNU Lesser 158 | General Public License, you may choose any version of the GNU Lesser 159 | General Public License ever published by the Free Software Foundation. 160 | 161 | If the Library as you received it specifies that a proxy can decide 162 | whether future versions of the GNU Lesser General Public License shall 163 | apply, that proxy's public statement of acceptance of any version is 164 | permanent authorization for you to choose that version for the 165 | Library. 166 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | 2 | all: popgm simplest 3 | 4 | popgm: 5 | gcc -o x86_populate_gm -m32 -Wall -nostdlib -fno-toplevel-reorder -masm=intel -O1 x86_populate_gm.c 6 | gcc -o x64_populate_gm -m64 -Wall -nostdlib -fno-toplevel-reorder -masm=intel -O1 x64_populate_gm.c 7 | bash parse_popgm.sh 8 | 9 | simplest: 10 | gcc -o simplest64 -m64 -O1 simplest.c 11 | gcc -o simplest32 -m32 simplest.c 12 | 13 | clean: 14 | rm -f x86_populate_gm x64_populate_gm x86_popgm x64_popgm simplest64 simplest32 simplest64-r simplest32-r 15 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Multiverse 2 | 3 | *Multiverse* is a static binary rewriter with an emphasis on simplicity and correctness. It does not rely on heuristics to perform its rewriting, and it attempts to make as few assumptions as possible to produce a rewritten binary. Details about Multiverse can be found in the paper "Superset Disassembly: Statically Rewriting x86 Binaries Without Heuristics." 4 | 5 | Multiverse currently supports 32-bit and 64-bit x86 binaries. 6 | 7 | ## Requirements 8 | 9 | Multiverse requires the following Python libraries: 10 | * capstone (linear disassembler) (we use a slightly modified version that is needed to rewrite 64-bit binaries. Our modified version can be found [here](https://github.com/baumane/capstone)) 11 | * pwntools (for its assembler bindings) 12 | * pyelftools (for reading elf binaries) 13 | * elfmanip (for modifying elf binaries) (can be found [here](https://github.com/schieb/ELFManip)) 14 | 15 | ## Compiling 16 | 17 | Multiverse is written in Python, but its code to generate a binary's global mapping is written in C. This must be compiled before binaries can be rewritten. To do so, run `make` and the global mapping code will be compiled. 18 | 19 | ## Running 20 | 21 | Multiverse can be run directly, but this will only rewrite binaries with no instrumentation. This can be used to make sure that everything is installed correctly or to debug changes to the rewriter. Running `multiverse.py` on a binary will rewrite it. It can be run like this: `./multiverse.py [options] `. There are several flags that can be passed to Multiverse to control how a binary is rewritten: 22 | * --so to rewrite a shared object 23 | * --execonly to rewrite only a main binary (it will use the original, unmodified libraries) 24 | * --nopic to write a binary without support for arbitrary position-independent code. It still supports common compiler-generated pic, but not arbitrary accesses to the program counter. This is not currently recommended for 64-bit binaries. 25 | * --arch to select the architecture of the binary. Current supported architectures are `x86` and `x86-64`. The default is `x86`. 26 | 27 | Rewritten binaries are named as the original filename with "-r" appended (e.g. `simplest64` becomes `simplest64-r`). 28 | 29 | Rewritten binaries *must* be run with the `LD_BIND_NOW` environment variable set to 1. This prevents control from flowing to the dynamic linker at runtime. Since we do not rewrite the dynamic linker, this is necessary for correct execution (e.g. to run `simplest-r`, type `LD_BIND_NOW=1 ./simplest-r`). 30 | 31 | A very simple example program is provided (`simplest.c`), which is automatically compiled when building Multiverse's global mapping code. This can be used to test that Multiverse is installed correctly. For example, to rewrite only the main executable for `simplest64`, the 64-bit version of `simplest`, type `./multiverse.py --execonly --arch x86-64 simplest64` and then run it with `LD_BIND_NOW=1 ./simplest64-r`. 32 | 33 | `rewrite.py` is a utility script to rewrite a binary and its libraries, so that `multiverse.py` does not have to be run manually for each library, and it automatically creates a directory for the rewritten libraries, plus a shell script to run the rewritten binary. For simplicity when rewriting binaries, we recommend using this script. For example, to rewrite `simplest64`, type `./rewrite.py -64 simplest64`, and the script will rewrite the main binary and all its required libraries (as long as they are not dynamically loaded via a mechanism such as `dlopen`; since statically determining dynamically loaded libraries is difficult, they must be manually extracted and their paths be placed in `-dynamic-libs.txt`, and then `rewrite.py` will rewrite them). This may take several minutes. When it is complete, run the rewritten binary with `bash simplest64-r.sh`. 34 | 35 | ## Instrumentation 36 | 37 | Multiverse is used as a Python library to instrument binaries. Right now, the instrumentation API is very simple and consists only of the function `set_before_inst_callback`, which takes a function that is called for every instruction that is encountered and will insert whichever bytes the callback function returns before the corresponding instruction. The callback function should accept a single argument: an instruction object, as created by the Capstone disassembler. It should return a byte array containing the assembled instructions to be inserted. 38 | 39 | In order to use multiverse, a script should import the Rewriter object (`from multiverse import Rewriter`) and then create an instance of Rewriter. Its constructor takes three boolean arguments: 40 | * `write_so` to rewrite a shared object 41 | * `exec_only` to rewrite only a main binary (it will use the original, unmodified libraries) 42 | * `no_pic` to write a binary without support for arbitrary position-independent code. It still supports common compiler-generated pic, but not arbitrary accesses to the program counter. This is not currently recommended for 64-bit binaries. 43 | 44 | `exec_only` and `no_pic` are performance optimizations that will not work on all binaries. For a main executable, `write_so` should be False, and for shared objects, `write_so` should be True. If `exec_only` is False, then all shared objects used by the binary must be rewritten. 45 | 46 | Two simple instrumentation examples can be found in `icount.py` (insert code to increment a counter before every instruction) and `addnop.py` (insert a nop before every instruction). These are currently configured to instrument only the main executable of 64-bit binaries. For example, to insert nops into `simplest64`, type `python addnop.py simplest64`, and to run the instrumented binary, type `LD_BIND_NOW=1 ./simplest64-r`. 47 | 48 | We are working on a higher-level API that will allow code written in C to be seamlessly called at instrumentation points, but it is not yet available. 49 | 50 | ## Citing 51 | 52 | If you create a research work that uses Multiverse, please cite the associated paper: 53 | 54 | ``` 55 | @inproceedings{Multiverse:NDSS18, 56 | author = {Erick Bauman and Zhiqiang Lin and Kevin Hamlen}, 57 | title = {Superset Disassembly: Statically Rewriting x86 Binaries Without Heuristics}, 58 | booktitle = {Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS'18)}, 59 | address = {San Diego, CA}, 60 | month = {February}, 61 | year = 2018, 62 | } 63 | ``` 64 | -------------------------------------------------------------------------------- /addnop.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | import sys 4 | from elftools.elf.elffile import ELFFile 5 | from multiverse import Rewriter 6 | from x64_assembler import _asm 7 | 8 | def count_instruction(inst): 9 | template = ''' 10 | nop 11 | ''' 12 | inc = template 13 | return _asm( inc ) 14 | 15 | if __name__ == '__main__': 16 | if len(sys.argv) == 2: 17 | f = open(sys.argv[1]) 18 | e = ELFFile(f) 19 | entry_point = e.header.e_entry 20 | f.close() 21 | #write_so = False, exec_only = True, no_pic = True 22 | rewriter = Rewriter(False,True,False) 23 | rewriter.set_before_inst_callback(count_instruction) 24 | rewriter.rewrite(sys.argv[1],'x86-64') 25 | else: 26 | print "Error: must pass executable filename.\nCorrect usage: %s "%sys.argv[0] 27 | -------------------------------------------------------------------------------- /bin_write.py: -------------------------------------------------------------------------------- 1 | #import sys 2 | #sys.path.insert(0,'/home/erick/git/delinker/Delinker/src') 3 | from elfmanip.elfmanip import ELFManip, CustomSection, CustomSegment 4 | from elfmanip.constants import PT_LOAD, SHF_TLS, PT_TLS 5 | 6 | from elftools.elf.elffile import ELFFile 7 | 8 | tls_section_added = False 9 | tls_section_contents = b'' 10 | tls_section_offset = 0 11 | 12 | def add_tls_section(fname,contents): 13 | # This does not require ELFManip because it must 14 | # be called earlier on, before we actually rewrite the 15 | # binary, because I need the new TLS offset. 16 | # We could obviously create the ELFManip object now, 17 | # but it won't be used again until we write it out at 18 | # the end. 19 | global tls_section_added 20 | global tls_section_contents 21 | tls_section_added = True 22 | #Pad contents to 4-byte alignment 23 | tls_section_contents = contents+('\0'*(4-len(contents)%4)) 24 | with open(fname) as f: 25 | elf = ELFFile(f) 26 | for s in elf.iter_segments(): 27 | #Assume only one TLS segment exists (will fail on an already modified binary) 28 | if s.header['p_type'] == 'PT_TLS': 29 | tls_section_offset = s.header['p_memsz']+len(tls_section_contents) 30 | print 'old section is 0x%x (%x with padding)'%(s.header['p_memsz'], s.header['p_memsz']+(4-s.header['p_memsz']%4)) 31 | print 'new content is 0x%x (%x with padding)'%(len(contents), len(contents)+(4-len(contents)%4)) 32 | print 'overall 0x%x (%x with padding)'%(tls_section_offset, tls_section_offset+(4-tls_section_offset%4)) 33 | return tls_section_offset + (4-tls_section_offset%4) 34 | return len(contents) + (4-len(contents)%4) #If there is no TLS segment 35 | 36 | def get_tls_content(elf): 37 | # For now assume that the TLS sections are adjacent and 38 | # we can append their contents directly 39 | # I also am assuming that there will probably be only 40 | # two sections, .tdata and .tbss, which seems likely. 41 | # This may work under different circumstances but it is 42 | # hard to predict. 43 | content = b'' 44 | if tls_section_added: 45 | content+=tls_section_contents 46 | print 'length of new contents: 0x%x'%len(content) 47 | for entry in elf.shdrs['entries']: 48 | if (entry.sh_flags & SHF_TLS) == SHF_TLS: 49 | if entry.sh_type == SHT_NOBITS: # bss has no contents 50 | content+='\0'*entry.sh_size # fill bss space with 0 51 | print 'adding .tbss section of length: 0x%x'%entry.sh_size 52 | else: 53 | content+=entry.contents 54 | print 'adding .tdata section of length: 0x%x'%len(entry.contents) 55 | return content 56 | 57 | def rewrite_noglobal(fname,nname,newcode,newbase,entry): 58 | elf = ELFManip(fname,num_adtl_segments=1) 59 | with open(newcode) as f: 60 | newbytes = f.read() 61 | elf.relocate_phdrs() 62 | newtext_section = CustomSection(newbytes, sh_addr = newbase) 63 | if newtext_section is None: 64 | raise Exception 65 | newtext_segment = CustomSegment(PT_LOAD) 66 | newtext_segment = elf.add_segment(newtext_segment) 67 | elf.add_section(newtext_section, newtext_segment) 68 | elf.set_entry_point(entry) 69 | elf.write_new_elf(nname) 70 | 71 | def rewrite(fname,nname,newcode,newbase,newglobal,newglobalbase,entry,text_section_offs,text_section_size,num_new_segments,arch): 72 | #TODO: change rewrite to take the context instead, and just retrieve the data it needs from that. 73 | elf = ELFManip(fname,num_adtl_segments=num_new_segments) 74 | if text_section_size >= elf.ehdr['e_phentsize']*(elf.ehdr['e_phnum']+num_new_segments+1): 75 | num_new_segments += 1 # Add an extra segment for the overwritten contents of the text section 76 | newtls = get_tls_content(elf) #Right now there will ALWAYS be a new TLS section 77 | with open(newcode) as f: 78 | newbytes = f.read() 79 | # IF the text section is large enough to hold the phdrs (true for a nontrivial program) 80 | # AND the architecture is x86-64, because I have not written 32-bit code to restore the text section yet 81 | # TODO: add support to 32-bit rewriter to use .text section for phdrs 82 | if arch == 'x86-64' and text_section_size >= elf.ehdr['e_phentsize']*(elf.ehdr['e_phnum']+num_new_segments): 83 | # Place the phdrs at the start of the (original) text section, overwriting the contents 84 | print 'placing phdrs in .text section, overwriting contents until runtime' 85 | #print 'BUT for now, still do it the original way so we can do a quick test...' 86 | #elf.relocate_phdrs() 87 | elf.relocate_phdrs(custom_offset=text_section_offs,new_size=elf.ehdr['e_phentsize']*(elf.ehdr['e_phnum']+num_new_segments)) 88 | # Assume that the phdrs won't be larger than a page, and just copy that entire first page of the text section. 89 | duptext_section = CustomSection(elf.elf.get_section_by_name('.text').data()[:4096], sh_addr = newglobalbase-0x20000) #TODO: make this address flexible 90 | duptext_segment = CustomSegment(PT_LOAD) 91 | duptext_segment = elf.add_segment(duptext_segment) 92 | elf.add_section(duptext_section, duptext_segment) 93 | else: 94 | # Use the previous heuristics to relocate the phdrs and hope for the best 95 | print '.text section too small to hold phdrs (or 32-bit binary); using other heuristics to relocate phdrs' 96 | elf.relocate_phdrs() 97 | newtext_section = CustomSection(newbytes, sh_addr = newbase) 98 | newglobal_section = CustomSection(newglobal, sh_addr = newglobalbase) 99 | newtls_section = CustomSection(newtls, sh_addr = newglobalbase-0x10000) #TODO: make this address flexible 100 | if newtext_section is None or newglobal_section is None: 101 | raise Exception 102 | newtext_segment = CustomSegment(PT_LOAD) 103 | newtext_segment = elf.add_segment(newtext_segment) 104 | newglobal_segment = CustomSegment(PT_LOAD) 105 | newglobal_segment = elf.add_segment(newglobal_segment) 106 | elf.add_section(newtext_section, newtext_segment) 107 | elf.add_section(newglobal_section, newglobal_segment) 108 | 109 | newtls_segment = CustomSegment(PT_LOAD) 110 | newtls_segment = elf.add_segment(newtls_segment) 111 | elf.add_section(newtls_section, newtls_segment) 112 | newtls_segment = CustomSegment(PT_TLS, p_align=4) 113 | newtls_segment = elf.add_segment(newtls_segment) 114 | elf.add_section(newtls_section, newtls_segment) 115 | 116 | elf.set_entry_point(entry) 117 | elf.write_new_elf(nname) 118 | 119 | if __name__ == '__main__': 120 | if len(sys.argv) != 2: 121 | print "needs filename" 122 | 123 | fn = sys.argv[1] 124 | 125 | elf = ELFManip(fn) 126 | 127 | newcode = 'newbytes' 128 | 129 | elf.add_section(newcode, sh_addr = 0x09000000) 130 | #elf.set_entry_point(0x09000200) #teeny 131 | #elf.set_entry_point(0x09000854) #simplest main 132 | #elf.set_entry_point(0x09000230) #eip 133 | #elf.set_entry_point(0x09000228) #mem 134 | #elf.set_entry_point(0x09002278) #64-bit echo (which therefore wouldn't work regardless) 135 | #elf.set_entry_point(0x09000765) #simplest (_init at 0xc78) 136 | #elf.set_entry_point(0x0900026c) #lookup 137 | #(0x8048cf0 - 0x8048000)+0x59838 = 0x5a428 (lookup index) 138 | #elf.set_entry_point(0x09001ce8) #bzip2 139 | elf.set_entry_point(0x090013ef) #ssimplest 140 | 141 | elf.write_new_elf('relocated') 142 | 143 | -------------------------------------------------------------------------------- /brute_force_disassembler.py: -------------------------------------------------------------------------------- 1 | import capstone 2 | from disassembler import Disassembler 3 | 4 | class BruteForceDisassembler(Disassembler): 5 | ''' Brute-force disassembler that disassembles bytes 6 | from every offset; all possible code that could 7 | execute is disassembled. Overlapping instructions are 8 | flattened out and duplicate sequences are connected 9 | with jump instructions. 10 | 11 | Uses Capstone as its underlying linear disassembler.''' 12 | 13 | def __init__(self,arch): 14 | if arch == 'x86': 15 | self.md = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_32) 16 | elif arch == 'x86-64': 17 | self.md = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_64) 18 | else: 19 | raise NotImplementedError( 'Architecture %s is not supported'%arch ) 20 | self.md.detail = True 21 | 22 | def disasm(self,bytes,base): 23 | print 'Starting disassembly...' 24 | dummymap = {} 25 | ten_percent = len(bytes)/10 26 | for instoff in range(0,len(bytes)): 27 | if instoff%ten_percent == 0: 28 | print 'Disassembly %d%% complete...'%((instoff/ten_percent)*10) 29 | while instoff < len(bytes): 30 | off = base+instoff 31 | try: 32 | if not off in dummymap: #If this offset has not been disassembled 33 | insts = self.md.disasm(bytes[instoff:instoff+15],base+instoff)#longest x86/x64 instr is 15 bytes 34 | ins = insts.next() #May raise StopIteration 35 | instoff+=len(ins.bytes) 36 | dummymap[ins.address] = True # Show that we have disassembled this address 37 | yield ins 38 | else: #If this offset has already been disassembled 39 | yield None #Indicates we encountered this offset before 40 | break #Stop disassembling from this starting offset 41 | except StopIteration: #Not a valid instruction 42 | break #Stop disassembling from this starting offset 43 | raise StopIteration 44 | 45 | -------------------------------------------------------------------------------- /brute_force_mapper.py: -------------------------------------------------------------------------------- 1 | import struct 2 | from mapper import Mapper 3 | from brute_force_disassembler import BruteForceDisassembler 4 | 5 | class BruteForceMapper(Mapper): 6 | ''' This mapper disassembled from every offset and includes a 7 | mapping for instructions at every byte offset in the code. 8 | To avoid duplicate code, when the disassembler encounters instructions 9 | it has encountered before, the mapper simply includes a jump instruction 10 | to link the current sequence to a previously mapped sequence.''' 11 | 12 | def __init__(self,arch,bytes,base,entry,context): 13 | self.disassembler = BruteForceDisassembler(arch) 14 | self.bytes = bytes 15 | self.base = base 16 | self.entry = entry 17 | self.context = context 18 | if arch == 'x86': 19 | #NOTE: We are currently NOT supporting instrumentation because we are passing 20 | #None to the translator. TODO: Add back instrumentation after everything gets 21 | #working again, and make instrumentation feel more organized 22 | from x86_translator import X86Translator 23 | from x86_runtime import X86Runtime 24 | self.translator = X86Translator(context.before_inst_callback,self.context) 25 | self.runtime = X86Runtime(self.context) 26 | global assembler 27 | import x86_assembler as assembler 28 | elif arch == 'x86-64': 29 | from x64_translator import X64Translator 30 | from x64_runtime import X64Runtime 31 | self.translator = X64Translator(context.before_inst_callback,self.context) 32 | self.runtime = X64Runtime(self.context) 33 | global assembler 34 | import x64_assembler as assembler 35 | else: 36 | raise NotImplementedError( 'Architecture %s is not supported'%arch ) 37 | 38 | def gen_mapping(self): 39 | print 'Generating mapping...' 40 | mapping = {} 41 | maplist = [] 42 | currmap = {} 43 | last = None #Last instruction disassembled 44 | reroute = assembler.asm('jmp $+0x8f') #Dummy jmp to imitate connecting jmp; we may not know dest yet 45 | for ins in self.disassembler.disasm(self.bytes,self.base): 46 | if ins is None and last is not None: # Encountered a previously disassembled instruction and have not redirected 47 | currmap[last.address] += len(reroute) 48 | last = None #If we have not found any more new instructions since our last redirect, don't redirect again 49 | maplist.append(currmap) 50 | currmap = {} 51 | elif ins is not None: 52 | last = ins #Remember the last disassembled instruction 53 | newins = self.translator.translate_one(ins,None) #In this pass, the mapping is incomplete 54 | if newins is not None: 55 | currmap[ins.address] = len(newins) 56 | else: 57 | currmap[ins.address] = len(ins.bytes) 58 | self.context.lookup_function_offset = 0 #Place lookup function at start of new text section 59 | lookup_size = len(self.runtime.get_lookup_code(self.base,len(self.bytes),0,0x8f)) #TODO: Issue with mapping offset & size 60 | offset = lookup_size 61 | if self.context.exec_only: 62 | self.context.secondary_lookup_function_offset = offset 63 | secondary_lookup_size = len(self.runtime.get_secondary_lookup_code(self.base,len(self.bytes),offset,0x8f)) 64 | offset += secondary_lookup_size 65 | for m in maplist: 66 | for k in sorted(m.keys()): 67 | size = m[k] 68 | mapping[k] = offset 69 | offset+=size #Add the size of this instruction to the total offset 70 | #Now that the mapping is complete, we know the length of it 71 | self.context.mapping_offset = len(self.bytes)+self.base #Where we pretend the mapping was in the old code 72 | if not self.context.write_so: 73 | self.context.new_entry_off = offset #Set entry point to start of auxvec 74 | offset+=len(self.runtime.get_auxvec_code(0x8f))#Unknown entry addr here, but not needed b/c we just need len 75 | mapping[self.context.lookup_function_offset] = self.context.lookup_function_offset 76 | if self.context.exec_only: 77 | #This is a very low number and therefore will not be written out into the final mapping. 78 | #It is used to convey this offset for the second phase when generating code, specifically 79 | #for the use of remap_target. Without setting this it always sets the target to 0x8f. Sigh. 80 | mapping[self.context.secondary_lookup_function_offset] = self.context.secondary_lookup_function_offset 81 | #Don't yet know mapping offset; we must compute it 82 | mapping[len(self.bytes)+self.base] = offset 83 | print 'final offset for mapping is: 0x%x' % offset 84 | if not self.context.write_so: 85 | #For NOW, place the global data/function at the end of this because we can't necessarily fit 86 | #another section. TODO: put this somewhere else 87 | #The first time, sysinfo's and flag's location is unknown, 88 | #so they are wrong in the first call to get_global_lookup_code 89 | #However, the global_flag is moving to a TLS section, so it takes 90 | #up no space in the global lookup 91 | #global_flag = global_lookup + len(get_global_lookup_code()) 92 | #popgm goes directly after the global lookup, and global_sysinfo directly after that. 93 | self.context.popgm_offset = len(self.runtime.get_global_lookup_code()) 94 | self.context.global_sysinfo = self.context.global_lookup + self.context.popgm_offset + len(self.runtime.get_popgm_code()) 95 | #Now that this is set, the auxvec code should work 96 | return mapping 97 | 98 | def gen_newcode(self,mapping): 99 | print 'Generating new code...' 100 | newbytes = '' 101 | bytemap = {} 102 | maplist = [] 103 | last = None #Last instruction disassembled 104 | for ins in self.disassembler.disasm(self.bytes,self.base): 105 | if ins is None and last is not None: # Encountered a previously disassembled instruction and have not redirected 106 | target = last.address + len(last.bytes) #address of where in the original code we would want to jmp to 107 | next_target = self.translator.remap_target(last.address, mapping, target, len(bytemap[last.address]) ) 108 | reroute = assembler.asm( 'jmp $+%s'%(next_target) ) 109 | #Maximum relative displacement is 32 for x86 and x64, so this works for both platforms 110 | if len(reroute) == 2: #Short encoding, which we do not want 111 | reroute+='\x90\x90\x90' #Add padding of 3 NOPs 112 | bytemap[last.address] += reroute 113 | last = None 114 | maplist.append(bytemap) 115 | bytemap = {} 116 | elif ins is not None: 117 | last = ins 118 | newins = self.translator.translate_one(ins,mapping) #In this pass, the mapping is incomplete 119 | if newins is not None: 120 | bytemap[ins.address] = newins #Old address maps to these new instructions 121 | else: 122 | bytemap[ins.address] = str(ins.bytes) #This instruction is unchanged, and its old address maps to it 123 | #Add the lookup function as the first thing in the new text section 124 | newbytes+=self.runtime.get_lookup_code(self.base,len(self.bytes),self.context.lookup_function_offset,mapping[self.context.mapping_offset]) 125 | if self.context.exec_only: 126 | newbytes += self.runtime.get_secondary_lookup_code(self.base,len(self.bytes),self.context.secondary_lookup_function_offset,mapping[self.context.mapping_offset]) 127 | count = 0 128 | for m in maplist: 129 | for k in sorted(m.keys()): #For each original address to code, in order of original address 130 | newbytes+=m[k] 131 | if not self.context.write_so: 132 | newbytes+=self.runtime.get_auxvec_code(mapping[self.entry]) 133 | print 'mapping is being placed at offset: 0x%x' % len(newbytes) 134 | #Append mapping to end of bytes 135 | newbytes+=self.write_mapping(mapping,self.base,len(self.bytes)) 136 | return newbytes 137 | 138 | def write_mapping(self,mapping,base,size): 139 | bytes = b'' 140 | for addr in range(base,base+size): 141 | if addr in mapping: 142 | if addr < 10: 143 | print 'offset for 0x%x: 0x%x' % (addr, mapping[addr]) 144 | bytes+=struct.pack('"%sys.argv[0] 46 | -------------------------------------------------------------------------------- /mapper.py: -------------------------------------------------------------------------------- 1 | 2 | class Mapper(object): 3 | ''' A mapper maps old addresses to new addresses and old 4 | instructions to new instructions. 5 | 6 | This is a generic Mapper object. All mappers 7 | used by this system should inherit from this parent 8 | object and provide implementations for all functions listed.''' 9 | 10 | def __init__(self,arch,bytes,base,entry,context): 11 | raise NotImplementedError('Override __init__() in a child class') 12 | def gen_mapping(self): 13 | raise NotImplementedError('Override gen_mapping() in a child class') 14 | def gen_newcode(self): 15 | raise NotImplementedError('Override gen_newcode() in a child class') 16 | -------------------------------------------------------------------------------- /msearch.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | import json,sys 3 | 4 | def search(item): 5 | with open('mapdump.json','rb') as f: 6 | mapping = json.load(f) 7 | if str(item) in mapping: 8 | return '0x%x'%int(mapping[str(item)]) 9 | else: 10 | return 'not found' 11 | 12 | def rsearch(item): 13 | with open('mapdump.json','rb') as f: 14 | mapping = json.load(f) 15 | for key,value in mapping.iteritems(): 16 | if item == value: 17 | return '0x%x'%int(key) 18 | return 'not found' 19 | 20 | if __name__ == '__main__': 21 | if len(sys.argv) < 2 or len(sys.argv) > 3: 22 | print "Correct usage: %s [-r]
" 23 | if len(sys.argv) == 2: 24 | print search(int(sys.argv[1],16)) 25 | if len(sys.argv) == 3 and sys.argv[1] == '-r': 26 | print rsearch(int(sys.argv[2],16)) 27 | -------------------------------------------------------------------------------- /multiverse.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | from elftools.elf.elffile import ELFFile 3 | import capstone 4 | import sys 5 | #import cProfile 6 | import x64_assembler 7 | import bin_write 8 | import json 9 | import os 10 | import re 11 | 12 | from context import Context 13 | from brute_force_mapper import BruteForceMapper 14 | 15 | save_reg_template = ''' 16 | mov DWORD PTR [esp%s], %s 17 | ''' 18 | restore_reg_template = ''' 19 | mov %s, DWORD PTR [esp%s] 20 | ''' 21 | 22 | save_register = ''' 23 | mov %s, -12 24 | mov %s, %s''' 25 | 26 | memory_ref_string = re.compile(u'^dword ptr \[(?P
0x[0-9a-z]+)\]$') 27 | 28 | ''' 29 | call X 30 | '''#%('eax',cs_insn.reg_name(opnd.reg),'eax') 31 | ''' 32 | class Context(object): 33 | def __init__(): 34 | self''' 35 | 36 | #Transforms the 'r_info' field in a relocation entry to the offset into another table 37 | #determined by the host reloc table's 'sh_link' entry. In our case it's the dynsym table. 38 | def ELF32_R_SYM(val): 39 | return (val) >> 8 40 | def ELF64_R_SYM(val): 41 | return (val) >> 32 42 | 43 | #Globals: If there end up being too many of these, put them in a Context & pass them around 44 | '''plt = {} 45 | newbase = 0x09000000 46 | #TODO: Set actual address of function 47 | lookup_function_offset = 0x8f 48 | secondary_lookup_function_offset = 0x8f #ONLY used when rewriting ONLY main executable 49 | mapping_offset = 0x8f 50 | global_sysinfo = 0x8f #Address containing sysinfo's address 51 | global_flag = 0x8f 52 | global_lookup = 0x7000000 #Address containing global lookup function 53 | popgm = 'popgm' 54 | popgm_offset = 0x8f 55 | new_entry_off = 0x8f 56 | write_so = False 57 | exec_only = False 58 | no_pic = False 59 | get_pc_thunk = None 60 | stat = {} 61 | stat['indcall'] = 0 62 | stat['indjmp'] = 0 63 | stat['dircall'] = 0 64 | stat['dirjmp'] = 0 65 | stat['jcc'] = 0 66 | stat['ret'] = 0 67 | stat['origtext'] = 0 68 | stat['newtext'] = 0 69 | stat['origfile'] = 0 70 | stat['newfile'] = 0 71 | stat['mapsize'] = 0 72 | stat['lookupsize'] = 0 73 | #stat['auxvecsize'] = 0 74 | #stat['globmapsize'] = 0 75 | #stat['globlookupsize'] = 0 76 | #List of library functions that have callback args; each function in the dict has a list of 77 | #the arguments passed to it that are a callback (measured as the index of which argument it is) 78 | #TODO: Handle more complex x64 calling convention 79 | #TODO: Should I count _rtlf_fini (offset 5)? It seems to be not in the binary 80 | callbacks = {'__libc_start_main':[0,3,4]}''' 81 | 82 | class Rewriter(object): 83 | 84 | def __init__(self,write_so,exec_only,no_pic): 85 | self.context = Context() 86 | self.context.write_so = write_so 87 | self.context.exec_only = exec_only 88 | self.context.no_pic = no_pic 89 | 90 | def set_before_inst_callback(self,func): 91 | '''Pass a function that will be called when translating each instruction. 92 | This function should accept an instruction argument (the instruction type returned from capstone), 93 | which can be read to determine what code to insert (if any). A byte string of assembled bytes 94 | should be returned to be inserted before the instruction, or if none are to be inserted, return None. 95 | 96 | NOTE: NOTHING is done to protect the stack, registers, flags, etc! If ANY of these are changed, there 97 | is a chance that EVERYTHING will go wrong! Leave everything as you found it or suffer the consequences! 98 | ''' 99 | self.context.before_inst_callback = func 100 | 101 | def alloc_globals(self,size,arch): 102 | '''Allocate an arbitrary amount of contiguous space for global variables for use by instrumentation code. 103 | Returns the address of the start of this space. 104 | ''' 105 | #create a temporary mapper to get where the globals would be inserted 106 | self.context.alloc_globals = 0 107 | mapper = BruteForceMapper(arch,b'',0,0,self.context) 108 | retval = self.context.global_lookup + len(mapper.runtime.get_global_mapping_bytes()) 109 | #Now actually set the size of allocated space 110 | self.context.alloc_globals = size 111 | return retval 112 | 113 | #Find the earliest address we can place the new code 114 | def find_newbase(self,elffile): 115 | maxaddr = 0 116 | for seg in elffile.iter_segments(): 117 | segend = seg.header['p_vaddr']+seg.header['p_memsz'] 118 | if segend > maxaddr: 119 | maxaddr = segend 120 | maxaddr += ( 0x1000 - maxaddr%0x1000 ) # Align to page boundary 121 | return maxaddr 122 | 123 | def rewrite(self,fname,arch): 124 | offs = size = addr = 0 125 | with open(fname,'rb') as f: 126 | elffile = ELFFile(f) 127 | relplt = None 128 | relaplt = None 129 | dynsym = None 130 | entry = elffile.header.e_entry #application entry point 131 | for section in elffile.iter_sections(): 132 | if section.name == '.text': 133 | print "Found .text" 134 | offs = section.header.sh_offset 135 | size = section.header.sh_size 136 | addr = section.header.sh_addr 137 | self.context.oldbase = addr 138 | # If .text section is large enough to hold all new segments, we can move the phdrs there 139 | if size >= elffile.header['e_phentsize']*(elffile.header['e_phnum']+self.context.num_new_segments+1): 140 | self.context.move_phdrs_to_text = True 141 | if section.name == '.plt': 142 | self.context.plt['addr'] = section.header['sh_addr'] 143 | self.context.plt['size'] = section.header['sh_size'] 144 | self.context.plt['data'] = section.data() 145 | if section.name == '.rel.plt': 146 | relplt = section 147 | if section.name == '.rela.plt': #x64 has .rela.plt 148 | relaplt = section 149 | if section.name == '.dynsym': 150 | dynsym = section 151 | if section.name == '.symtab': 152 | for sym in section.iter_symbols(): 153 | if sym.name == '__x86.get_pc_thunk.bx': 154 | self.context.get_pc_thunk = sym.entry['st_value'] #Address of thunk 155 | #section.get_symbol_by_name('__x86.get_pc_thunk.bx')) #Apparently this is in a newer pyelftools 156 | self.context.plt['entries'] = {} 157 | if relplt is not None: 158 | for rel in relplt.iter_relocations(): 159 | got_off = rel['r_offset'] #Get GOT offset address for this entry 160 | ds_ent = ELF32_R_SYM(rel['r_info']) #Get offset into dynamic symbol table 161 | if dynsym: 162 | name = dynsym.get_symbol(ds_ent).name #Get name of symbol 163 | self.context.plt['entries'][got_off] = name #Insert this mapping from GOT offset address to symbol name 164 | elif relaplt is not None: 165 | for rel in relaplt.iter_relocations(): 166 | got_off = rel['r_offset'] #Get GOT offset address for this entry 167 | ds_ent = ELF64_R_SYM(rel['r_info']) #Get offset into dynamic symbol table 168 | if dynsym: 169 | name = dynsym.get_symbol(ds_ent).name #Get name of symbol 170 | self.context.plt['entries'][got_off] = name #Insert this mapping from GOT offset address to symbol name 171 | #print self.context.plt 172 | else: 173 | print 'binary does not contain plt' 174 | if self.context.write_so: 175 | print 'Writing as .so file' 176 | self.context.newbase = self.find_newbase(elffile) 177 | elif self.context.exec_only: 178 | print 'Writing ONLY main binary, without support for rewritten .so files' 179 | self.context.newbase = 0x09000000 180 | else: 181 | print 'Writing as main binary' 182 | self.context.newbase = 0x09000000 183 | if self.context.no_pic: 184 | print 'Rewriting without support for generic PIC' 185 | for seg in elffile.iter_segments(): 186 | if seg.header['p_flags'] == 5 and seg.header['p_type'] == 'PT_LOAD': #Executable load seg 187 | print "Base address: %s"%hex(seg.header['p_vaddr']) 188 | bytes = seg.data() 189 | base = seg.header['p_vaddr'] 190 | mapper = BruteForceMapper(arch,bytes,base,entry,self.context) 191 | mapping = mapper.gen_mapping() 192 | newbytes = mapper.gen_newcode(mapping) 193 | #Perhaps I could find a better location to set the value of global_flag 194 | #(which is the offset from gs) 195 | #I only need one byte for the global flag, so I am adding a tiny bit to TLS 196 | #add_tls_section returns the offset, but we must make it negative 197 | self.context.global_flag = -bin_write.add_tls_section(fname,b'\0') 198 | print 'just set global_flag value to 0x%x'%self.context.global_flag 199 | #maptext = write_mapping(mapping,base,len(bytes)) 200 | #(mapping,newbytes) = translate_all(seg.data(),seg.header['p_vaddr']) 201 | #insts = md.disasm(newbytes[0x8048360-seg.header['p_vaddr']:0x8048441-seg.header['p_vaddr']],0x8048360) 202 | #The "mysterious" bytes between the previously patched instruction 203 | #(originally at 0x804830b) are the remaining bytes from that jmp instruction! 204 | #So even though there was nothing between that jmp at the end of that plt entry 205 | #and the start of the next plt entry, now there are 4 bytes from the rest of the jmp. 206 | #This is a good example of why I need to take a different approach to generating the mapping. 207 | #insts = md.disasm(newbytes[0x80483af-seg.header['p_vaddr']:0x80483bf-seg.header['p_vaddr']],0x80483af) 208 | #insts = md.disasm(newbytes,0x8048000) 209 | #for ins in insts: 210 | # print '0x%x:\t%s\t%s'%(ins.address,ins.mnemonic,ins.op_str) 211 | #tmpdct = {hex(k): (lambda x:hex(x+seg.header['p_vaddr']))(v) for k,v in mapping.items()} 212 | #keys = tmpdct.keys() 213 | #keys.sort() 214 | #output = '' 215 | #for key in keys: 216 | # output+='%s:%s '%(key,tmpdct[key]) 217 | with open('newbytes','wb') as f2: 218 | f2.write(newbytes) 219 | if not self.context.write_so: 220 | with open('newglobal','wb') as f2: 221 | f2.write(mapper.runtime.get_global_mapping_bytes()) 222 | #print output 223 | print mapping[base] 224 | print mapping[base+1] 225 | maptext = mapper.write_mapping(mapping,base,len(bytes)) 226 | cache = '' 227 | for x in maptext: 228 | #print x 229 | cache+='%d,'%int(x.encode('hex'),16) 230 | #print cache 231 | #print maptext.encode('hex') 232 | print '0x%x'%(base+len(bytes)) 233 | print 'code increase: %d%%'%(((len(newbytes)-len(bytes))/float(len(bytes)))*100) 234 | lookup = mapper.runtime.get_lookup_code(base,len(bytes),self.context.lookup_function_offset,0x8f) 235 | print 'lookup w/unknown mapping %s'%len(lookup) 236 | #insts = md.disasm(lookup,0x0) 237 | #for ins in insts: 238 | # print '0x%x:\t%s\t%s\t%s'%(ins.address,str(ins.bytes).encode('hex'),ins.mnemonic,ins.op_str) 239 | lookup = mapper.runtime.get_lookup_code(base,len(bytes),self.context.lookup_function_offset,mapping[self.context.mapping_offset]) 240 | print 'lookup w/known mapping %s'%len(lookup) 241 | #insts = md.disasm(lookup,0x0) 242 | #for ins in insts: 243 | # print '0x%x:\t%s\t%s\t%s'%(ins.address,str(ins.bytes).encode('hex'),ins.mnemonic,ins.op_str) 244 | if not self.context.write_so: 245 | print 'new entry point: 0x%x'%(self.context.newbase + self.context.new_entry_off) 246 | print 'new _start point: 0x%x'%(self.context.newbase + mapping[entry]) 247 | print 'global lookup: 0x%x'%self.context.global_lookup 248 | print 'local lookup: 0x%x'%self.context.lookup_function_offset 249 | print 'secondary local lookup: 0x%x'%self.context.secondary_lookup_function_offset 250 | print 'mapping offset: 0x%x'%mapping[self.context.mapping_offset] 251 | with open('%s-r-map.json'%fname,'wb') as f: 252 | json.dump(mapping,f) 253 | if not self.context.write_so: 254 | bin_write.rewrite(fname,fname+'-r','newbytes',self.context.newbase,mapper.runtime.get_global_mapping_bytes(),self.context.global_lookup,self.context.newbase+self.context.new_entry_off,offs,size,self.context.num_new_segments,arch) 255 | else: 256 | self.context.new_entry_off = mapping[entry] 257 | bin_write.rewrite_noglobal(fname,fname+'-r','newbytes',self.context.newbase,self.context.newbase+self.context.new_entry_off) 258 | self.context.stat['origtext'] = len(bytes) 259 | self.context.stat['newtext'] = len(newbytes) 260 | self.context.stat['origfile'] = os.path.getsize(fname) 261 | self.context.stat['newfile'] = os.path.getsize(fname+'-r') 262 | self.context.stat['mapsize'] = len(maptext) 263 | self.context.stat['lookupsize'] = \ 264 | len(mapper.runtime.get_lookup_code(base,len(bytes),self.context.lookup_function_offset,mapping[self.context.mapping_offset])) 265 | if self.context.exec_only: 266 | self.context.stat['secondarylookupsize'] = \ 267 | len(mapper.runtime.get_secondary_lookup_code(base,len(bytes), \ 268 | self.context.secondary_lookup_function_offset,mapping[self.context.mapping_offset])) 269 | if not self.context.write_so: 270 | self.context.stat['auxvecsize'] = len(mapper.runtime.get_auxvec_code(mapping[entry])) 271 | popgm = 'x86_popgm' if arch == 'x86' else 'x64_popgm' # TODO: if other architectures are added, this will need to be changed 272 | with open(popgm) as f: 273 | tmp=f.read() 274 | self.context.stat['popgmsize'] = len(tmp) 275 | self.context.stat['globmapsectionsize'] = len(mapper.runtime.get_global_mapping_bytes()) 276 | self.context.stat['globlookupsize'] = len(mapper.runtime.get_global_lookup_code()) 277 | with open('%s-r-stat.json'%fname,'wb') as f: 278 | json.dump(self.context.stat,f,sort_keys=True,indent=4,separators=(',',': ')) 279 | 280 | ''' 281 | with open(fname,'rb') as f: 282 | f.read(offs) 283 | bytes = f.read(size) 284 | (mapping,newbytes) = translate_all(bytes,addr) 285 | md = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_64) 286 | for i in range(0,size): 287 | #print dir(md.disasm(bytes[i:i+15],addr+i)) 288 | insts = md.disasm(newbytes[i:i+15],addr+i) 289 | ins = None 290 | try: 291 | ins = insts.next()#longest possible x86/x64 instruction is 15 bytes 292 | #print str(ins.bytes).encode('hex') 293 | #print ins.size 294 | #print dir(ins) 295 | except StopIteration: 296 | pass 297 | if ins is None: 298 | pass#print 'no legal decoding' 299 | else: 300 | pass#print '0x%x:\t%s\t%s'%(ins.address,ins.mnemonic,ins.op_str) 301 | print {k: (lambda x:x+addr)(v) for k,v in mapping.items()} 302 | print asm(save_register%('eax','eax','eax')).encode('hex')''' 303 | 304 | if __name__ == '__main__': 305 | import argparse 306 | 307 | parser = argparse.ArgumentParser(description='''Rewrite a binary so that the code is relocated. 308 | Running this script from the terminal does not allow any instrumentation. 309 | For that, use this as a library instead.''') 310 | parser.add_argument('filename',help='The executable file to rewrite.') 311 | parser.add_argument('--so',action='store_true',help='Write a shared object.') 312 | parser.add_argument('--execonly',action='store_true',help='Write only a main executable without .so support.') 313 | parser.add_argument('--nopic',action='store_true',help='Write binary without support for arbitrary pic. It still supports common compiler-generated pic.') 314 | parser.add_argument('--arch',default='x86',help='The architecture of the binary. Default is \'x86\'.') 315 | args = parser.parse_args() 316 | rewriter = Rewriter(args.so,args.execonly,args.nopic) 317 | rewriter.rewrite(args.filename,args.arch) 318 | #cProfile.run('renable(args.filename,args.arch)') 319 | 320 | -------------------------------------------------------------------------------- /parse_popgm.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Match the offset (index 1) and size (index 2) of the .text section so we can create a file 3 | # containing only the raw bytes of the .text section. 4 | re='.text[[:space:]]+PROGBITS[[:space:]]+[0-9a-f]+[[:space:]]+([0-9a-f]+)[[:space:]]+([0-9a-f]+)' 5 | textsection=$(readelf -S -W x86_populate_gm | grep '.text') 6 | if [[ ${textsection} =~ ${re} ]]; then 7 | dd if=x86_populate_gm of=x86_popgm skip=$((0x${BASH_REMATCH[1]})) bs=1 count=$((0x${BASH_REMATCH[2]})) 8 | fi 9 | textsection=$(readelf -S -W x64_populate_gm | grep '.text') 10 | if [[ ${textsection} =~ ${re} ]]; then 11 | dd if=x64_populate_gm of=x64_popgm skip=$((0x${BASH_REMATCH[1]})) bs=1 count=$((0x${BASH_REMATCH[2]})) 12 | fi 13 | -------------------------------------------------------------------------------- /rewrite.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | import sys,os 3 | import subprocess 4 | import shutil 5 | 6 | from multiverse import Rewriter 7 | 8 | def extract_libraries(fname): 9 | result = subprocess.check_output('ldd %s'%fname, shell=True) 10 | libs = result.split('\n') 11 | paths = [] 12 | for lib in libs: 13 | if '=>' in lib: 14 | path = lib[lib.find('=>')+2:lib.find(' (0x')].strip() 15 | if path != '': 16 | paths.append(path) 17 | return paths 18 | 19 | def extract_dynamic_libraries(fname, libpath): 20 | paths = [] 21 | dynlib = os.path.join(libpath, fname+'-dynamic-libs.txt') 22 | if os.path.exists(dynlib): 23 | with open(dynlib) as f: 24 | path = f.readline() 25 | while path != '': 26 | paths.append(path.strip()) 27 | path = f.readline() 28 | return paths 29 | 30 | def rewrite_libraries(libpath,paths,arch): 31 | rewriter = Rewriter(True,False,False) 32 | for path in paths: 33 | (base,fname) = os.path.split(path) 34 | libname = os.path.join(libpath,fname) 35 | shutil.copy(path,libname) 36 | rewriter.rewrite(libname,arch) 37 | os.remove(libname) 38 | shutil.move(libname+'-r',libname) 39 | shutil.move(libname+'-r-map.json',libname+'-map.json') 40 | shutil.move(libname+'-r-stat.json',libname+'-stat.json') 41 | 42 | if __name__ == '__main__': 43 | arch = 'x86' 44 | if len(sys.argv) == 2 or len(sys.argv) == 3: 45 | fpath = '' 46 | dynamic_only = False 47 | if len(sys.argv) == 2: 48 | fpath = sys.argv[1] 49 | else: 50 | fpath = sys.argv[2] 51 | if sys.argv[1] == '-d': 52 | dynamic_only = True 53 | if sys.argv[1] == '-64': 54 | arch = 'x86-64' 55 | 56 | paths = [] 57 | 58 | if not dynamic_only: 59 | print 'Getting required libraries for %s'%fpath 60 | paths = extract_libraries(fpath) 61 | 62 | (base,fname) = os.path.split(fpath) 63 | libpath = os.path.join(base,fname+'-libs-r') 64 | if not os.path.exists(libpath): 65 | os.makedirs(libpath) 66 | print 'Getting dynamic libraries' 67 | paths.extend(extract_dynamic_libraries(fname,libpath)) 68 | print 'Rewriting libraries' 69 | print paths 70 | rewrite_libraries(libpath,paths,arch) 71 | 72 | if not dynamic_only: 73 | print 'Rewriting main binary' 74 | rewriter = Rewriter(False,False,False) 75 | rewriter.rewrite(fpath,arch) 76 | 77 | print 'Writing runnable .sh' 78 | with open(fpath+'-r.sh', 'w') as f: 79 | ld_preload = '' 80 | for path in extract_dynamic_libraries(fname,libpath): 81 | (lbase,lname) = os.path.split(path) 82 | ld_preload += os.path.join(libpath,lname) + ' ' 83 | f.write('#!/bin/bash\nLD_LIBRARY_PATH=./%s LD_BIND_NOW=1 LD_PRELOAD="%s" ./%s'%( fname+'-libs-r', ld_preload, fname+'-r' ) ) 84 | else: 85 | print "Error: must pass executable filename.\nCorrect usage: %s [-d -64] \nUse -d flag to rewrite only dynamic libaries.\nUse -64 flag to rewrite 64-bit binaries."%sys.argv[0] 86 | -------------------------------------------------------------------------------- /runtime.py: -------------------------------------------------------------------------------- 1 | 2 | class Runtime(object): 3 | ''' The BinForce runtime library includes all code needed to run 4 | the rewritten binary. This includes the functions to populate 5 | the global mapping and perform lookups in mappings. 6 | 7 | This is a generic Runtime object. All runtimes 8 | used by this system should inherit from this parent 9 | object and provide implementations for all functions listed.''' 10 | def __init__(self,context): 11 | raise NotImplementedError('Override __init__() in a child class') 12 | def get_lookup_code(self,base,size,lookup_off,mapping_off): 13 | raise NotImplementedError('Override get_lookup_code() in a child class') 14 | def get_secondary_lookup_code(self,base,size,sec_lookup_off,mapping_off): 15 | raise NotImplementedError('Override get_secondary_lookup_code() in a child class') 16 | def get_global_lookup_code(self): 17 | raise NotImplementedError('Override get_global_lookup_code() in a child class') 18 | def get_auxvec_code(self,entry): 19 | raise NotImplementedError('Override get_auxvec_code() in a child class') 20 | def get_popgm_code(self): 21 | raise NotImplementedError('Override get_popgm_code() in a child class') 22 | def get_global_mapping_bytes(self): 23 | raise NotImplementedError('Override get_global_mapping_bytes() in a child class') 24 | -------------------------------------------------------------------------------- /simplest.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int add(int a, int b){ 4 | return a+b; 5 | } 6 | 7 | int main(int argc, char** argv){ 8 | printf("%d\n",add(2,4)); 9 | } 10 | -------------------------------------------------------------------------------- /translator.py: -------------------------------------------------------------------------------- 1 | 2 | class Translator(object): 3 | ''' A Translator converts the original instructions from a source 4 | binary into their corresponding translated instructions for 5 | the rewritten binary. This includes translating addresses 6 | for jmp/JCC/call/ret destinations and inserting user-defined 7 | instrumentation code around instructions. 8 | 9 | This is a generic Translator object. All translators 10 | used by this system should inherit from this parent 11 | object and provide implementations for all functions listed.''' 12 | def __init__(self,before_callback,context): 13 | raise NotImplementedError('Override __init__() in a child class') 14 | def translate_one(self,ins,mapping): 15 | raise NotImplementedError('Override translate_one() in a child class') 16 | def translate_uncond(self,ins,mapping): 17 | raise NotImplementedError('Override translate_uncond() in a child class') 18 | def translate_cond(self,ins,mapping): 19 | raise NotImplementedError('Override translate_cond() in a child class') 20 | def translate_ret(self,ins,mapping): 21 | raise NotImplementedError('Override translate_ret() in a child class') 22 | def remap_target(self,addr,mapping,target,offs): 23 | raise NotImplementedError('Override remap_target() in a child class') 24 | -------------------------------------------------------------------------------- /x64_assembler.py: -------------------------------------------------------------------------------- 1 | import pwn 2 | pwn.context(os='linux',arch='amd64') 3 | import re 4 | import struct 5 | 6 | cache = {} 7 | # Metacache stores data about an assembled instruction. 8 | # Specifically, right now it only holds the offset of the 9 | # displacement value (if the instruction encodes a 4-byte displacement). 10 | # This is only used for efficient modification of 11 | # already-assembled instructions containing a reference to rip. 12 | # This value allows us to change the offset from rip regardless of 13 | # the instruction. 14 | # even if 15 | # there is an immediate value (which appears at the end of an 16 | # encoded instruction's bytes). 17 | metacache = {} 18 | pat = re.compile('\$\+[-]?0x[0-9a-f]+') 19 | pat2 = re.compile('[ ]*push [0-9]+[ ]*') 20 | pat3 = re.compile('[ ]*mov eax, (d)?word ptr \[0x[0-9a-f]+\][ ]*') 21 | pat4 = re.compile('[ ]*mov eax, (dword ptr )?\[(?Pe[a-z][a-z])( )?[+-]( )?(0x)?[0-9a-f]+\][ ]*') 22 | pat5 = re.compile('(0x[0-9a-f]+|[0-9]+)') 23 | pat6 = re.compile('[ ]*(?P(add)|(sub)) (?P(esp)|(ebx)),(?P[0-9]+)[ ]*') 24 | pat7 = re.compile('[ ]*mov eax, word ptr.*')#Match stupid size mismatch 25 | pat8 = re.compile('[ ]*mov eax, .[xip]')#Match ridiculous register mismatch 26 | rip_with_offset = re.compile(u'\[rip(?: (?P[\+\-] [0x]?[0-9a-z]+))?\]') #Apparently the hex prefix is optional if the number is...unambiguous? 27 | 28 | #jcxz and jecxz are removed because they don't have a large expansion 29 | JCC = ['jo','jno','js','jns','je','jz','jne','jnz','jb','jnae', 30 | 'jc','jnb','jae','jnc','jbe','jna','ja','jnbe','jl','jnge','jge', 31 | 'jnl','jle','jng','jg','jnle','jp','jpe','jnp','jpo'] 32 | 33 | #Simple cache code. Called after more complex preprocessing of assembly source. 34 | def _asm(text): 35 | if text in cache: 36 | return cache[text] 37 | else: 38 | with open('uncached.txt','a') as f: 39 | f.write(text+'\n') 40 | code = pwn.asm(text) 41 | cache[text] = code 42 | return code 43 | 44 | def asm(text): 45 | code = b'' 46 | for line in text.split('\n'): 47 | if not line.find(';') == -1: 48 | line = line[:line.find(';')]#Eliminate comments 49 | #Check for offsets ($+) 50 | match = pat.search(line) 51 | if match and match.group() != '$+0x8f': 52 | off = int(match.group()[2:],16) 53 | line = line.strip() 54 | mnemonic = line[:line.find(' ')] 55 | line = pat.sub('$+0x8f',line) #Replace actual offset with dummy 56 | newcode = _asm(line) #Assembled code with dummy offset 57 | if mnemonic in ['jmp','call']: 58 | off-=5 #Subtract 5 because the large encoding knows it's 5 bytes long 59 | newcode = newcode[0]+struct.pack(']" to "mov eax, dword ptr []"' 72 | code+=b'\xa1' + struct.pack(' 0x7f: 91 | line = pat5.sub('0x8f',line) 92 | original = struct.pack(' 0x7f: 116 | newcode = _asm('%s %s,0x8f'%(mnemonic,register) ) 117 | newcode = newcode[:2] + struct.pack(']" to "mov eax, dword ptr []"' 127 | code+=_asm(line.replace(' word',' dword')) 128 | elif pat8.match(line): 129 | print 'WARNING: silently converting "mov eax, [xip]" to "mov eax, e[xip]"' 130 | code+=_asm(line.replace(', ',', e')) 131 | elif rip_with_offset.search(line): 132 | #print 'WARNING: using assumption to efficiently assemble "%s"' % line 133 | m = rip_with_offset.search(line) 134 | newstr = rip_with_offset.sub('[rip]', line) 135 | if newstr in metacache: 136 | # Assemble it with no offset, which must have have already been added to the cache 137 | newcode = _asm( newstr ) 138 | if m.group('offset'): 139 | #immediate = newcode[-metacache[newstr]:] if newstr in metacache else b'' 140 | #print 'WARNING: using assumption to efficiently assemble "%s"' % line 141 | # Replace 4 bytes of displacement with little-endian encoded offset retrieved from the original assembly 142 | #code += newcode[:-(4+len(immediate))] + struct.pack( ' 21 | #include 22 | #include 23 | #include 24 | #include 25 | #include 26 | #else 27 | #define NULL ( (void *) 0) 28 | #endif 29 | 30 | struct gm_entry { 31 | unsigned long lookup_function; 32 | unsigned long start; 33 | unsigned long length; 34 | }; 35 | 36 | unsigned int __attribute__ ((noinline)) my_read(int, char *, unsigned int); 37 | int __attribute__ ((noinline)) my_open(const char *); 38 | void populate_mapping(unsigned int, unsigned long, unsigned long, unsigned long, struct gm_entry *); 39 | void process_maps(char *, struct gm_entry *); 40 | struct gm_entry lookup(unsigned long, struct gm_entry *); 41 | 42 | #ifdef DEBUG 43 | int wrapper(struct gm_entry *global_mapping){ 44 | #else 45 | int _start(struct gm_entry *global_mapping){ 46 | #endif 47 | // force string to be stored on the stack even with optimizations 48 | //char maps_path[] = "/proc/self/maps\0"; 49 | volatile int maps_path[] = { 50 | 0x6f72702f, 51 | 0x65732f63, 52 | 0x6d2f666c, 53 | 0x00737061, 54 | }; 55 | 56 | unsigned int buf_size = 0x10000; 57 | char buf[buf_size]; 58 | int proc_maps_fd; 59 | int cnt, offset = 0; 60 | 61 | 62 | proc_maps_fd = my_open((char *) &maps_path); 63 | cnt = my_read(proc_maps_fd, buf, buf_size); 64 | while( cnt != 0 && offset < buf_size ){ 65 | offset += cnt; 66 | cnt = my_read(proc_maps_fd, buf+offset, buf_size-offset); 67 | } 68 | buf[offset] = '\0';// must null terminate 69 | 70 | #ifdef DEBUG 71 | printf("READ:\n%s\n", buf); 72 | process_maps(buf,global_mapping); 73 | int items = global_mapping[0].lookup_function; 74 | // simulation for testing 75 | populate_mapping(items + 0, 0x08800000, 0x08880000, 0x07000000, global_mapping); 76 | populate_mapping(items + 1, 0x09900000, 0x09990000, 0x07800000, global_mapping); 77 | global_mapping[0].lookup_function += 2;//Show that we have added these 78 | /* 79 | int i; 80 | for (i = 0x08800000; i < 0x08880000; i++){ 81 | if (lookup(i, global_mapping) != 0x07000000){ 82 | printf("Failed lookup of 0x%08x\n", i); 83 | } 84 | } 85 | */ 86 | //check edge cases 87 | 88 | printf("Testing %x (out of range)\n",0x08800000-1); 89 | lookup(0x08800000-1, global_mapping); 90 | printf("Testing %x (in range)\n",0x08800000); 91 | lookup(0x08800000, global_mapping); 92 | printf("Testing %x (in range)\n",0x08800001); 93 | lookup(0x08800001, global_mapping); 94 | printf("Testing %x (in range)\n",0x08880000); 95 | lookup(0x08880000, global_mapping); 96 | printf("Testing %x (out of range)\n",0x08880000+1); 97 | lookup(0x08880000+1, global_mapping); 98 | //printf("0x08812345 => 0x%08x\n", lookup(0x08812345, global_mapping)); 99 | #else 100 | process_maps(buf, global_mapping); 101 | #endif 102 | return 0; 103 | } 104 | 105 | #ifdef DEBUG 106 | struct gm_entry lookup(unsigned long addr, struct gm_entry *global_mapping){ 107 | unsigned int index; 108 | unsigned long gm_size = global_mapping[0].lookup_function;//Size is stored in first entry 109 | global_mapping++;//Now we point at the true first entry 110 | //Use binary search on the already-sorted entries 111 | //Here is a linear search for simple testing purposes. 112 | //For small arrays, binary search may not be as useful, so I may for now just use linear search. 113 | //I can try using binary search later and doing a performance comparison. 114 | //However, if I want to do binary search, I should do a conditional mov to reduce the number of branches 115 | for(index = 0; index < gm_size; index++){ 116 | //printf("SEARCHING 0x%lx :: mapping[%d] :: 0x%lx :: 0x%lx :: 0x%lx\n", addr, index, global_mapping[index].lookup_function, global_mapping[index].start, global_mapping[index].length); 117 | if( addr - global_mapping[index].start <= global_mapping[index].length){ 118 | printf("0x%lx :: mapping[%d] :: 0x%lx :: 0x%lx :: 0x%lx\n", addr, index, global_mapping[index].lookup_function, global_mapping[index].start, global_mapping[index].length); 119 | } 120 | } 121 | 122 | return global_mapping[index]; 123 | } 124 | #endif 125 | 126 | unsigned int __attribute__ ((noinline)) my_read(int fd, char *buf, unsigned int count){ 127 | unsigned long bytes_read; 128 | asm volatile( 129 | ".intel_syntax noprefix\n" 130 | "mov rax, 0\n" 131 | "mov rdi, %1\n" 132 | "mov rsi, %2\n" 133 | "mov rdx, %3\n" 134 | "syscall\n" 135 | "mov %0, rax\n" 136 | : "=g" (bytes_read) 137 | : "g" ((long)fd), "g" (buf), "g" ((long)count) 138 | : "rax", "rdi", "rsi", "rdx", "rcx", "r11" 139 | ); 140 | return (unsigned int) bytes_read; 141 | } 142 | 143 | int __attribute__ ((noinline)) my_open(const char *path){ 144 | unsigned long fp; 145 | asm volatile( 146 | ".intel_syntax noprefix\n" 147 | "mov rax, 2\n" 148 | "mov rdi, %1\n" 149 | "mov rsi, 0\n" 150 | "mov rdx, 0\n" 151 | "syscall\n" 152 | "mov %0, rax\n" 153 | : "=r" (fp) 154 | : "g" (path) 155 | : "rcx", "r11" 156 | ); 157 | return (int) fp; 158 | } 159 | 160 | #define PERM_WRITE 1 161 | #define PERM_EXEC 2 162 | unsigned char get_permissions(char *line){ 163 | // e.g., "08048000-08049000 r-xp ..." or "08048000-08049000 rw-p ..." 164 | unsigned char permissions = 0; 165 | while( *line != ' ' ) line++; 166 | line+=2; //Skip space and 'r' entry, go to 'w' 167 | if( *line == 'w' ) permissions |= PERM_WRITE; 168 | line++; //Go to 'x' 169 | if( *line == 'x' ) permissions |= PERM_EXEC; 170 | return permissions; 171 | } 172 | 173 | #define is_write(p) (p & PERM_WRITE) 174 | #define is_exec(p) (p & PERM_EXEC) 175 | 176 | #define NUM_EXTERNALS 3 177 | 178 | /* 179 | Check whether the memory range is not rewritten by our system: 180 | This includes [vsyscall], [vdso], and the dynamic loader 181 | */ 182 | unsigned char is_external(char *line){ 183 | volatile char externals[][11] = { 184 | "/ld-", 185 | "[vdso]", 186 | "[vsyscall]" 187 | }; 188 | unsigned int offset,i; 189 | char *lineoff; 190 | while( *line != ' ' ) line++; // Skip memory ranges 191 | line += 21; // Skip permissions and some other fields 192 | while( *line != ' ' ) line++; // Skip last field 193 | while( *line == ' ' ) line++; // Skip whitespace 194 | if( *line != '\n'){ // If line has text at the end 195 | // Could have done a string matching state machine here, but 196 | // it would be harder to add extra strings to later. 197 | for( i = 0; i < NUM_EXTERNALS; i++ ){ 198 | offset = 0; 199 | lineoff = line-1; 200 | while( *lineoff != '\n' && *lineoff != '\0' ){ 201 | // This is not perfect string matching, and will not work in general cases 202 | // because we do not backtrack. It should work with the strings we are searching 203 | // for now, plus it's relatively simple to do it this way, so I'm leaving it like 204 | // this for the time being. 205 | lineoff++; //Increment lineoff here so that we compare to the previous char for the loop 206 | if( externals[i][offset] == '\0' ){ 207 | return 1;// Matched 208 | } 209 | if( *lineoff == externals[i][offset] ){ 210 | offset++; // If we are matching, move forward one in external 211 | }else{ 212 | offset = 0; // If they failed to match, start over at the beginning 213 | } 214 | } 215 | } 216 | } 217 | return 0; //Not an external 218 | } 219 | 220 | char *next_line(char *line){ 221 | /* 222 | * finds the next line to process 223 | */ 224 | for (; line[0] != '\0'; line++){ 225 | if (line[0] == '\n'){ 226 | if (line[1] == '\0') 227 | return NULL; 228 | return line+1; 229 | } 230 | } 231 | return NULL; 232 | } 233 | 234 | unsigned long my_atol(char *a){ 235 | /* 236 | * convert unknown length (max 16) hex string into its integer representation 237 | * assumes input is from /proc/./maps 238 | * i.e., 'a' is a left-padded 16 byte lowercase hex string 239 | * e.g., "000000000804a000" 240 | */ 241 | #ifdef DEBUG 242 | //printf("Converting string to long: \"%s\"\n", a); 243 | #endif 244 | unsigned long l = 0; 245 | unsigned char digit = *a; 246 | while( (digit >= '0' && digit <= '9') || (digit >= 'a' && digit <= 'f') ){ 247 | digit -= '0'; 248 | if( digit > 9 ) digit -= 0x27; // digit was hex character 249 | l <<= 4; // Shift by half a byte 250 | l += digit; 251 | digit = *(++a); 252 | } 253 | #ifdef DEBUG 254 | //printf("Resulting value: %lx\n", l); 255 | #endif 256 | return l; 257 | } 258 | 259 | void parse_range(char *line, unsigned long *start, unsigned long *end){ 260 | /* 261 | * e.g., "08048000-08049000 ..." 262 | * Unfortunately, for 64-bit applications, the address ranges do not have a 263 | * consistent length! We must determine how many digits are in each number. 264 | */ 265 | char *line_start = line; 266 | while( *line != '-' ) line++; 267 | *start = my_atol(line_start); 268 | *end = my_atol(line+1); 269 | } 270 | 271 | void populate_mapping(unsigned int gm_index, unsigned long start, unsigned long end, unsigned long lookup_function, struct gm_entry *global_mapping){ 272 | global_mapping[gm_index].lookup_function = lookup_function; 273 | global_mapping[gm_index].start = start; 274 | global_mapping[gm_index].length = end - start; 275 | #ifdef DEBUG 276 | printf("Added gm entry @ %d: (0x%lx, 0x%lx, 0x%lx)\n", gm_index, global_mapping[gm_index].lookup_function, global_mapping[gm_index].start, global_mapping[gm_index].length); 277 | #endif 278 | } 279 | 280 | void process_maps(char *buf, struct gm_entry *global_mapping){ 281 | /* 282 | * Process buf which contains output of /proc/self/maps 283 | * populate global_mapping for each executable set of pages 284 | */ 285 | char *line = buf; 286 | unsigned int gm_index = 1;//Reserve first entry for metadata 287 | unsigned char permissions = 0; 288 | //unsigned int global_start, global_end; 289 | unsigned long old_text_start, old_text_end = 0; 290 | unsigned long new_text_start, new_text_end = 0; 291 | 292 | //Assume global mapping is first entry at 0x200000 and that there is nothing before 293 | //Skip global mapping (put at 0x200000 in 64-bit binaries, as opposed to 0x7000000 for x86) 294 | line = next_line(line); 295 | do{ // process each block of maps 296 | permissions = get_permissions(line); 297 | // process all segments from this object under very specific assumptions 298 | if ( is_exec(permissions) ){ 299 | if( !is_write(permissions) ){ 300 | parse_range(line, &old_text_start, &old_text_end); 301 | #ifdef DEBUG 302 | printf("Parsed range for r-xp: %lx-%lx\n", old_text_start, old_text_end); 303 | #endif 304 | if( is_external(line) ){ 305 | #ifdef DEBUG 306 | printf("Region is external: %lx-%lx\n", old_text_start, old_text_end); 307 | #endif 308 | // Populate external regions with 0x00000000, which will be checked for in the global lookup. 309 | // It will then rewrite the return address on the stack and return the original address. 310 | populate_mapping(gm_index, old_text_start, old_text_end, 0x00000000, global_mapping); 311 | gm_index++; 312 | } 313 | }else{ 314 | parse_range(line, &new_text_start, &new_text_end); 315 | #ifdef DEBUG 316 | printf("Parsed range for rwxp: %lx-%lx\n", new_text_start, new_text_end); 317 | #endif 318 | populate_mapping(gm_index, old_text_start, old_text_end, new_text_start, global_mapping); 319 | gm_index++; 320 | } 321 | } 322 | line = next_line(line); 323 | } while(line != NULL); 324 | global_mapping[0].lookup_function = gm_index;// Use first entry for storing how many entries there are 325 | } 326 | 327 | #ifdef DEBUG 328 | int main(void){ 329 | void *mapping_base = (void *)0x200000; 330 | void *new_section = (void *)0x8000000; 331 | int fd = open("/dev/zero", O_RDWR); 332 | void *global_mapping = mmap(mapping_base, 0x10000, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); 333 | mmap(new_section, 0x4000, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE, fd, 0); //Create a mock new "text" section that would be added by process_maps 334 | if (global_mapping != mapping_base){ 335 | printf("failed to get requested base addr\n"); 336 | exit(1); 337 | } 338 | wrapper(global_mapping); 339 | 340 | return 0; 341 | } 342 | #endif 343 | 344 | -------------------------------------------------------------------------------- /x64_runtime.py: -------------------------------------------------------------------------------- 1 | from x64_assembler import _asm,asm 2 | 3 | class X64Runtime(object): 4 | def __init__(self,context): 5 | self.context = context 6 | self.context.global_lookup = 0x200000 # Set global lookup offset for 64-bit 7 | 8 | def get_lookup_code(self,base,size,lookup_off,mapping_off): 9 | #Example assembly for lookup function 10 | ''' 11 | push edx 12 | mov edx,eax 13 | call get_eip 14 | get_eip: 15 | pop eax ;Get current instruction pointer 16 | sub eax,0x8248 ;Subtract offset from instruction pointer val to get new text base addr 17 | sub edx,0x8048000 ;Compare to start (exclusive) and set edx to an offset in the mapping 18 | jl outside ;Out of bounds (too small) 19 | cmp edx,0x220 ;Compare to end (inclusive) (note we are now comparing to the size) 20 | jge outside ;Out of bounds (too big) 21 | mov edx,[mapping+edx*4] ;Retrieve mapping entry (can't do this directly in generated func) 22 | cmp edx, 0xffffffff ;Compare to invalid entry 23 | je failure ;It was an invalid entry 24 | add eax,edx ;Add the offset of the destination to the new text section base addr 25 | pop edx 26 | ret 27 | outside: ;If the address is out of the mapping bounds, return original address 28 | add edx,0x8048000 ;Undo subtraction of base, giving us the originally requested address 29 | mov eax,edx ;Place the original request back in eax 30 | pop edx 31 | jmp global_lookup ;Check if global lookup can find this 32 | failure: 33 | hlt 34 | ''' 35 | #TODO: support lookup for binary/library combination 36 | lookup_template = ''' 37 | push rbx 38 | mov rbx,rax 39 | lea rax, [rip-%s] 40 | %s 41 | jb outside 42 | cmp rbx,%s 43 | jae outside 44 | mov ebx,[rax+rbx*4+%s] 45 | cmp ebx, 0xffffffff 46 | je failure 47 | add rax,rbx 48 | pop rbx 49 | ret 50 | outside: 51 | %s 52 | mov rax,rbx 53 | pop rbx 54 | mov QWORD PTR [rsp-8],%s 55 | jmp [rsp-8] 56 | failure: 57 | hlt 58 | ''' 59 | exec_code = ''' 60 | sub rbx,%s 61 | ''' 62 | exec_restore = ''' 63 | add rbx,%s 64 | ''' 65 | #Notice that we only move a DWORD from the mapping (into ebx) because the 66 | #mapping only stores 4-byte offsets. Therefore, if a text section is >4GB, 67 | #this mapping strategy will fail 68 | exec_only_lookup = ''' 69 | lookup: 70 | push rbx 71 | mov rbx,rax 72 | lea rax, [rip-%s] 73 | sub rbx,%s 74 | jb outside 75 | cmp rbx,%s 76 | jae outside 77 | mov ebx, [rax+rbx*4+%s] 78 | add rax,rbx 79 | pop rbx 80 | ret 81 | 82 | outside: 83 | add rbx,%s 84 | mov rax,[rsp+16] 85 | call lookup 86 | mov [rsp+16],rax 87 | mov rax,rbx 88 | pop rbx 89 | ret 90 | ''' 91 | #For an .so, it can be loaded at an arbitrary address, so we cannot depend on 92 | #the base address being in a fixed location. Therefore, we instead compute 93 | #the old text section's start address by using the new text section's offset 94 | #from it. 95 | # rax holds the address of the lookup function, which is at the start of the new 96 | # section we are adding. 97 | # rbx at the start holds the address we want to look up, and we want to compute 98 | # how many bytes the address is from the start of the original text section. So 99 | # we add the newbase address to rbx to add the offset there is between the old and 100 | # new text sections, and then subtract off the address of the lookup. 101 | so_code = ''' 102 | add rbx, %s 103 | sub rbx, rax 104 | ''' 105 | so_restore = ''' 106 | add rbx, rax 107 | sub rbx, %s 108 | ''' 109 | #retrieve rip 11 bytes after start of lookup function (right after first lea instruction) 110 | if self.context.write_so: 111 | return _asm(lookup_template%(lookup_off+11,so_code%(self.context.newbase),size,mapping_off,so_restore%(self.context.newbase),self.context.global_lookup)) 112 | elif self.context.exec_only: 113 | return _asm( exec_only_lookup%(lookup_off+11,base,size,mapping_off,base) ) 114 | else: 115 | return _asm(lookup_template%(lookup_off+11,exec_code%base,size,mapping_off,exec_restore%base,self.context.global_lookup)) 116 | 117 | def get_secondary_lookup_code(self,base,size,sec_lookup_off,mapping_off): 118 | '''This secondary lookup is only used when rewriting only the main executable. It is a second, simpler 119 | lookup function that is used by ret instructions and does NOT rewrite a return address on the stack 120 | when the destination is outside the mapping. It instead simply returns the original address and that's 121 | it. The only reason I'm doing this by way of a secondary lookup is this should be faster than a 122 | a parameter passed at runtime, so I need to statically have an offset to jump to in the case of returns. 123 | This is a cleaner way to do it than split the original lookup to have two entry points.''' 124 | #Notice that we only move a DWORD from the mapping (into ebx) because the 125 | #mapping only stores 4-byte offsets. Therefore, if a text section is >4GB, 126 | #this mapping strategy will fail 127 | secondary_lookup = ''' 128 | lookup: 129 | push rbx 130 | mov rbx,rax 131 | lea rax, [rip-%s] 132 | sub rbx,%s 133 | jb outside 134 | cmp rbx,%s 135 | jae outside 136 | mov ebx,[rax+rbx*4+%s] 137 | add rax,rbx 138 | pop rbx 139 | ret 140 | 141 | outside: 142 | add rbx,%s 143 | mov rax,rbx 144 | pop rbx 145 | ret 146 | ''' 147 | return _asm( secondary_lookup%(sec_lookup_off+11,base,size,mapping_off,base) ) 148 | 149 | def get_global_lookup_code(self): 150 | #TODO: Support global lookup, executable + library rewriting 151 | #I have to modify it so it will assemble since we write out the global lookup 152 | #regardless of whether it's used, but it obviously won't work in this state... 153 | #addr - global_mapping[index].start <= global_mapping[index].length 154 | # rbx = length 155 | # rcx = base/entry 156 | # rdx = index 157 | # r10 = entry 158 | #struct gm_entry { 159 | # unsigned long lookup_function; 160 | # unsigned long start; 161 | # unsigned long length; 162 | #}; 163 | #TODO: still need to handle code entering the loader region.... 164 | ''' 165 | ; Get rid of sysinfo comparison because we instead are going to be comparing based on entire address ranges 166 | ;cmp rax,[%s] ; If rax is sysinfo 167 | ;je sysinfo ; Go to rewrite return address 168 | glookup: 169 | push rcx ; Save working registers 170 | push rbx 171 | push rdx 172 | push r10 173 | mov rcx, %s ; Load address of first entry 174 | mov rbx, [rcx] ; Load first value in first entry (lookup_function, serving as length) 175 | xor rdx, rdx ; Clear rdx 176 | searchloop: 177 | cmp rbx, rdx ; Check if we are past last entry 178 | je failure ; Did not find successful entry, so fail 179 | add rcx, 24 ; Set rcx to next entry 180 | mov r10, [rcx+8] ; Load second item in entry (start) 181 | neg r10 ; Negate r10 so it can act like it is being subtracted 182 | add r10, rax ; Get difference between lookup address and start 183 | cmp r10, [rcx+16] ; Compare: address - start <= end - start (length) 184 | jle success ; If so, we found the right entry. 185 | inc rdx ; Add one to our index 186 | jmp searchloop ; Loop for next entry 187 | success: 188 | mov rcx,[rcx] ; Load lookup address into rcx so we can compare it to 0 189 | test rcx,rcx ; If lookup address is zero it means this region is not rewritten! 190 | jz external ; Jump to external so we can rewrite return address on the stack (assume only calls into external regions) 191 | pop r10 ; Restore the saved values first to grow the stack as little as possible 192 | pop rdx 193 | pop rbx 194 | call rcx ; Call the lookup, as specified by the first value in global mapping entry (lookup_function) 195 | pop rcx ; Restore rcx since we were using it to save the lookup function address 196 | ret ; rax should now have the right value, so return 197 | external: 198 | pop r10 ; Restore all saved registers, as the subsequent call to glookup will save them again. 199 | pop rdx ; Restoring the saved registers before the recursive call means the stack will not grow as much, 200 | pop rbx ; avoiding overwriting the value of rax saved outside the stack before the local lookup call without 201 | pop rcx ; having to increase the distance that rax is saved outside the stack as much as we would otherwise. 202 | mov [rsp-64],rax ; Save original rax (not with push so we don't increase the stack pointer any more) 203 | mov rax,[rsp+8] ; Load the return address we want to overwrite (address of instruction calling the local lookup) 204 | call glookup ; Lookup the translated value 205 | mov [rsp+8],rax ; Overwrite with the translated value 206 | mov rax,[rsp-64] ; Restore original rax, returned unmodified so we call unmodified external code 207 | ret 208 | failure: 209 | hlt 210 | ''' 211 | global_lookup_template = ''' 212 | glookup: 213 | push rcx 214 | push rbx 215 | push rdx 216 | push r10 217 | mov rcx, %s 218 | mov rbx, [rcx] 219 | xor rdx, rdx 220 | searchloop: 221 | cmp rbx, rdx 222 | je failure 223 | add rcx, 24 224 | mov r10, [rcx+8] 225 | neg r10 226 | add r10, rax 227 | cmp r10, [rcx+16] 228 | jle success 229 | inc rdx 230 | jmp searchloop 231 | success: 232 | mov rcx,[rcx] 233 | test rcx,rcx 234 | jz external 235 | pop r10 236 | pop rdx 237 | pop rbx 238 | call rcx 239 | pop rcx 240 | ret 241 | external: 242 | pop r10 243 | pop rdx 244 | pop rbx 245 | pop rcx 246 | mov [rsp-64],rax 247 | mov rax,[rsp+8] 248 | call glookup 249 | mov [rsp+8],rax 250 | mov rax,[rsp-64] 251 | ret 252 | failure: 253 | hlt 254 | ''' 255 | return _asm(global_lookup_template%(self.context.global_sysinfo+8)) 256 | 257 | def get_auxvec_code(self,entry): 258 | #Example assembly for searching the auxiliary vector 259 | #TODO: this commented assembly needs to be updated, as it's still (mostly) 32-bit code 260 | ''' 261 | mov [esp-4],esi ;I think there's no need to save these, but in case somehow the 262 | mov [esp-8],ecx ;linker leaves something of interest for _start, let's save them 263 | mov esi,[esp] ;Retrieve argc 264 | mov ecx,esp ;Retrieve address of argc 265 | lea ecx,[ecx+esi*4+4] ;Skip argv 266 | loopenv: ;Iterate through each environment variable 267 | add ecx,4 ;The first loop skips over the NULL after argv 268 | mov esi,[ecx] ;Retrieve environment variable 269 | test esi,esi ;Check whether it is NULL 270 | jnz loopenv ;If not, continue through environment vars 271 | add ecx,4 ;Hop over 0 byte to first entry 272 | loopaux: ;Iterate through auxiliary vector, looking for AT_SYSINFO (32) 273 | mov esi,[ecx] ;Retrieve the type field of this entry 274 | cmp esi,32 ;Compare to 32, the entry we want 275 | jz foundsysinfo ;Found it 276 | test esi,esi ;Check whether we found the entry signifying the end of auxv 277 | jz restore ;Go to _start if we reach the end 278 | add ecx,8 ;Each entry is 8 bytes; go to next 279 | jmp loopaux 280 | foundsysinfo: 281 | mov esi,[ecx+4] ;Retrieve sysinfo address 282 | mov [sysinfo],esi ;Save address 283 | restore: 284 | mov esi,[esp-4] 285 | mov ecx,[esp-8] 286 | push global_mapping ;Push address of global mapping for popgm 287 | call popgm 288 | ;place restoretext here if we need to restore .text 289 | add esp,4 ;Pop address of global mapping 290 | jmp realstart 291 | 292 | ;restoretext 293 | mov BYTE PTR [gs:%s],0 ;Restore flag to original state 294 | push rax ;Save registers required for syscall 295 | push rdi 296 | push rsi 297 | push rdx 298 | mov rax, 10 ;sys_mprotect 299 | mov rdi, text_base ;Location of start of text section (rounded down to nearest page size) 300 | mov rsi, 4096 ;One page 301 | mov rdx, 7 ;rwx 302 | syscall ;Make page writable 303 | mov rax, 0 ;Use rax as an index (starting at an offset that skips plt entries and other things preceding .text) 304 | mov rsi, saved_text_addr;Use rsi as a base address (address of the saved first page) (global lookup address - offset) 305 | mov rdi, text_addr ;Load actual text section location 306 | looprestore: 307 | mov rdx, [rsi+rax] ;Load 8 bytes from saved .text page 308 | mov [rdi+rax], rdx ;Restore this data 309 | add rax,8 ;Move index forward 8 bytes 310 | cmp rax,page_end ;If less than 4096-text_offset, continue looping 311 | jb looprestore 312 | mov rax, 10 ;sys_mprotect 313 | mov rdi, text_base ;Location of start of text section (rounded down to nearest page size) 314 | mov rsi, 4096 ;One page 315 | mov rdx, 5 ;r-x 316 | syscall ;Remove writable permission 317 | pop rdx ;Restore registers required for syscall 318 | pop rsi 319 | pop rdi 320 | pop rax 321 | ret 322 | ''' 323 | auxvec_template = ''' 324 | mov [rsp-8],rsi 325 | mov [rsp-16],rcx 326 | mov rsi,[rsp] 327 | mov rcx,rsp 328 | lea rcx,[rcx+rsi*8+8] 329 | loopenv: 330 | add rcx,8 331 | mov rsi,[rcx] 332 | test rsi,rsi 333 | jnz loopenv 334 | add rcx,8 335 | loopaux: 336 | mov rsi,[rcx] 337 | cmp rsi,32 338 | jz foundsysinfo 339 | test rsi,rsi 340 | jz restore 341 | add rcx,16 342 | jmp loopaux 343 | foundsysinfo: 344 | mov rsi,[rcx+8] 345 | mov [%s],rsi 346 | restore: 347 | mov rsi,[rsp-8] 348 | mov rcx,[rsp-16] 349 | push %s 350 | call [rsp] 351 | add rsp,8 352 | %s 353 | mov QWORD PTR [rsp-16], %s 354 | jmp [rsp-16]''' 355 | restoretext = ''' 356 | push rax 357 | push rdi 358 | push rsi 359 | push rdx 360 | mov rax, 10 361 | mov rdi, %s 362 | mov rsi, 4096 363 | mov rdx, 7 364 | syscall 365 | mov rax, 0 366 | mov rsi, %s 367 | mov rdi, %s 368 | looprestore: 369 | mov rdx, [rsi+rax] 370 | mov [rdi+rax], rdx 371 | add rax,8 372 | cmp rax,%s 373 | jb looprestore 374 | mov rax, 10 375 | mov rdi, %s 376 | mov rsi, 4096 377 | mov rdx, 5 378 | syscall 379 | pop rdx 380 | pop rsi 381 | pop rdi 382 | pop rax 383 | ''' % ( (self.context.oldbase/0x1000)*0x1000, self.context.global_lookup - 0x20000, self.context.oldbase, 0x1000-(self.context.oldbase%0x1000), (self.context.oldbase/0x1000)*0x1000 ) 384 | 385 | return _asm(auxvec_template%(self.context.global_sysinfo,self.context.global_lookup+self.context.popgm_offset,restoretext if self.context.move_phdrs_to_text else '',self.context.newbase+entry)) 386 | 387 | def get_popgm_code(self): 388 | #pushad and popad do NOT exist in x64, 389 | #so we must choose which registers must be preserved at program start 390 | #TODO: For now we skip actually calling popgm, because it will have to be 391 | #completely re-engineered, so we will need to change the offset to 0x11 392 | #once we have fixed popgm for x64 393 | call_popgm = ''' 394 | push rax 395 | push rcx 396 | push rdx 397 | push rbx 398 | push rbp 399 | push rsi 400 | push rdi 401 | mov rdi, %s 402 | call $+0x0d 403 | pop rdi 404 | pop rsi 405 | pop rbp 406 | pop rbx 407 | pop rdx 408 | pop rcx 409 | pop rax 410 | ret 411 | ''' 412 | popgmbytes = asm(call_popgm%(self.context.global_sysinfo+8)) 413 | with open('x64_%s' % self.context.popgm) as f: 414 | popgmbytes+=f.read() 415 | return popgmbytes 416 | 417 | def get_global_mapping_bytes(self): 418 | #TODO: support global mapping 419 | globalbytes = self.get_global_lookup_code() 420 | #globalbytes+='\0' #flag field 421 | globalbytes += self.get_popgm_code() 422 | globalbytes += '\0\0\0\0\0\0\0\0' #sysinfo field 423 | # Global mapping (0x6000 0x00 bytes). This contains space for 1024 entries: 424 | # 8 * 3 = 24 bytes per entry * 1024 entries = 0x6000 (24576) bytes. If a binary 425 | # has more than 1024 libraries, the program will most likely segfault. 426 | globalbytes += '\x00'*0x6000 427 | # Allocate extra space for any additional global variables that 428 | # instrumentation code might require 429 | if self.context.alloc_globals > 0: 430 | globalbytes += '\x00'*self.context.alloc_globals 431 | return globalbytes 432 | -------------------------------------------------------------------------------- /x64_translator.py: -------------------------------------------------------------------------------- 1 | from x64_assembler import asm,cache,metacache 2 | from capstone.x86 import X86_OP_REG,X86_OP_MEM,X86_OP_IMM 3 | import struct 4 | import re 5 | from translator import Translator 6 | 7 | class X64Translator(Translator): 8 | 9 | def __init__(self,before_callback,context): 10 | self.before_inst_callback = before_callback 11 | self.context = context 12 | self.memory_ref_string = re.compile(u'^qword ptr \[rip \+ (?P0x[0-9a-z]+)\]$') 13 | self.rip_with_offset = re.compile(u'\[rip(?: (?P[\+\-] [0x]?[0-9a-z]+))?\]') #Apparently the hex prefix is optional if the number is...unambiguous? 14 | # Pre-populate this instruction in the metacache so we can avoid rewriting variations of it 15 | metacache[' lea rbx,[rip]'] = 3 16 | metacache[' lea rbx,[rip]'] = 3 17 | #From Brian's Static_phase.py 18 | self.JCC = ['jo','jno','js','jns','je','jz','jne','jnz','jb','jnae', 19 | 'jc','jnb','jae','jnc','jbe','jna','ja','jnbe','jl','jnge','jge', 20 | 'jnl','jle','jng','jg','jnle','jp','jpe','jnp','jpo','jrcxz','jecxz'] 21 | 22 | def replace_rip(self,ins,mapping,newlen): 23 | code = b'' 24 | # In the main binary, we technically do not need to use rip; 25 | # since we know the location our main binary code will be at, 26 | # we can replace it with an absolute address. HOWEVER, if we want 27 | # to support position-independent main binaries, and if we don't 28 | # want to have to re-assemble any instructions that our assembler 29 | # cannot currently handle correctly (such as ljmp), then it is better 30 | # to simply replace rip in the same way as in shared objects. 31 | # 32 | # For shared objects we *need* to use rip, but calculate 33 | # (rip - (newbase + after new instruction address)) + address after old instruction 34 | # or (rip + ( (address after old instruction) - (newbase + after new instruction address) ) ) 35 | # The goal is to compute the value rip WOULD have had if the original binary were run, and replace 36 | # rip with that value, derived from the NEW value in rip... 37 | match = self.rip_with_offset.search(ins.op_str) #TODO: all this new stuff with the match and then the assembler optimization 38 | if mapping is not None: 39 | #print 'rewriting %s instruction with rip: %s %s' % (ins.mnemonic,ins.mnemonic,ins.op_str) 40 | oldoffset = 0 #Assume at first that there is no offset from rip 41 | if match.group('offset') != None: 42 | #print 'match on offset: %s' % match.group('offset') 43 | oldoffset = int(match.group('offset'), 16) 44 | oldaddr = ins.address + len(ins.bytes) 45 | # For completely rewritten instructions, the new length will indeed change, because the original instruction 46 | # may be rewritten into multiple instructions, with potentially many instructions inserted before the one 47 | # that references rip. Because an instruction referring to rip has it pointing after that instruction, we need 48 | # the length of all code preceding it and then the length of the new instruction referencing rip to know the 49 | # *real* new address. Then we can determine the offset between them and add the old offset, thereby giving our new offset. 50 | # All instructions may potentially have code inserted before them, so we will always need this new length. 51 | newaddr = mapping[ins.address] + newlen 52 | newoffset = (oldaddr - (self.context.newbase + newaddr)) + oldoffset 53 | newopstr = '' 54 | # If the new offset cannot be encoded in 4 bytes, replace it with a placeholder 55 | if newoffset <= -0x80000000 or newoffset >= 0x7fffffff: 56 | print 'WARNING: unencodable offset for instruction @ 0x%x: %x' % (ins.address,newoffset) 57 | newoffset = -0x7faddead 58 | # Check whether it's negative so we can prefix with 0x even with negative numbers 59 | if newoffset < 0: 60 | newopstr = self.rip_with_offset.sub('[rip - 0x%x]' % -newoffset, ins.op_str) 61 | else: 62 | newopstr = self.rip_with_offset.sub('[rip + 0x%x]' % newoffset, ins.op_str) 63 | #print 'Old offset: 0x%x / Old address: 0x%x / New address: 0x%x / New base: 0x%x' % (oldoffset,oldaddr,newaddr,self.context.newbase) 64 | #print 'New instruction: %s %s' % (ins.mnemonic,newopstr) 65 | return newopstr 66 | else: 67 | #Placeholder until we know the new instruction location 68 | newopstr = self.rip_with_offset.sub('[rip]', ins.op_str) 69 | #print 'rewriting %s instruction with rip: %s %s' % (ins.mnemonic,ins.mnemonic,ins.op_str) 70 | #print 'assembling %s %s' % (ins.mnemonic, newopstr) 71 | #print 'instruction is %s' % str(ins.bytes[:-4] + (b'\0'*4)).encode('hex') 72 | newins = '%s %s' % (ins.mnemonic, newopstr) 73 | # Pre-populate cache with version of this instruction with NO offset; this means we never have to call assembler for this instruction. 74 | # The assembler can just replace the offset, which we assume is the last 4 bytes in the instruction 75 | if newins not in cache: 76 | # Only add to the cache ONCE. If you keep adding to the cache, some instructions have prefixes that ALTER the base instruction length 77 | # for that instruction with no offset. Therefore, if another instruction comes along with the same mnemonic and opstring, but containing 78 | # a different number of garbage prefixes before it, then the length of these instructions fluctuates, throwing off all the careful alignment 79 | # required for mapping these instructions. Due to these garbage prefixes, some instructions may increase by a few bytes and semantics could 80 | # potentially, theoretically be altered, but this could be solved with a better assembler or disassembler. 81 | # --- 82 | # The displacement size and offset are not easily obtainable in the current version of capstone, so this requires a customized version that 83 | # provides access to this data. With this, we can determine exactly the position of the displacement and replace it 84 | disp_size = ins._detail.arch.x86.encoding.disp_size 85 | disp_offset = ins._detail.arch.x86.encoding.disp_offset 86 | # We will only automatically replace 4-byte displacements, because smaller ones will very likely not fit the new displacement, and 4-byte 87 | # displacements are much more common. This means we will need to re-assemble any instructions that do not have a 4-byte displacement, however. 88 | if disp_size == 4: 89 | metacache[newins] = disp_offset # Save displacement offset for assembler 90 | # Populate version in cache with the instruction with a displacement of all 0s. Leave the immediate value (if there is one) intact. 91 | cache[newins] = ins.bytes[:disp_offset] + (b'\0'*4) + ins.bytes[disp_offset+disp_size:] 92 | else: 93 | # TODO: Changing the instruction to use a larger displacement WILL change the instruction length, and thus WILL result in an incorrect new 94 | # displacement as we calculate it now. This needs to be fixed to use the correct new displacement as it would be calculated after knowing 95 | # the new instruction length. 96 | print 'WARNING: instruction %s has small displacement: %d'%(newins,disp_size) 97 | return newopstr 98 | 99 | def translate_one(self,ins,mapping): 100 | if ins.mnemonic in ['call','jmp']: #Unconditional jump 101 | return self.translate_uncond(ins,mapping) 102 | elif ins.mnemonic in self.JCC: #Conditional jump 103 | return self.translate_cond(ins,mapping) 104 | elif ins.mnemonic == 'ret': 105 | return self.translate_ret(ins,mapping) 106 | elif ins.mnemonic in ['retn','retf','repz']: #I think retn is not used in Capstone 107 | #print 'WARNING: unimplemented %s %s'%(ins.mnemonic,ins.op_str) 108 | return '\xf4\xf4\xf4\xf4' #Create obvious cluster of hlt instructions 109 | else: #Any other instruction 110 | inserted = self.before_inst_callback(ins) 111 | #Even for non-control-flow instructions, we need to replace all references to rip 112 | #with the address pointing directly after the instruction. 113 | #TODO: This will NOT work for shared libraries or any PIC, because it depends on 114 | #knowing the static instruction address. For all shared objects, we would need to 115 | #subtract off the offset between the original and new text; as long as the offset is 116 | #fixed, then we should be able to just precompute that offset, without it being affected 117 | #by the position of the .so code 118 | #TODO: abandon rewriting ljmp instructions for now because the assembler doesn't like them 119 | #and we haven't been rewriting their destinations anyway; if they *are* used, they were already 120 | #broken before this 121 | #TODO: I have also abandoned rewriting the following instructions because I can't get it to 122 | #re-assemble with the current assembler: 123 | # fstp 124 | # fldenv 125 | # fld 126 | #TODO: Since I am now doing a crazy optimization in which I use the original instruction's bytes 127 | #and only change the last 4 bytes (the offset), I should actually be able to support these incompatible 128 | #instructions by saving their original bytes in the assembler cache and therefore never actually sending 129 | #the disassembled instruction to the assembler at all. 130 | incompatible = ['ljmp', 'fstp', 'fldenv', 'fld', 'fbld'] 131 | if 'rip' in ins.op_str:# and (ins.mnemonic not in incompatible): 132 | '''asm1 = asm( '%s %s' % (ins.mnemonic, self.replace_rip(ins,mapping) ) ) 133 | asm2 = asm( '%s %s' % (ins.mnemonic, self.replace_rip(ins,None) ) ) 134 | if len(asm1) != len(asm2): 135 | print '%s %s @ 0x%x LENGTH FAIL1: %s vs %s' % (ins.mnemonic, ins.op_str, ins.address, str(asm1).encode('hex'), str(asm2).encode('hex') ) 136 | newone = len( asm( '%s %s' % (ins.mnemonic, self.replace_rip(ins,mapping) ) ) ) 137 | oldone = len( asm( '%s %s' % (ins.mnemonic, self.replace_rip(ins,None) ) ) ) 138 | print '%d vs %d, %d vs %d' % (newone,oldone,len(asm1),len(asm2))''' 139 | code = b'' 140 | if inserted is not None: 141 | code = asm( '%s %s' % (ins.mnemonic, self.replace_rip(ins,mapping,len(inserted) + len(ins.bytes) ) ) ) 142 | code = inserted + code 143 | else: 144 | code = asm( '%s %s' % (ins.mnemonic, self.replace_rip(ins,mapping,len(ins.bytes) ) ) ) 145 | return code 146 | else: 147 | '''if 'rip' in ins.op_str and (ins.mnemonic in incompatible): 148 | print 'NOT rewriting %s instruction with rip: %s %s' % (ins.mnemonic,ins.mnemonic,ins.op_str) 149 | if ins.mnemonic == 'ljmp': 150 | print 'WARNING: unhandled %s %s @ %x'%(ins.mnemonic,ins.op_str,ins.address)''' 151 | if inserted is not None: 152 | return inserted + str(ins.bytes) 153 | return None #No translation needs to be done 154 | 155 | def translate_ret(self,ins,mapping): 156 | ''' 157 | mov [esp-28], eax ;save old eax value 158 | pop eax ;pop address from stack from which we will get destination 159 | call $+%s ;call lookup function 160 | mov [esp-4], eax ;save new eax value (destination mapping) 161 | mov eax, [esp-32] ;restore old eax value (the pop has shifted our stack so we must look at 28+4=32) 162 | jmp [esp-4] ;jmp/call to new address 163 | ''' 164 | template_before = ''' 165 | mov [rsp-56], rax 166 | pop rax 167 | ''' 168 | template_after = ''' 169 | call $+%s 170 | %s 171 | mov [rsp-8], rax 172 | mov rax, [rsp-%d] 173 | jmp [rsp-8] 174 | ''' 175 | self.context.stat['ret']+=1 176 | code = b'' 177 | inserted = self.before_inst_callback(ins) 178 | if inserted is not None: 179 | code += inserted 180 | # Since thunks do not need to be used for 64-bit code, there is no specific 181 | # place we need to treat as a special case. It is unlikely that code will 182 | # try to use the pushed return address to obtain the instruction pointer 183 | # (after all, it can just access it directly!), but should it TRY to do this, 184 | # the program will crash! Thus the no_pic optimization is a heuristic that 185 | # won't work for some code (in this case only very unusual code?) 186 | if self.context.no_pic: # and ins.address != self.context.get_pc_thunk + 3: 187 | #Perform a normal return UNLESS this is the ret for the thunk. 188 | #Currently its position is hardcoded as three bytes after the thunk entry. 189 | code = asm( 'ret %s'%ins.op_str ) 190 | else: 191 | code = asm(template_before) 192 | size = len(code) 193 | lookup_target = b'' 194 | if self.context.exec_only: 195 | #Special lookup for not rewriting arguments when going outside new main text address space 196 | lookup_target = self.remap_target(ins.address,mapping,self.context.secondary_lookup_function_offset,size) 197 | else: 198 | lookup_target = self.remap_target(ins.address,mapping,self.context.lookup_function_offset,size) 199 | if ins.op_str == '': 200 | code+=asm(template_after%(lookup_target,'',64)) #64 because of the value we popped 201 | else: #For ret instructions that pop imm16 bytes from the stack, add that many bytes to esp 202 | pop_amt = int(ins.op_str,16) #We need to retrieve the right eax value from where we saved it 203 | code+=asm(template_after%(lookup_target,'add rsp,%d'%pop_amt,64+pop_amt)) 204 | return code 205 | 206 | def translate_cond(self,ins,mapping): 207 | self.context.stat['jcc']+=1 208 | patched = b'' 209 | inserted = self.before_inst_callback(ins) 210 | if inserted is not None: 211 | patched += inserted 212 | if ins.mnemonic in ['jrcxz','jecxz']: #These instructions have no long encoding (and jcxz is not allowed in 64-bit) 213 | jrcxz_template = ''' 214 | test rcx,rcx 215 | ''' 216 | jecxz_template = ''' 217 | test ecx,ecx 218 | ''' 219 | target = ins.operands[0].imm # int(ins.op_str,16) The destination of this instruction 220 | #newtarget = remap_target(ins.address,mapping,target,0) 221 | if ins.mnemonic == 'jrcxz': 222 | patched+=asm(jrcxz_template) 223 | else: 224 | patched+=asm(jecxz_template) 225 | newtarget = self.remap_target(ins.address,mapping,target,len(patched)) 226 | #print 'want %s, but have %s instead'%(remap_target(ins.address,mapping,target,len(patched)), newtarget) 227 | #Apparently the offset for jcxz and jecxz instructions may have been wrong? How did it work before? 228 | patched += asm('jz $+%s'%newtarget) 229 | #print 'code length: %d'%len(patched) 230 | 231 | #TODO: some instructions encode to 6 bytes, some to 5, some to 2. How do we know which? 232 | #For example, for CALL, it seems to only be 5 or 2 depending on offset. 233 | #But for jg, it can be 2 or 6 depending on offset, I think because it has a 2-byte opcode. 234 | #while len(patched) < 6: #Short encoding, which we do not want 235 | # patched+='\x90' #Add padding of NOPs 236 | #The previous commented out code wouldn't even WORK now, since we insert another instruction 237 | #at the MINIMUM. I'm amazed the jcxz/jecxz code even worked at all before 238 | else: 239 | target = ins.operands[0].imm # int(ins.op_str,16) The destination of this instruction 240 | newtarget = self.remap_target(ins.address,mapping,target,len(patched)) 241 | patched+=asm(ins.mnemonic + ' $+' + newtarget) 242 | #TODO: some instructions encode to 6 bytes, some to 5, some to 2. How do we know which? 243 | #For example, for CALL, it seems to only be 5 or 2 depending on offset. 244 | #But for jg, it can be 2 or 6 depending on offset, I think because it has a 2-byte opcode. 245 | #while len(patched) < 6: #Short encoding, which we do not want 246 | # patched+='\x90' #Add padding of NOPs 247 | return patched 248 | 249 | def translate_uncond(self,ins,mapping): 250 | op = ins.operands[0] #Get operand 251 | if op.type == X86_OP_REG: # e.g. call eax or jmp ebx 252 | target = ins.reg_name(op.reg) 253 | return self.get_indirect_uncond_code(ins,mapping,target) 254 | elif op.type == X86_OP_MEM: # e.g. call [eax + ecx*4 + 0xcafebabe] or jmp [ebx+ecx] 255 | target = ins.op_str 256 | return self.get_indirect_uncond_code(ins,mapping,target) 257 | elif op.type == X86_OP_IMM: # e.g. call 0xdeadbeef or jmp 0xcafebada 258 | target = op.imm 259 | code = b'' 260 | inserted = self.before_inst_callback(ins) 261 | if inserted is not None: 262 | code += inserted 263 | # Again, there is no thunk special case for 64-bit code 264 | if self.context.no_pic: # and target != self.context.get_pc_thunk: 265 | #push nothing if no_pic UNLESS it's the thunk 266 | #We only support DIRECT calls to the thunk 267 | if ins.mnemonic == 'call': 268 | self.context.stat['dircall']+=1 269 | else: 270 | self.context.stat['dirjmp']+=1 271 | elif ins.mnemonic == 'call': #If it's a call, push the original address of the next instruction 272 | self.context.stat['dircall']+=1 273 | exec_call = ''' 274 | push %s 275 | ''' 276 | so_call = ''' 277 | push rbx 278 | lea rbx,[rip - 0x%x] 279 | xchg rbx,[rsp] 280 | ''' 281 | if self.context.write_so: 282 | if mapping is not None: 283 | # 8 is the length of push rbx;lea rbx,[rip-%s] 284 | code += asm(so_call%( (self.context.newbase+(mapping[ins.address]+8)) - (ins.address+len(ins.bytes)) ) ) 285 | else: 286 | code += asm(so_call%( (self.context.newbase) - (ins.address+len(ins.bytes)) ) ) 287 | else: 288 | code += asm(exec_call%(ins.address+len(ins.bytes))) 289 | else: 290 | self.context.stat['dirjmp']+=1 291 | newtarget = self.remap_target(ins.address,mapping,target,len(code)) 292 | #print "(pre)new length: %s"%len(callback_code) 293 | #print "target: %s"%hex(target) 294 | #print "newtarget: %s"%newtarget 295 | # Again, there is no thunk special case for 64-bit code 296 | if self.context.no_pic: # and target != self.context.get_pc_thunk: 297 | code += asm( '%s $+%s'%(ins.mnemonic,newtarget) ) 298 | else: 299 | patched = asm('jmp $+%s'%newtarget) 300 | if len(patched) == 2: #Short encoding, which we do not want 301 | patched+='\x90\x90\x90' #Add padding of 3 NOPs 302 | code += patched 303 | #print "new length: %s"%len(callback_code+patched) 304 | return code 305 | return None 306 | 307 | def get_indirect_uncond_code(self,ins,mapping,target): 308 | #Commented assembly 309 | ''' 310 | mov [esp-28], eax ;save old eax value (very far above the stack because of future push/call) 311 | mov eax, %s ;read location in memory from which we will get destination 312 | %s ;if a call, we push return address here 313 | call $+%s ;call lookup function 314 | mov [esp-4], eax ;save new eax value (destination mapping) 315 | mov eax, [esp-%s] ;restore old eax value (offset depends on whether return address pushed) 316 | jmp [esp-4] ;jmp to new address 317 | ''' 318 | #If the argument is an offset from rip, then we must change the reference to rip. Any rip-relative 319 | #addressing is destroyed because all the offsets are completely different; we need the 320 | #original address that rip WOULD have pointed to, so we must replace any references to it. 321 | template_before = ''' 322 | mov [rsp-64], rax 323 | mov rax, %s 324 | %s 325 | ''' 326 | exec_call = ''' 327 | push %s 328 | ''' 329 | so_call_before = ''' 330 | push rbx 331 | ''' 332 | so_call_after = ''' 333 | lea rbx,[rip - 0x%x] 334 | xchg rbx,[rsp] 335 | ''' 336 | template_after = ''' 337 | call $+%s 338 | mov [rsp-8], rax 339 | mov rax, [rsp-%s] 340 | jmp [rsp-8] 341 | ''' 342 | template_nopic = ''' 343 | call $+%s 344 | mov [rsp-8], rax 345 | mov rax, [rsp-%s] 346 | %s [rsp-8] 347 | ''' 348 | #TODO: This is somehow still the bottleneck, so this needs to be optimized 349 | code = b'' 350 | if self.context.exec_only: 351 | code += self.get_remap_callbacks_code(ins,mapping,target) 352 | #NOTE: user instrumentation code comes after callbacks code. No particular reason to put it either way, 353 | #other than perhaps consistency, but for now this is easier. 354 | inserted = self.before_inst_callback(ins) 355 | if inserted is not None: 356 | code += inserted 357 | #Replace references to rip with the original address after this instruction so that we 358 | #can look up the new address using the original 359 | if 'rip' in target: 360 | '''if len( asm( '%s %s' % (ins.mnemonic, self.replace_rip(ins,mapping) ) ) ) != len( asm( '%s %s' % (ins.mnemonic, self.replace_rip(ins,None) ) ) ): 361 | print '%s %s @ 0x%x LENGTH FAIL2: %s vs %s' % (ins.mnemonic, ins.op_str, ins.address, str(asm('%s %s' % (ins.mnemonic, self.replace_rip(ins,mapping) ))).encode('hex'), str(asm('%s %s' % (ins.mnemonic, self.replace_rip(ins,None)) )).encode('hex') ) 362 | newone = len( asm( '%s %s' % (ins.mnemonic, self.replace_rip(ins,mapping) ) ) ) 363 | oldone = len( asm( '%s %s' % (ins.mnemonic, self.replace_rip(ins,None) ) ) ) 364 | print '%d vs %d, %s' % (newone,oldone,newone == oldone)''' 365 | # The new "instruction length" is the length of all preceding code, plus the instructions up through the one referencing rip 366 | target = self.replace_rip(ins,mapping,len(code) + len(asm('mov [rsp-64],rax\nmov rax,[rip]')) ) 367 | if self.context.no_pic: 368 | if ins.mnemonic == 'call': 369 | self.context.stat['indcall']+=1 370 | else: 371 | self.context.stat['indjmp']+=1 372 | code += asm( template_before%(target,'') ) 373 | elif ins.mnemonic == 'call': 374 | self.context.stat['indcall']+=1 375 | if self.context.write_so: 376 | code += asm( template_before%(target,so_call_before) ) 377 | if mapping is not None: 378 | # 7 is the length of the lea rbx,[rip-%s] instruction, which needs to be added to the length of the code preceding where we access RIP 379 | code += asm(so_call_after%( (mapping[ins.address]+len(code)+7+self.context.newbase) - (ins.address+len(ins.bytes)) ) ) 380 | else: 381 | code += asm(so_call_after%( (0x8f+self.context.newbase) - (ins.address+len(ins.bytes)) ) ) 382 | else: 383 | code += asm(template_before%(target,exec_call%(ins.address+len(ins.bytes)) )) 384 | else: 385 | self.context.stat['indjmp']+=1 386 | code += asm(template_before%(target,'')) 387 | size = len(code) 388 | lookup_target = self.remap_target(ins.address,mapping,self.context.lookup_function_offset,size) 389 | #Always transform an unconditional control transfer to a jmp, but 390 | #for a call, insert a push instruction to push the original return address on the stack. 391 | #At runtime, our rewritten ret will look up the right address to return to and jmp there. 392 | #If we push a value on the stack, we have to store even FURTHER away from the stack. 393 | #Note that calling the lookup function can move the stack pointer temporarily up to 394 | #20 bytes, which will obliterate anything stored too close to the stack pointer. That, plus 395 | #the return value we push on the stack, means we need to put it at least 28 bytes away. 396 | if self.context.no_pic: 397 | #Change target to secondary lookup function instead 398 | lookup_target = self.remap_target(ins.address,mapping,self.context.secondary_lookup_function_offset,size) 399 | code += asm( template_nopic%(lookup_target,64,ins.mnemonic) ) 400 | elif ins.mnemonic == 'call': 401 | code += asm(template_after%(lookup_target,56)) 402 | else: 403 | code += asm(template_after%(lookup_target,64)) 404 | return code 405 | 406 | def get_remap_callbacks_code(self,ins,mapping,target): 407 | '''Checks whether the target destination (expressed as the opcode string from a jmp/call instruction) 408 | is in the got, then checks if it matches a function with callbacks. It then rewrites the 409 | addresses if necessary. This will *probably* always be from jmp instructions in the PLT. 410 | NOTE: This assumes it does not have any code inserted before it, and that it comprises 411 | the first special instructions inserted for an instruction.''' 412 | if self.memory_ref_string.match(target): 413 | match = self.memory_ref_string.match(target) 414 | #Add address of instruction after this one and the offset to get destination 415 | address = (ins.address + len(ins.bytes)) + int(match.group('offset'), 16) 416 | if address in self.context.plt['entries']: 417 | if self.context.plt['entries'][address] in self.context.callbacks: 418 | print 'Found library call with callbacks: %s'%self.context.plt['entries'][address] 419 | return self.get_callback_code( ins.address, mapping, self.context.callbacks[self.context.plt['entries'][address]] ) 420 | return b'' 421 | 422 | def get_callback_code(self,address,mapping,cbargs): 423 | '''Remaps each callback argument based on index. cbargs is an array of argument indices 424 | that let us know which argument (a register in x64) we must rewrite. 425 | We insert code for each we must rewrite.''' 426 | arg_registers = ['rdi','rsi','rdx','rcx','r8','r9'] #Order of arguments in x86-64 427 | callback_template_before = ''' 428 | mov rax, %s 429 | ''' 430 | callback_template_after = ''' 431 | call $+%s 432 | mov %s, rax 433 | ''' 434 | code = asm('push rax') #Save rax, use to hold callback address 435 | for ind in cbargs: 436 | #Move value in register for that argument to rax 437 | cb_before = callback_template_before%( arg_registers[ind] ) 438 | code += asm(cb_before) #Assemble this part first so we will know the offset to the lookup function 439 | size = len(code) 440 | #Use secondary lookup function so it won't try to rewrite arguments if the callback is outside the main binary 441 | lookup_target = self.remap_target( address, mapping, self.context.secondary_lookup_function_offset, size ) 442 | cb_after = callback_template_after%( lookup_target, arg_registers[ind] ) 443 | code += asm(cb_after) #Save the new address over the original 444 | code += asm('pop rax') #Restore rax 445 | return code 446 | 447 | def in_plt(self,target): 448 | return target in range(self.context.plt['addr'],self.context.plt['addr']+self.context.plt['size']) 449 | 450 | '''def get_plt_entry(self,target): 451 | #It seems that an elf does not directly give a mapping from each entry in the plt. 452 | #Instead, it maps from the got entries instead, making it unclear exactly where objdump 453 | #gets the information. For our purposes, since all the entries in the plt jump to the got 454 | #entry, we can read the destination address from the jmp instruction. 455 | #TODO: ensure works for x64 456 | offset = target - self.context.plt['addr'] #Get the offset into the plt 457 | #TODO: The following assumes an absolute jmp, whereas I believe it is a rip-relative jmp in x64 458 | dest = self.context.plt['data'][offset+2:offset+2+4] #Get the four bytes of the GOT address 459 | dest = struct.unpack('e[a-z][a-z])( )?[+-]( )?(0x)?[0-9a-f]+\][ ]*') 11 | pat5 = re.compile('(0x[0-9a-f]+|[0-9]+)') 12 | pat6 = re.compile('[ ]*(?P(add)|(sub)) (?P(esp)|(ebx)),(?P[0-9]*)[ ]*') 13 | pat7 = re.compile('[ ]*mov eax, word ptr.*')#Match stupid size mismatch 14 | pat8 = re.compile('[ ]*mov eax, .[xip]')#Match ridiculous register mismatch 15 | 16 | #jcxz and jecxz are removed because they don't have a large expansion 17 | JCC = ['jo','jno','js','jns','je','jz','jne','jnz','jb','jnae', 18 | 'jc','jnb','jae','jnc','jbe','jna','ja','jnbe','jl','jnge','jge', 19 | 'jnl','jle','jng','jg','jnle','jp','jpe','jnp','jpo'] 20 | 21 | #Simple cache code. Called after more complex preprocessing of assembly source. 22 | def _asm(text): 23 | if text in cache: 24 | return cache[text] 25 | else: 26 | with open('uncached.txt','a') as f: 27 | f.write(text+'\n') 28 | code = pwn.asm(text) 29 | cache[text] = code 30 | return code 31 | 32 | def asm(text): 33 | code = b'' 34 | for line in text.split('\n'): 35 | if not line.find(';') == -1: 36 | line = line[:line.find(';')]#Eliminate comments 37 | #Check for offsets ($+) 38 | match = pat.search(line) 39 | if match and match.group() != '$+0x8f': 40 | off = int(match.group()[2:],16) 41 | line = line.strip() 42 | mnemonic = line[:line.find(' ')] 43 | line = pat.sub('$+0x8f',line) #Replace actual offset with dummy 44 | newcode = _asm(line) #Assembled code with dummy offset 45 | if mnemonic in ['jmp','call']: 46 | off-=5 #Subtract 5 because the large encoding knows it's 5 bytes long 47 | newcode = newcode[0]+struct.pack(']" to "mov eax, dword ptr []"' 60 | code+=b'\xa1' + struct.pack(' 0x7f: 79 | line = pat5.sub('0x8f',line) 80 | original = struct.pack(' 0x7f: 103 | newcode = _asm('%s %s,0x8f'%(mnemonic,register) ) 104 | newcode = newcode[:2] + struct.pack(']" to "mov eax, dword ptr []"' 114 | code+=_asm(line.replace(' word',' dword')) 115 | elif pat8.match(line): 116 | print 'WARNING: silently converting "mov eax, [xip]" to "mov eax, e[xip]"' 117 | code+=_asm(line.replace(', ',', e')) 118 | else: 119 | code+=_asm(line) 120 | return code 121 | 122 | def oldasm(text): 123 | if 'mov [esp-16], eax\n mov eax, ' in text: 124 | print text 125 | if not pat3.search(text): 126 | print str(pwn.asm(text)).encode('hex') 127 | text2 = ''' 128 | mov [esp-16], eax 129 | mov eax, dword ptr [eax*4 + 0x80597bc] 130 | ''' 131 | print str(pwn.asm(text2)).encode('hex') 132 | raise Exception 133 | if '$+' in text: 134 | code = b'' 135 | for line in text.split('\n'): 136 | match = pat.search(line) 137 | if match and match.group() != '$+0x8f': 138 | #print 'ORIGINAL: %s'%line 139 | #print 'MATCH %s'%match.group() 140 | off = int(match.group()[2:],16) 141 | #print 'offset %x'%off 142 | line = line.strip() 143 | mnemonic = line[:line.find(' ')] 144 | #print 'mnemonic %s'%mnemonic 145 | #before = _asm(line) 146 | #print 'BEFORE: %s'%before.encode('hex') 147 | line = pat.sub('$+0x8f',line) #Replace actual offset with dummy 148 | newcode = _asm(line) #Assembled code with dummy offset 149 | #print 'DUMMY: %s'%newcode.encode('hex') 150 | if mnemonic in ['jmp','call']: 151 | off-=5 #Subtract 5 because the large encoding knows it's 5 bytes long 152 | newcode = newcode[0]+struct.pack(' 21 | #include 22 | #include 23 | #include 24 | #include 25 | #include 26 | #else 27 | #define NULL ( (void *) 0) 28 | #endif 29 | 30 | unsigned int __attribute__ ((noinline)) my_read(int, char *, unsigned int); 31 | int __attribute__ ((noinline)) my_open(const char *); 32 | void populate_mapping(unsigned int, unsigned int, unsigned int, unsigned int *); 33 | void process_maps(char *, unsigned int *); 34 | unsigned int lookup(unsigned int, unsigned int *); 35 | 36 | #ifdef DEBUG 37 | int wrapper(unsigned int *global_mapping){ 38 | #else 39 | int _start(void *global_mapping){ 40 | #endif 41 | // force string to be stored on the stack even with optimizations 42 | //char maps_path[] = "/proc/self/maps\0"; 43 | volatile int maps_path[] = { 44 | 0x6f72702f, 45 | 0x65732f63, 46 | 0x6d2f666c, 47 | 0x00737061, 48 | }; 49 | 50 | unsigned int buf_size = 0x10000; 51 | char buf[buf_size]; 52 | int proc_maps_fd; 53 | int cnt, offset = 0; 54 | 55 | 56 | proc_maps_fd = my_open((char *) &maps_path); 57 | cnt = my_read(proc_maps_fd, buf, buf_size); 58 | while( cnt != 0 && offset < buf_size ){ 59 | offset += cnt; 60 | cnt = my_read(proc_maps_fd, buf+offset, buf_size-offset); 61 | } 62 | buf[offset] = '\0';// must null terminate 63 | 64 | #ifdef DEBUG 65 | printf("READ:\n%s\n", buf); 66 | // simulation for testing - dont call process maps 67 | populate_mapping(0x08800000, 0x08880000, 0x07000000, global_mapping); 68 | /* 69 | int i; 70 | for (i = 0x08800000; i < 0x08880000; i++){ 71 | if (lookup(i, global_mapping) != 0x07000000){ 72 | printf("Failed lookup of 0x%08x\n", i); 73 | } 74 | } 75 | */ 76 | //chedck edge cases 77 | 78 | lookup(0x08800000-1, global_mapping); 79 | lookup(0x08800000, global_mapping); 80 | lookup(0x08880000+1, global_mapping); 81 | //printf("0x08812345 => 0x%08x\n", lookup(0x08812345, global_mapping)); 82 | #else 83 | process_maps(buf, global_mapping); 84 | #endif 85 | return 0; 86 | } 87 | 88 | #ifdef DEBUG 89 | unsigned int lookup(unsigned int addr, unsigned int *global_mapping){ 90 | unsigned int index = addr >> 12; 91 | //if (global_mapping[index] == 0xffffffff){ 92 | printf("0x%08x :: mapping[%d] :: &0x%p :: 0x%08x\n", addr, index, &(global_mapping[index]), global_mapping[index]); 93 | //} 94 | return global_mapping[index]; 95 | } 96 | #endif 97 | 98 | unsigned int __attribute__ ((noinline)) my_read(int fd, char *buf, unsigned int count){ 99 | unsigned int bytes_read; 100 | asm volatile( 101 | ".intel_syntax noprefix\n" 102 | "mov eax, 3\n" 103 | "mov ebx, %1\n" 104 | "mov ecx, %2\n" 105 | "mov edx, %3\n" 106 | "int 0x80\n" 107 | "mov %0, eax\n" 108 | : "=g" (bytes_read) 109 | : "g" (fd), "g" (buf), "g" (count) 110 | : "ebx", "esi", "edi" 111 | ); 112 | return bytes_read; 113 | } 114 | 115 | int __attribute__ ((noinline)) my_open(const char *path){ 116 | int fp; 117 | asm volatile( 118 | ".intel_syntax noprefix\n" 119 | "mov eax, 5\n" 120 | "mov ebx, %1\n" 121 | "mov ecx, 0\n" 122 | "mov edx, 0\n" 123 | "int 0x80\n" 124 | "mov %0, eax\n" 125 | : "=r" (fp) 126 | : "g" (path) 127 | : "ebx", "esi", "edi" 128 | ); 129 | return fp; 130 | } 131 | 132 | int is_exec(char *line){ 133 | // e.g., "08048000-08049000 r-xp ..." 134 | return line[20] == 'x'; 135 | } 136 | 137 | int is_write(char *line){ 138 | // e.g., "08048000-08049000 rw-p ..." 139 | return line[19] == 'w'; 140 | } 141 | 142 | char *next_line(char *line){ 143 | /* 144 | * finds the next line to process 145 | */ 146 | for (; line[0] != '\0'; line++){ 147 | if (line[0] == '\n'){ 148 | if (line[1] == '\0') 149 | return NULL; 150 | return line+1; 151 | } 152 | } 153 | return NULL; 154 | } 155 | 156 | unsigned int my_atoi(char *a){ 157 | /* 158 | * convert 8 byte hex string into its integer representation 159 | * assumes input is from /proc/./maps 160 | * i.e., 'a' is a left-padded 8 byte lowercase hex string 161 | * e.g., "0804a000" 162 | */ 163 | unsigned int i = 0; 164 | int place, digit; 165 | for (place = 7; place >= 0; place--, a++){ 166 | digit = (int)(*a) - 0x30; 167 | if (digit > 9) 168 | digit -= 0x27; // digit was [a-f] 169 | i += digit << (place << 2); 170 | } 171 | return i; 172 | } 173 | 174 | void parse_range(char *line, unsigned int *start, unsigned int *end){ 175 | // e.g., "08048000-08049000 ..." 176 | *start = my_atoi(line); 177 | *end = my_atoi(line+9); 178 | } 179 | 180 | void populate_mapping(unsigned int start, unsigned int end, unsigned int lookup_function, unsigned int *global_mapping){ 181 | unsigned int index = start >> 12; 182 | int i; 183 | for(i = 0; i < (end - start) / 0x1000; i++){ 184 | global_mapping[index + i] = lookup_function; 185 | } 186 | #ifdef DEBUG 187 | printf("Wrote %d entries\n", i); 188 | #endif 189 | } 190 | 191 | void process_maps(char *buf, unsigned int *global_mapping){ 192 | /* 193 | * Process buf which contains output of /proc/self/maps 194 | * populate global_mapping for each executable set of pages 195 | */ 196 | char *line = buf; 197 | //unsigned int global_start, global_end; 198 | unsigned int old_text_start, old_text_end; 199 | unsigned int new_text_start, new_text_end; 200 | 201 | //Assume global mapping is first entry at 0x7000000 and that there is nothing before 202 | //Skip global mapping 203 | line = next_line(line); 204 | do{ // process each block of maps 205 | // process all segments from this object under very specific assumptions 206 | if ( is_exec(line) ){ 207 | if( !is_write(line) ){ 208 | parse_range(line, &old_text_start, &old_text_end); 209 | }else{ 210 | parse_range(line, &new_text_start, &new_text_end); 211 | populate_mapping(old_text_start, old_text_end, new_text_start, global_mapping); 212 | } 213 | } 214 | line = next_line(line); 215 | } while(line != NULL); 216 | // assume the very last executable and non-writable segment is that of the dynamic linker (ld-X.X.so) 217 | // populate those ranges with the value 0x00000000 which will be compared against in the global lookup function 218 | populate_mapping(old_text_start, old_text_end, 0x00000000, global_mapping); 219 | } 220 | 221 | #ifdef DEBUG 222 | int main(void){ 223 | void *mapping_base = (void *)0x09000000; 224 | int fd = open("./map_shell", O_RDWR); 225 | void *global_mapping = mmap(mapping_base, 0x400000, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); 226 | if (global_mapping != mapping_base){ 227 | printf("failed to get requested base addr\n"); 228 | exit(1); 229 | } 230 | wrapper(global_mapping); 231 | 232 | return 0; 233 | } 234 | #endif 235 | 236 | -------------------------------------------------------------------------------- /x86_runtime.py: -------------------------------------------------------------------------------- 1 | from x86_assembler import _asm,asm 2 | 3 | class X86Runtime(object): 4 | def __init__(self,context): 5 | self.context = context 6 | 7 | def get_lookup_code(self,base,size,lookup_off,mapping_off): 8 | #Example assembly for lookup function 9 | ''' 10 | push edx 11 | mov edx,eax 12 | call get_eip 13 | get_eip: 14 | pop eax ;Get current instruction pointer 15 | sub eax,0x8248 ;Subtract offset from instruction pointer val to get new text base addr 16 | sub edx,0x8048000 ;Compare to start (exclusive) and set edx to an offset in the mapping 17 | jl outside ;Out of bounds (too small) 18 | cmp edx,0x220 ;Compare to end (inclusive) (note we are now comparing to the size) 19 | jge outside ;Out of bounds (too big) 20 | mov edx,[mapping+edx*4] ;Retrieve mapping entry (can't do this directly in generated func) 21 | cmp edx, 0xffffffff ;Compare to invalid entry 22 | je failure ;It was an invalid entry 23 | add eax,edx ;Add the offset of the destination to the new text section base addr 24 | pop edx 25 | ret 26 | outside: ;If the address is out of the mapping bounds, return original address 27 | add edx,0x8048000 ;Undo subtraction of base, giving us the originally requested address 28 | mov eax,edx ;Place the original request back in eax 29 | pop edx 30 | jmp global_lookup ;Check if global lookup can find this 31 | failure: 32 | hlt 33 | ''' 34 | lookup_template = ''' 35 | push ebx 36 | mov ebx,eax 37 | call get_eip 38 | get_eip: 39 | pop eax 40 | sub eax,%s 41 | %s 42 | jb outside 43 | cmp ebx,%s 44 | jae outside 45 | mov ebx,[eax+ebx*4+%s] 46 | cmp ebx, 0xffffffff 47 | je failure 48 | add eax,ebx 49 | pop ebx 50 | ret 51 | outside: 52 | %s 53 | mov eax,ebx 54 | pop ebx 55 | mov DWORD PTR [esp-32],%s 56 | jmp [esp-32] 57 | failure: 58 | hlt 59 | ''' 60 | exec_code = ''' 61 | sub ebx,%s 62 | ''' 63 | exec_restore = ''' 64 | add ebx,%s 65 | ''' 66 | exec_only_lookup = ''' 67 | lookup: 68 | push ebx 69 | mov ebx,eax 70 | call get_eip 71 | get_eip: 72 | pop eax 73 | sub eax,%s 74 | sub ebx,%s 75 | jb outside 76 | cmp ebx,%s 77 | jae outside 78 | mov ebx,[eax+ebx*4+%s] 79 | add eax,ebx 80 | pop ebx 81 | ret 82 | 83 | outside: 84 | add ebx,%s 85 | mov eax,[esp+8] 86 | call lookup 87 | mov [esp+8],eax 88 | mov eax,ebx 89 | pop ebx 90 | ret 91 | ''' 92 | #For an .so, it can be loaded at an arbitrary address, so we cannot depend on 93 | #the base address being in a fixed location. Therefore, we instead compute 94 | #the old text section's start address by using the new text section's offset 95 | #from it. The new text section's offset equals the lookup address and is 96 | #stored in eax. I use lea instead of add because it doesn't affect the flags, 97 | #which are used to determine if ebx is outside the range. 98 | so_code = ''' 99 | sub eax,%s 100 | sub ebx,eax 101 | lea eax,[eax+%s] 102 | ''' 103 | so_restore = ''' 104 | sub eax,%s 105 | add ebx,eax 106 | add eax,%s 107 | ''' 108 | #retrieve eip 8 bytes after start of lookup function 109 | if self.context.write_so: 110 | return _asm(lookup_template%(lookup_off+8,so_code%(self.context.newbase,self.context.newbase),size,mapping_off,so_restore%(self.context.newbase,self.context.newbase),self.context.global_lookup)) 111 | elif self.context.exec_only: 112 | return _asm( exec_only_lookup%(lookup_off+8,base,size,mapping_off,base) ) 113 | else: 114 | return _asm(lookup_template%(lookup_off+8,exec_code%base,size,mapping_off,exec_restore%base,self.context.global_lookup)) 115 | 116 | def get_secondary_lookup_code(self,base,size,sec_lookup_off,mapping_off): 117 | '''This secondary lookup is only used when rewriting only the main executable. It is a second, simpler 118 | lookup function that is used by ret instructions and does NOT rewrite a return address on the stack 119 | when the destination is outside the mapping. It instead simply returns the original address and that's 120 | it. The only reason I'm doing this by way of a secondary lookup is this should be faster than a 121 | a parameter passed at runtime, so I need to statically have an offset to jump to in the case of returns. 122 | This is a cleaner way to do it than split the original lookup to have two entry points.''' 123 | secondary_lookup = ''' 124 | lookup: 125 | push ebx 126 | mov ebx,eax 127 | call get_eip 128 | get_eip: 129 | pop eax 130 | sub eax,%s 131 | sub ebx,%s 132 | jb outside 133 | cmp ebx,%s 134 | jae outside 135 | mov ebx,[eax+ebx*4+%s] 136 | add eax,ebx 137 | pop ebx 138 | ret 139 | 140 | outside: 141 | add ebx,%s 142 | mov eax,ebx 143 | pop ebx 144 | ret 145 | ''' 146 | return _asm( secondary_lookup%(sec_lookup_off+8,base,size,mapping_off,base) ) 147 | 148 | def get_global_lookup_code(self): 149 | global_lookup_template = ''' 150 | cmp eax,[%s] 151 | jz sysinfo 152 | glookup: 153 | cmp BYTE PTR [gs:%s],1 154 | jz failure 155 | mov BYTE PTR [gs:%s],1 156 | push eax 157 | shr eax,12 158 | shl eax,2 159 | mov eax,[%s+eax] 160 | mov DWORD PTR [esp-32],eax 161 | cmp eax, 0xffffffff 162 | jz abort 163 | test eax,eax 164 | jz loader 165 | pop eax 166 | call [esp-36] 167 | mov BYTE PTR [gs:%s],0 168 | ret 169 | loader: 170 | mov BYTE PTR [gs:%s],0 171 | pop eax 172 | sysinfo: 173 | push eax 174 | mov eax,[esp+8] 175 | call glookup 176 | mov [esp+8],eax 177 | pop eax 178 | ret 179 | failure: 180 | hlt 181 | abort: 182 | hlt 183 | mov eax,1 184 | int 0x80 185 | ''' 186 | return _asm(global_lookup_template%(self.context.global_sysinfo,self.context.global_flag,self.context.global_flag,self.context.global_sysinfo+4,self.context.global_flag,self.context.global_flag)) 187 | 188 | def get_auxvec_code(self,entry): 189 | #Example assembly for searching the auxiliary vector 190 | ''' 191 | mov [esp-4],esi ;I think there's no need to save these, but in case somehow the 192 | mov [esp-8],ecx ;linker leaves something of interest for _start, let's save them 193 | mov esi,[esp] ;Retrieve argc 194 | mov ecx,esp ;Retrieve address of argc 195 | lea ecx,[ecx+esi*4+4] ;Skip argv 196 | loopenv: ;Iterate through each environment variable 197 | add ecx,4 ;The first loop skips over the NULL after argv 198 | mov esi,[ecx] ;Retrieve environment variable 199 | test esi,esi ;Check whether it is NULL 200 | jnz loopenv ;If not, continue through environment vars 201 | add ecx,4 ;Hop over 0 byte to first entry 202 | loopaux: ;Iterate through auxiliary vector, looking for AT_SYSINFO (32) 203 | mov esi,[ecx] ;Retrieve the type field of this entry 204 | cmp esi,32 ;Compare to 32, the entry we want 205 | jz foundsysinfo ;Found it 206 | test esi,esi ;Check whether we found the entry signifying the end of auxv 207 | jz restore ;Go to _start if we reach the end 208 | add ecx,8 ;Each entry is 8 bytes; go to next 209 | jmp loopaux 210 | foundsysinfo: 211 | mov esi,[ecx+4] ;Retrieve sysinfo address 212 | mov [sysinfo],esi ;Save address 213 | restore: 214 | mov esi,[esp-4] 215 | mov ecx,[esp-8] 216 | push global_mapping ;Push address of global mapping for popgm 217 | call popgm 218 | add esp,4 ;Pop address of global mapping 219 | jmp realstart 220 | ''' 221 | auxvec_template = ''' 222 | mov [esp-4],esi 223 | mov [esp-8],ecx 224 | mov esi,[esp] 225 | mov ecx,esp 226 | lea ecx,[ecx+esi*4+4] 227 | loopenv: 228 | add ecx,4 229 | mov esi,[ecx] 230 | test esi,esi 231 | jnz loopenv 232 | add ecx,4 233 | loopaux: 234 | mov esi,[ecx] 235 | cmp esi,32 236 | jz foundsysinfo 237 | test esi,esi 238 | jz restore 239 | add ecx,8 240 | jmp loopaux 241 | foundsysinfo: 242 | mov esi,[ecx+4] 243 | mov [%s],esi 244 | restore: 245 | mov esi,[esp-4] 246 | mov ecx,[esp-8] 247 | push %s 248 | call [esp] 249 | add esp,4 250 | mov DWORD PTR [esp-12], %s 251 | jmp [esp-12] 252 | ''' 253 | return _asm(auxvec_template%(self.context.global_sysinfo,self.context.global_lookup+self.context.popgm_offset,self.context.newbase+entry)) 254 | 255 | def get_popgm_code(self): 256 | call_popgm = ''' 257 | pushad 258 | push %s 259 | call $+0xa 260 | add esp,4 261 | popad 262 | ret 263 | ''' 264 | popgmbytes = asm(call_popgm%(self.context.global_sysinfo+4)) 265 | with open('x86_%s' % self.context.popgm) as f: 266 | popgmbytes+=f.read() 267 | return popgmbytes 268 | 269 | def get_global_mapping_bytes(self): 270 | globalbytes = self.get_global_lookup_code() 271 | #globalbytes+='\0' #flag field 272 | globalbytes += self.get_popgm_code() 273 | globalbytes += '\0\0\0\0' #sysinfo field 274 | #Global mapping (0x3ffff8 0xff bytes) ending at kernel addresses. Note it is NOT ending 275 | #at 0xc0000000 because this boundary is only true for 32-bit kernels. For 64-bit kernels, 276 | #the application is able to use most of the entire 4GB address space, and the kernel only 277 | #holds onto a tiny 8KB at the top of the address space. 278 | globalbytes += '\xff'*((0xffffe000>>12)<<2) 279 | # Allocate extra space for any additional global variables that 280 | # instrumentation code might require 281 | if self.context.alloc_globals > 0: 282 | globalbytes += '\x00'*self.context.alloc_globals 283 | return globalbytes 284 | -------------------------------------------------------------------------------- /x86_translator.py: -------------------------------------------------------------------------------- 1 | from x86_assembler import asm 2 | from capstone.x86 import X86_OP_REG,X86_OP_MEM,X86_OP_IMM 3 | import struct 4 | import re 5 | from translator import Translator 6 | 7 | class X86Translator(Translator): 8 | 9 | def __init__(self,before_callback,context): 10 | self.before_inst_callback = before_callback 11 | self.context = context 12 | self.memory_ref_string = re.compile(u'^dword ptr \[(?P
0x[0-9a-z]+)\]$') 13 | #From Brian's Static_phase.py 14 | self.JCC = ['jo','jno','js','jns','je','jz','jne','jnz','jb','jnae', 15 | 'jc','jnb','jae','jnc','jbe','jna','ja','jnbe','jl','jnge','jge', 16 | 'jnl','jle','jng','jg','jnle','jp','jpe','jnp','jpo','jcxz','jecxz'] 17 | 18 | def translate_one(self,ins,mapping): 19 | if ins.mnemonic in ['call','jmp']: #Unconditional jump 20 | return self.translate_uncond(ins,mapping) 21 | elif ins.mnemonic in self.JCC: #Conditional jump 22 | return self.translate_cond(ins,mapping) 23 | elif ins.mnemonic == 'ret': 24 | return self.translate_ret(ins,mapping) 25 | elif ins.mnemonic in ['retn','retf','repz']: #I think retn is not used in Capstone 26 | #print 'WARNING: unimplemented %s %s'%(ins.mnemonic,ins.op_str) 27 | return '\xf4\xf4\xf4\xf4' #Create obvious cluster of hlt instructions 28 | else: #Any other instruction 29 | inserted = self.before_inst_callback(ins) 30 | if inserted is not None: 31 | return inserted + str(ins.bytes) 32 | return None #No translation needs to be done 33 | 34 | def translate_ret(self,ins,mapping): 35 | ''' 36 | mov [esp-28], eax ;save old eax value 37 | pop eax ;pop address from stack from which we will get destination 38 | call $+%s ;call lookup function 39 | mov [esp-4], eax ;save new eax value (destination mapping) 40 | mov eax, [esp-32] ;restore old eax value (the pop has shifted our stack so we must look at 28+4=32) 41 | jmp [esp-4] ;jmp/call to new address 42 | ''' 43 | template_before = ''' 44 | mov [esp-28], eax 45 | pop eax 46 | ''' 47 | template_after = ''' 48 | call $+%s 49 | %s 50 | mov [esp-4], eax 51 | mov eax, [esp-%d] 52 | jmp [esp-4] 53 | ''' 54 | self.context.stat['ret']+=1 55 | code = b'' 56 | inserted = self.before_inst_callback(ins) 57 | if inserted is not None: 58 | code += inserted 59 | if self.context.no_pic and ins.address != self.context.get_pc_thunk + 3: 60 | #Perform a normal return UNLESS this is the ret for the thunk. 61 | #Currently its position is hardcoded as three bytes after the thunk entry. 62 | code = asm( 'ret %s'%ins.op_str ) 63 | else: 64 | code = asm(template_before) 65 | size = len(code) 66 | lookup_target = b'' 67 | if self.context.exec_only: 68 | #Special lookup for not rewriting arguments when going outside new main text address space 69 | lookup_target = self.remap_target(ins.address,mapping,self.context.secondary_lookup_function_offset,size) 70 | else: 71 | lookup_target = self.remap_target(ins.address,mapping,self.context.lookup_function_offset,size) 72 | if ins.op_str == '': 73 | code+=asm(template_after%(lookup_target,'',32)) #32 because of the value we popped 74 | else: #For ret instructions that pop imm16 bytes from the stack, add that many bytes to esp 75 | pop_amt = int(ins.op_str,16) #We need to retrieve the right eax value from where we saved it 76 | code+=asm(template_after%(lookup_target,'add esp,%d'%pop_amt,32+pop_amt)) 77 | return code 78 | 79 | def translate_cond(self,ins,mapping): 80 | self.context.stat['jcc']+=1 81 | patched = b'' 82 | inserted = self.before_inst_callback(ins) 83 | if inserted is not None: 84 | patched += inserted 85 | if ins.mnemonic in ['jcxz','jecxz']: #These instructions have no long encoding 86 | jcxz_template = ''' 87 | test cx,cx 88 | ''' 89 | jecxz_template = ''' 90 | test ecx,ecx 91 | ''' 92 | target = ins.operands[0].imm # int(ins.op_str,16) The destination of this instruction 93 | #newtarget = remap_target(ins.address,mapping,target,0) 94 | if ins.mnemonic == 'jcxz': 95 | patched+=asm(jcxz_template) 96 | else: 97 | patched+=asm(jecxz_template) 98 | newtarget = self.remap_target(ins.address,mapping,target,len(patched)) 99 | #print 'want %s, but have %s instead'%(remap_target(ins.address,mapping,target,len(patched)), newtarget) 100 | #Apparently the offset for jcxz and jecxz instructions may have been wrong? How did it work before? 101 | patched += asm('jz $+%s'%newtarget) 102 | #print 'code length: %d'%len(patched) 103 | 104 | #TODO: some instructions encode to 6 bytes, some to 5, some to 2. How do we know which? 105 | #For example, for CALL, it seems to only be 5 or 2 depending on offset. 106 | #But for jg, it can be 2 or 6 depending on offset, I think because it has a 2-byte opcode. 107 | #while len(patched) < 6: #Short encoding, which we do not want 108 | # patched+='\x90' #Add padding of NOPs 109 | #The previous commented out code wouldn't even WORK now, since we insert another instruction 110 | #at the MINIMUM. I'm amazed the jcxz/jecxz code even worked at all before 111 | else: 112 | target = ins.operands[0].imm # int(ins.op_str,16) The destination of this instruction 113 | newtarget = self.remap_target(ins.address,mapping,target,len(patched)) 114 | patched+=asm(ins.mnemonic + ' $+' + newtarget) 115 | #TODO: some instructions encode to 6 bytes, some to 5, some to 2. How do we know which? 116 | #For example, for CALL, it seems to only be 5 or 2 depending on offset. 117 | #But for jg, it can be 2 or 6 depending on offset, I think because it has a 2-byte opcode. 118 | #while len(patched) < 6: #Short encoding, which we do not want 119 | # patched+='\x90' #Add padding of NOPs 120 | return patched 121 | 122 | def translate_uncond(self,ins,mapping): 123 | op = ins.operands[0] #Get operand 124 | if op.type == X86_OP_REG: # e.g. call eax or jmp ebx 125 | target = ins.reg_name(op.reg) 126 | return self.get_indirect_uncond_code(ins,mapping,target) 127 | elif op.type == X86_OP_MEM: # e.g. call [eax + ecx*4 + 0xcafebabe] or jmp [ebx+ecx] 128 | target = ins.op_str 129 | return self.get_indirect_uncond_code(ins,mapping,target) 130 | elif op.type == X86_OP_IMM: # e.g. call 0xdeadbeef or jmp 0xcafebada 131 | target = op.imm 132 | code = b'' 133 | inserted = self.before_inst_callback(ins) 134 | if inserted is not None: 135 | code += inserted 136 | if self.context.no_pic and target != self.context.get_pc_thunk: 137 | #push nothing if no_pic UNLESS it's the thunk 138 | #We only support DIRECT calls to the thunk 139 | if ins.mnemonic == 'call': 140 | self.context.stat['dircall']+=1 141 | else: 142 | self.context.stat['dirjmp']+=1 143 | elif ins.mnemonic == 'call': #If it's a call, push the original address of the next instruction 144 | self.context.stat['dircall']+=1 145 | exec_call = ''' 146 | push %s 147 | ''' 148 | so_call_before = ''' 149 | push ebx 150 | call $+5 151 | ''' 152 | so_call_after = ''' 153 | pop ebx 154 | sub ebx,%s 155 | xchg ebx,[esp] 156 | ''' 157 | if self.context.write_so: 158 | code += asm(so_call_before) 159 | if mapping is not None: 160 | # Note that if somehow newbase is a very small value we could have problems with the small 161 | # encoding of sub. This could result in different lengths between the mapping and code gen phases 162 | code += asm(so_call_after%( (self.context.newbase+(mapping[ins.address]+len(code))) - (ins.address+len(ins.bytes)) ) ) 163 | else: 164 | code += asm(so_call_after%( (self.context.newbase) - (ins.address+len(ins.bytes)) ) ) 165 | else: 166 | code += asm(exec_call%(ins.address+len(ins.bytes))) 167 | else: 168 | self.context.stat['dirjmp']+=1 169 | newtarget = self.remap_target(ins.address,mapping,target,len(code)) 170 | #print "(pre)new length: %s"%len(callback_code) 171 | #print "target: %s"%hex(target) 172 | #print "newtarget: %s"%newtarget 173 | if self.context.no_pic and target != self.context.get_pc_thunk: 174 | code += asm( '%s $+%s'%(ins.mnemonic,newtarget) ) 175 | else: 176 | patched = asm('jmp $+%s'%newtarget) 177 | if len(patched) == 2: #Short encoding, which we do not want 178 | patched+='\x90\x90\x90' #Add padding of 3 NOPs 179 | code += patched 180 | #print "new length: %s"%len(callback_code+patched) 181 | return code 182 | return None 183 | 184 | def get_indirect_uncond_code(self,ins,mapping,target): 185 | #Commented assembly 186 | ''' 187 | mov [esp-28], eax ;save old eax value (very far above the stack because of future push/call) 188 | mov eax, %s ;read location in memory from which we will get destination 189 | %s ;if a call, we push return address here 190 | call $+%s ;call lookup function 191 | mov [esp-4], eax ;save new eax value (destination mapping) 192 | mov eax, [esp-%s] ;restore old eax value (offset depends on whether return address pushed) 193 | jmp [esp-4] ;jmp to new address 194 | ''' 195 | template_before = ''' 196 | mov [esp-32], eax 197 | mov eax, %s 198 | %s 199 | ''' 200 | exec_call = ''' 201 | push %s 202 | ''' 203 | so_call_before = ''' 204 | push ebx 205 | call $+5 206 | ''' 207 | so_call_after = ''' 208 | pop ebx 209 | sub ebx,%s 210 | xchg ebx,[esp] 211 | ''' 212 | template_after = ''' 213 | call $+%s 214 | mov [esp-4], eax 215 | mov eax, [esp-%s] 216 | jmp [esp-4] 217 | ''' 218 | template_nopic = ''' 219 | call $+%s 220 | mov [esp-4], eax 221 | mov eax, [esp-%s] 222 | %s [esp-4] 223 | ''' 224 | #TODO: This is somehow still the bottleneck, so this needs to be optimized 225 | code = b'' 226 | if self.context.exec_only: 227 | code += self.get_remap_callbacks_code(ins.address,mapping,target) 228 | #NOTE: user instrumentation code comes after callbacks code. No particular reason to put it either way, 229 | #other than perhaps consistency, but for now this is easier. 230 | inserted = self.before_inst_callback(ins) 231 | if inserted is not None: 232 | code += inserted 233 | if self.context.no_pic: 234 | if ins.mnemonic == 'call': 235 | self.context.stat['indcall']+=1 236 | else: 237 | self.context.stat['indjmp']+=1 238 | code += asm( template_before%(target,'') ) 239 | elif ins.mnemonic == 'call': 240 | self.context.stat['indcall']+=1 241 | if self.context.write_so: 242 | code += asm( template_before%(target,so_call_before) ) 243 | if mapping is not None: 244 | code += asm(so_call_after%( (mapping[ins.address]+len(code)+self.context.newbase) - (ins.address+len(ins.bytes)) ) ) 245 | #print 'CODE LEN/1: %d\n%s'%(len(code),code.encode('hex')) 246 | else: 247 | code += asm(so_call_after%( (0x8f+self.context.newbase) - (ins.address+len(ins.bytes)) ) ) 248 | #print 'CODE LEN/0: %d\n%s'%(len(code),code.encode('hex')) 249 | else: 250 | code += asm(template_before%(target,exec_call%(ins.address+len(ins.bytes)) )) 251 | else: 252 | self.context.stat['indjmp']+=1 253 | code += asm(template_before%(target,'')) 254 | size = len(code) 255 | lookup_target = self.remap_target(ins.address,mapping,self.context.lookup_function_offset,size) 256 | #Always transform an unconditional control transfer to a jmp, but 257 | #for a call, insert a push instruction to push the original return address on the stack. 258 | #At runtime, our rewritten ret will look up the right address to return to and jmp there. 259 | #If we push a value on the stack, we have to store even FURTHER away from the stack. 260 | #Note that calling the lookup function can move the stack pointer temporarily up to 261 | #20 bytes, which will obliterate anything stored too close to the stack pointer. That, plus 262 | #the return value we push on the stack, means we need to put it at least 28 bytes away. 263 | if self.context.no_pic: 264 | #Change target to secondary lookup function instead 265 | lookup_target = self.remap_target(ins.address,mapping,self.context.secondary_lookup_function_offset,size) 266 | code += asm( template_nopic%(lookup_target,32,ins.mnemonic) ) 267 | elif ins.mnemonic == 'call': 268 | code += asm(template_after%(lookup_target,28)) 269 | else: 270 | code += asm(template_after%(lookup_target,32)) 271 | return code 272 | 273 | def get_remap_callbacks_code(self,insaddr,mapping,target): 274 | '''Checks whether the target destination (expressed as the opcode string from a jmp/call instruction) 275 | is in the got, then checks if it matches a function with callbacks. It then rewrites the 276 | addresses if necessary. This will *probably* always be from jmp instructions in the PLT. 277 | NOTE: This assumes it does not have any code inserted before it, and that it comprises the first 278 | special instructions inserted for an instruction.''' 279 | if self.memory_ref_string.match(target): 280 | address = int(self.memory_ref_string.match(target).group('address'), 16) 281 | if address in self.context.plt['entries']: 282 | if self.context.plt['entries'][address] in self.context.callbacks: 283 | print 'Found library call with callbacks: %s'%self.context.plt['entries'][address] 284 | return self.get_callback_code( insaddr, mapping, self.context.callbacks[self.context.plt['entries'][address]] ) 285 | return b'' 286 | 287 | def get_callback_code(self,address,mapping,cbargs): 288 | '''Remaps each callback argument on the stack based on index. cbargs is an array of argument indices 289 | that let us know where on the stack we must rewrite. We insert code for each we must rewrite.''' 290 | callback_template_before = ''' 291 | mov eax, [esp+(%s*4)] 292 | ''' 293 | callback_template_after = ''' 294 | call $+%s 295 | mov [esp+(%s*4)], eax 296 | ''' 297 | code = asm('push eax') #Save eax, use to hold callback address 298 | for ind in cbargs: 299 | #Add 2 because we must skip over the saved value of eax and the return value already pushed 300 | #ASSUMPTION: before this instruction OR this instruction if it IS a call, a return address was 301 | #pushed. Since this *probably* is taking place inside the PLT, in all probability this is a 302 | #jmp instruction, and the call that got us *into* the PLT pushed a return address, so we can't rely 303 | #on the current instruction to tell us this either way. Therefore, we are *assuming* that the PLT 304 | #is always entered via a call instruction, or that somebody is calling an address in the GOT directly. 305 | #If code ever jmps based on an address in the got, we will probably corrupt the stack. 306 | cb_before = callback_template_before%( ind + 2 ) 307 | code += asm(cb_before) #Assemble this part first so we will know the offset to the lookup function 308 | size = len(code) 309 | lookup_target = self.remap_target( address, mapping, self.context.lookup_function_offset, size ) 310 | cb_after = callback_template_after%( lookup_target, ind + 2 ) 311 | code += asm(cb_after) #Save the new address over the original 312 | code += asm('pop eax') #Restore eax 313 | return code 314 | 315 | def in_plt(self,target): 316 | return target in range(self.context.plt['addr'],self.context.plt['addr']+self.context.plt['size']) 317 | 318 | def get_plt_entry(self,target): 319 | #It seems that an elf does not directly give a mapping from each entry in the plt. 320 | #Instead, it maps from the got entries instead, making it unclear exactly where objdump 321 | #gets the information. For our purposes, since all the entries in the plt jump to the got 322 | #entry, we can read the destination address from the jmp instruction. 323 | #TODO: ensure works for x64 324 | offset = target - self.context.plt['addr'] #Get the offset into the plt 325 | dest = self.context.plt['data'][offset+2:offset+2+4] #Get the four bytes of the GOT address 326 | dest = struct.unpack('