├── README.md ├── imgs ├── capstone-archs.png └── capstone-fields.png └── chapter1_capstone_preview.md /README.md: -------------------------------------------------------------------------------- 1 | ## Capstone learning 2 | [Chapter 1: Capstone Preview](chapter1_capstone_preview.md) -------------------------------------------------------------------------------- /imgs/capstone-archs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/w0lfzhang/capstone_learning/HEAD/imgs/capstone-archs.png -------------------------------------------------------------------------------- /imgs/capstone-fields.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/w0lfzhang/capstone_learning/HEAD/imgs/capstone-fields.png -------------------------------------------------------------------------------- /chapter1_capstone_preview.md: -------------------------------------------------------------------------------- 1 | ## Introduction 2 | Capstone是一款轻量级反汇编引擎。它可以支持多种硬件构架,如ARM、ARM64、MIPS、X86。该框架使用C语言实现,但支持C++、Python、Ruby、OCaml、C#、Java和Go语言,具有很好的扩展性。 3 | 安装可参考以下链接: 4 | http://www.capstone-engine.org/documentation.html 5 | 6 | ## Simple Sample 7 | 首先通过一个简单的例子来看下capstone的基本用法: 8 | ```python 9 | from capstone import * 10 | 11 | CODE = b"\x55\x48\x8b\x05\xb8\x13\x00\x00" 12 | 13 | md = Cs(CS_ARCH_X86, CS_MODE_64) 14 | for i in md.disasm(CODE, 0x1000): 15 | print("0x%x:\t%s\t%s" %(i.address, i.mnemonic, i.op_str)) 16 | ``` 17 | 首先用Cs类来初始化了一个反编译实例,该类需要两个参数,一个是硬件架构,另一个是硬件模式。可参考下图。 18 | 19 | 然后调用该类的disasm函数对二进制代码进行反汇编,该函数会返回一个CsInsn类,该类具有以下字段: 20 | 21 | 从2.1版本起提供了一个新的函数disasm_lite,跟disasm功能差不多,返回一个tuple(address, size, mnemonic, op_st)。 22 | 23 | ## Details 24 | Cs类有个detial字段,当我们想要一些另外的重要的信息例如读写的寄存器或这些指令属于哪一个语义组,我们可以这样设置: 25 | ```python 26 | md.detail = True 27 | ``` 28 | 看下面的例子: 29 | ```python 30 | from capstone import * 31 | from capstone.arm import * 32 | 33 | CODE = b"\xf1\x02\x03\x0e\x00\x00\xa0\xe3\x02\x30\xc1\xe7\x00\x00\x53\xe3" 34 | 35 | md = Cs(CS_ARCH_ARM, CS_MODE_ARM) 36 | md.detail = True 37 | 38 | for i in md.disasm(CODE, 0x1000): 39 | if i.id in (ARM_INS_BL, ARM_INS_CMP): 40 | print("0x%x:\t%s\t%s" %(i.address, i.mnemonic, i.op_str)) 41 | 42 | if len(i.regs_read) > 0: 43 | print("\tImplicit registers read: "), 44 | for r in i.regs_read: 45 | print("%s " %i.reg_name(r)), 46 | print 47 | 48 | if len(i.groups) > 0: 49 | print("\tThis instruction belongs to groups:"), 50 | for g in i.groups: 51 | print("%u" %g), 52 | print 53 | ``` 54 | 该代码比前一个例子多了访问寄存器和组的地方,其余地方相差不多。 55 | 56 | ## More Details 57 | 再来看一个例子: 58 | ```python 59 | from capstone import * 60 | from capstone.arm64 import * 61 | 62 | CODE = b"\xe1\x0b\x40\xb9\x20\x04\x81\xda\x20\x08\x02\x8b" 63 | 64 | md = Cs(CS_ARCH_ARM64, CS_MODE_ARM) 65 | md.detail = True 66 | 67 | for insn in md.disasm(CODE, 0x38): 68 | print("0x%x:\t%s\t%s" %(insn.address, insn.mnemonic, insn.op_str)) 69 | 70 | if len(insn.operands) > 0: 71 | print("\tNumber of operands: %u" %len(insn.operands)) 72 | c = -1 73 | for i in insn.operands: 74 | c += 1 75 | if i.type == ARM64_OP_REG: 76 | print("\t\toperands[%u].type: REG = %s" %(c, insn.reg_name(i.value.reg))) 77 | if i.type == ARM64_OP_IMM: 78 | print("\t\toperands[%u].type: IMM = 0x%x" %(c, i.value.imm)) 79 | if i.type == ARM64_OP_CIMM: 80 | print("\t\toperands[%u].type: C-IMM = %u" %(c, i.value.imm)) 81 | if i.type == ARM64_OP_FP: 82 | print("\t\toperands[%u].type: FP = %f" %(c, i.value.fp)) 83 | if i.type == ARM64_OP_MEM: 84 | print("\t\toperands[%u].type: MEM" %c) 85 | if i.value.mem.base != 0: 86 | print("\t\t\toperands[%u].mem.base: REG = %s" \ 87 | %(c, insn.reg_name(i.value.mem.base))) 88 | if i.value.mem.index != 0: 89 | print("\t\t\toperands[%u].mem.index: REG = %s" \ 90 | %(c, insn.reg_name(i.value.mem.index))) 91 | if i.value.mem.disp != 0: 92 | print("\t\t\toperands[%u].mem.disp: 0x%x" \ 93 | %(c, i.value.mem.disp)) 94 | 95 | if i.shift.type != ARM64_SFT_INVALID and i.shift.value: 96 | print("\t\t\tShift: type = %u, value = %u" \ 97 | %(i.shift.type, i.shift.value)) 98 | 99 | if i.ext != ARM64_EXT_INVALID: 100 | print("\t\t\tExt: %u" %i.ext) 101 | 102 | if insn.writeback: 103 | print("\tWrite-back: True") 104 | if not insn.cc in [ARM64_CC_AL, ARM64_CC_INVALID]: 105 | print("\tCode condition: %u" %insn.cc) 106 | if insn.update_flags: 107 | print("\tUpdate-flags: True") 108 | ``` 109 | 该例子对操作数进行判断后然后输出具体信息。 110 | 111 | ## Run-time Options 112 | ### Syntax option 113 | 可通过以下选项对汇编语法进行设置: 114 | ```python 115 | md.syntax = CS_OPT_SYNTAX_ATT 116 | or 117 | md.syntax = CS_OPT_SYNTAX_INTEL 118 | ``` 119 | 120 | ### Change diasssemble mode 121 | 可通过以下选项改变汇编模式: 122 | ```python 123 | md.mode = CS_MODE_THUMB # dynamically change to Thumb mode 124 | # from now on disassemble Thumb code .... 125 | 126 | md.mode = CS_MODE_ARM # change back to Arm mode again 127 | # from now on disassemble Arm code .... 128 | ``` 129 | 130 | 接下来读capstone源码。 --------------------------------------------------------------------------------