├── README.md
├── imgs
├── capstone-archs.png
└── capstone-fields.png
└── chapter1_capstone_preview.md
/README.md:
--------------------------------------------------------------------------------
1 | ## Capstone learning
2 | [Chapter 1: Capstone Preview](chapter1_capstone_preview.md)
--------------------------------------------------------------------------------
/imgs/capstone-archs.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/w0lfzhang/capstone_learning/HEAD/imgs/capstone-archs.png
--------------------------------------------------------------------------------
/imgs/capstone-fields.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/w0lfzhang/capstone_learning/HEAD/imgs/capstone-fields.png
--------------------------------------------------------------------------------
/chapter1_capstone_preview.md:
--------------------------------------------------------------------------------
1 | ## Introduction
2 | Capstone是一款轻量级反汇编引擎。它可以支持多种硬件构架,如ARM、ARM64、MIPS、X86。该框架使用C语言实现,但支持C++、Python、Ruby、OCaml、C#、Java和Go语言,具有很好的扩展性。
3 | 安装可参考以下链接:
4 | http://www.capstone-engine.org/documentation.html
5 |
6 | ## Simple Sample
7 | 首先通过一个简单的例子来看下capstone的基本用法:
8 | ```python
9 | from capstone import *
10 |
11 | CODE = b"\x55\x48\x8b\x05\xb8\x13\x00\x00"
12 |
13 | md = Cs(CS_ARCH_X86, CS_MODE_64)
14 | for i in md.disasm(CODE, 0x1000):
15 | print("0x%x:\t%s\t%s" %(i.address, i.mnemonic, i.op_str))
16 | ```
17 | 首先用Cs类来初始化了一个反编译实例,该类需要两个参数,一个是硬件架构,另一个是硬件模式。可参考下图。
18 |
19 | 然后调用该类的disasm函数对二进制代码进行反汇编,该函数会返回一个CsInsn类,该类具有以下字段:
20 |
21 | 从2.1版本起提供了一个新的函数disasm_lite,跟disasm功能差不多,返回一个tuple(address, size, mnemonic, op_st)。
22 |
23 | ## Details
24 | Cs类有个detial字段,当我们想要一些另外的重要的信息例如读写的寄存器或这些指令属于哪一个语义组,我们可以这样设置:
25 | ```python
26 | md.detail = True
27 | ```
28 | 看下面的例子:
29 | ```python
30 | from capstone import *
31 | from capstone.arm import *
32 |
33 | CODE = b"\xf1\x02\x03\x0e\x00\x00\xa0\xe3\x02\x30\xc1\xe7\x00\x00\x53\xe3"
34 |
35 | md = Cs(CS_ARCH_ARM, CS_MODE_ARM)
36 | md.detail = True
37 |
38 | for i in md.disasm(CODE, 0x1000):
39 | if i.id in (ARM_INS_BL, ARM_INS_CMP):
40 | print("0x%x:\t%s\t%s" %(i.address, i.mnemonic, i.op_str))
41 |
42 | if len(i.regs_read) > 0:
43 | print("\tImplicit registers read: "),
44 | for r in i.regs_read:
45 | print("%s " %i.reg_name(r)),
46 | print
47 |
48 | if len(i.groups) > 0:
49 | print("\tThis instruction belongs to groups:"),
50 | for g in i.groups:
51 | print("%u" %g),
52 | print
53 | ```
54 | 该代码比前一个例子多了访问寄存器和组的地方,其余地方相差不多。
55 |
56 | ## More Details
57 | 再来看一个例子:
58 | ```python
59 | from capstone import *
60 | from capstone.arm64 import *
61 |
62 | CODE = b"\xe1\x0b\x40\xb9\x20\x04\x81\xda\x20\x08\x02\x8b"
63 |
64 | md = Cs(CS_ARCH_ARM64, CS_MODE_ARM)
65 | md.detail = True
66 |
67 | for insn in md.disasm(CODE, 0x38):
68 | print("0x%x:\t%s\t%s" %(insn.address, insn.mnemonic, insn.op_str))
69 |
70 | if len(insn.operands) > 0:
71 | print("\tNumber of operands: %u" %len(insn.operands))
72 | c = -1
73 | for i in insn.operands:
74 | c += 1
75 | if i.type == ARM64_OP_REG:
76 | print("\t\toperands[%u].type: REG = %s" %(c, insn.reg_name(i.value.reg)))
77 | if i.type == ARM64_OP_IMM:
78 | print("\t\toperands[%u].type: IMM = 0x%x" %(c, i.value.imm))
79 | if i.type == ARM64_OP_CIMM:
80 | print("\t\toperands[%u].type: C-IMM = %u" %(c, i.value.imm))
81 | if i.type == ARM64_OP_FP:
82 | print("\t\toperands[%u].type: FP = %f" %(c, i.value.fp))
83 | if i.type == ARM64_OP_MEM:
84 | print("\t\toperands[%u].type: MEM" %c)
85 | if i.value.mem.base != 0:
86 | print("\t\t\toperands[%u].mem.base: REG = %s" \
87 | %(c, insn.reg_name(i.value.mem.base)))
88 | if i.value.mem.index != 0:
89 | print("\t\t\toperands[%u].mem.index: REG = %s" \
90 | %(c, insn.reg_name(i.value.mem.index)))
91 | if i.value.mem.disp != 0:
92 | print("\t\t\toperands[%u].mem.disp: 0x%x" \
93 | %(c, i.value.mem.disp))
94 |
95 | if i.shift.type != ARM64_SFT_INVALID and i.shift.value:
96 | print("\t\t\tShift: type = %u, value = %u" \
97 | %(i.shift.type, i.shift.value))
98 |
99 | if i.ext != ARM64_EXT_INVALID:
100 | print("\t\t\tExt: %u" %i.ext)
101 |
102 | if insn.writeback:
103 | print("\tWrite-back: True")
104 | if not insn.cc in [ARM64_CC_AL, ARM64_CC_INVALID]:
105 | print("\tCode condition: %u" %insn.cc)
106 | if insn.update_flags:
107 | print("\tUpdate-flags: True")
108 | ```
109 | 该例子对操作数进行判断后然后输出具体信息。
110 |
111 | ## Run-time Options
112 | ### Syntax option
113 | 可通过以下选项对汇编语法进行设置:
114 | ```python
115 | md.syntax = CS_OPT_SYNTAX_ATT
116 | or
117 | md.syntax = CS_OPT_SYNTAX_INTEL
118 | ```
119 |
120 | ### Change diasssemble mode
121 | 可通过以下选项改变汇编模式:
122 | ```python
123 | md.mode = CS_MODE_THUMB # dynamically change to Thumb mode
124 | # from now on disassemble Thumb code ....
125 |
126 | md.mode = CS_MODE_ARM # change back to Arm mode again
127 | # from now on disassemble Arm code ....
128 | ```
129 |
130 | 接下来读capstone源码。
--------------------------------------------------------------------------------