├── Chapter-01
    └── Chapter-1.md
├── Chapter-02
    ├── 1.jpg
    └── Chapter-2.md
├── Chapter-03
    └── Chapter-3.md
├── Chapter-04
    ├── 1.jpg
    ├── 2.jpg
    ├── 3.jpg
    └── Chapter-4.md
├── Chapter-05
    ├── 1.png
    ├── 2.png
    ├── 3.png
    ├── 4.png
    ├── 5.png
    ├── 6.png
    └── Chapter-5.md
├── Chapter-06
    ├── 1.png
    ├── 10.png
    ├── 11.png
    ├── 12.png
    ├── 13.png
    ├── 14.png
    ├── 2.png
    ├── 3.png
    ├── 4.png
    ├── 5.png
    ├── 6.png
    ├── 7.png
    ├── 8.png
    ├── 9.png
    └── Chapter-6.md
├── Chapter-07
    ├── 1.png
    └── Chapter-7.md
├── Chapter-08
    └── Chapter-8.md
├── Chapter-09
    ├── 1.png
    ├── 2.png
    ├── 3.png
    ├── 4.png
    ├── 5.png
    ├── 6.png
    ├── 7.png
    ├── 8.png
    └── Chapter-9.md
├── Chapter-10
    ├── 1.png
    ├── 2.png
    ├── 3.png
    ├── 4.png
    ├── 5.png
    ├── 6.png
    ├── 7.png
    ├── 8.png
    └── Chapter-10.md
├── Chapter-11
    └── Chapter-11.md
├── Chapter-12
    ├── 1.png
    ├── 2.png
    ├── 3.png
    ├── 4.png
    └── Chapter-12.md
├── Chapter-13
    ├── 1.png
    ├── 2.png
    ├── 3.png
    ├── 4.png
    ├── 5.jpg
    └── Chapter-13.md
├── Chapter-14
    ├── 1.jpg
    ├── 2.jpg
    ├── 3.jpg
    ├── 4.jpg
    ├── 5.jpg
    ├── 6.jpg
    ├── 7.jpg
    └── Chapter-14.md
├── Chapter-15
    └── Chapter-15.md
├── Chapter-16
    └── Chapter-16.md
├── Chapter-17
    ├── 1.jpg
    ├── 2.jpg
    ├── 3.jpg
    └── Chapter-17.md
├── Chapter-18
    ├── 1.png
    ├── 2.png
    └── Chapter-18.md
├── Chapter-19
    └── Chapter-19.md
├── Chapter-20
    ├── 1.png
    ├── 2.png
    ├── 3.png
    ├── 4.png
    └── Chapter-20.md
├── Chapter-21
    └── Chapter-21.md
├── Chapter-22
    └── Chapter-22.md
├── Chapter-23
    └── Chapter-23.md
├── Chapter-24
    └── Chapter-24.md
├── Chapter-25
    └── Chapter-25.md
├── Chapter-26
    └── Chapter-26.md
├── Chapter-27
    └── Chapter-27.md
├── Chapter-28
    └── Chapter-28.md
├── Chapter-29
    └── Chapter-29.md
├── Chapter-30
    └── Chapter-30.md
├── Chapter-31
    ├── 1.png
    ├── 2.png
    ├── 3.png
    ├── 4.png
    ├── 5.png
    ├── 6.png
    ├── 7.png
    └── Chapter-31.md
├── Chapter-32
    └── Chapter-32.md
├── Chapter-33
    └── Chapter-33.md
├── Chapter-54
    ├── 54.10位.md
    ├── 54.11循环.md
    ├── 54.12switch函数.md
    ├── 54.13数组.md
    ├── 54.14字符串.md
    ├── 54.15异常.md
    ├── 54.16类.md
    ├── 54.17简单的补丁.md
    ├── 54.1介绍.md
    ├── 54.2返回一个值.md
    ├── 54.3简单的计算函数.md
    ├── 54.4JVM内存模型.md
    ├── 54.5简单的函数调用.md
    ├── 54.6调用beep函数.md
    ├── 54.7线性同余随机生成器.md
    ├── 54.8条件跳转.md
    └── 54.9传递参数值.md
├── Chapter-55
    ├── 55.1_MicrosoftVisualC++.md
    └── img
    │   └── 55-1.png
├── Chapter-56
    └── 56_communication_with_the_outer_world_(win32).md
├── Chapter-57
    ├── 57.1_text_strings.md
    └── img
    │   ├── 57-1.png
    │   ├── 57-2.png
    │   ├── 57-3.png
    │   ├── 57-4.png
    │   ├── 57-5.png
    │   └── 57-6.png
├── Chapter-58
    └── 58_call_to_assert.md
├── Chapter-59
    └── 59_constans.md
├── Chapter-60
    ├── finding_the_right_instructions.md
    └── img
    │   └── 60-1.png
├── Chapter-61
    └── 61.1_xor_instructions.md
├── Chapter-62
    └── 62_using_magic_numbers_while_tracing.md
├── Chapter-63
    ├── 63.1_general_idea.md
    └── img
    │   └── 63-1.png
├── Chapter-64
    └── ArgumentsPassingMethods.md
├── Chapter-65
    └── ThreadLocalStorage.md
├── Chapter-66
    └── SystemCalls.md
├── Chapter-67
    └── Linux.md
├── Chapter-68
    ├── Windows-NT.md
    └── img
    │   ├── Figure_68.1_A_scheme_that_unites_all_PE-file_structures_related_to_imports.jpg
    │   ├── Figure_68.2_Windows_XP.jpg
    │   ├── Figure_68.3_Windows_XP.jpg
    │   ├── Figure_68.4_Windows_7.jpg
    │   ├── Figure_68.5_Windows_8.1.jpg
    │   ├── exception.jpg
    │   ├── seh3.jpg
    │   └── seh4.jpg
├── Chapter-69
    └── Disassembler.md
├── Chapter-70
    └── Debugger.md
├── Chapter-71
    └── SystemCallTracing.md
├── Chapter-72
    └── Decompilers.md
├── Chapter-73
    └── OtherTools.md
├── Chapter-84
    ├── Primitive XOR-encryption.md
    └── img
    │   ├── 84-1.png
    │   ├── 84-2.png
    │   ├── 84-3.png
    │   ├── 84-4.png
    │   ├── 84-5.png
    │   └── 84-6.png
├── Chapter-85
    ├── Millenium game save file.md
    └── img
    │   ├── 85-1.png
    │   ├── 85-2.png
    │   ├── 85-3.png
    │   ├── 85-4.png
    │   ├── 85-5.png
    │   ├── 85-6.png
    │   ├── 85-7.png
    │   └── 85-8.png
├── Chapter-86
    ├── Oracle RDBMS SYM-files.md
    └── img
    │   ├── 86-1.png
    │   ├── 86-2.png
    │   ├── 86-3.png
    │   ├── 86-4.png
    │   └── 86-5.png
├── Chapter-87
    ├── Oracle RDBMS MSB-files.md
    └── img
    │   ├── 87-1.png
    │   ├── 87-2.png
    │   ├── 87-3.png
    │   └── 87-4.png
├── README.md
└── index.md


/Chapter-01/Chapter-1.md:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-01/Chapter-1.md


--------------------------------------------------------------------------------
/Chapter-02/1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-02/1.jpg


--------------------------------------------------------------------------------
/Chapter-02/Chapter-2.md:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-02/Chapter-2.md


--------------------------------------------------------------------------------
/Chapter-03/Chapter-3.md:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-03/Chapter-3.md


--------------------------------------------------------------------------------
/Chapter-04/1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-04/1.jpg


--------------------------------------------------------------------------------
/Chapter-04/2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-04/2.jpg


--------------------------------------------------------------------------------
/Chapter-04/3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-04/3.jpg


--------------------------------------------------------------------------------
/Chapter-04/Chapter-4.md:
--------------------------------------------------------------------------------
  1 | 栈在计算科学中是最重要并且是最基本的数据结构。
  2 | 
  3 | 严格的来说,它只是在x86中被ESP,或x64中被RSP,或ARM中被SP所指向的一段程序内存区域。访问栈内存,最常使用的指令是PUSH和POP(在x86和ARM Thumb模式中)。
  4 | 
  5 | PUSH指令在32位模式下,会将ESP/RSP/SP的值减去4(在64位系统中,会减去8),然后将操作数写入到ESP/RSP/SP指向的内存地址。
  6 | 
  7 | POP是相反的操作运算:从SP指向的内存地址中获取数据,存入操作数(一般为寄存器), 然后将SP(栈指针)加4(或8)。
  8 | 
  9 | # 4.1 为什么栈反向增长?
 10 | 
 11 | 按正常思维来说,我们会认为像其它数据结构一样,栈是正向增长的,比如:栈指针会指向高地址。
 12 | 
 13 | ![](1.jpg)
 14 | 
 15 | 我们知道:
 16 | 
 17 | 映像文件的划分为三个部分,程度代码段在内存空闲部分运行。在运行过程中,这部分是具有写保护的,所有进程都可以共享访问这个程序。在内存空间中,程序text区段开始的8k字节是不能共享的可写区段,这个大小可以使用系统函数来扩大。在内存高位地址是可以像硬件指针可以自由活动向下增长的栈区段。
 18 | 
 19 | # 4.2 栈可以用来做什么?
 20 | 
 21 | ## 4.2.1 保存函数返回地址以便在函数执行完成时返回控制权
 22 | 
 23 | ### x86
 24 | 
 25 | 当使用CALL指令去调用一个函数时,CALL后面一条指令的地址会被保存到栈中,使用无条件跳转指令跳转到CALL中执行。 ￼ ￼CALL指令等价于PUSH函数返回地址和JMP跳转。
 26 | 
 27 | ```
 28 | void f()
 29 | {
 30 |     f();
 31 | };
 32 | ```
 33 | 
 34 | ￼MSVC 2008显示了一些报错信息:
 35 | 
 36 | ```
 37 | c:\tmp6>cl ss.cpp /Fass.asm
 38 | Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
 39 | Copyright (C) Microsoft Corporation.  All rights reserved.
 40 | ss.cpp
 41 | c:\tmp6\ss.cpp(4) : warning C4717: ’f’ : recursive on all control paths, function will cause
 42 |     runtime stack overflow
 43 | ```
 44 | 
 45 | 但无论如何还是生成了正确的代码:
 46 | 
 47 | ```
 48 | ?f@@YAXXZ PROC                                          ; f
 49 | ; File c:\tmp6\ss.cpp
 50 | ; Line 2
 51 |         push ebp
 52 |         mov     ebp, esp
 53 | ; Line 3
 54 |         call    ?f@@YAXXZ                               ; f
 55 | ; Line 4
 56 |         pop ebp
 57 |         ret     0
 58 | ?f@@YAXXZ ENDP  ; f
 59 | ```
 60 | 
 61 | ￼￼如果我们设置优化(/0x)标识,生成的代码将不会出现栈溢出,并且会运行的很好。
 62 | 
 63 | ```
 64 | ?f@@YAXXZ PROC                                          ; f
 65 | ; File c:\tmp6\ss.cpp
 66 | ; Line 2
 67 | $LL3@f:
 68 | ; Line 3
 69 |         jmp     SHORT $LL3@f
 70 | ?f@@YAXXZ ENDP                                          ; f
 71 | ```
 72 | 
 73 | GCC 4.4.1 在这两种条件下,会生成同样的代码,而且不会有任何警告。
 74 | 
 75 | ### ARM
 76 | 
 77 | ARM程序员经常使用栈来保存返回地址,但有些不同。像是提到的“Hello,World!(2.3), RA保存在LR(链接寄存器)。然而,如果需要调用另外一个函数,需要多次使用LR寄存器,它的值会被保存起来。通常会在函数开始的时候保存。像我们经常看到的指令“PUSH R4-R7, LR”,在函数结尾处的指令“POP R4-R7, PC”,在函数中使⽤用到的寄存器会被保存到栈中,包括LR。
 78 | 
 79 | 尽管如此,如果一个函数从未调用其它函数,在ARM专用术语中被叫称作叶子函数。因此,叶⼦函数不需要LR寄存器。如果一个函数很小并使用了少量的寄存器,可能不会⽤到栈。因此,是可以不使用栈而实现调用叶子函数的。在扩展ARM上不使用栈,这样就会比在x86上运行要快。在未分配栈内存或栈内存不可用的情况下,这种方式是非常有用的。
 80 | 
 81 | ## 4.2.2 传递函数参数
 82 | 
 83 | 在x86中,最常见的传参方式是`cdecl`:
 84 | 
 85 | ```
 86 | push arg3
 87 | push arg2
 88 | push arg1
 89 | call f
 90 | add esp, 4*3
 91 | ```
 92 | 
 93 | 被调用函数通过栈指针访问函数参数。因此,这就是为什么要在函数f()执行之前将数据放入栈中的原因。
 94 | 
 95 | ![](2.jpg)
 96 | 
 97 | 来看一下其它调用约定。没有意义也没有必要强迫程序员一定要使用栈来传递参数。
 98 | 
 99 | 这不是必需的,可以不使用栈,通过其它方式来实现。
100 | 
101 | 例如,可以为参数分配一部分堆空间,存入参数,将指向这部分内存的指针存入EAX,这样就可以了。然而,在x86和ARM中,使用栈传递参数还是更加方便的。
102 | 
103 | 另外一个问题,被调用的函数并不知道有多少参数被传递进来。有些函数可以传递不同个数的参数(如:printf()),通过一些说明性的字符(以%开始)才可以判断。如果我们这样调用函数
104 | 
105 | `printf("%d %d %d", 1234);`
106 | 
107 | printf()会传⼊入1234,然后另外传入栈中的两个随机数字。这就让我们使用哪种方式调用 main()函数变得不重要,像`main()`,`main(int argc, char *argv[])`或`main(int argc, char *argv[], char *envp[])`。
108 | 
109 | 事实上,CRT函数在调用main()函数时,使用了下面的指令: ￼￼￼￼ `#!bash push envp push argv push argc call main ...`
110 | 
111 | ￼如果你使用了没有参数的main()函数,尽管如此,但它们仍然在栈中,只是无法使用。如果你使用了`main(int argc, char *argv[])`,你可以使用两个参数,第三个参数对你的函数是"不可见的"。如果你使用`main(int argc)`这种方式,同样是可以正常运行的。
112 | 
113 | ## 4.2.3 局部变量存放
114 | 
115 | 局部变量存放到任何你想存放的地方,但传统上来说,大家更喜欢通过将栈指针移动到栈底,来存放局部变量,当然,这不是必需的。
116 | 
117 | ## 4.2.4 x86: alloca() 函数
118 | 
119 | 对alloca()函数并没有值得学习的。
120 | 
121 | 该函数的作用像malloc()一样,但只会在栈上分配内存。
122 | 
123 | 它分配的内存并不需要在函数结束时,调用像free()这样的函数来释放,当函数运行结束,ESP的值还原时,这部分内存会自动释放。对alloca()函数的实现也没有什么值得介绍的。
124 | 
125 | 这个函数,如果精简一下,就是将ESP指针指向栈底,根据你所需要的内存大小将ESP指向所分配的内存块。让我们试一下:
126 | 
127 | ```
128 | #include <malloc.h>
129 | #include <stdio.h>
130 | void f() {
131 |     char *buf=(char*)alloca (600);
132 |     _snprintf (buf, 600, "hi! %d, %d, %d\n", 1, 2, 3);
133 |     puts (buf);
134 | };
135 | ```
136 | 
137 | (_snprintf()函数作用与printf()函数基本相同,不同的地方是printf()会将结果输出到的标准输出stdout,_snprintf()会将结果保存到内存中,后面两⾏代码可以使用printf()替换,但我想说明小内存的使用习惯。)
138 | 
139 | ### MSVC
140 | 
141 | 让我们来编译 (MSVC 2010):
142 | 
143 | ```
144 | ...
145 | ￼￼￼￼￼￼￼        mov    eax, 600         ; 00000258H
146 |         call   __alloca_probe_16
147 |         mov    esi, esp
148 |  
149 | ￼￼￼￼￼￼￼        push   3
150 |         push   2
151 |         push   1
152 |         push   OFFSET $SG2672
153 |         push   600              ; 00000258H
154 |         push   esi
155 |         call   __snprintf
156 |  
157 | ￼￼￼￼￼￼￼        push   esi
158 |         call   _puts
159 |         add    esp, 28          ; 0000001cH
160 | ...
161 | ```
162 | 
163 | 这唯一的函数参数是通过EAX(未使用栈)传递。在函数调用结束时,ESP会指向 600字节的内存,我们可以像使用一般内存一样来使用它做为缓冲区。
164 | 
165 | ### GCC + Intel格式
166 | 
167 | GCC 4.4.1不需要调用函数就可以实现相同的功能:
168 | 
169 | ```
170 | .LC0:
171 |            .string "hi! %d, %d, %d\n"
172 | f:
173 |            push    ebp
174 |            mov     ebp, esp
175 |            push    ebx
176 |            sub     esp, 660
177 |            lea     ebx, [esp+39]
178 |            and     ebx, -16                             ; align pointer by 16-bit border
179 |            mov     DWORD PTR [esp], ebx                 ; s
180 |            mov     DWORD PTR [esp+20], 3
181 |            mov     DWORD PTR [esp+16], 2
182 |            mov     DWORD PTR [esp+12], 1
183 |            mov     DWORD PTR [esp+8], OFFSET FLAT:.LC0  ; "hi! %d, %d, %d\n"
184 |            mov     DWORD PTR [esp+4], 600               ; maxlen
185 |            call    _snprintf
186 |            mov     DWORD PTR [esp], ebx
187 |            call    puts
188 |            mov     ebx, DWORD PTR [ebp-4]
189 |            leave
190 |            ret
191 | ```
192 | 
193 | ￼### GCC + AT&T 格式
194 | 
195 | 我们来看相同的代码,但使用了AT&T格式:
196 | 
197 | ```
198 | .LC0:
199 |         .string "hi! %d, %d, %d\n"
200 | f:
201 |         pushl %ebp
202 |         movl    %esp, %ebp
203 |         pushl   %ebx
204 |         subl    $660, %esp
205 |         leal    39(%esp), %ebx
206 |         andl    $-16, %ebx
207 |         movl    %ebx, (%esp)
208 |         movl    $3, 20(%esp)
209 |         movl    $2, 16(%esp)
210 |         movl    $1, 12(%esp)
211 |         movl    $.LC0, 8(%esp)
212 |         movl    $600, 4(%esp)
213 |         call    _snprintf
214 |         movl    %ebx, (%esp)
215 |         call    puts
216 |         movl    -4(%ebp), %ebx
217 |         leave
218 |         ret
219 | ```
220 | 
221 | ￼代码与上面的那个图是相同的。
222 | 
223 | 例如:movl $3, 20(%esp)与mov DWORD PTR [esp + 20],3是等价的,Intel的内存地址增加是使用register+offset,而AT&T使用的是offset(%register)。
224 | 
225 | ## 4.2.5 (Windows) 结构化异常处理
226 | 
227 | SEH也是存放在栈中的(如果存在的话)。 想了解更多，请等待后续翻译在(51.3)。
228 | 
229 | ## 4.2.6 缓冲区溢出保护
230 | 
231 | 想了解更多，请等待后续翻译，在(16.2)。
232 | 
233 | # 4.3 典型的内存布局
234 | 
235 | 在32位系统中，函数开始时，栈的布局:
236 | 
237 | ![](3.jpg)


--------------------------------------------------------------------------------
/Chapter-05/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-05/1.png


--------------------------------------------------------------------------------
/Chapter-05/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-05/2.png


--------------------------------------------------------------------------------
/Chapter-05/3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-05/3.png


--------------------------------------------------------------------------------
/Chapter-05/4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-05/4.png


--------------------------------------------------------------------------------
/Chapter-05/5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-05/5.png


--------------------------------------------------------------------------------
/Chapter-05/6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-05/6.png


--------------------------------------------------------------------------------
/Chapter-06/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/1.png


--------------------------------------------------------------------------------
/Chapter-06/10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/10.png


--------------------------------------------------------------------------------
/Chapter-06/11.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/11.png


--------------------------------------------------------------------------------
/Chapter-06/12.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/12.png


--------------------------------------------------------------------------------
/Chapter-06/13.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/13.png


--------------------------------------------------------------------------------
/Chapter-06/14.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/14.png


--------------------------------------------------------------------------------
/Chapter-06/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/2.png


--------------------------------------------------------------------------------
/Chapter-06/3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/3.png


--------------------------------------------------------------------------------
/Chapter-06/4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/4.png


--------------------------------------------------------------------------------
/Chapter-06/5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/5.png


--------------------------------------------------------------------------------
/Chapter-06/6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/6.png


--------------------------------------------------------------------------------
/Chapter-06/7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/7.png


--------------------------------------------------------------------------------
/Chapter-06/8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/8.png


--------------------------------------------------------------------------------
/Chapter-06/9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-06/9.png


--------------------------------------------------------------------------------
/Chapter-07/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-07/1.png


--------------------------------------------------------------------------------
/Chapter-08/Chapter-8.md:
--------------------------------------------------------------------------------
  1 | # 一个或者多个字的返回值
  2 | 
  3 | X86架构下通常返回EAX寄存器的值，如果是单字节char，则只使用EAX的低8位AL。如果返回float类型则使用FPU寄存器ST(0)。ARM架构下通常返回寄存器R0。
  4 | 
  5 | 假如main()函数的返回值是void而不是int会怎么样？
  6 | 
  7 | 通常启动函数调用main()为：
  8 | 
  9 | ```
 10 | push envp
 11 | push argv
 12 | push argc
 13 | call main
 14 | push eax
 15 | call exit
 16 | ```
 17 | 
 18 | 换句话说为
 19 | 
 20 | `exit(main(argc,argv,envp));`
 21 | 
 22 | 如果main()声明为void类型并且函数没有明确返回状态值，通常在main()结束时EAX寄存器的值被返回，然后作为exit()的参数。大多数情况下函数返回的是随机值。这种情况下程序的退出代码为伪随机的。
 23 | 
 24 | 我们看一个实例，注意main()是void类型：
 25 | 
 26 | ```
 27 | #include <stdio.h>
 28 | void main()
 29 | {
 30 |     printf ("Hello, world!");
 31 | };
 32 | ```
 33 | 
 34 | 我们在linux下编译。
 35 | 
 36 | GCC 4.8.1会使用puts()替代printf()（看前面章节2.3.3），没有关系，因为puts()会返回打印的字符数，就行printf()一样。请注意，main()结束时EAX寄存器的值是非0的，这意味着main()结束时保留puts()返回时EAX的值。
 37 | 
 38 | Listing 8.1: GCC 4.8.1
 39 | 
 40 | ```
 41 | .LC0:
 42 |         .string "Hello, world!"
 43 | main:
 44 |         push    ebp
 45 |         mov     ebp, esp
 46 |         and     esp, -16
 47 |         sub     esp, 16
 48 |         mov     DWORD PTR [esp], OFFSET FLAT:.LC0
 49 |         call    puts
 50 |         leave
 51 |         ret
 52 | ```
 53 | 
 54 | 我们写bash脚本来看退出状态：
 55 | 
 56 | Listing 8.2: tst.sh
 57 | 
 58 | ```
 59 | #!/bin/sh
 60 | ./hello_world
 61 | echo $?
 62 | ```
 63 | 
 64 | 运行：
 65 | 
 66 | ```
 67 | $ tst.sh
 68 | Hello, world!
 69 | 14
 70 | ```
 71 | 
 72 | 14为打印的字符数。
 73 | 
 74 | 回到返回值是EAX寄存器值的事实，这也就是为什么老的C编译器不能够创建返回信息无法拟合到一个寄存器（通常是int型）的函数。如果必须这样，应该通过指针来传递。现在可以这样，比如返回整个结构体，这种情况应该避免。如果必须要返回大的结构体，调用者必须开辟存储空间，并通过第一个参数传递指针，整个过程对程序是透明的。像手动通过第一个参数传递指针一样，只是编译器隐藏了这个过程。
 75 | 
 76 | 小例子：
 77 | 
 78 | ```
 79 | struct s
 80 | {
 81 |     int a;
 82 |     int b;
 83 |     int c;
 84 | };
 85 | 
 86 | struct s get_some_values (int a)
 87 | {
 88 |     struct s rt;
 89 |     rt.a=a+1;
 90 |     rt.b=a+2;
 91 |     rt.c=a+3;
 92 | 
 93 |     return rt;
 94 | };
 95 | ```
 96 | 
 97 | …我们可以得到(MSVC 2010 /Ox):
 98 | 
 99 | ```
100 | $T3853 = 8                  ; size = 4
101 | _a$ = 12                    ; size = 4
102 | ?get_some_values@@YA?AUs@@H@Z PROC      ; get_some_values
103 |     mov     ecx, DWORD PTR _a$[esp-4]
104 |     mov     eax, DWORD PTR $T3853[esp-4]
105 |     lea     edx, DWORD PTR [ecx+1]
106 |     mov     DWORD PTR [eax], edx
107 |     lea     edx, DWORD PTR [ecx+2]
108 |     add     ecx, 3
109 |     mov     DWORD PTR [eax+4], edx
110 |     mov     DWORD PTR [eax+8], ecx
111 |     ret     0
112 | ?get_some_values@@YA?AUs@@H@Z ENDP      ; get_some_values
113 | ```
114 | 
115 | 内部变量传递指针到结构体的宏为$T3853。
116 | 
117 | 这个例子可以用C99语言扩展来重写：
118 | 
119 | ```
120 | struct s
121 | {
122 |     int a;
123 |     int b;
124 |     int c;
125 | };
126 | 
127 | struct s get_some_values (int a)
128 | {
129 |     return (struct s){.a=a+1, .b=a+2, .c=a+3};
130 | };
131 | ```
132 | 
133 | Listing 8.3: GCC 4.8.1
134 | 
135 | ```
136 | _get_some_values proc near
137 | 
138 | ptr_to_struct   = dword ptr 4
139 | a               = dword ptr 8
140 |                 mov     edx, [esp+a]
141 |                 mov     eax, [esp+ptr_to_struct]
142 |                 lea     ecx, [edx+1]
143 |                 mov     [eax], ecx
144 |                 lea     ecx, [edx+2]
145 |                 add     edx, 3
146 |                 mov     [eax+4], ecx
147 |                 mov     [eax+8], edx
148 |                 retn
149 | _get_some_values endp
150 | ```
151 | 
152 | 我们可以看到，函数仅仅填充调用者申请的结构体空间的相应字段。因此没有性能缺陷。


--------------------------------------------------------------------------------
/Chapter-09/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-09/1.png


--------------------------------------------------------------------------------
/Chapter-09/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-09/2.png


--------------------------------------------------------------------------------
/Chapter-09/3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-09/3.png


--------------------------------------------------------------------------------
/Chapter-09/4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-09/4.png


--------------------------------------------------------------------------------
/Chapter-09/5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-09/5.png


--------------------------------------------------------------------------------
/Chapter-09/6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-09/6.png


--------------------------------------------------------------------------------
/Chapter-09/7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-09/7.png


--------------------------------------------------------------------------------
/Chapter-09/8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-09/8.png


--------------------------------------------------------------------------------
/Chapter-09/Chapter-9.md:
--------------------------------------------------------------------------------
  1 | # 指针
  2 | 
  3 | 指针通常被用作函数返回值(recall scanf() case (6)).例如，当函数返回两个值时。
  4 | 
  5 | ## 9.1 Global variables example
  6 | 
  7 | ```
  8 | #include <stdio.h>
  9 | 
 10 | void f1 (int x, int y, int *sum, int *product)
 11 | {
 12 |     *sum=x+y;
 13 |     *product=x*y;
 14 | };
 15 | 
 16 | int sum, product;
 17 | 
 18 | void main()
 19 | {
 20 |     f1(123, 456, &sum, &product);
 21 |     printf ("sum=%d, product=%d", sum, product);
 22 | };
 23 | ```
 24 | 
 25 | 编译后
 26 | 
 27 | Listing 9.1: Optimizing MSVC 2010 (/Ox /Ob0)
 28 | 
 29 | ```
 30 | COMM        _product:DWORD
 31 | COMM        _sum:DWORD
 32 | $SG2803 DB              ’sum=%d, product=%d’, 0aH, 00H
 33 |  
 34 | _x$ = 8                                     ; size = 4
 35 | _y$ = 12                                    ; size = 4
 36 | _sum$ = 16                                  ; size = 4
 37 | _product$ = 20                              ; size = 4
 38 | _f1         PROC
 39 |             mov     ecx, DWORD PTR _y$[esp-4]
 40 |             mov     eax, DWORD PTR _x$[esp-4]
 41 |             lea     edx, DWORD PTR [eax+ecx]
 42 |             imul    eax, ecx
 43 |             mov     ecx, DWORD PTR _product$[esp-4]
 44 |             push    esi
 45 |             mov     esi, DWORD PTR _sum$[esp]
 46 |             mov     DWORD PTR [esi], edx
 47 |             mov     DWORD PTR [ecx], eax
 48 |             pop     esi
 49 |             ret     0
 50 | _f1         ENDP
 51 |  
 52 | _main       PROC
 53 |             push    OFFSET _product
 54 |             push    OFFSET _sum
 55 |             push    456                     ; 000001c8H
 56 |             push    123                     ; 0000007bH
 57 |             call    _f1
 58 |             mov     eax, DWORD PTR _product
 59 |             mov     ecx, DWORD PTR _sum
 60 |             push    eax
 61 |             push    ecx
 62 |             push    OFFSET $SG2803
 63 |             call    DWORD PTR __imp__printf
 64 |             add     esp, 28                 ; 0000001cH
 65 |             xor     eax, eax
 66 |             ret     0
 67 | _main   ENDP
 68 | ```
 69 | 
 70 | 让我们在OD中查看：图9.1。首先全局变量地址被传递进f1()。我们在堆栈元素点击“数据窗口跟随”，可以看到数据段上分配两个变量的空间。这些变量被置0，因为未初始化数据（BSS1）在程序运行之前被清理为0。这些变量属于数据段，我们按Alt+M可以查看内存映射fig. 9.5.
 71 | 
 72 | 让我们跟踪（F7）到f1()fig. 9.2.在堆栈中为456 (0x1C8) 和 123 (0x7B)，接着是两个全局变量的地址。
 73 | 
 74 | 让我们跟踪到f1()结尾，可以看到两个全局变量存放了计算结果。
 75 | 
 76 | 现在两个全局变量的值被加载到寄存器传递给printf(): fig. 9.4.
 77 | 
 78 | ![](1.png)
 79 | 
 80 | Figure 9.1: OllyDbg: 全局变量地址被传递进f1()
 81 | 
 82 | ![](2.png)
 83 | 
 84 | Figure 9.2: OllyDbg: f1()开始
 85 | 
 86 | ![](3.png)
 87 | 
 88 | Figure 9.3: OllyDbg: f1()完成
 89 | 
 90 | ![](4.png)
 91 | 
 92 | Figure 9.4: OllyDbg: 全局变量被传递进printf()
 93 | 
 94 | ![](5.png)
 95 | 
 96 | Figure 9.5: OllyDbg: memory map
 97 | 
 98 | ## 9.2 Local variables example
 99 | 
100 | 让我们修改一下例子：
101 | 
102 | Listing 9.2: 局部变量
103 | 
104 | ```
105 | void main()
106 | {
107 |     int sum, product; // now variables are here
108 |  
109 |     f1(123, 456, &sum, &product);
110 |     printf ("sum=%d, product=%d
111 | ", sum, product);
112 | };
113 | ```
114 | 
115 | f1()函数代码没有改变。仅仅main()代码作了修改。
116 | 
117 | Listing 9.3: Optimizing MSVC 2010 (/Ox /Ob0)
118 | 
119 | ```
120 | _product$ = -8              ; size = 4
121 | _sum$ = -4                  ; size = 4
122 | _main   PROC
123 | ; Line 10
124 |         sub     esp, 8
125 | ; Line 13
126 |         lea     eax, DWORD PTR _product$[esp+8]
127 |         push    eax
128 |         lea     ecx, DWORD PTR _sum$[esp+12]
129 |         push    ecx
130 |         push    456         ; 000001c8H
131 |         push    123         ; 0000007bH
132 |         call    _f1
133 | ; Line 14
134 |         mov     edx, DWORD PTR _product$[esp+24]
135 |         mov     eax, DWORD PTR _sum$[esp+24]
136 |         push    edx
137 |         push    eax
138 |         push    OFFSET $SG2803
139 |         call    DWORD PTR __imp__printf
140 | ; Line 15
141 |         xor     eax, eax
142 |         add     esp, 36     ; 00000024H
143 |         ret     0
144 | ```
145 | 
146 | 我们在OD中查看，局部变量地址在堆栈中是0x35FCF4和0x35FCF8。我们可以看到是如何圧栈的fig. 9.6.
147 | 
148 | f1()开始的时候，随机栈地址为0x35FCF4和0x35FCF8 fig. 9.7.
149 | 
150 | f1()完成时结果0xDB18和0x243存放在地址0x35FCF4和0x35FCF8。
151 | 
152 | ![](6.png)
153 | 
154 | Figure 9.6: OllyDbg: 局部变量地址被圧栈
155 | 
156 | ![](7.png)
157 | 
158 | Figure 9.7: OllyDbg: f1()starting
159 | 
160 | ![](8.png)
161 | 
162 | Figure 9.8: OllyDbg: f1()finished
163 | 
164 | ## 9.3 小结
165 | 
166 | f1()可以返回结果到内存的任何地方，这是指针的本质和特性。顺便提一下，C++引用的工作方式和这个类似。详情阅读相关内容（33）。


--------------------------------------------------------------------------------
/Chapter-10/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-10/1.png


--------------------------------------------------------------------------------
/Chapter-10/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-10/2.png


--------------------------------------------------------------------------------
/Chapter-10/3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-10/3.png


--------------------------------------------------------------------------------
/Chapter-10/4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-10/4.png


--------------------------------------------------------------------------------
/Chapter-10/5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-10/5.png


--------------------------------------------------------------------------------
/Chapter-10/6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-10/6.png


--------------------------------------------------------------------------------
/Chapter-10/7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-10/7.png


--------------------------------------------------------------------------------
/Chapter-10/8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-10/8.png


--------------------------------------------------------------------------------
/Chapter-12/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-12/1.png


--------------------------------------------------------------------------------
/Chapter-12/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-12/2.png


--------------------------------------------------------------------------------
/Chapter-12/3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-12/3.png


--------------------------------------------------------------------------------
/Chapter-12/4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-12/4.png


--------------------------------------------------------------------------------
/Chapter-13/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-13/1.png


--------------------------------------------------------------------------------
/Chapter-13/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-13/2.png


--------------------------------------------------------------------------------
/Chapter-13/3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-13/3.png


--------------------------------------------------------------------------------
/Chapter-13/4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-13/4.png


--------------------------------------------------------------------------------
/Chapter-13/5.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-13/5.jpg


--------------------------------------------------------------------------------
/Chapter-13/Chapter-13.md:
--------------------------------------------------------------------------------
  1 | # strlen()
  2 | 
  3 | 现在，让我们再看一眼循环结构。通常，strlen()函数是由while()来实现的。这就是MSVC标准库中strlen的做法：
  4 | 
  5 | ```
  6 | int my_strlen (const char * str)
  7 | {
  8 |     const char *eos = str;
  9 |     while( *eos++ ) ;
 10 |     return( eos - str - 1 );
 11 | }
 12 | int main()
 13 | {
 14 |     // test
 15 |     return my_strlen("hello!");
 16 | };
 17 | ```
 18 | 
 19 | ## 13.1 x86
 20 | 
 21 | ### 13.1.1 无优化的 MSVC
 22 | 
 23 | 让我们编译一下：
 24 | 
 25 | ```
 26 | _eos$ = -4                  ; size = 4
 27 | _str$ = 8                   ; size = 4
 28 | _strlen PROC
 29 |     push    ebp
 30 |     mov     ebp, esp
 31 |     push    ecx
 32 |     mov     eax, DWORD PTR _str$[ebp]   ; place pointer to string from str
 33 |     mov     DWORD PTR _eos$[ebp], eax   ; place it to local varuable eos
 34 | $LN2@strlen_:
 35 |     mov     ecx, DWORD PTR _eos$[ebp]   ; ECX=eos
 36 |  
 37 |     ; take 8-bit byte from address in ECX and place it as 32-bit value to EDX with sign extension
 38 |  
 39 |     movsx   edx, BYTE PTR [ecx]
 40 |     mov     eax, DWORD PTR _eos$[ebp]   ; EAX=eos
 41 |     add     eax, 1 ; increment EAX
 42 |     mov     DWORD PTR _eos$[ebp], eax   ; place EAX back to eos
 43 |     test    edx, edx                    ; EDX is zero?
 44 |     je      SHORT $LN1@strlen_          ; yes, then finish loop
 45 |     jmp     SHORT $LN2@strlen_          ; continue loop
 46 | $LN1@strlen_:
 47 |  
 48 |     ; here we calculate the difference between two pointers
 49 |  
 50 |     mov     eax, DWORD PTR _eos$[ebp]
 51 |     sub     eax, DWORD PTR _str$[ebp]
 52 |     sub     eax, 1                      ; subtract 1 and return result
 53 |     mov     esp, ebp
 54 |     pop     ebp
 55 |     ret     0
 56 | _strlen_ ENDP
 57 | ```
 58 | 
 59 | 我们看到了两个新的指令：MOVSX（见13.1.1节）和TEST。
 60 | 
 61 | 关于第一个：MOVSX用来从内存中取出字节然后把它放到一个32位寄存器中。MOVSX意味着MOV with Sign-Extent（带符号扩展的MOV操作）。MOVSX操作下，如果复制源是负数，从第8到第31的位将被设为1，否则将被设为0。
 62 | 
 63 | 现在解释一下为什么要这么做。
 64 | 
 65 | C/C++标准将char（译注：1字节）类型定义为有符号的。如果我们有2个值，一个是char，另一个是int（int也是有符号的），而且它的初值是-2（被编码为0xFE），我们将这个值拷贝到int（译注：一般是4字节）中时，int的值将是0x000000FE，这时，int的值将是254而不是-2。因为在有符号数中，-2被编码为0xFFFFFFFE。 所以，如果我们需要将0xFE从char类型转换为int类型，那么，我们就需要识别它的符号并扩展它。这就是MOVSX所做的事情。
 66 | 
 67 | 请参见章节“有符号数表示方法”。（35章）
 68 | 
 69 | 我不太确定编译器是否需要将char变量存储在EDX中，它可以使用其中8位（我的意思是DL部分）。显然，编译器的寄存器分配器就是这么工作的。
 70 | 
 71 | 然后我们可以看到TEST EDX, EDX。关于TEST指令，你可以阅读一下位这一节（17章）。但是现在我想说的是，这个TEST指令只是检查EDX的值是否等于0。
 72 | 
 73 | ### 13.1.2 无优化的 GCC
 74 | 
 75 | 让我们在GCC 4.4.1下测试：
 76 | 
 77 | ```
 78 |         public strlen
 79 | strlen  proc near
 80 |  
 81 | eos     = dword ptr -4
 82 | arg_0   = dword ptr 8
 83 |  
 84 |         push    ebp
 85 |         mov     ebp, esp
 86 |         sub     esp, 10h
 87 |         mov     eax, [ebp+arg_0]
 88 |         mov     [ebp+eos], eax
 89 |  
 90 | loc_80483F0:
 91 |         mov     eax, [ebp+eos]
 92 |         movzx   eax, byte ptr [eax]
 93 |         test    al, al
 94 |         setnz   al
 95 |         add     [ebp+eos], 1
 96 |         test    al, al
 97 |         jnz     short loc_80483F0
 98 |         mov     edx, [ebp+eos]
 99 |         mov     eax, [ebp+arg_0]
100 |         mov     ecx, edx
101 |         sub     ecx, eax
102 |         mov     eax, ecx
103 |         sub     eax, 1
104 |         leave
105 |         retn
106 | strlen  endp
107 | ```
108 | 
109 | 可以看到它的结果和MSVC几乎相同，但是这儿我们可以看到它用MOVZX代替了MOVSX。 MOVZX代表着MOV with Zero-Extend（0位扩展MOV）。这个指令将8位或者16位的值拷贝到32位寄存器，然后将剩余位设置为0。事实上，这个指令比较方便的原因是它将两条指令组合到了一起：xor eax,eax / mov al, [...]。
110 | 
111 | 另一方面来说，显然这里编译器可以产生如下代码： mov al, byte ptr [eax] / test al, al，这几乎是一样的，但是，EAX高位将还是会有随机的数值存在。 但是我们想一想就知道了，这正是编译器的劣势所在——它不能产生更多能让人容易理解的代码。严格的说， 事实上编译器也并没有义务为人类产生易于理解的代码。
112 | 
113 | 还有一个新指令，SETNZ。这里，如果AL包含非0， test al, al将设置ZF标记位为0。 但是SETNZ中，如果ZF == 0（NZ的意思是非零，Not Zero），AL将设置为1。用自然语言描述一下，如果AL非0，我们就跳转到loc_80483F0。编译器生成了少量的冗余代码，不过不要忘了我们已经把优化给关了。
114 | 
115 | 13.1.3 优化后的 MSVC
116 | 
117 | 让我们在MSVC 2012下编译，打开优化选项/Ox：
118 | 
119 | 清单13.1: MSVC 2010 /Ox /Ob0
120 | 
121 | ```
122 | _str$ = 8               ; size = 4
123 | _strlen     PROC
124 |             mov     edx, DWORD PTR _str$[esp-4] ; EDX -> 指向字符的指针
125 |             mov     eax, edx                    ; 移动到 EAX
126 | $LL2@strlen:
127 |             mov     cl, BYTE PTR [eax]          ; CL = *EAX
128 |             inc     eax                         ; EAX++
129 |             test    cl, cl                      ; CL==0?
130 |             jne     SHORT $LL2@strlen           ; 否，继续循环
131 |             sub     eax, edx                    ; 计算指针差异
132 |             dec     eax                         ; 递减 EAX
133 |             ret     0
134 | _strlen ENDP
135 | ```
136 | 
137 | 现在看起来就更简单点了。但是没有必要去说编译器能在这么小的函数里面，如此有效率的使用如此少的本地变量，特殊情况而已。
138 | 
139 | INC / DEC是递增 / 递减指令，或者换句话说，给变量加一或者减一。
140 | 
141 | ### 13.1.4 优化后的 MSVC + OllyDbg
142 | 
143 | 我们可以在OllyDbg中试试这个（优化过的）例子。这儿有一个简单的最初的初始化：图13.1。 我们可以看到OllyDbg
144 | 
145 | 找到了一个循环，然后为了方便观看，OllyDbg把它们环绕在一个方格区域中了。在EAX上右键点击，我们可以选择“Follow in Dump”，然后内存窗口的位置将会跳转到对应位置。我们可以在内存中看到这里有一个“hello！”的字符串。 在它之后至少有一个0字节，然后就是随机的数据。 如果OllyDbg发现了一个寄存器是一个指向字符串的指针，那么它会显示这个字符串。
146 | 
147 | 让我们按下F8（步过）多次，我们可以看到当前地址的游标将在循环体中回到开始的地方：图13.2。我们可以看到EAX现在包含有字符串的第二个字符。
148 | 
149 | 我们继续按F8，然后执行完整个循环：图13.3。我们可以看到EAX现在包含空字符（）的地址，也就是字符串的末尾。同时，EDX并没有改变，所以它还是指向字符串的最开始的地方。现在它就可以计算这两个寄存器的差值了。
150 | 
151 | 然后SUB指令会被执行：图13.4。 差值保存在EAX中，为7。 但是，字符串“hello!”的长度是6，这儿7是因为包含了末尾的。但是strlen（）函数必须返回非0部分字符串的长度，所以在最后还是要给EAX减去1，然后将它作为返回值返回，退出函数。
152 | 
153 | ![](1.png)
154 | 
155 | 图13.1： 第一次循环迭代起始位置
156 | 
157 | ![](2.png)
158 | 
159 | 图13.2：第二次循环迭代开始位置
160 | 
161 | ![](3.png)
162 | 
163 | 图13.3： 现在要计算二者的差了
164 | 
165 | ![](4.png)
166 | 
167 | 图13.4： EAX需要减一
168 | 
169 | ### 13.1.5 优化过的GCC
170 | 
171 | 让我们打开GCC 4.4.1的编译优化选项（-O3）：
172 | 
173 | ```
174 |         public strlen
175 | strlen  proc near
176 |  
177 | arg_0   = dword ptr 8
178 |  
179 |         push    ebp
180 |         mov     ebp, esp
181 |         mov     ecx, [ebp+arg_0]
182 |         mov     eax, ecx
183 |  
184 | loc_8048418:
185 |         movzx   edx, byte ptr [eax]
186 |         add     eax, 1
187 |         test    dl, dl
188 |         jnz     short loc_8048418
189 |         not     ecx
190 |         add     eax, ecx
191 |         pop     ebp
192 |         retn
193 | strlen endp
194 | ```
195 | 
196 | 这儿GCC和MSVC的表现方式几乎一样，除了MOVZX的表达方式。
197 | 
198 | 但是，这里的MOVZX可能被替换为mov dl, byte ptr [eax]。
199 | 
200 | 可能是因为对GCC编译器来说，生成此种代码会让它更容易记住整个寄存器已经分配给char变量了，然后因此它就可以确认高位在任何时候都不会有任何干扰数据的存在了。
201 | 
202 | 之后，我们可以看到新的操作符NOT。这个操作符把操作数的所有位全部取反。可以说，它和XOR ECX, 0fffffffh效果是一样的。NOT和接下来的ADD指令计算差值然后将结果减一。在最开始的ECX出存储了str的指针，翻转之后会将它的值减一。
203 | 
204 | 请参考“有符号数的表达方式”。（第35章）
205 | 
206 | 换句话说，在函数最后，也就是循环体后面其实是做了这样一个操作：
207 | 
208 | ```
209 | ecx=str;
210 | eax=eos;
211 | ecx=(-ecx)-1;
212 | eax=eax+ecx
213 | return eax
214 | ```
215 | 
216 | 这样做其实几乎相等于：
217 | 
218 | ```
219 | ecx=str;
220 | eax=eos;
221 | eax=eax-ecx;
222 | eax=eax-1;
223 | return eax
224 | ```
225 | 
226 | 为什么GCC会认为它更棒呢？我不能确定，但是我确定上下两种方式都应该有相同的效率。
227 | 
228 | ## 13.2 ARM
229 | 
230 | ### 13.2.1 无优化 Xcode (LLVM) + ARM模式
231 | 
232 | 清单13.2: 无优化的Xcode（LLVM）+ ARM模式
233 | 
234 | ```
235 | _strlen
236 |  
237 | eos     = -8
238 | str     = -4
239 |         SUB     SP, SP, #8 ; allocate 8 bytes for local variables
240 |         STR     R0, [SP,#8+str]
241 |         LDR     R0, [SP,#8+str]
242 |         STR     R0, [SP,#8+eos]
243 |  
244 | loc_2CB8                ; CODE XREF: _strlen+28
245 |         LDR     R0, [SP,#8+eos]
246 |         ADD     R1, R0, #1
247 |         STR     R1, [SP,#8+eos]
248 |         LDRSB   R0, [R0]
249 |         CMP     R0, #0
250 |         BEQ     loc_2CD4
251 |         B       loc_2CB8
252 | ; ----------------------------------------------------------------
253 |  
254 | loc_2CD4                ; CODE XREF: _strlen+24
255 |         LDR     R0, [SP,#8+eos]
256 |         LDR     R1, [SP,#8+str]
257 |         SUB     R0, R0, R1 ; R0=eos-str
258 |         SUB     R0, R0, #1 ; R0=R0-1
259 |         ADD     SP, SP, #8 ; deallocate 8 bytes for local variables
260 |         BX      LR
261 | ```
262 | 
263 | 无优化的LLVM生成了太多的代码，但是，这里我们可以看到函数是如何在栈上处理本地变量的。我们的函数里只有两个本地变量，eos和str。
264 | 
265 | 在这个IDA生成的列表里，我把var_8和var_4命名为了eos和str。
266 | 
267 | 所以，第一个指令只是把输入的值放到str和eos里。
268 | 
269 | 循环体从loc_2CB8标签处开始。
270 | 
271 | 循环体的前三个指令（LDR、ADD、STR）将eos的值载入R0，然后值会加一，然后存回栈上本地变量eos。
272 | 
273 | 下一条指令“LDRSB R0, [R0]”（Load Register Signed Byte，读取寄存器有符号字）将从R0地址处读取一个字节，然后把它符号扩展到32位。这有点像是x86里的MOVSX函数（见13.1.1节）。因为char在C标准里面是有符号的，所以编译器也把这个字节当作有符号数。我已经在13.1.1节写了这个，虽然那里是相对x86来说的。 需要注意的是，在ARM里会单独分割使用8位或者16位或者32位的寄存器，就像x86一样。显然，这是因为x86有一个漫长的历史上的兼容性问题，它需要和他的前身：16位8086处理器甚至8位的8080处理器相兼容。但是ARM确是从32位的精简指令集处理器中发展而成的。因此，为了处理单独的字节，程序必须使用32位的寄存器。 所以LDRSB一个接一个的将符号从字符串内载入R0，下一个CMP和BEQ指令将检查是否读入的符号是0，如果不是0，控制流将重新回到循环体，如果是0，那么循环结束。 在函数最后，程序会计算eos和str的差，然后减一，返回值通过R0返回。
274 | 
275 | 注意：这个函数并没有保存寄存器。这是因为由ARM调用时的转换，R0-R3寄存器是“临时寄存器”（scratch register），它们只是为了传递参数用的，它们的值并不会在函数退出后保存，因为这时候函数也不会再使用它们。因此，它们可以被我们用来做任何事情，而这里其他寄存器都没有使用到，这也就是为什么我们的栈上事实上什么都没有的原因。因此，控制流可以通过简单跳转（BX）来返回调用的函数，地址存在LR寄存器中。
276 | 
277 | ### 13.2.2 优化后的 Xcode (LLVM) + thumb 模式
278 | 
279 | 清单13.3: 优化后的 Xcode（LLVM） + thumb模式
280 | 
281 | ```
282 | _strlen
283 |         MOV     R1, R0
284 |  
285 | loc_2DF6                ; CODE XREF: _strlen+8
286 |         LDRB.W  R2, [R1],#1
287 |         CMP     R2, #0
288 |         BNE     loc_2DF6
289 |         MVNS    R0, R0
290 |         ADD     R0, R1
291 |         BX      LR
292 | ```
293 | 
294 | 在优化后的LLVM中，为eos和str准备的栈上空间可能并不会分配，因为这些变量可以永远正确的存储在寄存器中。在循环体开始之前，str将一直存储在R0中，eos在R1中。
295 | 
296 | `"LDRB.W R2, [R1],#1"`指令从R1内存中读取字节到R2里，按符号扩展成32位的值，但是不仅仅这样。 在指令最后的#1被称为“后变址”（Post-indexed address），这代表着在字节读取之后，R1将会加一。这个在读取数组时特别方便。
297 | 
298 | 在x86中这里并没有这样的地址存取方式，但是在其他处理器中却是有的，甚至在PDP-11里也有。这是PDP-11中一个前增、后增、前减、后减的例子。这个很像是C语言（它是在PDP-11上开发的）中“罪恶的”语句形式ptr++、++ptr、ptr--、--ptr。顺带一提，C的这个语法真的很难让人记住。下为具体叙述：
299 | 
300 | ![](5.jpg)
301 | 
302 | C语言作者之一的Dennis Ritchie提到了这个可能是由于另一个作者Ken Thompson开发的功能，因此这个处理器特性在PDP-7中最早出现了（参考资料[28][29]）。因此，C语言编译器将在处理器支持这种指令时使用它。
303 | 
304 | 然后可以指出的是循环体的CMP和BNE，这两个指令将一直处理到字符串中的0出现为止。
305 | 
306 | MVNS（翻转所有位，也即x86的NOT）指令和ADD指令计算cos-str-1.事实上，这两个指令计算出R0=str+cos。这和源码里的指令效果一样，为什么他要这么做的原因我在13.1.5节已经说过了。
307 | 
308 | 显然，LLVM，就像是GCC一样，会把代码变得更短或者更快。
309 | 
310 | ### 13.2.3 优化后的 Keil + ARM 模式
311 | 
312 | 清单13.4: 优化后的 Keil + ARM模式
313 | 
314 | ```
315 | _strlen
316 |         MOV     R1, R0
317 | loc_2C8                 ; CODE XREF: _strlen+14
318 |         LDRB    R2, [R1],#1
319 |         CMP     R2, #0
320 |         SUBEQ   R0, R1, R0
321 |         SUBEQ   R0, R0, #1
322 |         BNE     loc_2C8
323 |         BX      LR
324 | ```
325 | 
326 | 这个和我们之前看到的几乎一样，除了str-cos-1这个表达式并不在函数末尾计算，而是被调到了循环体中间。 可以回忆一下-EQ后缀，这个代表指令仅仅会在CMP执行之前的语句互相相等时才会执行。因此，如果R0的值是0，两个SUBEQ指令都会执行，然后结果会保存在R0寄存器中。


--------------------------------------------------------------------------------
/Chapter-14/1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-14/1.jpg


--------------------------------------------------------------------------------
/Chapter-14/2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-14/2.jpg


--------------------------------------------------------------------------------
/Chapter-14/3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-14/3.jpg


--------------------------------------------------------------------------------
/Chapter-14/4.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-14/4.jpg


--------------------------------------------------------------------------------
/Chapter-14/5.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-14/5.jpg


--------------------------------------------------------------------------------
/Chapter-14/6.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-14/6.jpg


--------------------------------------------------------------------------------
/Chapter-14/7.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-14/7.jpg


--------------------------------------------------------------------------------
/Chapter-14/Chapter-14.md:
--------------------------------------------------------------------------------
  1 | # 除法
  2 | 
  3 | 下面是一个非常简单的函数
  4 | 
  5 | ```
  6 | int f(int a)
  7 | {
  8 |     return a/9;
  9 | };
 10 | ```
 11 | 
 12 | ## 14.1 x86
 13 | 
 14 | 以一种十分容易预测的方式编译的
 15 | 
 16 | ```
 17 | _a$ = 8             ; size = 4
 18 | _f   PROC
 19 |     push    ebp
 20 |     mov     ebp, esp
 21 |     mov     eax, DWORD PTR _a$[ebp]
 22 |     cdq             ; sign extend EAX to EDX:EAX
 23 |     mov     ecx, 9
 24 |     idiv    ecx
 25 |     pop     ebp
 26 |     ret     0
 27 | _f  ENDP
 28 | ```
 29 | 
 30 | IDIV 有符号数除法指令 64位的被除数分存在两个寄存器EDX:EAX,除数放在单个寄存器ECX中。运算结束后，商放在EAX，余数放在EDX。f（）函数的返回值将包含在eax寄存器中，也就是说，在进行除法运算之后，值不会再放到其他位置，它已经在合适的地方了。正因为IDIV指令要求被除数分存在EDX：EAX里，所以需要在做除法前用CDQ指令将EAX中的值扩展成64位有符号数，就像MOVSX指令(13.1.1)所做的一样。如果我们切换到优化模式（/0x），我们会得到
 31 | 
 32 | 清单14.2:MSVC优化模式
 33 | 
 34 | ```
 35 | _a$ = 8                         ; size = 4
 36 | _f   PROC
 37 | 
 38 |     mov     ecx, DWORD PTR _a$[esp-4]
 39 |     mov     eax, 954437177      ; 38e38e39H
 40 |     imul    ecx
 41 |     sar     edx, 1
 42 |     mov     eax, edx
 43 |     shr     eax, 31             ; 0000001fH
 44 |     add     eax, edx
 45 |     ret     0
 46 | _f   ENDP
 47 | ```
 48 | 
 49 | 这里将除法优化为乘法。乘法运算要快得多。使用这种技巧可以得到更高效的代码。
 50 | 
 51 | 在编译器优化中，这也称为“strength reduction”
 52 | 
 53 | GCC4.4.1甚至在没有打开优化模式的情况下生成了和在MSVC下打开优化模式的生成的几乎一样的代码。
 54 | 
 55 | 清单14.3 GCC 4.4.1 非优化模式
 56 | 
 57 | ```
 58 |         public f
 59 | f       procnear
 60 | arg_0   = dword ptr 8
 61 |  
 62 |         push    ebp
 63 |         mov     ebp, esp
 64 |         mov     ecx, [ebp+arg_0]
 65 |         mov     edx, 954437177 ; 38E38E39h
 66 |         mov     eax, ecx
 67 |         imul    edx
 68 |         sar     edx, 1
 69 |         mov     eax, ecx
 70 |         sar     eax, 1Fh
 71 |         mov     ecx, edx
 72 |         sub     ecx, eax
 73 |         mov     eax, ecx
 74 |         pop     ebp
 75 |         retn
 76 | f       endp
 77 | ```
 78 | 
 79 | ## 14.2 ARM
 80 | 
 81 | ARM处理器，就像其他的“纯”RISC处理器一样，缺少除法指令，缺少32位常数乘法的单条指令。利用一个技巧，通过加法，减法，移位是可以实现除法的。 这里有一个32位数被10（20，3.3常量除法）除的例子，输出商和余数。
 82 | 
 83 | ```
 84 | ; takes argument in a1
 85 | ; returns quotient in a1, remainder in a2
 86 | ; cycles could be saved if only divide or remainder is required
 87 |     SUB     a2, a1, #10         ; keep (x-10) for later
 88 |     SUB     a1, a1, a1, lsr #2
 89 |     ADD     a1, a1, a1, lsr #4
 90 |     ADD     a1, a1, a1, lsr #8
 91 |     ADD     a1, a1, a1, lsr #16
 92 |     MOV     a1, a1, lsr #3
 93 |     ADD     a3, a1, a1, asl #2
 94 |     SUBS    a2, a2, a3, asl #1  ; calc (x-10) - (x/10)*10
 95 |     ADDPL   a1, a1, #1          ; fix-up quotient
 96 |     ADDMI   a2, a2, #10         ; fix-up remainder
 97 |     MOV     pc, lr
 98 | ```
 99 | 
100 | ### 14.2.1 Xcode优化模式（LLVM）+ARM模式
101 | 
102 | ```
103 | __text:00002C58 39 1E 08 E3 E3 18 43 E3     MOV     R1, 0x38E38E39
104 | __text:00002C60 10 F1 50 E7                 SMMUL   R0, R0, R1
105 | __text:00002C64 C0 10 A0 E1                 MOV     R1, R0,ASR#1
106 | __text:00002C68 A0 0F 81 E0                 ADD     R0, R1, R0,LSR#31
107 | __text:00002C6C 1E FF 2F E1                 BX      LR
108 | ```
109 | 
110 | 运行原理
111 | 
112 | 这里的代码和优化模式的MSVC和GCC生成的基本相同。显然，LLVM在产生常量上使用相同的算法。
113 | 
114 | 善于观察的读者可能会问，MOV指令是如何将32位数值写入寄存器中的，因为这在ARM模式下是不可能的。实际上是可能的，但是，就像我们看到的，与标准指令每条有四个字节不同的是，这里的每条指令有8个字节，其实这是两条指令。第一条指令将值0x8E39装入寄存器的低十六位，第二条指令是MOVT,它将0x383E装入寄存器的高16位。IDA知道这些顺序，并且为了精简紧凑，将它精简转换成一条伪代码。
115 | 
116 | SMMUL (Signed Most Significant Word Multiply)实现两个32位有符号数的乘法，并且将高32位的部分放在r0中，弃掉结果的低32位部分。
117 | 
118 | ```
119 | MOV R1,R0,ASR#1 指令算数右移一位。
120 | ADD R0,R1,LSR#31 R0=R1+R0>>32
121 | ```
122 | 
123 | 事实上，在ARM模式下，并没有单独的移位指令。相反，像（MOV,ADD,SUB,RSB）3 这样的数据处理指令，第二个操作数需要被移位。ASR表示算数右移，LSR表示逻辑右移。
124 | 
125 | ### 14.2.2 优化 Xcode(LLVM)+thumb-2 模式
126 | 
127 | ```
128 | MOV         R1, 0x38E38E39
129 | SMMUL.W     R0, R0, R1
130 | ASRS        R1, R0, #1
131 | ADD.W       R0, R1, R0,LSR#31
132 | BX          LR
133 | ```
134 | 
135 | 在thumb模式下有些单独的移位指令，这个例子中使用了ASRS（算数右移）
136 | 
137 | ### 14.2.3 Xcode非优化模式（LLVM） keil模式
138 | 
139 | 非优化模式 LLVM不生成我们之前看到的那样的代码，它插入了一个调用库函数的call __divsi3
140 | 
141 | 关于keil：通常插入一个调用库函数的call __aeabi_idivmod
142 | 
143 | ## 14.3 工作原理
144 | 
145 | 下面展示的是怎样用乘法来优化除法，其中借助了2^n的阶乘
146 | 
147 | ![](1.jpg)
148 | 
149 | M是一个magic系数
150 | 
151 | M的计算过程
152 | 
153 | ![](2.jpg)
154 | 
155 | 因此这些代码片段通常具有这样的形式
156 | 
157 | ![](3.jpg)
158 | 
159 | n可以是任意数，可能是32（那么这样运算结果的高位部分从EX或者RDX寄存器中获取），可能是31（这种情况下乘法结果的高位部分结果右移）
160 | 
161 | n的选取是为了减少错误。
162 | 
163 | 当进行有符号数除法运算，乘法结果的符号也会被放到输出结果中。
164 | 
165 | 下面来看看不同之处。
166 | 
167 | ```
168 | int f3_32_signed(int a)
169 | {
170 |     return a/3;
171 | };
172 | unsigned int f3_32_unsigned(unsigned int a)
173 | {
174 |     return a/3;
175 | };
176 | ```
177 | 
178 | 在无符号版本的函数中，magic系数是0xAAAAAAAB，乘法结果被2^3*3除。
179 | 
180 | 在有符号版本的函数中，magic系数是0x55555556，乘法结果被2^32除。
181 | 
182 | 符号来自于乘法结果：高32位的结果右移31位（将符号位放在EAX中最不重要的位置）。如果最后结果为负，则会设置为1。
183 | 
184 | 清单14.4：MSVC 2012/OX
185 | 
186 | ```
187 | _f3_32_unsigned     PROC
188 |         mov     eax, -1431655765        ; aaaaaaabH
189 |         mul     DWORD PTR _a$[esp-4]    ; unsigned multiply
190 |         shr     edx, 1
191 |         mov     eax, edx
192 |         ret     0
193 | _f3_32_unsigned ENDP
194 |  
195 | _f3_32_signed PROC
196 |         mov     eax, 1431655766         ; 55555556H
197 |         imul    DWORD PTR _a$[esp-4]    ; signed multiply
198 |         mov     eax, edx
199 |         shr     eax, 31                 ; 0000001fH
200 |         add     eax, edx                ; add 1 if sign is negative
201 |         ret     0
202 | _f3_32_signed ENDP
203 | ```
204 | 
205 | ## 14.4 得到除数
206 | 
207 | ### 14.4.1 变形＃1
208 | 
209 | 通常，代码具有这样一种形式
210 | 
211 | ```
212 | mov     eax, MAGICAL CONSTANT
213 | imul    input value
214 | sar     edx, SHIFTING COEFFICIENT ; signed division by 2^x using arithmetic shift right
215 | mov     eax, edx
216 | shr     eax, 31
217 | add     eax, edx
218 | ```
219 | 
220 | 我们将32位的magic系数表示为M，移位表示为C，除数表示为D
221 | 
222 | 我们得到的除法是
223 | 
224 | ![](4.jpg)
225 | 
226 | 举个例子
227 | 
228 | 清单14.5：优化模式 MSVC2012
229 | 
230 | ```
231 | mov     eax, 2021161081     ; 78787879H
232 | imul    DWORD PTR _a$[esp-4]
233 | sar     edx, 3
234 | mov     eax, edx
235 | shr     eax, 31             ; 0000001fH
236 | add     eax, edx
237 | ```
238 | 
239 | 即
240 | 
241 | ![](5.jpg)
242 | 
243 | 比32位的数字大，为了方便，于是我们使用用Wolfram Mathematica软件。
244 | 
245 | ```
246 | In[1]:=N[2^(32+3)/2021161081]
247 | Out[1]:=17.
248 | ```
249 | 
250 | 因此例子中的代码得到结果是17。
251 | 
252 | 对于64位除法来说，原理是一样的，但是应该使用2^64来代替2^32。
253 | 
254 | ```
255 | uint64_t f1234(uint64_t a)
256 | {
257 |     return a/1234;
258 | };
259 | ```
260 | 
261 | 清单14.7：MSVC2012/Ox
262 | 
263 | ```
264 | f1234   PROC
265 |         mov     rax, 7653754429286296943 ; 6a37991a23aead6fH
266 |         mul     rcx
267 |         shr     rdx, 9
268 |         mov     rax, rdx
269 |         ret     0
270 | f1234   ENDP
271 | ```
272 | 
273 | 清单14.8：Wolfram Mathematica
274 | 
275 | ```
276 | In[1]:=N[2^(64+9)/16^^6a37991a23aead6f]
277 | Out[1]:=1234.
278 | ```
279 | 
280 | ### 14.4.2 变形＃2
281 | 
282 | 忽略算数移位的变形也是存在的
283 | 
284 | ```
285 | mov     eax, 55555556h ; 1431655766
286 | imul    ecx
287 | mov     eax, edx
288 | shr     eax, 1Fh
289 | ```
290 | 
291 | 更加简洁
292 | 
293 | ![](6.jpg)
294 | 
295 | 在这个例子中
296 | 
297 | ![](7.jpg)
298 | 
299 | 再用一次Wolfram Mathematica
300 | 
301 | ```
302 | In[1]:=N[2^32/16^^55555556]
303 | Out[1]:=3.
304 | ```
305 | 
306 | 得到的除数是3


--------------------------------------------------------------------------------
/Chapter-17/1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-17/1.jpg


--------------------------------------------------------------------------------
/Chapter-17/2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-17/2.jpg


--------------------------------------------------------------------------------
/Chapter-17/3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-17/3.jpg


--------------------------------------------------------------------------------
/Chapter-18/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-18/1.png


--------------------------------------------------------------------------------
/Chapter-18/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-18/2.png


--------------------------------------------------------------------------------
/Chapter-19/Chapter-19.md:
--------------------------------------------------------------------------------
 1 | # 联合体
 2 | 
 3 | ## 19.1 伪随机数生成器的例子
 4 | 
 5 | 如果我们需要0～1的随机浮点数，最简单的方法就是用PRNG（伪随机数发生器），比如马特赛特旋转演算法可以生成一个随机的32位的DWORD。然后我们可以把这个值转为FLOAT类型，然后除以RAND_MAX（我们的例子是0xFFFFFFFF），这样，我们得到的将是0..1区间的数。 但是如我们所知道的是，除法很慢。我们是否能摆脱它呢？就像我们用乘法做除法一样（14章）。 让我们想想浮点数由什么构成：符号位、有效数字位、指数位。我们只需要在这里面存储一些随机的位就好了。 指数不能变成0（在本例里面数字会不正常），所以我们存储0111111到指数里面，这意味着指数位将是1。然后，我们用随机位填充有效数字位，然后把符号位设置为0（正数）。生成的数字将在1-2的间隔中生成，所以我们必须从里面再减去1。 我例子里面是最简单的线性同余随机数生成器，生成32位（译注：32-bit比特位，非数字位）的数字。PRNG将会用UNIX时间戳来初始化。 然后，我们会把float类型当作联合体（union）来处理，这是一个C/C++的结构。它允许我们把一片内存里面各种不同类型的数据联合覆盖到一起用。在我们的例子里，我们可以创建一个union，然后通过float或者uint32_t来访问它。因此，这只是一个小技巧，而且是很脏的技巧。
 6 | 
 7 | ```
 8 | #include <stdio.h>
 9 | #include <stdint.h>
10 | #include <time.h>
11 | union uint32_t_float
12 | {
13 |     uint32_t i;
14 |     float f;
15 | };
16 | // from the Numerical Recipes book
17 | const uint32_t RNG_a=1664525;
18 | const uint32_t RNG_c=1013904223;
19 | int main()
20 | {
21 |     uint32_t_float tmp;
22 |     uint32_t RNG_state=time(NULL); // initial seed
23 |     for (int i=0; i<100; i++)
24 |     {
25 |         RNG_state=RNG_state*RNG_a+RNG_c;
26 |         tmp.i=RNG_state & 0x007fffff | 0x3F800000;
27 |         float x=tmp.f-1;
28 |         printf ("%f", x);
29 |     };
30 |     return 0;
31 | };
32 | ```
33 | 
34 | 清单19.1: MSVC 2010 （/Ox）
35 | 
36 | ```
37 | $SG4232 DB ’%f’, 0aH, 00H
38 | __real@3ff0000000000000 DQ 03ff0000000000000r ; 1
39 | tv140= -4 ; size = 4
40 | _tmp$= -4 ; size = 4
41 | _main PROC
42 |     push ebp
43 |     mov ebp, esp
44 |     and esp, -64 ; ffffffc0H
45 |     sub esp, 56 ; 00000038H
46 |     push esi
47 |     push edi
48 |     push 0
49 |     call __time64
50 |     add esp, 4
51 |     mov esi, eax
52 |     mov edi, 100 ; 00000064H
53 | $LN3@main:
54 |     ; let’s generate random 32-bit number
55 |     imul esi, 1664525 ; 0019660dH
56 |     add esi, 1013904223 ; 3c6ef35fH
57 |     mov eax, esi
58 |     ; leave bits for significand only
59 |     and eax, 8388607 ; 007fffffH
60 |     ; set exponent to 1
61 |     or eax, 1065353216 ; 3f800000H
62 |     ; store this value as int
63 |     mov DWORD PTR _tmp$[esp+64], eax
64 |     sub esp, 8
65 |     ; load this value as float
66 |     fld DWORD PTR _tmp$[esp+72]
67 |     ; subtract one from it
68 |     fsub QWORD PTR __real@3ff0000000000000
69 |     fstp DWORD PTR tv140[esp+72]
70 |     fld DWORD PTR tv140[esp+72]
71 |     fstp QWORD PTR [esp]
72 |     push OFFSET $SG4232
73 |     call _printf
74 |     add esp, 12 ; 0000000cH
75 |     dec edi
76 |     jne SHORT $LN3@main
77 |     pop edi
78 |     xor eax, eax
79 |     pop esi
80 |     mov esp, ebp
81 |     pop ebp
82 |     ret 0
83 | _main ENDP
84 | _TEXT ENDS
85 | END
86 | ```
87 | 
88 | GCC也生成了非常相似的代码。


--------------------------------------------------------------------------------
/Chapter-20/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-20/1.png


--------------------------------------------------------------------------------
/Chapter-20/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-20/2.png


--------------------------------------------------------------------------------
/Chapter-20/3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-20/3.png


--------------------------------------------------------------------------------
/Chapter-20/4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-20/4.png


--------------------------------------------------------------------------------
/Chapter-21/Chapter-21.md:
--------------------------------------------------------------------------------
  1 | # 在32位环境中的64位值
  2 | 
  3 | 在32位环境中的通用寄存器是32位的，所以64位值转化为一对32位值。
  4 | 
  5 | ## 21.1参数的传递，加法，减法
  6 | 
  7 | ```
  8 | #include <stdint.h>
  9 | uint64_t f1 (uint64_t a, uint64_t b)
 10 | {
 11 |         return a+b;
 12 | };
 13 | void f1_test ()
 14 | {
 15 | #ifdef __GNUC__
 16 |         printf ("%lld", f1(12345678901234, 23456789012345));
 17 | #else
 18 |         printf ("%I64d", f1(12345678901234, 23456789012345));
 19 | #endif
 20 | };
 21 | uint64_t f2 (uint64_t a, uint64_t b)
 22 | {
 23 |         return a-b;
 24 | };
 25 | ```
 26 | 
 27 | 代码 21.1: MSVC 2012 /Ox /Ob1
 28 | 
 29 | ```
 30 | _a$ = 8                                             ; size = 8
 31 | _b$ = 16                                            ; size = 8
 32 | _f1     PROC
 33 |         mov     eax, DWORD PTR _a$[esp-4]
 34 |         add     eax, DWORD PTR _b$[esp-4]
 35 |         mov     edx, DWORD PTR _a$[esp]
 36 |         adc     edx, DWORD PTR _b$[esp]
 37 |         ret     0
 38 | _f1     ENDP
 39 |  
 40 | _f1_test    PROC
 41 |         push        5461                            ; 00001555H
 42 |         push        1972608889                      ; 75939f79H
 43 |         push        2874                            ; 00000b3aH
 44 |         push        1942892530                      ; 73ce2ff2H
 45 |         call        _f1
 46 |         push        edx
 47 |         push        eax
 48 |         push        OFFSET $SG1436 ; ’%I64d’, 0aH, 00H
 49 |         call        _printf
 50 |         add         esp, 28                         ; 0000001cH
 51 |         ret     0
 52 | _f1_test    ENDP
 53 | _f2     PROC
 54 |         mov     eax, DWORD PTR _a$[esp-4]
 55 |         sub     eax, DWORD PTR _b$[esp-4]
 56 |         mov     edx, DWORD PTR _a$[esp]
 57 |         sbb     edx, DWORD PTR _b$[esp]
 58 |         ret     0
 59 | _f2     ENDP
 60 | ```
 61 | 
 62 | 我们可以看到在函数f1_test()中每个64位值转化为2个32位值，高位先转，然后是低位。加法和减法也是如此。
 63 | 
 64 | 当进行加法操作时，低32位部分先做加法。如果相加过程中产生进位，则设置CF标志。下一步通过ADC指令加上高位部分，如果CF置1了就增加1。
 65 | 
 66 | 减法操作也是如此。第一个SUB操作也会导致CF标志的改变，并在随后的SBB操作中检查：如果CF置1了，那么最终结果也会减去1。
 67 | 
 68 | 在32位环境中，64位的值是从EDX:EAX这一对寄存器的函数中返回的。可以很容易看出f1()函数是如何转化为printf()函数的。
 69 | 
 70 | 代码 21.2: GCC 4.8.1 -O1 -fno-inline
 71 | 
 72 | ```
 73 | _f1:
 74 |         mov     eax, DWORD PTR [esp+12]
 75 |         mov     edx, DWORD PTR [esp+16]
 76 |         add     eax, DWORD PTR [esp+4]
 77 |         adc     edx, DWORD PTR [esp+8]
 78 |         ret
 79 | 
 80 | _f1_test:
 81 |         sub     esp, 28
 82 |         mov     DWORD PTR [esp+8], 1972608889           ; 75939f79H
 83 |         mov     DWORD PTR [esp+12], 5461                ; 00001555H
 84 |         mov     DWORD PTR [esp], 1942892530             ; 73ce2ff2H
 85 |         mov     DWORD PTR [esp+4], 2874                 ; 00000b3aH
 86 |         call    _f1
 87 |         mov     DWORD PTR [esp+4], eax
 88 |         mov     DWORD PTR [esp+8], edx
 89 |         mov     DWORD PTR [esp], OFFSET FLAT:LC0        ; "%lld12"
 90 |         call    _printf
 91 |         add     esp, 28
 92 |         ret
 93 | 
 94 | _f2:
 95 |         mov     eax, DWORD PTR [esp+4]
 96 |         mov     edx, DWORD PTR [esp+8]
 97 |         sub     eax, DWORD PTR [esp+12]
 98 |         sbb     edx, DWORD PTR [esp+16]
 99 |         ret
100 | ```
101 | GCC代码也是如此。
102 | 
103 | ## 21.2 乘法，除法
104 | 
105 | ```
106 | #include <stdint.h>
107 | uint64_t f3 (uint64_t a, uint64_t b)
108 | {
109 |         return a*b;
110 | };
111 | uint64_t f4 (uint64_t a, uint64_t b)
112 | {
113 |         return a/b;
114 | };
115 | uint64_t f5 (uint64_t a, uint64_t b)
116 | {
117 |         return a % b;
118 | };
119 | ```
120 | 
121 | 代码 21.3: MSVC 2012 /Ox /Ob1
122 | 
123 | ```
124 | _a$ = 8                                     ; size = 8
125 | _b$ = 16                                    ; size = 8
126 | _f3     PROC
127 |         push        DWORD PTR _b$[esp]
128 |         push        DWORD PTR _b$[esp]
129 |         push        DWORD PTR _a$[esp+8]
130 |         push        DWORD PTR _a$[esp+8]
131 |         call        __allmul ; long long multiplication
132 |         ret         0
133 | _f3     ENDP
134 | _a$ = 8                                     ; size = 8
135 | _b$ = 16                                    ; size = 8
136 | _f4     PROC
137 |         push        DWORD PTR _b$[esp]
138 |         push        DWORD PTR _b$[esp]
139 |         push        DWORD PTR _a$[esp+8]
140 |         push        DWORD PTR _a$[esp+8]
141 |         call        __aulldiv ; unsigned long long division
142 |         ret         0
143 | _f4     ENDP
144 | _a$ = 8                                     ; size = 8
145 | _b$ = 16                                    ; size = 8
146 | _f5     PROC
147 |         push        DWORD PTR _b$[esp]
148 |         push        DWORD PTR _b$[esp]
149 |         push        DWORD PTR _a$[esp+8]
150 |         push        DWORD PTR _a$[esp+8]
151 |         call        __aullrem ; unsigned long long remainder
152 |         ret         0
153 | _f5     ENDP
154 | ```
155 | 
156 | 乘法和除法是更为复杂的操作，一般来说，编译器会嵌入库函数的calls来使用。
157 | 
158 | 部分函数的意义：可参见附录E。
159 | 
160 | Listing 21.4: GCC 4.8.1 -O3 -fno-inline
161 | 
162 | ```
163 | _f3:
164 |         push        ebx
165 |         mov         edx, DWORD PTR [esp+8]
166 |         mov         eax, DWORD PTR [esp+16]
167 |         mov         ebx, DWORD PTR [esp+12]
168 |         mov         ecx, DWORD PTR [esp+20]
169 |         imul        ebx, eax
170 |         imul        ecx, edx
171 |         mul         edx
172 |         add         ecx, ebx
173 |         add         edx, ecx
174 |         pop         ebx
175 |         ret
176 | _f4:
177 |         sub         esp, 28
178 |         mov         eax, DWORD PTR [esp+40]
179 |         mov         edx, DWORD PTR [esp+44]
180 |         mov         DWORD PTR [esp+8], eax
181 |         mov         eax, DWORD PTR [esp+32]
182 |         mov         DWORD PTR [esp+12], edx
183 |         mov         edx, DWORD PTR [esp+36]
184 |         mov         DWORD PTR [esp], eax
185 |         mov         DWORD PTR [esp+4], edx
186 |         call        ___udivdi3 ; unsigned division
187 |         add         esp, 28
188 |         ret
189 | _f5:
190 |         sub         esp, 28
191 |         mov         eax, DWORD PTR [esp+40]
192 |         mov         edx, DWORD PTR [esp+44]
193 |         mov         DWORD PTR [esp+8], eax
194 |         mov         eax, DWORD PTR [esp+32]
195 |         mov         DWORD PTR [esp+12], edx
196 |         mov         edx, DWORD PTR [esp+36]
197 |         mov         DWORD PTR [esp], eax
198 |         mov         DWORD PTR [esp+4], edx
199 |         call        ___umoddi3 ; unsigned modulo
200 |         add         esp, 28
201 |         ret
202 | ```
203 | 
204 | GCC的做法几乎一样，但是乘法代码内联在函数中，可认为这样更有效。
205 | 
206 | GCC有一些不同的库函数：参见附录D
207 | 
208 | ## 21.3 右移
209 | 
210 | ```
211 | #include <stdint.h>
212 | uint64_t f6 (uint64_t a)
213 | {
214 |         return a>>7;
215 | };
216 | ```
217 | 
218 | 代码 21.5: MSVC 2012 /Ox /Ob1
219 | 
220 | ```
221 | _a$ = 8                                     ; size = 8
222 | _f6     PROC
223 |         mov     eax, DWORD PTR _a$[esp-4]
224 |         mov     edx, DWORD PTR _a$[esp]
225 |         shrd    eax, edx, 7
226 |         shr     edx, 7
227 |         ret     0
228 | _f6     ENDP
229 | ```
230 | 
231 | 代码 21.6: GCC 4.8.1 -O3 -fno-inline
232 | 
233 | ```
234 | _f6:
235 |         mov     edx, DWORD PTR [esp+8]
236 |         mov     eax, DWORD PTR [esp+4]
237 |         shrd        eax, edx, 7
238 |         shr     edx, 7
239 |         ret
240 | ```
241 | 
242 | 右移也是分成两步完成：先移低位，然后移高位。但是低位部分通过指令SHRD移动，它将EDX的值移动7位，并从EAX借来1位，也就是从高位部分。而高位部分通过更受欢迎的指令SHR移动：的确，高位释放出来的位置用0填充。
243 | 
244 | ## 21.4从32位值转化为64位值
245 | 
246 | ```
247 | #include <stdint.h>
248 | int64_t f7 (int64_t a, int64_t b, int32_t c)
249 | {
250 |         return a*b+c;
251 | };
252 | 
253 | int64_t f7_main ()
254 | {
255 |         return f7(12345678901234, 23456789012345, 12345);
256 | };
257 | ```
258 | 
259 | 代码 21.7: MSVC 2012 /Ox /Ob1
260 | 
261 | ```
262 | _a$ = 8                                 ; size = 8
263 | _b$ = 16                                ; size = 8
264 | _c$ = 24                                ; size = 4
265 | _f7     PROC
266 |         push        esi
267 |         push        DWORD PTR _b$[esp+4]
268 |         push        DWORD PTR _b$[esp+4]
269 |         push        DWORD PTR _a$[esp+12]
270 |         push        DWORD PTR _a$[esp+12]
271 |         call        __allmul ; long long multiplication
272 |         mov         ecx, eax
273 |         mov         eax, DWORD PTR _c$[esp]
274 |         mov         esi, edx
275 |         cdq                 ; input: 32-bit value in EAX; output: 64-bit value in EDX:EAX
276 |         add         eax, ecx
277 |         adc         edx, esi
278 |         pop         esi
279 |         ret         0
280 | _f7     ENDP
281 |  
282 | _f7_main PROC
283 |         push        12345               ; 00003039H
284 |         push        5461                ; 00001555H
285 |         push        1972608889          ; 75939f79H
286 |         push        2874                ; 00000b3aH
287 |         push        1942892530          ; 73ce2ff2H
288 |         call        _f7
289 |         add     esp, 20                 ; 00000014H
290 |         ret     0
291 | _f7_main ENDP
292 | ```
293 | 
294 | 这里我们有必要将有符号的32位值从c转化为有符号的64位值。无符号值的转化简单了当：所有的高位部分全部置0。但是这样不适合有符号的数据类型：符号标志应复制到结果中的高位部分。这里用到的指令是CDQ，它从EAX中取出数值，将其变为64位并存放到EDX:EAX这一对寄存器中。换句话说，指令CDQ从EAX中获取符号（通过EAX中最重要的位），并根据它来设置EDX中所有位为0还是为1。它的操作类似于指令MOVSX（13.1.1）。
295 | 
296 | 代码 21.8: GCC 4.8.1 -O3 -fno-inline
297 | 
298 | ```
299 | _f7:
300 |         push        edi
301 |         push        esi
302 |         push        ebx
303 |         mov         esi, DWORD PTR [esp+16]
304 |         mov         edi, DWORD PTR [esp+24]
305 |         mov         ebx, DWORD PTR [esp+20]
306 |         mov         ecx, DWORD PTR [esp+28]
307 |         mov         eax, esi
308 |         mul         edi
309 |         imul        ebx, edi
310 |         imul        ecx, esi
311 |         mov         esi, edx
312 |         add         ecx, ebx
313 |         mov         ebx, eax
314 |         mov         eax, DWORD PTR [esp+32]
315 |         add         esi, ecx
316 |         cdq             ; input: 32-bit value in EAX; output: 64-bit value in EDX:EAX
317 |         add         eax, ebx
318 |         adc         edx, esi
319 |         pop         ebx
320 |         pop         esi
321 |         pop         edi
322 |         ret
323 | _f7_main:
324 |         sub         esp, 28
325 |         mov         DWORD PTR [esp+16], 12345               ; 00003039H
326 |         mov         DWORD PTR [esp+8], 1972608889           ; 75939f79H
327 |         mov         DWORD PTR [esp+12], 5461                ; 00001555H
328 |         mov         DWORD PTR [esp], 1942892530             ; 73ce2ff2H
329 |         mov         DWORD PTR [esp+4], 2874                 ; 00000b3aH
330 |         call
331 | _f7
332 |         add         esp, 28
333 |         ret
334 | ```
335 | 
336 | GCC生成的汇编代码跟MSVC一样，但是在函数中内联乘法代码。 更多：32位值在16位环境中（30.4）


--------------------------------------------------------------------------------
/Chapter-24/Chapter-24.md:
--------------------------------------------------------------------------------
  1 | # 使用x64下的SIMD来处理浮点数
  2 | 
  3 | 当然，在增加了x64扩展这个特性之后，FPU在x86兼容处理器中还是存在的。但是同事，SIMD扩展（SSE, SSE2等）已经有了，他们也可以处理浮点数。数字格式依然相同（使用IEEE754标准）。
  4 | 
  5 | 所以，x86-64编译器通常都使用SIMD指令。可以说这是一个好消息，因为这让我们可以更容易的使用他们。 24.1 简单的例子
  6 | 
  7 | ```
  8 | double f (double a, double b)
  9 | {
 10 |     return a/3.14 + b*4.1;
 11 | };
 12 | ```
 13 | 
 14 | 清单24.1： MSFC 2012 x64 /Ox
 15 | 
 16 | ```
 17 | __real@4010666666666666 DQ 04010666666666666r ; 4.1
 18 | __real@40091eb851eb851f DQ 040091eb851eb851fr ; 3.14
 19 | a$ = 8
 20 | b$ = 16
 21 | f PROC
 22 |     divsd xmm0, QWORD PTR __real@40091eb851eb851f
 23 |     mulsd xmm1, QWORD PTR __real@4010666666666666
 24 |     addsd xmm0, xmm1
 25 |     ret 0
 26 | f ENDP
 27 | ```
 28 | 
 29 | 输入的浮点数被传入了XMM0-XMM3寄存器，其他的通过栈来传递。 a被传入了XMM0，b则是通过XMM1。 XMM寄存器是128位的（可以参考SIMD22一节），但是我们的类型是double型的，也就意味着只有一半的寄存器会被使用。
 30 | 
 31 | DIVSD是一个SSE指令，意思是“Divide Scalar Double-Precision Floating-Point Values”（除以标量双精度浮点数值），它只是把一个double除以另一个double，然后把结果存在操作符的低一半位中。 常量会被编译器以IEEE754格式提前编码。 MULSD和ADDSD也是类似的，只不过一个是乘法，一个是加法。 函数处理double的结果将保存在XMM0寄存器中。
 32 | 
 33 | 这是无优化的MSVC编译器的结果：
 34 | 
 35 | 清单24.2： MSVC 2012 x64
 36 | 
 37 | ```
 38 | __real@4010666666666666 DQ 04010666666666666r ; 4.1
 39 | __real@40091eb851eb851f DQ 040091eb851eb851fr ; 3.14
 40 | a$ = 8
 41 | b$ = 16
 42 | f PROC
 43 |     movsdx QWORD PTR [rsp+16], xmm1
 44 |     movsdx QWORD PTR [rsp+8], xmm0
 45 |     movsdx xmm0, QWORD PTR a$[rsp]
 46 |     divsd xmm0, QWORD PTR __real@40091eb851eb851f
 47 |     movsdx xmm1, QWORD PTR b$[rsp]
 48 |     mulsd xmm1, QWORD PTR __real@4010666666666666
 49 |     addsd xmm0, xmm1
 50 |     ret 0
 51 | f ENDP
 52 | ```
 53 | 
 54 | 有一些繁杂，输入参数保存在“shadow space”（影子空间，7.2.1节），但是只有低一半的寄存器，也即只有64位存了这个double的值。
 55 | 
 56 | GCC编译器生成了几乎一样的代码。
 57 | 
 58 | 24.2 通过参数传递浮点型变量
 59 | 
 60 | ```
 61 | #include <math.h>
 62 | #include <stdio.h>
 63 | int main ()
 64 | {
 65 |     printf ("32.01 ^ 1.54 = %lf\n", pow (32.01,1.54));
 66 |     return 0;
 67 | }
 68 | ```
 69 | 
 70 | 他们通过XMM0-XMM3的低一半寄存器传递。
 71 | 
 72 | 清单24.3： MSVC 2012 x64 /Ox
 73 | 
 74 | ```
 75 | $SG1354 DB ’32.01 ^ 1.54 = %lf’, 0aH, 00H
 76 | __real@40400147ae147ae1 DQ 040400147ae147ae1r ; 32.01
 77 | __real@3ff8a3d70a3d70a4 DQ 03ff8a3d70a3d70a4r ; 1.54
 78 | main PROC
 79 |     sub rsp, 40 ; 00000028H
 80 |     movsdx xmm1, QWORD PTR __real@3ff8a3d70a3d70a4
 81 |     movsdx xmm0, QWORD PTR __real@40400147ae147ae1
 82 |     call pow
 83 |     lea rcx, OFFSET FLAT:$SG1354
 84 |     movaps xmm1, xmm0
 85 |     movd rdx, xmm1
 86 |     call printf
 87 |     xor eax, eax
 88 |     add rsp, 40 ; 00000028H
 89 |     ret 0
 90 | main ENDP
 91 | ```
 92 | 
 93 | 在Intel和AMD的手册中（见14章和1章）并没有MOVSDX这个指令，而只有MOVSD一个。所以在x86中有两个指令共享了同一个名字（另一个见B.6.2）。显然，微软的开发者想要避免弄得一团糟，所以他们把它重命名为MOVSDX，它只是会多把一个值载入XMM寄存器的低一半中。 pow（）函数从XMM0和XMM1中加载参数，然后返回结果到XMM0中。 然后把值移动到RDX中，因为接下来printf()需要调用这个函数。为什么？老实说我也不知道，也许是因为printf()是一个参数不定的函数？
 94 | 
 95 | 清单24.4：GCC 4.4.6 x64 -O3
 96 | 
 97 | ```
 98 | .LC2:
 99 | .string "32.01 ^ 1.54 = %lf\n"
100 | main:
101 |     sub rsp, 8
102 |     movsd xmm1, QWORD PTR .LC0[rip]
103 |     movsd xmm0, QWORD PTR .LC1[rip]
104 |     call pow
105 |     ; result is now in XMM0
106 |     mov edi, OFFSET FLAT:.LC2
107 |     mov eax, 1 ; number of vector registers passed
108 |     call printf
109 |     xor eax, eax
110 |     add rsp, 8
111 |     ret
112 | .LC0:
113 |     .long 171798692
114 |     .long 1073259479
115 | .LC1:
116 |     .long 2920577761
117 |     .long 1077936455
118 | ```
119 | 
120 | GCC让结果更清晰，printf（）的值传入到了XMM0中。顺带一提，这是一个因为printf()才把1写入EAX中的例子。这意味着参数会被传递到向量寄存器中，就像标准需求一样（见21章）。
121 | 
122 | ## 24.3 比较式的例子
123 | 
124 | ```
125 | double d_max (double a, double b)
126 | {
127 |     if (a>b)
128 |     return a;
129 |     return b;
130 | };
131 | ```
132 | 
133 | 清单 24.5： MSVC 2012 x64 /Ox
134 | 
135 | ```
136 | a$ = 8
137 | b$ = 16
138 | d_max PROC
139 |     comisd xmm0, xmm1
140 |     ja SHORT $LN2@d_max
141 |     movaps xmm0, xmm1
142 | $LN2@d_max:
143 |     fatret 0
144 | d_max ENDP
145 | ```
146 | 
147 | 优化过的MSVC产生了很容易理解的代码。 COMISD是“Compare Scalar Ordered Double-Precision Floating-Point Values and Set EFLAGS”（比较标量双精度浮点数的值然后设置EFLAG）的缩写，显然，看着名字就知道他要干啥了。 非优化的MSVC代码产生了更加丰富的代码，但是仍然不难理解：
148 | 
149 | 清单 24.6： MSVC 2012 x64
150 | 
151 | ```
152 | a$ = 8
153 | b$ = 16
154 | d_max PROC
155 |     comisd xmm0, xmm1
156 |     ja SHORT $LN2@d_max
157 |     movaps xmm0, xmm1
158 |     $LN2@d_max:
159 |     fatret 0
160 | d_max ENDP
161 | ```
162 | 
163 | 但是，GCC 4.4.6生成了更多的优化代码，并且使用了MAXSD（“Return Maximum Scalar Double-Precision Floating-Point Value”，返回最大的双精度浮点数的值）指令，它将选中其中一个最大数。
164 | 
165 | 清单24.7： GCC 4.4.6 x64 -O3
166 | 
167 | ```
168 | a$ = 8
169 | b$ = 16
170 | d_max PROC
171 |     movsdx QWORD PTR [rsp+16], xmm1
172 |     movsdx QWORD PTR [rsp+8], xmm0
173 |     movsdx xmm0, QWORD PTR a$[rsp]
174 |     comisd xmm0, QWORD PTR b$[rsp]
175 |     jbe SHORT $LN1@d_max
176 |     movsdx xmm0, QWORD PTR a$[rsp]
177 |     jmp SHORT $LN2@d_max
178 |     $LN1@d_max:
179 |     movsdx xmm0, QWORD PTR b$[rsp]
180 |     $LN2@d_max:
181 |     fatret 0
182 | d_max ENDP
183 | ```
184 | 
185 | ## 24.4 总结
186 | 
187 | 只有低一半的XMM寄存器会被使用，一组IEEE754格式的数字也会被存在这里。 显然，所有的指令都有SD后缀（标量双精度数），这些操作数是可以用于IEEE754浮点数的，他们存在XMM寄存器的低64位中。 比FPU更简单的是，显然SIMD扩展并不像FPU以前那么混乱，栈寄存器模型也没使用。 如果你像试着将例子中的double替换成float的话，它们还是会使用同样的指令，但是后缀是SS（标量单精度数），例如MOVSS，COMISS，ADDSS等等。 标量（Scalar）代表着SIMD寄存器会包含仅仅一个值，而不是所有的。可以在所有类型的值中生效的指令都被“封装”成同一个名字。


--------------------------------------------------------------------------------
/Chapter-25/Chapter-25.md:
--------------------------------------------------------------------------------
  1 | # 温度转换
  2 | 
  3 | 另一个在初学者的编程书中常见的例子是温度转换程序，例如将华氏度转为摄氏度，或者反过来。
  4 | 
  5 | 我也添加了一个简单的错误处理： 1）我们应该检查用户是否输入了正确的数字 2）我们应该检查摄氏度是否低于-273゜C，因为这比绝对零度还低，学校物理课上的东西应该都还记得。 exit()函数将立即终止程序，而不会回到调用者函数。
  6 | 
  7 | ## 25.1 整数值
  8 | 
  9 | ```
 10 | #include <stdio.h>
 11 | #include <stdlib.h>
 12 | int main()
 13 | {
 14 |     int celsius, fahr;
 15 |     printf ("Enter temperature in Fahrenheit:\n");
 16 |     if (scanf ("%d", &fahr)!=1)
 17 |     {
 18 |         printf ("Error while parsing your input\n");
 19 |         exit(0);
 20 |     };
 21 |     celsius = 5 * (fahr-32) / 9;
 22 |     if (celsius<-273)
 23 |     {
 24 |         printf ("Error: incorrect temperature!\n");
 25 |         exit(0);
 26 |     };
 27 |     printf ("Celsius: %d\n", celsius);
 28 | };
 29 | ```
 30 | 
 31 | ## 25.1.1 MSVC 2012 x86 /Ox
 32 | 
 33 | 清单25.1： MSVC 2012 x86 /Ox
 34 | 
 35 | ```
 36 | $SG4228 DB ’Enter temperature in Fahrenheit:’, 0aH, 00H
 37 | $SG4230 DB ’%d’, 00H
 38 | $SG4231 DB ’Error while parsing your input’, 0aH, 00H
 39 | $SG4233 DB ’Error: incorrect temperature!’, 0aH, 00H
 40 | $SG4234 DB ’Celsius: %d’, 0aH, 00H
 41 | _fahr$ = -4 ; size = 4
 42 | _main PROC
 43 |     push ecx
 44 |     push esi
 45 |     mov esi, DWORD PTR __imp__printf
 46 |     push OFFSET $SG4228 ; ’Enter temperature in Fahrenheit:’
 47 |     call esi ; call printf()
 48 |     lea eax, DWORD PTR _fahr$[esp+12]
 49 |     push eax
 50 |     push OFFSET $SG4230 ; ’%d’
 51 |     call DWORD PTR __imp__scanf
 52 |     add esp, 12 ; 0000000cH
 53 |     cmp eax, 1
 54 |     je SHORT $LN2@main
 55 |     push OFFSET $SG4231 ; ’Error while parsing your input’
 56 |     call esi ; call printf()
 57 |     add esp, 4
 58 |     push 0
 59 |     call DWORD PTR __imp__exit
 60 |     $LN9@main:
 61 |     $LN2@main:
 62 |     mov eax, DWORD PTR _fahr$[esp+8]
 63 |     add eax, -32 ; ffffffe0H
 64 |     lea ecx, DWORD PTR [eax+eax*4]
 65 |     mov eax, 954437177 ; 38e38e39H
 66 |     imul ecx
 67 |     sar edx, 1
 68 |     mov eax, edx
 69 |     shr eax, 31 ; 0000001fH
 70 |     add eax, edx
 71 |     cmp eax, -273 ; fffffeefH
 72 |     jge SHORT $LN1@main
 73 |     push OFFSET $SG4233 ; ’Error: incorrect temperature!’
 74 |     call esi ; call printf()
 75 |     add esp, 4
 76 |     push 0
 77 |     call DWORD PTR __imp__exit
 78 |     $LN10@main:
 79 |     $LN1@main:
 80 |     push eax
 81 |     push OFFSET $SG4234 ; ’Celsius: %d’
 82 |     call esi ; call printf()
 83 |     add esp, 8
 84 |     ; return 0 - at least by C99 standard
 85 |     xor eax, eax
 86 |     pop esi
 87 |     pop ecx
 88 |     ret 0
 89 | $LN8@main:
 90 | _main ENDP
 91 | ```
 92 | 
 93 | 关于这个我们可以说的是：
 94 | 
 95 | - printf()的地址先被载入了ESI寄存器中，所以printf()调用的序列会被CALL ESI处理，这是一个非常著名的编译器技术，当代码中存在多个序列调用同一个函数的时候，并且/或者有空闲的寄存器可以用上的时候，编译器就会这么做。 
 96 | - 我们知道ADD EAX,-32指令会把EAX中的数据减去32。 EAX = EAX + (-32)等同于 EAX = EAX - 32，因此编译器决定用ADD而不是用SUB，也许这样性能比较高吧。
 97 | - LEA指令在值应当乘以5的时候用到了： lea ecx, DWORD PTR [eax+eax*4]。 是的，i + i * 4是等同于i*5的，而且LEA比IMUL运行的要快。 还有，SHL EAX,2/ ADD EAX,EAX指令对也可以替换这句，而且有些编译器就是会这么优化。
 98 | - 用乘法做除法的技巧也会在这儿用上。
 99 | - 虽然我们没有指定，但是main()函数依然会返回0。C99规范告诉我们[15章， 5.1.2.2.3] main()将在没有return时也会照常返回0。 这个规则仅仅对main()函数有效。 虽然MSVC并不支持C99，但是这么看说不好他还是做到了一部分呢？
100 | 
101 | ### 25.1.2 MSVC 2012 x64 /Ox
102 | 
103 | 生成的代码几乎一样，但是我发现每个exit()调用之后都有INT 3。
104 | 
105 | ```
106 | xor ecx, ecx
107 | call QWORD PTR __imp_exit
108 | int 3
109 | ```
110 | 
111 | INT 3是一个调试器断点。 可以知道的是exit()是永远不会return的函数之一。所以如果他“返回”了，那么估计发生了什么奇怪的事情，也是时候启动调试器了。
112 | 
113 | ## 25.2 浮点数值
114 | 
115 | 清单11.1: MSVC 2010
116 | 
117 | ```
118 | #include <stdio.h>
119 | #include <stdlib.h>
120 | int main()
121 | {
122 |     double celsius, fahr;
123 |     printf ("Enter temperature in Fahrenheit:\n");
124 |     if (scanf ("%lf", &fahr)!=1)
125 |     {
126 |         printf ("Error while parsing your input\n");
127 |         exit(0);
128 |     };
129 |     celsius = 5 * (fahr-32) / 9;
130 |     if (celsius<-273)
131 |     {
132 |         printf ("Error: incorrect temperature!\n");
133 |         exit(0);
134 |     };
135 |     printf ("Celsius: %lf\n", celsius);
136 | };
137 | ```
138 | 
139 | MSVC 2010 x86使用FPU指令...
140 | 
141 | 清单25.2: MSVC 2010 x86 /Ox
142 | 
143 | ```
144 | $SG4038 DB ’Enter temperature in Fahrenheit:’, 0aH, 00H
145 | $SG4040 DB ’%lf’, 00H
146 | $SG4041 DB ’Error while parsing your input’, 0aH, 00H
147 | $SG4043 DB ’Error: incorrect temperature!’, 0aH, 00H
148 | $SG4044 DB ’Celsius: %lf’, 0aH, 00H
149 | __real@c071100000000000 DQ 0c071100000000000r ; -273
150 | __real@4022000000000000 DQ 04022000000000000r ; 9
151 | __real@4014000000000000 DQ 04014000000000000r ; 5
152 | __real@4040000000000000 DQ 04040000000000000r ; 32
153 | _fahr$ = -8 ; size = 8
154 | _main PROC
155 |     sub esp, 8
156 |     push esi
157 |     mov esi, DWORD PTR __imp__printf
158 |     push OFFSET $SG4038 ; ’Enter temperature in Fahrenheit:’
159 |     call esi ; call printf
160 |     lea eax, DWORD PTR _fahr$[esp+16]
161 |     push eax
162 |     push OFFSET $SG4040 ; ’%lf’
163 |     call DWORD PTR __imp__scanf
164 |     add esp, 12 ; 0000000cH
165 |     cmp eax, 1
166 |     je SHORT $LN2@main
167 |     push OFFSET $SG4041 ; ’Error while parsing your input’
168 |     call esi ; call printf
169 |     add esp, 4
170 |     push 0
171 |     call DWORD PTR __imp__exit
172 |     $LN2@main:
173 |     fld QWORD PTR _fahr$[esp+12]
174 |     fsub QWORD PTR __real@4040000000000000 ; 32
175 |     fmul QWORD PTR __real@4014000000000000 ; 5
176 |     fdiv QWORD PTR __real@4022000000000000 ; 9
177 |     fld QWORD PTR __real@c071100000000000 ; -273
178 |     fcomp ST(1)
179 |     fnstsw ax
180 |     test ah, 65 ; 00000041H
181 |     jne SHORT $LN1@main
182 |     push OFFSET $SG4043 ; ’Error: incorrect temperature!’
183 |     fstp ST(0)
184 |     call esi ; call printf
185 |     add esp, 4
186 |     push 0
187 |     call DWORD PTR __imp__exit
188 |     $LN1@main:
189 |     sub esp, 8
190 |     fstp QWORD PTR [esp]
191 |     push OFFSET $SG4044 ; ’Celsius: %lf’
192 |     call esi
193 |     add esp, 12 ; 0000000cH
194 |     ; return 0
195 |     xor eax, eax
196 |     pop esi
197 |     add esp, 8
198 |     ret 0
199 | $LN10@main:
200 | _main ENDP
201 | ```
202 | 
203 | 但是MSVC从2012年开始又改成了使用SIMD指令：
204 | 
205 | 清单25.3: MSVC 2010 x86 /Ox
206 | 
207 | ```
208 | $SG4228 DB ’Enter temperature in Fahrenheit:’, 0aH, 00H
209 | $SG4230 DB ’%lf’, 00H
210 | $SG4231 DB ’Error while parsing your input’, 0aH, 00H
211 | $SG4233 DB ’Error: incorrect temperature!’, 0aH, 00H
212 | $SG4234 DB ’Celsius: %lf’, 0aH, 00H
213 | __real@c071100000000000 DQ 0c071100000000000r ; -273
214 | __real@4040000000000000 DQ 04040000000000000r ; 32
215 | __real@4022000000000000 DQ 04022000000000000r ; 9
216 | __real@4014000000000000 DQ 04014000000000000r ; 5
217 | _fahr$ = -8 ; size = 8
218 | _main PROC
219 |     sub esp, 8
220 |     push esi
221 |     mov esi, DWORD PTR __imp__printf
222 |     push OFFSET $SG4228 ; ’Enter temperature in Fahrenheit:’
223 |     call esi ; call printf
224 |     lea eax, DWORD PTR _fahr$[esp+16]
225 |     push eax
226 |     push OFFSET $SG4230 ; ’%lf’
227 |     call DWORD PTR __imp__scanf
228 |     add esp, 12 ; 0000000cH
229 |     cmp eax, 1
230 |     je SHORT $LN2@main
231 |     push OFFSET $SG4231 ; ’Error while parsing your input’
232 |     call esi ; call printf
233 |     add esp, 4
234 |     push 0
235 |     call DWORD PTR __imp__exit
236 |     $LN9@main:
237 |     $LN2@main:
238 |     movsd xmm1, QWORD PTR _fahr$[esp+12]
239 |     subsd xmm1, QWORD PTR __real@4040000000000000 ; 32
240 |     movsd xmm0, QWORD PTR __real@c071100000000000 ; -273
241 |     mulsd xmm1, QWORD PTR __real@4014000000000000 ; 5
242 |     divsd xmm1, QWORD PTR __real@4022000000000000 ; 9
243 |     comisd xmm0, xmm1
244 |     jbe SHORT $LN1@main
245 |     push OFFSET $SG4233 ; ’Error: incorrect temperature!’
246 |     call esi ; call printf
247 |     add esp, 4
248 |     push 0
249 |     call DWORD PTR __imp__exit
250 |     $LN10@main:
251 |     $LN1@main:
252 |     sub esp, 8
253 |     movsd QWORD PTR [esp], xmm1
254 |     push OFFSET $SG4234 ; ’Celsius: %lf’
255 |     call esi ; call printf
256 |     add esp, 12 ; 0000000cH
257 |     ; return 0
258 |     xor eax, eax
259 |     pop esi
260 |     add esp, 8
261 |     ret 0
262 | $LN8@main:
263 | _main ENDP
264 | ```
265 | 
266 | 当然，SIMD在x86下也是可用的，包括这些浮点数的运算。使用他们计算起来也确实方便点，所以微软编译器使用了他们。 我们也可以注意到 -273 这个值会很早的被载入XMM0。这个没问题，因为编译器并不一定会按照源代码里面的顺序产生代码。


--------------------------------------------------------------------------------
/Chapter-26/Chapter-26.md:
--------------------------------------------------------------------------------
  1 | # C99的限制
  2 | 
  3 | 这个例子说明了为什么某些情况下FORTRAN的速度比C/C++要快
  4 | 
  5 | ```
  6 | void f1 (int* x, int* y, int* sum, int* product, int* sum_product, int* update_me, size_t s)
  7 | {
  8 |     for (int i=0; i<s; i++)
  9 |         {
 10 |         sum[i]=x[i]+y[i];
 11 |         product[i]=x[i]*y[i];
 12 |         update_me[i]=i*123; // some dummy value
 13 |         sum_product[i]=sum[i]+product[i];
 14 |     };
 15 | };
 16 | ```
 17 | 
 18 | 这是一个十分简单的例子，但是有一点需要注意：指向update_me数组的指针也可以指向sum数组，甚至是sum_product数组。但是这不是严重的错误，对吗？ 编译器很清楚这一点，所以他在循环体中产生了四个阶段： 1.计算下一个sum[i] 2.计算下一个product[i] 3.计算下一个unpdate_me[i] 4.计算下一个sum_product[i],在这个阶段，我们需要从已经计算过sum[i]和product[i]的内存中载入数据
 19 | 
 20 | 最后一个阶段可以优化吗？既然已经计算过的sum[i]和product[i]是不需要再次从内存装载的（因为我们已经计算过他们了）。但是编译器不能保证在第三个阶段没有东西被覆盖掉！这就叫“指针别名”，在这种情况下编译器无法确定指针指向区域的内存是否已经被改变。
 21 | 
 22 | C99标准中的限制给解决这一问题带来了一线曙光。由设计器传送给编译器的函数单元在标记这种关键字(restrict)后，它会指向不同的内存区域，并且不 会被混用。 如果要更加准确地描述这种情况，restrict表明了只有指针是可以访问对象的。这样的话我们可以通过特定的指针进行工作，并且不会用到其他指针。也就是说一个对象如果被标记为restrict，那么它只能通过一个指针访问。 我们把每个指向变量的指针标记为restrict关键字：
 23 | 
 24 | ```
 25 | void f2 (int* restrict x, int* restrict y, int* restrict sum, int* restrict product, int*
 26 | restrict sum_product,
 27 | int* restrict update_me, size_t s)
 28 | {
 29 |     for (int i=0; i<s; i++)
 30 |     {
 31 |         sum[i]=x[i]+y[i];
 32 |         product[i]=x[i]*y[i];
 33 |         update_me[i]=i*123; // some dummy value
 34 |         sum_product[i]=sum[i]+product[i];
 35 |     };
 36 | };
 37 | ```
 38 | 
 39 | 来看下结果：
 40 | 
 41 | 清单26.1： GCC x64: f1()
 42 | 
 43 | ```
 44 | f1:
 45 |     push r15 r14 r13 r12 rbp rdi rsi rbx
 46 |     mov r13, QWORD PTR 120[rsp]
 47 |     mov rbp, QWORD PTR 104[rsp]
 48 |     mov r12, QWORD PTR 112[rsp]
 49 |     test r13, r13
 50 |     je .L1
 51 |     add r13, 1
 52 |     xor ebx, ebx
 53 |     mov edi, 1
 54 |     xor r11d, r11d
 55 |     jmp .L4
 56 |     .L6:
 57 |     mov r11, rdi
 58 |     mov rdi, rax
 59 |     .L4:
 60 |     lea rax, 0[0+r11*4]
 61 |     lea r10, [rcx+rax]
 62 |     lea r14, [rdx+rax]
 63 |     lea rsi, [r8+rax]
 64 |     add rax, r9
 65 |     mov r15d, DWORD PTR [r10]
 66 |     add r15d, DWORD PTR [r14]
 67 |     mov DWORD PTR [rsi], r15d ; store to sum[]
 68 |     mov r10d, DWORD PTR [r10]
 69 |     imul r10d, DWORD PTR [r14]
 70 |     mov DWORD PTR [rax], r10d ; store to product[]
 71 |     mov DWORD PTR [r12+r11*4], ebx ; store to update_me[]
 72 |     add ebx, 123
 73 |     mov r10d, DWORD PTR [rsi] ; reload sum[i]
 74 |     add r10d, DWORD PTR [rax] ; reload product[i]
 75 |     lea rax, 1[rdi]
 76 |     cmp rax, r13
 77 |     mov DWORD PTR 0[rbp+r11*4], r10d ; store to sum_product[]
 78 |     jne .L6
 79 |     .L1:
 80 |     pop rbx rsi rdi rbp r12 r13 r14 r15
 81 |     ret
 82 | ```
 83 | 
 84 | 清单26.2： GCC x64: f2()
 85 | 
 86 | ```
 87 | f2:
 88 |     push r13 r12 rbp rdi rsi rbx
 89 |     mov r13, QWORD PTR 104[rsp]
 90 |     mov rbp, QWORD PTR 88[rsp]
 91 |     mov r12, QWORD PTR 96[rsp]
 92 |     test r13, r13
 93 |     je .L7
 94 |     add r13, 1
 95 |     xor r10d, r10d
 96 |     mov edi, 1
 97 |     xor eax, eax
 98 |     jmp .L10
 99 |     .L11:
100 |     mov rax, rdi
101 |     mov rdi, r11
102 |     .L10:
103 |     mov esi, DWORD PTR [rcx+rax*4]
104 |     mov r11d, DWORD PTR [rdx+rax*4]
105 |     mov DWORD PTR [r12+rax*4], r10d ; store to update_me[]
106 |     add r10d, 123
107 |     lea ebx, [rsi+r11]
108 |     imul r11d, esi
109 |     mov DWORD PTR [r8+rax*4], ebx ; store to sum[]
110 |     mov DWORD PTR [r9+rax*4], r11d ; store to product[]
111 |     add r11d, ebx
112 |     mov DWORD PTR 0[rbp+rax*4], r11d ; store to sum_product[]
113 |     lea r11, 1[rdi]
114 |     cmp r11, r13
115 |     jne .L11
116 |     .L7:
117 |     pop rbx rsi rdi rbp r12 r13
118 |     ret
119 | ```
120 | 
121 | 被编译过的f1()和f2()的不同点是：在f1()中，sum[i]和product[i]在循环中途被装入，但是在f2()中没有这样的特性。已经计算过的变量将被使用，既然我们已经向编译器“保证”在循环执行期间，sum[i]和product[i]不会发生改变，所以编译器“确信”变量的值不用从内存被再装入。很明显，第二个例子的程序更快。 但是如果函数变量中的指针发生混淆的情况又能如何呢？这与一个程序员的认知有关，并且结果是不正确的。 回到FORTRAN。FORTRAN语言编译器按照指针的本身含义对待他，所以当FORTRAN程序在这种情况下不可能使用restrict的时候，它可以生成生成执行更快的代码。
122 | 
123 | 这有什么实用价值？当函数处理内存中很多大“块”的时候，比如说用超级计算机解决线性代数问题。或许这就是为什么FORTRAN语言还在这个领域被使用。 但是当迭代步骤不是很多的时候，速度的增加并不是显著的。


--------------------------------------------------------------------------------
/Chapter-27/Chapter-27.md:
--------------------------------------------------------------------------------
  1 | # 内联函数
  2 | 
  3 | 内联代码是指当编译的时候，将函数体直接嵌入正确位置，而不是在这个位置放上函数声明。
  4 | 
  5 | ```
  6 | #include <stdio.h>
  7 | int celsius_to_fahrenheit (int celsius)
  8 | {
  9 |     return celsius * 9 / 5 + 32;
 10 | };
 11 | int main(int argc, char *argv[])
 12 | {
 13 |     int celsius=atol(argv[1]);
 14 |     printf ("%d\n", celsius_to_fahrenheit (celsius));
 15 | };
 16 | ```
 17 | 
 18 | 这个编译是意料之中的，但是如果换成GCC的优化方案，我们会看到：
 19 | 
 20 | 清单27.2: GCC 4.8.1 -O3
 21 | 
 22 | ```
 23 | _main:
 24 |     push ebp
 25 |     mov ebp, esp
 26 |     and esp, -16
 27 |     sub esp, 16
 28 |     call ___main
 29 |     mov eax, DWORD PTR [ebp+12]
 30 |     mov eax, DWORD PTR [eax+4]
 31 |     mov DWORD PTR [esp], eax
 32 |     call _atol
 33 |     mov edx, 1717986919
 34 |     mov DWORD PTR [esp], OFFSET FLAT:LC2 ; "%d\12\0"
 35 |     lea ecx, [eax+eax*8]
 36 |     mov eax, ecx
 37 |     imul edx
 38 |     sar ecx, 31
 39 |     sar edx
 40 |     sub edx, ecx
 41 |     add edx, 32
 42 |     mov DWORD PTR [esp+4], edx
 43 |     call _printf
 44 |     leave
 45 |     ret
 46 | ```
 47 | 
 48 | 这里的除法由乘法完成。 是的，我们的小函数被放到了printf()调用之前。为什么？因为这比直接执行函数之前的“调用/返回”过程速度更快。 在过去，这样的函数在函数声明的时候必须被标记为“内联”。在现代，这样的函数会自动被编译器识别。 另外一个普通的自动优化的例子是内联字符串函数，比如strcpy(),strcmp()等
 49 | 
 50 | 清单27.3 : 另一个简单的例子
 51 | 
 52 | ```
 53 | bool is_bool (char *s)
 54 | {
 55 |     if (strcmp (s, "true")==0)
 56 |     return true;
 57 |     if (strcmp (s, "false")==0)
 58 |     return false;
 59 |     assert(0);
 60 | };
 61 | ```
 62 | 
 63 | 清单27.4： GCC 4.8.1 -O3
 64 | 
 65 | ```
 66 | _is_bool:
 67 |     push edi
 68 |     mov ecx, 5
 69 |     push esi
 70 |     mov edi, OFFSET FLAT:LC0 ; "true\0"
 71 |     sub esp, 20
 72 |     mov esi, DWORD PTR [esp+32]
 73 |     repz cmpsb
 74 |     je L3
 75 |     mov esi, DWORD PTR [esp+32]
 76 |     mov ecx, 6
 77 |     mov edi, OFFSET FLAT:LC1 ; "false\0"
 78 |     repz cmpsb
 79 |     seta cl
 80 |     setb dl
 81 |     xor eax, eax
 82 |     cmp cl, dl
 83 |     jne L8
 84 |     add esp, 20
 85 |     pop esi
 86 |     pop edi
 87 |     ret
 88 | ```
 89 | 
 90 | 这是一个经常可以见到的关于MSVC生成的strcmp()的例子。
 91 | 
 92 | 清单27.5: MSVC
 93 | 
 94 | ```
 95 |     mov dl, [eax]
 96 |     cmp dl, [ecx]
 97 |     jnz short loc_10027FA0
 98 |     test dl, dl
 99 |     jz short loc_10027F9C
100 |     mov dl, [eax+1]
101 |     cmp dl, [ecx+1]
102 |     jnz short loc_10027FA0
103 |     add eax, 2
104 |     add ecx, 2
105 |     test dl, dl
106 |     jnz short loc_10027F80
107 |     loc_10027F9C: ; CODE XREF: f1+448
108 |     xor eax, eax
109 |     jmp short loc_10027FA5
110 | ; ---------------------------------------------------------------------------
111 |     loc_10027FA0: ; CODE XREF: f1+444
112 | ; f1+450
113 |     sbb eax, eax
114 |     sbb eax, 0FFFFFFFFh
115 | ```
116 | 
117 | 我写了一个小的用于搜索和归纳的IDA脚本，这样的脚本经常能在内联代码中看到：[IDA_scripts](https://github.com/yurichev/IDA_scripts).


--------------------------------------------------------------------------------
/Chapter-29/Chapter-29.md:
--------------------------------------------------------------------------------
  1 | # 花指令
  2 | 
  3 | 花指令是企图隐藏掉不想被逆向工程的代码块(或其它功能)的一种方法。
  4 | 
  5 | ## 文本字符串
  6 | 
  7 | 我发现在文本字符串使用可能会很有用，程序员意识某字符串不想被逆向工程的时候，可能会试图隐藏掉该字符串，让IDA或者其他十六进制编辑器无法找到。 这里说明一个简单的方法，那就是怎么去构造这样的字符串的实现方式：
  8 | 
  9 | ```
 10 | mov byte ptr [ebx], ’h’
 11 | mov byte ptr [ebx+1], ’e’
 12 | mov byte ptr [ebx+2], ’l’
 13 | mov byte ptr [ebx+3], ’l’
 14 | mov byte ptr [ebx+4], ’o’
 15 | mov byte ptr [ebx+5], ’ ’
 16 | mov byte ptr [ebx+6], ’w’
 17 | mov byte ptr [ebx+7], ’o’
 18 | mov byte ptr [ebx+8], ’r’
 19 | mov byte ptr [ebx+9], ’l’
 20 | mov byte ptr [ebx+10], ’d’
 21 | ```
 22 | 
 23 | 当两个字符串进行比较的时候看起来是这样：
 24 | 
 25 | ```
 26 | mov ebx, offset username
 27 | cmp byte ptr [ebx], ’j’
 28 | jnz fail
 29 | cmp byte ptr [ebx+1], ’o’
 30 | jnz fail
 31 | cmp byte ptr [ebx+2], ’h’
 32 | jnz fail
 33 | cmp byte ptr [ebx+3], ’n’
 34 | jnz fail
 35 | jz it_is_john
 36 | ```
 37 | 
 38 | 在这两种情况下，是不可能通过十六进制编辑器中找到这些字符串的。
 39 | 
 40 | 顺便提一下，这种方法使得字符串不可能被分配到程序的代码段中。在某些场合可能会用到，比如，在PIC或者在shellcode中。
 41 | 
 42 | 另一种方法是，我曾经看到用sprintf()构造字符串。
 43 | 
 44 | `sprintf(buf, "%s%c%s%c%s", "hel",’l’,"o w",’o’,"rld");`
 45 | 
 46 | 代码看起来比较怪异，但是做为一个简单的防止逆向工程确实一个有用的方法。 文本字符串也可能存在于加密的形式，那么所有字符串在使用前比较闲将字符串解密了。
 47 | 
 48 | ## 29.2 可执行代码
 49 | 
 50 | ### 29.2.1
 51 | 
 52 | 可执行代码花指令的意思是在真实的代码中插入一些垃圾代码，但是保证原有程序的执行正确。
 53 | 
 54 | 举个简单的例子：
 55 | 
 56 | ```
 57 | add eax, ebx
 58 | mul ecx
 59 | ```
 60 | 
 61 | 代码清单29.1： 花指令
 62 | 
 63 | ```
 64 | xor esi, 011223344h ; garbage
 65 | add esi, eax ; garbage
 66 | add eax, ebx
 67 | mov edx, eax ; garbage
 68 | shl edx, 4 ; garbage
 69 | mul ecx
 70 | xor esi, ecx ; garbage
 71 | ```
 72 | 
 73 | 这里的花指令使用原程序代码中没有使用的寄存器(ESI和EDX)。无论如何，增加花指令之后，原有的汇编代码变得更为枯涩难懂，从而达到不轻易被逆向工程的效果。
 74 | 
 75 | ### 29.2.2 替换与原有指令等价的指令
 76 | 
 77 | ```
 78 | mov op1, op2可以替换为 push op2/pop op1这两条指令。
 79 | jmp label可以替换为 push label/ret这两条指令，IDA将不会显示被引用的label。
 80 | call label可以替换为push label_after_call_instruction/push label/ref这三条指令。
 81 | push op可以替换为 sub esp, 4(或者8)/mov [esp], op这两条指令。
 82 | ```
 83 | 
 84 | ### 29.2.3 绝对被执行的代码与绝对不被执行的代码
 85 | 
 86 | 如果开发人员肯定ESI寄存器始终为0：
 87 | 
 88 | ```
 89 |     mov esi, 1
 90 |     ... ; some code not touching ESI
 91 |     dec esi
 92 |     ... ; some code not touching ESI
 93 |     cmp esi, 0
 94 |     jz real_code
 95 |     ;fakeluggage
 96 | real_code:
 97 | ```
 98 | 
 99 | 逆向工程需要一段时间才能够执行到real_code。这也被称为opaque predicate。 另一个例子(同上，假设可以肯定ESI寄存器始终为0):
100 | 
101 | ```
102 | add eax, ebx ; real code
103 | mul ecx ; real code
104 | add eax, esi ; opaque predicate. XOR, AND or SHL, etc, can be here instead of ADD.
105 | ```
106 | 
107 | ### 29.2.4打乱执行流程
108 | 
109 | 举个例子，比如执行下面这三条指令：
110 | 
111 | ```
112 | instruction 1
113 | instruction 2
114 | instruction 3
115 | ```
116 | 
117 | 可以被替换为：
118 | 
119 | ```
120 | begin: 
121 |     jmp ins1_label
122 | ins2_label: 
123 |     instruction 2
124 |     jmp ins3_label
125 | ins3_label: 
126 |     instruction 3
127 |     jmp exit
128 | ins1_label: 
129 |     instruction 1
130 |     jmp ins2_label
131 | exit:
132 | ```
133 | 
134 | ### 29.2.4使用间接指针
135 | 
136 | ```
137 | dummy_data1 db 100h dup (0)
138 | message1 db ’hello world’,0
139 | 
140 | dummy_data2 db 200h dup (0)
141 | message2 db ’another message’,0
142 | 
143 | func proc
144 |     ...
145 |     mov eax, offset dummy_data1 ; PE or ELF reloc here
146 |     add eax, 100h
147 |     push eax
148 |     call dump_string
149 |     ...
150 |     mov eax, offset dummy_data2 ; PE or ELF reloc here
151 |     add eax, 200h
152 |     push eax
153 |     call dump_string
154 |     ...
155 | func endp
156 | ```
157 | 
158 | IDA仅会显示dummy_data1和dummy_data2的引用，但无法引导到文本字符串，全局变量甚至是函数的访问方式都可能使用这种方法以达到混淆代码的目地。
159 | 
160 | ## 29.3 虚拟机/伪代码
161 | 
162 | 程序员可能写一个PL或者ISA来解释程序(例如Visual Basic 5.0与之前的版本, .NET, Java machine)。这使得逆向工程不得不花费更多的时间去了解这些语言它们的所有ISP指令详细信息。更有甚者，他们可能需要编写其中某些语言的反汇编器。
163 | 
164 | ## 29.4 其它
165 | 
166 | 我为TCC(Tiny C compiler)添加一个产生花指令功能的补丁：http://blog.yurichev.com/node/58。


--------------------------------------------------------------------------------
/Chapter-31/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-31/1.png


--------------------------------------------------------------------------------
/Chapter-31/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-31/2.png


--------------------------------------------------------------------------------
/Chapter-31/3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-31/3.png


--------------------------------------------------------------------------------
/Chapter-31/4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-31/4.png


--------------------------------------------------------------------------------
/Chapter-31/5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-31/5.png


--------------------------------------------------------------------------------
/Chapter-31/6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-31/6.png


--------------------------------------------------------------------------------
/Chapter-31/7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-31/7.png


--------------------------------------------------------------------------------
/Chapter-32/Chapter-32.md:
--------------------------------------------------------------------------------
 1 | # ostream
 2 | 
 3 | 继续以一个hello world程序为例，但是这次使用ostream
 4 | 
 5 | ```
 6 | #include <iostream>
 7 | int main()
 8 | {
 9 |     std::cout << "Hello, world!\n";
10 | }
11 | ```
12 | 
13 | 几乎所有关于c++的书籍都会提到<<操作支持很多数据类型。这些支持是在ostream中完成的。通过反汇编之后的代码，可以看到ostream中的<<操作被调用：
14 | 
15 | ```
16 | $SG37112 DB ’Hello, world!’, 0aH, 00H
17 | _main PROC
18 | push OFFSET $SG37112
19 | push OFFSET ?cout@std@@3V?$basic_ostream@DU?$char_traits@D@std@@@1@A ; std::cout
20 | call ??$?6U?$char_traits@D@std@@@std@@YAAAV?$basic_ostream@DU?
21 | $char_traits@D@std@@@0@AAV10@PBD@Z ; std::operator<<<std::char_traits<char> >
22 | add esp, 8
23 | xor eax, eax
24 | ret 0
25 | _main ENDP
26 | ```
27 | 
28 | 对示例程序做如下修改：
29 | 
30 | ```
31 | #include <iostream>
32 | int main()
33 | {
34 |     std::cout << "Hello, " << "world!\n";
35 | }
36 | ```
37 | 
38 | 同样的，从许多C++书籍中可以知道，ostream的输出操作的运算结果可以用作下一次输出（即ostream的输出操作返回ostream对象）。
39 | 
40 | ```
41 | $SG37112 DB ’world!’, 0aH, 00H
42 | $SG37113 DB ’Hello, ’, 00H
43 | _main PROC
44 | push OFFSET $SG37113 ; ’Hello, ’
45 | push OFFSET ?cout@std@@3V?$basic_ostream@DU?$char_traits@D@std@@@1@A ; std::cout
46 | call ??$?6U?$char_traits@D@std@@@std@@YAAAV?$basic_ostream@DU?
47 | $char_traits@D@std@@@0@AAV10@PBD@Z ; std::operator<<<std::char_traits<char> >
48 | add esp, 8
49 | push OFFSET $SG37112 ; ’world!’
50 | push eax ; result of previous function
51 | call ??$?6U?$char_traits@D@std@@@std@@YAAAV?$basic_ostream@DU?
52 | $char_traits@D@std@@@0@AAV10@PBD@Z ; std::operator<<<std::char_traits<char> >
53 | add esp, 8
54 | xor eax, eax
55 | ret 0
56 | _main ENDP
57 | ```
58 | 
59 | 如果用函数f()替换<<运算符，示例代码等价于：
60 | 
61 | `f(f(std::cout, "Hello, "), "world!")`
62 | 
63 | 通过GCC生成的代码和MSVC的代码基本相同。
64 | 
65 | 引用： 在c++中，引用和指针一样，但是使用的时候更安全，因为在使用引用的时候几乎不会发生错误。例如，引用必须始终指向一个相应类型的对象，而不能为NULL。甚至于，引用不能被改变，不能将一个对象的引用重新赋值以指向另一个对象。 如果我们尝试修改指针的例子（9），将指针替换为引用：
66 | 
67 | ```
68 | void f2 (int x, int y, int & sum, int & product)
69 | {
70 |     sum=x+y;
71 |     product=x*y;
72 | };
73 | ```
74 | 
75 | 可以想到，编译后的代码和使用指针生成的代码一致。
76 | 
77 | ```
78 | _x$ = 8 ; size = 4
79 | _y$ = 12 ; size = 4
80 | _sum$ = 16 ; size = 4
81 | _product$ = 20 ; size = 4
82 | ?f2@@YAXHHAAH0@Z PROC ; f2
83 | mov ecx, DWORD PTR _y$[esp-4]
84 | mov eax, DWORD PTR _x$[esp-4]
85 | lea edx, DWORD PTR [eax+ecx]
86 | imul eax, ecx
87 | mov ecx, DWORD PTR _product$[esp-4]
88 | push esi
89 | mov esi, DWORD PTR _sum$[esp]
90 | mov DWORD PTR [esi], edx
91 | mov DWORD PTR [ecx], eax
92 | pop esi
93 | ret 0
94 | ?f2@@YAXHHAAH0@Z ENDP ; f2
95 | ```


--------------------------------------------------------------------------------
/Chapter-54/54.10位.md:
--------------------------------------------------------------------------------
 1 | 54.10位。
 2 | 
 3 | 所有位操作工作，与其他的一些ISA（指令集架构）类似：
 4 | 
 5 |     public static int set (int a, int b)
 6 |     {
 7 |     return a | 1<<b;
 8 |     }
 9 |     public static int clear (int a, int b)
10 |     {
11 |     return a & (~(1<<b));
12 |     }
13 |     public static int set(int, int);
14 |     flags: ACC_PUBLIC, ACC_STATIC
15 |     Code:
16 |     stack=3, locals=2, args_size=2
17 |     0: iload_0
18 |     1: iconst_1
19 |     2: iload_1
20 |     3: ishl
21 |     4: ior
22 |     5: ireturn
23 |     public static int clear(int, int);
24 |     flags: ACC_PUBLIC, ACC_STATIC
25 |     Code:
26 |     stack=3, locals=2, args_size=2
27 |     0: iload_0
28 |     1: iconst_1
29 |     2: iload_1
30 |     3: ishl
31 |     4: iconst_m1
32 |     5: ixor
33 |     6: iand
34 |     7: ireturn
35 | 
36 | 
37 | 926
38 | iconst_m1将-1入栈，这数其实就是16进制的0xFFFFFFFF，将0xFFFFFFFF作为XOR-ing指令执行的操作数。起到的效果就是把所有bits位反向，（A.6.2在1406页）
39 | 
40 | 我将所有数据类型，扩展成64为长整型。
41 | 
42 |     public static long lset (long a, int b)
43 |     {
44 |     return a | 1<<b;
45 |     }
46 |     public static long lclear (long a, int b)
47 |     {
48 |     return a & (~(1<<b));
49 |     }
50 |     public static long lset(long, int);
51 |     flags: ACC_PUBLIC, ACC_STATIC
52 |     Code:
53 |     stack=4, locals=3, args_size=2
54 |     0: lload_0
55 |     1: iconst_1
56 |     2: iload_2
57 |     3: ishl
58 |     4: i2l
59 |     5: lor
60 |     6: lreturn
61 |     public static long lclear(long, int);
62 |     flags: ACC_PUBLIC, ACC_STATIC
63 |     Code:
64 |     stack=4, locals=3, args_size=2
65 |     0: lload_0
66 |     1: iconst_1
67 |     2: iload_2
68 |     3: ishl
69 |     4: iconst_m1
70 |     5: ixor
71 |     6: i2l
72 |     7: land
73 |     8: lreturn
74 |     
75 | 代码是相同的，但是指令前面使用了前缀L，操作64位值，并且第二个函数参数还是int类型，并且32值需要升级为64位值，值被i21指令使用，本质上
76 | 就是把整型，扩展成64位长整型.
77 | 
78 | 927页
79 | 


--------------------------------------------------------------------------------
/Chapter-54/54.11循环.md:
--------------------------------------------------------------------------------
  1 | 54.11循环
  2 | 
  3 |     
  4 |     public class Loop
  5 |     {
  6 |     public static void main(String[] args)
  7 |     {
  8 |     for (int i = 1; i <= 10; i++)
  9 |     {
 10 |     System.out.println(i);
 11 |     }
 12 |     }
 13 |     }
 14 |     public static void main(java.lang.String[]);
 15 |     flags: ACC_PUBLIC, ACC_STATIC
 16 |     Code:
 17 |     stack=2, locals=2, args_size=1
 18 |     0: iconst_1
 19 |     1: istore_1
 20 |     2: iload_1
 21 |     3: bipush 10
 22 |     5: if_icmpgt 21
 23 |     8: getstatic #2 // Field java/⤦
 24 |     Ç lang/System.out:Ljava/io/PrintStream;
 25 |     11: iload_1
 26 |     12: invokevirtual #3 // Method java/io⤦
 27 |     Ç /PrintStream.println:(I)V
 28 |     15: iinc 1, 1
 29 |     18: goto 2
 30 |     21: return
 31 | 
 32 | 
 33 | icont_1将1推入到栈顶，istore_1将其存入到LVA的参数槽1，为什么没有零槽？因为main()函数只有一个参数，并且指向其的引用，就在第0号槽中。
 34 | 
 35 | 因此，i本地变量总是在1号参数槽中。
 36 | 指令在行3偏移和行5偏移，将i和10的比较。如果i大，执行流进入行21偏移，函数结束了，如果不被println调用。i在行11偏移进行了重新加载，之后给println使用。
 37 | 
 38 | 多说一句，我们调用pringln打印数据类型是整型，我们看注释，“i，v”，i的意思是整型，v的意思是返回void。
 39 | 
 40 | 当println函数结束，i是步进到行15偏移，指令第一个操作数是参数槽1的值。第二个是数值1与本地变量相加结果。
 41 | 
 42 | goto指令就是跳转，它跳转到循环体的开始地址，再行偏移2.
 43 | 
 44 | 928页
 45 | 
 46 | 让我们进行更复杂的例子。
 47 | 
 48 |     public class Fibonacci
 49 |     {
 50 |     public static void main(String[] args)
 51 |     {
 52 |     int limit = 20, f = 0, g = 1;
 53 |     for (int i = 1; i <= limit; i++)
 54 |     {
 55 |     f = f + g;
 56 |     g = f - g;
 57 |     System.out.println(f);
 58 |     }
 59 |     }
 60 |     }
 61 |     public static void main(java.lang.String[]);
 62 |     flags: ACC_PUBLIC, ACC_STATIC
 63 |     Code:
 64 |     stack=2, locals=5, args_size=1
 65 |     0: bipush 20
 66 |     2: istore_1
 67 |     3: iconst_0
 68 |     4: istore_2
 69 |     5: iconst_1
 70 |     6: istore_3
 71 |     7: iconst_1
 72 |     8: istore 4
 73 |     10: iload 4
 74 |     12: iload_1
 75 |     13: if_icmpgt 37
 76 |     16: iload_2
 77 |     17: iload_3
 78 |     18: iadd
 79 |     19: istore_2
 80 |     20: iload_2
 81 |     21: iload_3
 82 |     22: isub
 83 |     23: istore_3
 84 |     24: getstatic #2 // Field java/⤦
 85 |     Ç lang/System.out:Ljava/io/PrintStream;
 86 |     27: iload_2
 87 |     28: invokevirtual #3 // Method java/io⤦
 88 |     Ç /PrintStream.println:(I)V
 89 |     31: iinc 4, 1
 90 |     34: goto 10
 91 |     37: return
 92 | 
 93 | 
 94 | 
 95 | 929
 96 | LVA槽中参数映射。
 97 | 0-main（）的唯一参数。
 98 | 1-限制，总是20.
 99 | 2-f
100 | 3-g
101 | 4-i
102 | 
103 | 我们可以看到java编译器在LVA参数槽分配变量，并且是相同的顺序，就像在源代码中声明变量。
104 | 
105 | 分离指令istore，是用于访问参数槽0123，但是不能大于4，因此，附加一些操作，在行2，8偏移，使用槽中数据作为操作数，类似于在偏移10位置的iload指令。
106 | 
107 | 无可口非，分离其他的槽，限制变量总是20（其本质上就是一个常数），重加载值很经常吗？
108 | 
109 | JVM JIT 编译器经常可以对其优化的很好。在代码中人工的干预优化其实是没有什么太大价值的。
110 | 
111 | 
112 | 


--------------------------------------------------------------------------------
/Chapter-54/54.12switch函数.md:
--------------------------------------------------------------------------------
 1 | 54.12 switch()函数
 2 | 
 3 | switch（）语句的实现是用tableswitch指令，
 4 | public static void f(int a)
 5 | {
 6 | switch (a)
 7 | {
 8 | case 0: System.out.println("zero"); break;
 9 | case 1: System.out.println("one\n"); break;
10 | case 2: System.out.println("two\n"); break;
11 | case 3: System.out.println("three\n"); break;
12 | case 4: System.out.println("four\n"); break;
13 | default: System.out.println("something unknown\⤦
14 | Ç n"); break;
15 | };
16 | }
17 | 
18 | 尽可能简单的例子
19 | 
20 | 
21 |     public static void f(int);
22 |     flags: ACC_PUBLIC, ACC_STATIC
23 |     Code:
24 |     stack=2, locals=1, args_size=1
25 |     0: iload_0
26 |     1: tableswitch { // 0 to 4
27 |     0: 36
28 |     1: 47
29 |     2: 58
30 |     3: 69
31 |     4: 80
32 |     default: 91
33 |     }
34 |     36: getstatic #2 // Field java/⤦
35 |     Ç lang/System.out:Ljava/io/PrintStream;
36 |     39: ldc #3 // String zero
37 |     41: invokevirtual #4 // Method java/io⤦
38 |     Ç /PrintStream.println:(Ljava/lang/String;)V
39 |     44: goto 99
40 |     47: getstatic #2 // Field java/⤦
41 |     Ç lang/System.out:Ljava/io/PrintStream;
42 |     50: ldc #5 // String one\n
43 |     52: invokevirtual #4 // Method java/io⤦
44 |     Ç /PrintStream.println:(Ljava/lang/String;)V
45 |     55: goto 99
46 |     58: getstatic #2 // Field java/⤦
47 |     Ç lang/System.out:Ljava/io/PrintStream;
48 |     61: ldc #6 // String two\n
49 |     63: invokevirtual #4 // Method java/io⤦
50 |     Ç /PrintStream.println:(Ljava/lang/String;)V
51 |     66: goto 99
52 |     69: getstatic #2 // Field java/⤦
53 |     Ç lang/System.out:Ljava/io/PrintStream;
54 |     72: ldc #7 // String three\n
55 |     74: invokevirtual #4 // Method java/io⤦
56 |     Ç /PrintStream.println:(Ljava/lang/String;)V
57 |     77: goto 99
58 |     80: getstatic #2 // Field java/⤦
59 |     Ç lang/System.out:Ljava/io/PrintStream;
60 |     83: ldc #8 // String four\n
61 |     85: invokevirtual #4 // Method java/io⤦
62 |     Ç /PrintStream.println:(Ljava/lang/String;)V
63 |     88: goto 99
64 |     91: getstatic #2 // Field java/⤦
65 |     Ç lang/System.out:Ljava/io/PrintStream;
66 |     94: ldc #9 // String ⤦
67 |     Ç something unknown\n
68 |     931
69 |     CHAPTER 54. JAVA 54.13. ARRAYS
70 |     96: invokevirtual #4 // Method java/io⤦
71 |     Ç /PrintStream.println:(Ljava/lang/String;)V
72 |     99: return
73 |     
74 | 930
75 | 
76 | 931
77 | 


--------------------------------------------------------------------------------
/Chapter-54/54.14字符串.md:
--------------------------------------------------------------------------------
  1 | 54.14 字符串
  2 | 54.14.1 第一个例子
  3 | 
  4 | 字符串也是对象，和其他对象的构造方式相同。（还有数组）
  5 | 
  6 | 
  7 |     public static void main(String[] args)
  8 |     {
  9 |     System.out.println("What is your name?");
 10 |     String input = System.console().readLine();
 11 |     System.out.println("Hello, "+input);
 12 |     }
 13 |     public static void main(java.lang.String[]);
 14 |     flags: ACC_PUBLIC, ACC_STATIC
 15 |     Code:
 16 |     stack=3, locals=2, args_size=1
 17 |     0: getstatic #2 // Field java/⤦
 18 |     Ç lang/System.out:Ljava/io/PrintStream;
 19 |     3: ldc #3 // String What is⤦
 20 |     Ç your name?
 21 |     5: invokevirtual #4 // Method java/io⤦
 22 |     Ç /PrintStream.println:(Ljava/lang/String;)V
 23 |     8: invokestatic #5 // Method java/⤦
 24 |     Ç lang/System.console:()Ljava/io/Console;
 25 |     11: invokevirtual #6 // Method java/io⤦
 26 |     Ç /Console.readLine:()Ljava/lang/String;
 27 |     14: astore_1
 28 |     15: getstatic #2 // Field java/⤦
 29 |     Ç lang/System.out:Ljava/io/PrintStream;
 30 |     18: new #7 // class java/⤦
 31 |     Ç lang/StringBuilder
 32 |     21: dup
 33 |     22: invokespecial #8 // Method java/⤦
 34 |     Ç lang/StringBuilder."<init>":()V
 35 |     25: ldc #9 // String Hello,
 36 |     27: invokevirtual #10 // Method java/⤦
 37 |     Ç lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/⤦
 38 |     Ç StringBuilder;
 39 |     30: aload_1
 40 |     31: invokevirtual #10 // Method java/⤦
 41 |     Ç lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/⤦
 42 |     Ç StringBuilder;
 43 |     34: invokevirtual #11 // Method java/⤦
 44 |     Ç lang/StringBuilder.toString:()Ljava/lang/String;
 45 |     37: invokevirtual #4 // Method java/io⤦
 46 |     Ç /PrintStream.println:(Ljava/lang/String;)V
 47 |     40: return
 48 | 
 49 | 944
 50 | 在11行偏移调用了readline()方法，字符串引用（由用户提供）被存储在栈顶，在14行偏移,字符串引用被存储在LVA的1号槽中。
 51 | 
 52 | 
 53 | 用户输入的字符串在30行偏移处重新加载并和 “hello”字符进行了链接，使用的是StringBulder类，在17行偏移,构造的字符串被pirntln方法打印。
 54 | 
 55 | 54.14.2 第二个例子
 56 | 另外一个例子
 57 | 
 58 |     public class strings
 59 |     {
 60 |     public static char test (String a)
 61 |     {
 62 |     return a.charAt(3);
 63 |     };
 64 |     public static String concat (String a, String b)
 65 |     {
 66 |     return a+b;
 67 |     }
 68 |     }
 69 |     public static char test(java.lang.String);
 70 |     flags: ACC_PUBLIC, ACC_STATIC
 71 |     Code:
 72 |     stack=2, locals=1, args_size=1
 73 |     0: aload_0
 74 |     1: iconst_3
 75 |     2: invokevirtual #2 // Method java/⤦
 76 |     Ç lang/String.charAt:(I)C
 77 |     5: ireturn
 78 | 
 79 | 945
 80 | 
 81 | 字符串的链接使用用StringBuilder类完成。
 82 | 
 83 | 
 84 |     public static java.lang.String concat(java.lang.String, java.⤦
 85 |     Ç lang.String);
 86 |     flags: ACC_PUBLIC, ACC_STATIC
 87 |     Code:
 88 |     stack=2, locals=2, args_size=2
 89 |     0: new #3 // class java/⤦
 90 |     Ç lang/StringBuilder
 91 |     3: dup
 92 |     4: invokespecial #4 // Method java/⤦
 93 |     Ç lang/StringBuilder."<init>":()V
 94 |     7: aload_0
 95 |     8: invokevirtual #5 // Method java/⤦
 96 |     Ç lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/⤦
 97 |     Ç StringBuilder;
 98 |     11: aload_1
 99 |     12: invokevirtual #5 // Method java/⤦
100 |     Ç lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/⤦
101 |     Ç StringBuilder;
102 |     15: invokevirtual #6 // Method java/⤦
103 |     Ç lang/StringBuilder.toString:()Ljava/lang/String;
104 |     18: areturn
105 | 
106 | 另外一个例子
107 |     
108 |     public static void main(String[] args)
109 |     {
110 |     String s="Hello!";
111 |     int n=123;
112 |     System.out.println("s=" + s + " n=" + n);
113 |     }
114 | 
115 | 字符串构造用StringBuilder类，和它的添加方法，被构造的字符串被传递给println方法。
116 | 
117 |     
118 |     public static void main(java.lang.String[]);
119 |     flags: ACC_PUBLIC, ACC_STATIC
120 |     Code:
121 |     stack=3, locals=3, args_size=1
122 |     0: ldc #2 // String Hello!
123 |     2: astore_1
124 |     3: bipush 123
125 |     5: istore_2
126 |     6: getstatic #3 // Field java/⤦
127 |     Ç lang/System.out:Ljava/io/PrintStream;
128 |     9: new #4 // class java/⤦
129 |     Ç lang/StringBuilder
130 |     12: dup
131 |     13: invokespecial #5 // Method java/⤦
132 |     Ç lang/StringBuilder."<init>":()V
133 |     16: ldc #6 // String s=
134 |     18: invokevirtual #7 // Method java/⤦
135 |     Ç lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/⤦
136 |     Ç StringBuilder;
137 |     21: aload_1
138 |     22: invokevirtual #7 // Method java/⤦
139 |     Ç lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/⤦
140 |     Ç StringBuilder;
141 |     25: ldc #8 // String n=
142 |     27: invokevirtual #7 // Method java/⤦
143 |     Ç lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/⤦
144 |     Ç StringBuilder;
145 |     30: iload_2
146 |     31: invokevirtual #9 // Method java/⤦
147 |     Ç lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
148 |     34: invokevirtual #10 // Method java/⤦
149 |     Ç lang/StringBuilder.toString:()Ljava/lang/String;
150 |     37: invokevirtual #11 // Method java/io⤦
151 |     Ç /PrintStream.println:(Ljava/lang/String;)V
152 |     40: return
153 | 
154 | 
155 |  946
156 | 
157 | 


--------------------------------------------------------------------------------
/Chapter-54/54.15异常.md:
--------------------------------------------------------------------------------
  1 | 54.15 异常
  2 | 让我们稍微修改一下，月处理的那个例子(在932页的54.13.4)
  3 | 
  4 | 清单 54.10: IncorrectMonthException.java
  5 |     
  6 |     public class IncorrectMonthException extends Exception
  7 |     {
  8 |     private int index;
  9 |     public IncorrectMonthException(int index)
 10 |     {
 11 |     this.index = index;
 12 |     }
 13 |     public int getIndex()
 14 |     {
 15 |     return index;
 16 |     }
 17 |     }
 18 | 
 19 | 清单 54.11: Month2.java
 20 | 
 21 | 
 22 |     class Month2
 23 |     {
 24 |     public static String[] months =
 25 |     {
 26 |     "January",
 27 |     "February",
 28 |     "March",
 29 |     "April",
 30 |     "May",
 31 |     "June",
 32 |     "July",
 33 |     "August",
 34 |     "September",
 35 |     "October",
 36 |     "November",
 37 |     "December"
 38 |     };
 39 |     public static String get_month (int i) throws ⤦
 40 |     Ç IncorrectMonthException
 41 |     {
 42 |     if (i<0 || i>11)
 43 |     throw new IncorrectMonthException(i);
 44 |     return months[i];
 45 |     };
 46 |     public static void main (String[] args)
 47 |     {
 48 |     try
 49 |     {
 50 |     System.out.println(get_month(100));
 51 |     }
 52 |     catch(IncorrectMonthException e)
 53 |     {
 54 |     System.out.println("incorrect month ⤦
 55 |     Ç index: "+ e.getIndex());
 56 |     e.printStackTrace();
 57 |     }
 58 |     };
 59 |     }
 60 |     
 61 | 
 62 | 本质上，IncorrectMonthExceptinClass类只是做了对象构造，还有访问器方法。
 63 | IncorrectMonthExceptinClass是继承于Exception类，所以，IncorrectMonth类构造之前，构造父类Exception，然后传递整数给IncorrectMonthException类作为唯一的属性值。
 64 | 
 65 | 
 66 |     public IncorrectMonthException(int);
 67 |     flags: ACC_PUBLIC
 68 |     Code:
 69 |     stack=2, locals=2, args_size=2
 70 |     0: aload_0
 71 |     1: invokespecial #1 // Method java/⤦
 72 |     Ç lang/Exception."<init>":()V
 73 |     4: aload_0
 74 |     5: iload_1
 75 |     6: putfield #2 // Field index:I
 76 |     9: return
 77 | 
 78 | getIndex()只是一个访问器，引用到IncorrectMothnException类，被传到LVA的0槽(this指针),用aload_0指令取得， 用getfield指令取得对象的整数值，用ireturn指令将其返回。
 79 | 
 80 |     public int getIndex();
 81 |     flags: ACC_PUBLIC
 82 |     Code:
 83 |     stack=1, locals=1, args_size=1
 84 |     0: aload_0
 85 |     1: getfield #2 // Field index:I
 86 |     4: ireturn
 87 | 
 88 | 现在来看下month.class的get_month方法。
 89 | 
 90 | 清单 54.12: Month2.class
 91 |     
 92 |     public static java.lang.String get_month(int) throws ⤦
 93 |     Ç IncorrectMonthException;
 94 |     flags: ACC_PUBLIC, ACC_STATIC
 95 |     Code:
 96 |     stack=3, locals=1, args_size=1
 97 |     0: iload_0
 98 |     1: iflt 10
 99 |     4: iload_0
100 |     5: bipush 11
101 |     7: if_icmple 19
102 |     10: new #2 // class ⤦
103 |     Ç IncorrectMonthException
104 |     13: dup
105 |     14: iload_0
106 |     15: invokespecial #3 // Method ⤦
107 |     Ç IncorrectMonthException."<init>":(I)V
108 |     18: athrow
109 |     19: getstatic #4 // Field months:[⤦
110 |     Ç Ljava/lang/String;
111 |     22: iload_0
112 |     23: aaload
113 |     24: areturn
114 | 
115 | 949
116 | 
117 | iflt 在行偏移1 ，如果小于的话，
118 | 
119 | 这种情况其实是无效的索引，在行偏移10创建了一个对象，对象类型是作为操作书传递指令的。（这个IncorrectMonthException的构造届时，下标整数是被通过TOS传递的。行15偏移）
120 | 时间流程走到了行18偏移，对象已经被构造了，现在athrow指令取得新构对象的引用，然后发信号给JVM去找个合适的异常句柄。
121 | 
122 | athrow指令在这个不返回到控制流，行19偏移的其他的个基本模块，和异常无关，我们能得到到行7偏移。
123 | 句柄怎么工作？ main()在inmonth2.class
124 | 
125 | 清单 54.13: Month2.class
126 | 
127 |     public static void main(java.lang.String[]);
128 |     flags: ACC_PUBLIC, ACC_STATIC
129 |     Code:
130 |     stack=3, locals=2, args_size=1
131 |     0: getstatic #5 // Field java/⤦
132 |     Ç lang/System.out:Ljava/io/PrintStream;
133 |     3: bipush 100
134 |     5: invokestatic #6 // Method ⤦
135 |     Ç get_month:(I)Ljava/lang/String;
136 |     8: invokevirtual #7 // Method java/io⤦
137 |     Ç /PrintStream.println:(Ljava/lang/String;)V
138 |     11: goto 47
139 |     14: astore_1
140 |     15: getstatic #5 // Field java/⤦
141 |     Ç lang/System.out:Ljava/io/PrintStream;
142 |     18: new #8 // class java/⤦
143 |     Ç lang/StringBuilder
144 |     21: dup
145 |     22: invokespecial #9 // Method java/⤦
146 |     Ç lang/StringBuilder."<init>":()V
147 |     25: ldc #10 // String ⤦
148 |     Ç incorrect month index:
149 |     27: invokevirtual #11 // Method java/⤦
150 |     Ç lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/⤦
151 |     Ç StringBuilder;
152 |     30: aload_1
153 |     31: invokevirtual #12 // Method ⤦
154 |     Ç IncorrectMonthException.getIndex:()I
155 |     34: invokevirtual #13 // Method java/⤦
156 |     Ç lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
157 |     37: invokevirtual #14 // Method java/⤦
158 |     Ç lang/StringBuilder.toString:()Ljava/lang/String;
159 |     40: invokevirtual #7 // Method java/io⤦
160 |     Ç /PrintStream.println:(Ljava/lang/String;)V
161 |     43: aload_1
162 |     44: invokevirtual #15 // Method ⤦
163 |     Ç IncorrectMonthException.printStackTrace:()V
164 |     47: return
165 |     Exception table:
166 |     from to target type
167 |     0 11 14 Class IncorrectMonthException
168 | 
169 | 
170 | 
171 | 
172 | 950
173 | 这是一个异常表，在行偏移0-11（包括）行，一个IncorrectinMonthException异常可能发生，如果发生，控制流到达14行偏移，确实main程序在11行偏移结束，在14行异常开始，
174 | 没有进入此区域条件(condition/uncondition)设定，是不可能到打这个位置的。（PS：就是没有异常捕获的设定，就不会有异常流被调用执行。）
175 | 
176 | 
177 | 但是JVM会传递并覆盖执行这个异常case。
178 | 第一个astore_1(在行偏移14)取得，将到来的异常对象的引用，存储在LVA的槽参数1之后。getIndex()方法（这个异常对象）
179 | 会被在31行偏移调用。引用当前的异常对象，是在30行偏移之前。
180 | 所有的这些代码重置都是字符串操作代码：第一个整数值使用的是getIndex()方法，被转换成字符串使用的是toString()方法，它会和“正确月份下标”的文本字符来链接（像我们之前考虑的那样）。
181 | println()和printStackTrace(1)会被调用，PrintStackTrace(1)调用
182 | 结束之后，异常被捕获，我们可以处理正常的函数，在47行偏移，return结束main（）函数 , 如果没有发生异常，不会执行任何的代码。
183 | 
184 | 
185 | 这有个例子，IDA是如何显示异常范围：
186 | 
187 | 清单54.14 
188 | 我从我的计算机中找到 random.class 这个文件
189 | 
190 |     
191 |     .catch java/io/FileNotFoundException from met001_335 to ⤦
192 |     Ç met001_360\
193 |     using met001_360
194 |     .catch java/io/FileNotFoundException from met001_185 to ⤦
195 |     Ç met001_214\
196 |     using met001_214
197 |     .catch java/io/FileNotFoundException from met001_181 to ⤦
198 |     Ç met001_192\
199 |     using met001_195
200 |     951
201 |     CHAPTER 54. JAVA 54.16. CLASSES
202 |     .catch java/io/FileNotFoundException from met001_155 to ⤦
203 |     Ç met001_176\
204 |     using met001_176
205 |     .catch java/io/FileNotFoundException from met001_83 to ⤦
206 |     Ç met001_129 using \
207 |     met001_129
208 |     .catch java/io/FileNotFoundException from met001_42 to ⤦
209 |     Ç met001_66 using \
210 |     met001_69
211 |     .catch java/io/FileNotFoundException from met001_begin to ⤦
212 |     Ç met001_37\
213 |     using met001_37
214 | 
215 | 
216 | ［校准到这结束。］
217 | 
218 | 
219 | 


--------------------------------------------------------------------------------
/Chapter-54/54.16类.md:
--------------------------------------------------------------------------------
  1 | 54.16 类
  2 | 简单类
  3 | 
  4 | 清单 54.15: test.java
  5 |     
  6 |     public class test
  7 |     {
  8 |     public static int a;
  9 |     private static int b;
 10 |     public test()
 11 |     {
 12 |     a=0;
 13 |     b=0;
 14 |     }
 15 |     public static void set_a (int input)
 16 |     {
 17 |     a=input;
 18 |     }
 19 |     public static int get_a ()
 20 |     {
 21 |     return a;
 22 |     }
 23 |     public static void set_b (int input)
 24 |     {
 25 |     b=input;
 26 |     }
 27 |     public static int get_b ()
 28 |     {
 29 |     return b;
 30 |     }
 31 |     }
 32 | 
 33 | 952
 34 | 
 35 | 构造函数，只是把两个之段设置成0.
 36 | 
 37 |     public test();
 38 |     flags: ACC_PUBLIC
 39 |     Code:
 40 |     stack=1, locals=1, args_size=1
 41 |     0: aload_0
 42 |     1: invokespecial #1 // Method java/⤦
 43 |     Ç lang/Object."<init>":()V
 44 |     4: iconst_0
 45 |     5: putstatic #2 // Field a:I
 46 |     8: iconst_0
 47 |     9: putstatic #3 // Field b:I
 48 |     12: return
 49 | 
 50 | a的设定器
 51 | 
 52 |     public static void set_a(int);
 53 |     flags: ACC_PUBLIC, ACC_STATIC
 54 |     Code:
 55 |     stack=1, locals=1, args_size=1
 56 |     0: iload_0
 57 |     1: putstatic #2 // Field a:I
 58 |     4: return
 59 | 
 60 | a的取得器
 61 | 
 62 |     public static int get_a();
 63 |     flags: ACC_PUBLIC, ACC_STATIC
 64 |     Code:
 65 |     stack=1, locals=0, args_size=0
 66 |     0: getstatic #2 // Field a:I
 67 |     3: ireturn
 68 | 
 69 | b的设定器
 70 | 
 71 |     public static void set_b(int);
 72 |     flags: ACC_PUBLIC, ACC_STATIC
 73 |     Code:
 74 |     stack=1, locals=1, args_size=1
 75 |     0: iload_0
 76 |     1: putstatic #3 // Field b:I
 77 |     4: return
 78 | 
 79 | b的取得器
 80 | 
 81 |     public static int get_b();
 82 |     flags: ACC_PUBLIC, ACC_STATIC
 83 |     Code:
 84 |     stack=1, locals=0, args_size=0
 85 |     0: getstatic #3 // Field b:I
 86 |     3: ireturn
 87 | 
 88 | 
 89 | 953
 90 | 类中的公有和私有字段代码没什么区别。 但是类型信息会在in.class 文件中表示，并且，无论如何私有变量是不可以被访问的。
 91 | 
 92 | 让我们创建对象并调用方法：
 93 | 清单 54.16: ex1.java
 94 | 
 95 | 954
 96 | 新指令创建对象，但不调用构造函数（它在4行偏移被调用）set_a()方法被在16行偏移被调用，字段访问使用的getstatic指令,在行偏移21。
 97 | 
 98 |     Listing 54.16: ex1.java
 99 |     public class ex1
100 |     {
101 |     public static void main(String[] args)
102 |     {
103 |     test obj=new test();
104 |     obj.set_a (1234);
105 |     System.out.println(obj.a);
106 |     }
107 |     }
108 |     public static void main(java.lang.String[]);
109 |     flags: ACC_PUBLIC, ACC_STATIC
110 |     Code:
111 |     stack=2, locals=2, args_size=1
112 |     0: new #2 // class test
113 |     3: dup
114 |     4: invokespecial #3 // Method test."<⤦
115 |     Ç init>":()V
116 |     7: astore_1
117 |     8: aload_1
118 |     9: pop
119 |     10: sipush 1234
120 |     13: invokestatic #4 // Method test.⤦
121 |     Ç set_a:(I)V
122 |     16: getstatic #5 // Field java/⤦
123 |     Ç lang/System.out:Ljava/io/PrintStream;
124 |     19: aload_1
125 |     20: pop
126 |     21: getstatic #6 // Field test.a:I
127 |     24: invokevirtual #7 // Method java/io⤦
128 |     Ç /PrintStream.println:(I)V
129 |     27: return
130 | 
131 | 


--------------------------------------------------------------------------------
/Chapter-54/54.17简单的补丁.md:
--------------------------------------------------------------------------------
  1 | 54.17 简单的补丁。
  2 | 
  3 | 54.17.1 第一个例子
  4 | 
  5 | 让我们进入一个简单的修补任务。
  6 | 
  7 | 
  8 |     public class nag
  9 |     {
 10 |     public static void nag_screen()
 11 |     {
 12 |     System.out.println("This program is not ⤦
 13 |     Ç registered");
 14 |     };
 15 |     public static void main(String[] args)
 16 |     {
 17 |     System.out.println("Greetings from the mega-⤦
 18 |     Ç software");
 19 |     nag_screen();
 20 |     }
 21 |     }
 22 | 
 23 | 
 24 | 我们如何去除"This program is registered"的打印输出.
 25 | 
 26 | 最会在IDA中加载.class文件。
 27 | 
 28 | 955
 29 | 
 30 | 
 31 | 清单54.1: IDA
 32 | 
 33 | 
 34 | 我们修补一下函数的第一个byte在177(返回指令操作码)
 35 | 
 36 | Figure 54.2 : IDA
 37 | 
 38 | 
 39 | 这个在JDK1.7中不工作
 40 | 
 41 |     Exception in thread "main" java.lang.VerifyError: Expecting a ⤦
 42 |     Ç stack map frame
 43 |     Exception Details:
 44 |     Location:
 45 |     nag.nag_screen()V @1: nop
 46 |     Reason:
 47 |     Error exists in the bytecode
 48 |     Bytecode:
 49 |     0000000: b100 0212 03b6 0004 b1
 50 |     at java.lang.Class.getDeclaredMethods0(Native Method)
 51 |     at java.lang.Class.privateGetDeclaredMethods(Class.java⤦
 52 |     Ç :2615)
 53 |     at java.lang.Class.getMethod0(Class.java:2856)
 54 |     at java.lang.Class.getMethod(Class.java:1668)
 55 |     at sun.launcher.LauncherHelper.getMainMethod(⤦
 56 |     Ç LauncherHelper.java:494)
 57 |     at sun.launcher.LauncherHelper.checkAndLoadMain(⤦
 58 |     Ç LauncherHelper.java:486)
 59 | 
 60 | 956
 61 | 也许，JVM有一些其他检查，关联到栈映射。
 62 | 好的，我们修补成不同的，去掉nag()函数调用。
 63 | 
 64 | 
 65 | 清单:54.5 IDA
 66 |  NOP的操作码是0:
 67 | 这个可以了！
 68 | 
 69 | 54.17.2第二个例子
 70 | 
 71 | 现在是另外一个简单的crackme例子。
 72 | 
 73 |     public class password
 74 |     {
 75 |     public static void main(String[] args)
 76 |     {
 77 |     System.out.println("Please enter the password")⤦
 78 |     Ç ;
 79 |     String input = System.console().readLine();
 80 |     if (input.equals("secret"))
 81 |     System.out.println("password is correct⤦
 82 |     Ç ");
 83 |     957
 84 |     CHAPTER 54. JAVA 54.17. SIMPLE PATCHING
 85 |     else
 86 |     System.out.println("password is not ⤦
 87 |     Ç correct");
 88 |     }
 89 |     }
 90 | 
 91 | 957
 92 | 
 93 | 图54.4:IDA
 94 | 我们看ifeq指令是怎么工作的，他的名字的意思是如果等于。
 95 | 这是不恰当的，我更愿意命名if (ifz if zero)
 96 | 如果栈顶值是0，他就会跳转，在我们这个例子，如果密码
 97 | 不正确他就跳转。（equal方法返回的是0）
 98 | 首先第一个方案就是修该这个指令... iefq是两个bytes的操作码
 99 | 编码和跳转偏移，让这个指令定制，我们必须设定byte3
100 | 3byte（因为3是要添加当前地址结果，总是跳转同下一条指令）
101 | 因为ifeq的指令长度就是3bytes.
102 | 
103 | 
104 | 958
105 | 图54.5IDA
106 | 
107 | 这个在JDK1.7中不工作
108 | 
109 |     Exception in thread "main" java.lang.VerifyError: Expecting a ⤦
110 |     Ç stackmap frame at branch target 24
111 |     Exception Details:
112 |     Location:
113 |     password.main([Ljava/lang/String;)V @21: ifeq
114 |     Reason:
115 |     Expected stackmap frame at this location.
116 |     Bytecode:
117 |     0000000: b200 0212 03b6 0004 b800 05b6 0006 4c2b
118 |     0000010: 1207 b600 0899 0003 b200 0212 09b6 0004
119 |     0000020: a700 0bb2 0002 120a b600 04b1
120 |     Stackmap Table:
121 |     append_frame(@35,Object[#20])
122 |     same_frame(@43)
123 |     at java.lang.Class.getDeclaredMethods0(Native Method)
124 |     at java.lang.Class.privateGetDeclaredMethods(Class.java⤦
125 |     Ç :2615)
126 |     at java.lang.Class.getMethod0(Class.java:2856)
127 |     at java.lang.Class.getMethod(Class.java:1668)
128 |     at sun.launcher.LauncherHelper.getMainMethod(⤦
129 |     Ç LauncherHelper.java:494)
130 |     959
131 |     CHAPTER 54. JAVA 54.18. SUMMARY
132 |     at sun.launcher.LauncherHelper.checkAndLoadMain(⤦
133 |     Ç LauncherHelper.java:486)
134 |     
135 | 不用说了，它工作在JRE1.6
136 | 我也尝试把所有的3 ifeq的所有操作码都用0替换（NOP），它仍然会工作，好，可能没有更多的堆栈映射在JRE1.7中被检查出来。
137 | 
138 | 好的，我替换整个equal方法调用，使用icore_1指令加NOPS的修改。
139 | 
140 | 
141 | （TOS）栈顶总是1，当ifeq指令被执行...所以ifeq也不会被执行。
142 | 
143 | 可以了。
144 | 
145 | 54.18总结
146 | 
147 | 960
148 | 和C/C+比较java少了一些什么？
149 | 结构体：使用类
150 | 联合：使用类继承。
151 | 无符号数据类型，多说一句，还有一些在Java中实现的加密算法的硬编码。
152 | 函数指针。


--------------------------------------------------------------------------------
/Chapter-54/54.1介绍.md:
--------------------------------------------------------------------------------
 1 | 5.4  章
 2 | 
 3 | 54.1介绍
 4 | 大家都知道，java有很多的反编译器（或是产生JVM字节码）
 5 | 原因是JVM字节码比其他的X86低级代码更容易进行反编译。
 6 | 
 7 | a).多很多相关数据类型的信息。
 8 | b).JVM（java虚拟机）内存模型更严格和概括。
 9 | c).java编译器没有做任何的优化工作（JVM JIT不是实时），所以，类文件中的字节代码的通常更清晰易读。
10 | 
11 | JVM字节码知识什么时候有用呢？
12 | 
13 | a).文件的快速粗糙的打补丁任务，类文件不需要重新编译反编译的结果。
14 | b).分析混淆代码
15 | c).创建你自己的混淆器。
16 | d).创建编译器代码生成器（后端）目标。
17 | 
18 | 我们从一段简短的代码开始，除非特殊声明，我们用的都是JDK1.7
19 | 
20 | 反编译类文件使用的命令，随处可见：javap -c -verbase.
21 | 
22 | 在这本书中提供的很多的例子，都用到了这个。
23 | 
24 | 
25 | 


--------------------------------------------------------------------------------
/Chapter-54/54.2返回一个值.md:
--------------------------------------------------------------------------------
  1 | 54.2 返回一个值
  2 | 
  3 | 可能最简单的java函数就是返回一些值，oh，并且我们必须注意，一边情况下，在java中没有孤立存在的函数，他们是“方法”(method)，每个方法都是被关联到某些类，所以方法不会被定义在类外面， 但是我还是叫他们“函数”
  4 | (function),我这么用。
  5 | 
  6 | 
  7 |     public class ret
  8 |     {
  9 |     public static int main(String[] args)
 10 |     {
 11 |     return 0;
 12 |     }
 13 |     }
 14 |     
 15 | 
 16 | 编译它。
 17 |     
 18 |     javac ret.java
 19 | 
 20 | 。。。使用Java标准工具反编译。
 21 | 
 22 |     javap -c -verbose ret.class
 23 |     
 24 | 会得到结果：
 25 | 
 26 |     public static int main(java.lang.String[]);
 27 |     flags: ACC_PUBLIC, ACC_STATIC
 28 |     Code:
 29 |     stack=1, locals=1, args_size=1
 30 |     0: iconst_0
 31 |     1: ireturn
 32 | 
 33 | 对于java开发者在编程中，0是使用频率最高的常量。
 34 | 因为区分短一个短字节的 iconst_0指令入栈0，iconst_1指令（入栈），iconst_2等等，直到iconst5。也可以有iconst_m1, 推送-1。
 35 | 
 36 | 
 37 | 就像在MIPS中，分离一个寄存器给0常数：3.5.2 在第三页。
 38 | 
 39 | 栈在JVM中用于在函数调用时，传参和传返回值。因此， iconst_0是将0入栈，ireturn指令，（i就是integer的意思。）是从栈顶返回整数值。
 40 | 
 41 | ［校准到这,未完待续...］
 42 | 
 43 | 让我们写一个简单的例子， 现在我们返回1234：
 44 | 
 45 |     public class ret
 46 |     {
 47 |     public static int main(String[] args)
 48 |     {
 49 |     return 1234;
 50 |     }
 51 |     }
 52 | 
 53 | 我们得到：
 54 | 
 55 | 清单：  54.2:jdk1.7(节选)
 56 |     public static int main(java.lang.String[]);
 57 |     flags: ACC_PUBLIC, ACC_STATIC
 58 |     Code:
 59 |     stack=1, locals=1, args_size=1
 60 |     0: sipush 1234
 61 |     3: ireturn
 62 |     
 63 | sipush(shot integer)如栈值是1234,slot的名字以为着一个16bytes值将会入栈。
 64 | sipush(短整型)
 65 | 1234数值确认时候16-bit值。
 66 | 
 67 |     public class ret
 68 |     {
 69 |     public static int main(String[] args)
 70 |     {
 71 |     return 12345678;
 72 |     }
 73 |     }
 74 |     
 75 | 更大的值是什么？
 76 | 
 77 | 清单 54.3 常量区
 78 | 
 79 | 
 80 |     ...
 81 |     #2 = Integer 12345678
 82 |     ...
 83 | 5栈顶
 84 | 
 85 | 
 86 |     public static int main(java.lang.String[]);
 87 |     flags: ACC_PUBLIC, ACC_STATI
 88 |     Code:
 89 |     stack=1, locals=1, args_size=1
 90 |     0: ldc #2 // int 12345678
 91 |     2: ireturn
 92 | 
 93 | 907
 94 | 
 95 | 操作码
 96 | JVM的指令码操作码不可能编码成32位数，开发者放弃这种可能。因此，32位数字12345678是被存储在一个叫做常量区的地方。让我们说（大多数被使用的常数（包括字符，对象等等车））
 97 | 对我们而言。
 98 | 
 99 | 对JVM来说传递常量不是唯一的，MIPS ARM和其他的RISC CPUS也不可能把32位操作编码成32位数字，因此 RISC CPU（包括MIPS和ARM）去构造一个值需要一系列的步骤，或是他们保存在数据段中：
100 | 28。3 在654页.291 在695页。
101 | 
102 | MIPS码也有一个传统的常量区，literal pool(原语区)
103 | 这个段被叫做"lit4"(对于32位单精度浮点数常数存储)
104 | 和lit8(64位双精度浮点整数常量区)
105 | 
106 | 布尔型
107 | 
108 |     public class ret
109 |     {
110 |     public static boolean main(String[] args)
111 |     {
112 |     return true;
113 |     }
114 |     }
115 | 
116 | 
117 | 
118 |     public static boolean main(java.lang.String[]);
119 |     flags: ACC_PUBLIC, ACC_STATIC
120 |     Code:
121 |     stack=1, locals=1, args_size=1
122 |     0: iconst_1
123 | 
124 | 
125 | 这个JVM字节码是不同于返回的整数学 ，32位数据，在形参中被当成逻辑值使用。像C/C++，但是不能像使用整型或是viceversa返回布尔型，类型信息被存储在类文件中，在运行时检查。
126 | 
127 | 16位短整型也是一样。
128 | 
129 | 908
130 | 
131 |     public class ret
132 |     {
133 |     
134 |     public static short main(String[] args)
135 |     {
136 |     return 1234;
137 |     }
138 |     }
139 |     public static short main(java.lang.String[]);
140 |     flags: ACC_PUBLIC, ACC_STATIC
141 |     Code:
142 |     stack=1, locals=1, args_size=1
143 |     0: sipush 1234
144 |     3: ireturn
145 | 
146 | 还有char 字符型？
147 | 
148 |     public class ret
149 |     {
150 |     public static char main(String[] args)
151 |     {
152 |     return 'A';
153 |     }
154 |     }
155 |     public static char main(java.lang.String[]);
156 |     flags: ACC_PUBLIC, ACC_STATIC
157 |     Code:
158 |     stack=1, locals=1, args_size=1
159 |     0: bipush 65
160 |     2: ireturn
161 | 
162 | 
163 | bipush 的意思"push byte"字节入栈，不必说java的char是16位UTF16字符，和short 短整型相等，单ASCII码的A字符是65，它可能使用指令传输字节到栈。
164 | 
165 | 让我们是试一下byte。
166 | 
167 |     public class retc
168 |     {
169 |     public static byte main(String[] args)
170 |     {
171 |     return 123;
172 |     }
173 |     }
174 |     public static byte main(java.lang.String[]);
175 |     flags: ACC_PUBLIC, ACC_STATIC
176 |     
177 |     Code:
178 |     stack=1, locals=1, args_size=1
179 |     0: bipush 123
180 |     2: ireturn
181 | 
182 | 909
183 | 
184 | 也许会问，位什么费事用两个16位整型当32位用？为什么char数据类型和短整型类型还使用char.
185 | 
186 | 答案很简单，为了数据类型的控制和代码的可读性。char也许本质上short相同，但是我们快速的掌握它的占位符，16位的UTF字符，并且不像其他的integer值符。使用 short,为各位展现一下变量的范围被限制在16位。在需要的地方使用boolean型也是一个很好的主意。代替C样式的int也是为了相同的目的。
187 | 
188 | 在java中integer的64位数据类型。
189 | 
190 |     public class ret3
191 |     {
192 |     public static long main(String[] args)
193 |     {
194 |     return 1234567890123456789L;
195 |     }
196 |     }
197 | 
198 | 清单54.4常量区
199 | 
200 |     ...
201 |     #2 = Long 1234567890123456789l
202 |     ...
203 |     public static long main(java.lang.String[]);
204 |     flags: ACC_PUBLIC, ACC_STATIC
205 |     Code:
206 |     stack=2, locals=1, args_size=1
207 |     0: ldc2_w #2 // long ⤦
208 |     Ç 1234567890123456789l
209 |     3: lreturn
210 | 
211 | 
212 | 64位数也被在存储在常量区，ldc2_w 加载它，lreturn返回它。 ldc2_w指令也是从内存常量区中加载双精度浮点数。（同样占64位）
213 | 
214 | 
215 |     public class ret
216 |     {
217 |     public static double main(String[] args)
218 |     {
219 |     return 123.456d;
220 |     }
221 |     }
222 |     
223 | 清单54.5常量区
224 | 
225 |     ...
226 |     #2 = Double 123.456d
227 |     ...
228 |     public static double main(java.lang.String[]);
229 |     flags: ACC_PUBLIC, ACC_STATIC
230 |     Code:
231 |     stack=2, locals=1, args_size=1
232 |     0: ldc2_w #2 // double 123.456⤦
233 |     Ç d
234 |     3: dreturn
235 | 
236 | 
237 | dreturn 代表 "return double"
238 | 
239 | 最后，单精度浮点数：
240 | 
241 |     public class ret
242 |     {
243 |     public static float main(String[] args)
244 |     {
245 |     return 123.456f;
246 |     }
247 |     }
248 | 
249 | 清单54.6 常量区
250 | 
251 |     ...
252 |     #2 = Float 123.456f
253 |     ...
254 |     public static float main(java.lang.String[]);
255 |     flags: ACC_PUBLIC, ACC_STATIC
256 |     Code:
257 |     stack=1, locals=1, args_size=1
258 |     0: ldc #2 // float 123.456f
259 |     2: freturn
260 | 
261 | 此处的ldc指令使用和32位整型数据一样，从常量区中加载。freturn 的意思是"return float"
262 | 
263 | 
264 | 
265 | 
266 | 那么函数还能返回什么呢？
267 | 
268 | 911
269 |     
270 |     public class ret
271 |     {
272 |     public static void main(String[] args)
273 |     {
274 |     return;
275 |     }
276 |     }
277 |     public static void main(java.lang.String[]);
278 |     flags: ACC_PUBLIC, ACC_STATIC
279 |     Code:
280 |     stack=0, locals=1, args_size=1
281 |     0: return
282 | 
283 | 
284 | 这以为着，使用return控制指令确没有返回实际的值，知道这一点就非常容易的从最后一条指令中演绎出函数（或是方法）的返回类型。
285 | 


--------------------------------------------------------------------------------
/Chapter-54/54.3简单的计算函数.md:
--------------------------------------------------------------------------------
  1 | 54.3 简单的计算函数
  2 | 
  3 | 让我们继续看简单的计算函数。
  4 | 
  5 |     public class calc
  6 |     {
  7 |     public static int half(int a)
  8 |     {
  9 |     return a/2;
 10 |     }
 11 |     }
 12 | 
 13 | 这种情况使用icont_2会被使用。
 14 | 
 15 |     public static int half(int);
 16 |     flags: ACC_PUBLIC, ACC_STATIC
 17 |     Code:
 18 |     stack=2, locals=1, args_size=1
 19 |     0: iload_0
 20 |     1: iconst_2
 21 |     2: idiv
 22 |     3: ireturn
 23 | 
 24 | iload_0 将零给函数做参数，然后将其入栈。iconst_2将2入栈，这两个指令执行后，栈看上去是这个样子的。
 25 | 
 26 |     +---+
 27 |     TOS ->| 2 |
 28 |     +---+
 29 |     | a |
 30 |     +---+
 31 | 
 32 | 
 33 | idiv携带两个值在栈顶，
 34 | divides 只有一个值，返回结果在栈顶。
 35 | 
 36 |     +--------+
 37 |     TOS ->| result |
 38 |     +--------+
 39 | 
 40 | ireturn取得比返回。
 41 | 让我们处理双精度浮点整数。
 42 | 
 43 |     public class calc
 44 |     {
 45 |     public static double half_double(double a)
 46 |     {
 47 |     return a/2.0;
 48 |     }
 49 |     }
 50 |     
 51 | 
 52 | 清单54.7 常量区
 53 | 
 54 |     ...
 55 |     #2 = Double 2.0d
 56 |     ...
 57 |     public static double half_double(double);
 58 |     flags: ACC_PUBLIC, ACC_STATIC
 59 |     Code:
 60 |     stack=4, locals=2, args_size=1
 61 |     0: dload_0
 62 |     1: ldc2_w #2 // double 2.0d
 63 |     4: ddiv
 64 |     5: dreturn
 65 | 
 66 | 
 67 | 类似，只是ldc2_w指令是从常量区装载2.0，另外，所有其他三条指令有d前缀，意思是他们工作在double数据类型下。
 68 | 
 69 | 我们现在使用两个参数的函数。
 70 |     
 71 |     public class calc
 72 |     {
 73 |     public static int sum(int a, int b)
 74 |     {
 75 |     return a+b;
 76 |     }
 77 |     }
 78 |     public static int sum(int, int);
 79 |     flags: ACC_PUBLIC, ACC_STATIC
 80 |     Code:
 81 |     stack=2, locals=2, args_size=2
 82 |     0: iload_0
 83 |     1: iload_1
 84 |     2: iadd
 85 |     3: ireturn
 86 | 
 87 | 
 88 | iload_0加载第一个函数参数（a)，iload_2 第二个参数(b)下面两条指令执行后，栈的情况如下：
 89 | 
 90 |     +---+
 91 |     TOS ->| b |
 92 |     +---+
 93 |     | a |
 94 |     +---+
 95 | 
 96 | 
 97 | iadds 增加两个值，返回结果在栈顶。
 98 |     +--------+
 99 |     TOS ->| result |
100 |     +--------+
101 | 
102 | 
103 | 让我们把这个例子扩展成长整型数据类型。
104 | 
105 |     public static long lsum(long a, long b)
106 |     {
107 |     return a+b;
108 |     }
109 | 
110 | 我们得到的是：
111 | 
112 |     public static long lsum(long, long);
113 |     flags: ACC_PUBLIC, ACC_STATIC
114 |     Code:
115 |     stack=4, locals=4, args_size=2
116 |     0: lload_0
117 |     1: lload_2
118 |     2: ladd
119 |     3: lreturn
120 | 
121 | 第二个（load指令从第二参数槽中，取得第二参数。这是因为64位长整型的值占用来位，用了另外的话2位参数槽。）
122 | 
123 | 稍微复杂的例子
124 | 
125 |     public class calc
126 |     {
127 |     public static int mult_add(int a, int b, int c)
128 |     {
129 |     return a*b+c;
130 |     }
131 |     }
132 |     public static int mult_add(int, int, int);
133 |     flags: ACC_PUBLIC, ACC_STATIC
134 |     Code:
135 |     stack=2, locals=3, args_size=3
136 |     0: iload_0
137 |     1: iload_1
138 |     2: imul
139 |     3: iload_2
140 |     4: iadd
141 |     5: ireturn
142 | 
143 | 第一是相乘，积被存储在栈顶。
144 | 
145 |     +---------+
146 |     TOS ->| product |
147 |     +---------+
148 | iload_2加载第三个参数（C）入栈。
149 | 
150 |     +---------+
151 |     TOS ->| c |
152 |     +---------+
153 |     | product |
154 |     +---------+
155 | 
156 | 现在iadd指令可以相加两个值。
157 | 
158 | 915
159 | 
160 | 


--------------------------------------------------------------------------------
/Chapter-54/54.4JVM内存模型.md:
--------------------------------------------------------------------------------
 1 | 54.4 JVM内存模型
 2 | 
 3 | X86和其他低级环境系统使用栈传递参数和存储本地变量，JVM稍微有些不同。
 4 | 
 5 | 主要体现在：
 6 | 本地变量数组（LVA）被用于存储到来函数的参数和本地变量。iload_0指令是从其中加载值，istore存储值在其中，首先，函数参数到达：开始从0 或者1(如果0参被this指针用。)，那么本地局部变量被分配。
 7 | 
 8 | 每个槽子的大小都是32位，因此long和double数据类型都占两个槽。
 9 | 
10 | 操作数栈（或只是"栈"），被用于在其他函数调用时，计算和传递参数。不像低级X86的环境，它不能去访问栈，而又不明确的使用pushes和pops指令，进行出入栈操作。
11 | 


--------------------------------------------------------------------------------
/Chapter-54/54.5简单的函数调用.md:
--------------------------------------------------------------------------------
  1 | 54.5 简单的函数调用
  2 | mathrandom()返回一个伪随机数，函数范围在「0.0...1.0)之间，但对我们来说，由于一些原因，我们常常需要设计一个函数返回数值范围在「0.0...0.5)
  3 | 
  4 | 
  5 |     public class HalfRandom
  6 |     {
  7 |     public static double f()
  8 |     {
  9 |     return Math.random()/2;
 10 |     }
 11 |     }
 12 |     
 13 | 
 14 | 
 15 | 54.8 常量区
 16 | 
 17 |     ...
 18 |     #2 = Methodref #18.#19 // java/lang/Math.⤦
 19 |     Ç random:()D
 20 |     6(Java) Local Variable Array
 21 |     
 22 |     #3 = Double 2.0d
 23 |     ...
 24 |     #12 = Utf8 ()D
 25 |     ...
 26 |     #18 = Class #22 // java/lang/Math
 27 |     #19 = NameAndType #23:#12 // random:()D
 28 |     #22 = Utf8 java/lang/Math
 29 |     #23 = Utf8 random
 30 |     public static double f();
 31 |     flags: ACC_PUBLIC, ACC_STATIC
 32 |     Code:
 33 |     stack=4, locals=0, args_size=0
 34 |     0: invokestatic #2 // Method java/⤦
 35 |     Ç lang/Math.random:()D
 36 |     3: ldc2_w #3 // double 2.0d
 37 |     6: ddiv
 38 |     7: dreturn
 39 | 
 40 | java本地变量数组
 41 | 916
 42 | 静态执行调用math.random()函数，返回值在栈顶。结果是被0.5初返回的，但函数名是怎么被编码的呢？
 43 | 在常量区使用methodres表达式,进行编码的，它定义类和方法的名称。第一个methodref 字段指向表达式，其次，指向通常文本字符（"java/lang/math"）
 44 | 第二个methodref表达指向名字和类型表达式，同时链接两个字符。第一个方法的名字式字符串"random",第二个字符串是"()D",来编码函数类型，它以为这两个值（因此D是字符串）这种方式1JVM可以检查数据类型的正确性：2）java反编译器可以从被编译的类文件中修改数据类型。
 45 | 
 46 | 最后，我们试着使用"hello，world！"作为例子。
 47 | 
 48 |     public class HelloWorld
 49 |     {
 50 |     public static void main(String[] args)
 51 |     {
 52 |     System.out.println("Hello, World");
 53 |     }
 54 |     }
 55 | 
 56 | 
 57 | 54.9 常量区
 58 | 
 59 | 
 60 | 917
 61 | 常量区的ldc行偏移3，指向"hello，world！"字符串，并且将其入栈，在java里它被成为饮用，其实它就是指针，或是地址。
 62 | 
 63 | 918
 64 | 
 65 |     ...
 66 |     #2 = Fieldref #16.#17 // java/lang/System.⤦
 67 |     Ç out:Ljava/io/PrintStream;
 68 |     #3 = String #18 // Hello, World
 69 |     #4 = Methodref #19.#20 // java/io/⤦
 70 |     Ç PrintStream.println:(Ljava/lang/String;)V
 71 |     ...
 72 |     #16 = Class #23 // java/lang/System
 73 |     #17 = NameAndType #24:#25 // out:Ljava/io/⤦
 74 |     Ç PrintStream;
 75 |     #18 = Utf8 Hello, World
 76 |     #19 = Class #26 // java/io/⤦
 77 |     Ç PrintStream
 78 |     #20 = NameAndType #27:#28 // println:(Ljava/⤦
 79 |     Ç lang/String;)V
 80 |     ...
 81 |     #23 = Utf8 java/lang/System
 82 |     #24 = Utf8 out
 83 |     #25 = Utf8 Ljava/io/PrintStream;
 84 |     #26 = Utf8 java/io/PrintStream
 85 |     #27 = Utf8 println
 86 |     #28 = Utf8 (Ljava/lang/String;)V
 87 |     ...
 88 |     public static void main(java.lang.String[]);
 89 |     flags: ACC_PUBLIC, ACC_STATIC
 90 |     Code:
 91 |     stack=2, locals=1, args_size=1
 92 |     0: getstatic #2 // Field java/⤦
 93 |     Ç lang/System.out:Ljava/io/PrintStream;
 94 |     3: ldc #3 // String Hello, ⤦
 95 |     Ç World
 96 |     5: invokevirtual #4 // Method java/io⤦
 97 |     Ç /PrintStream.println:(Ljava/lang/String;)V
 98 |     8: return
 99 | 
100 | 常见的invokevirtual指令，从常量区取信息，然后调用pringln()方法，貌似我们知道的println()方法，适用于各种数据类型，我这种println()函数版本，预先给的是字符串类型。
101 | 
102 | 但是第一个getstatic指令是干什么的？这条指令取得对象信息的字段的一个引用或是地址。输出并将其进栈，这个值实际更像是println放的指针，因此，内部的print method取得两个参数，输入1指向对象的this指针，2）"hello，world"字符串的地址，确实，println()在被初始化系统的调用，对象之外，为了方便，javap使用工具把所有的信息都写入到注释中。
103 | 
104 | 


--------------------------------------------------------------------------------
/Chapter-54/54.6调用beep函数.md:
--------------------------------------------------------------------------------
 1 | 54.6  调用beep()函数
 2 | 这可能是最简单的，不使用参数的调用两个函数。
 3 | 
 4 | 
 5 |     public static void main(String[] args)
 6 |     {
 7 |     java.awt.Toolkit.getDefaultToolkit().beep();
 8 |     };
 9 |     public static void main(java.lang.String[]);
10 |     flags: ACC_PUBLIC, ACC_STATIC
11 |     Code:
12 |     stack=1, locals=1, args_size=1
13 |     0: invokestatic #2 // Method java/⤦
14 |     Ç awt/Toolkit.getDefaultToolkit:()Ljava/awt/Toolkit;
15 |     3: invokevirtual #3 // Method java/⤦
16 |     Ç awt/Toolkit.beep:()V
17 |     6: return
18 | 
19 | 
20 | 首先，invokestatic在0行偏移调用javaawt.toolkit. getDefaultTookKit()函数,返回toolkit类对象的引用，invokedvirtualIFge指令在3行偏移，调用这个类的beep（）方法。
21 | 
22 | 


--------------------------------------------------------------------------------
/Chapter-54/54.7线性同余随机生成器.md:
--------------------------------------------------------------------------------
 1 | 54.7 线性同余伪随机数生成器
 2 | 我们来试一个简单的伪随机函数生成器，我已经在这本书中用过一次了。（在500页20行）
 3 | 919
 4 | 
 5 | 
 6 |     public class LCG
 7 |     {
 8 |     public static int rand_state;
 9 |     public void my_srand (int init)
10 |     {
11 |     rand_state=init;
12 |     }
13 |     public static int RNG_a=1664525;
14 |     public static int RNG_c=1013904223;
15 |     
16 |     public int my_rand ()
17 |     {
18 |     rand_state=rand_state*RNG_a;
19 |     rand_state=rand_state+RNG_c;
20 |     return rand_state & 0x7fff;
21 |     }
22 |     }
23 | 
24 | 
25 | 一对类的字段，在最开始时被初始化。但是怎么能，在javap的输出中，发现类的构造呢？
26 | 
27 |     static {};
28 |     flags: ACC_STATIC
29 |     Code:
30 |     stack=1, locals=0, args_size=0
31 |     0: ldc #5 // int 1664525
32 |     2: putstatic #3 // Field RNG_a:I
33 |     5: ldc #6 // int 1013904223
34 |     7: putstatic #4 // Field RNG_c:I
35 |     10: return
36 | 
37 | 这种变量的初始化，RNG_a占用了3个参数槽，iRNG_C是4个，而puststatic指令是，用于设定常量。
38 | 
39 | my_srand()函数，只是将输入值，存储到rand_state中;
40 | 
41 |     public void my_srand(int);
42 |     flags: ACC_PUBLIC
43 |     Code:
44 |     stack=1, locals=2, args_size=2
45 |     0: iload_1
46 |     1: putstatic #2 // Field ⤦
47 |     Ç rand_state:I
48 |     4: return
49 | 
50 |  iload_1 取得输入值并将其入栈。但为什么不用iload_0?因为这个函数可能使用类的字段属性，因此这个变量被作为参数0传递给了函数，rand_state字段属性，在类中占用2个参数槽子。
51 | 
52 | 现在my_rand():
53 | 
54 |     public int my_rand();
55 |     flags: ACC_PUBLIC
56 |     Code:
57 |     stack=2, locals=1, args_size=1
58 |     0: getstatic #2 // Field ⤦
59 |     Ç rand_state:I
60 |     3: getstatic #3 // Field RNG_a:I
61 |     6: imul
62 |     7: putstatic #2 // Field ⤦
63 |     Ç rand_state:I
64 |     10: getstatic #2 // Field ⤦
65 |     Ç rand_state:I
66 |     13: getstatic #4 // Field RNG_c:I
67 |     16: iadd
68 |     17: putstatic #2 // Field ⤦
69 |     Ç rand_state:I
70 |     20: getstatic #2 // Field ⤦
71 |     Ç rand_state:I
72 |     23: sipush 32767
73 |     26: iand
74 |     27: ireturn
75 | 
76 | 它仅是加载了所有对象字段的值。在20行偏移，操作和更新rand_state，使用putstatic指令。
77 | 
78 |  rand_state 值被再次重载（因为之前，使用过putstatic指令，其被从栈中弃出）这种代码其实比较低效率，但是可以肯定的是，JVM会经常的，对其进行很好的优化。
79 | 
80 | 


--------------------------------------------------------------------------------
/Chapter-54/54.8条件跳转.md:
--------------------------------------------------------------------------------
  1 | 54.8 条件跳转
  2 | 让我们进入条件跳转
  3 | 
  4 |     public class abs
  5 |     {
  6 |     public static int abs(int a)
  7 |     {
  8 |     if (a<0)
  9 |     return -a;
 10 |     return a;
 11 |     }
 12 |     }
 13 |     public static int abs(int);
 14 |     flags: ACC_PUBLIC, ACC_STATIC
 15 |     Code:
 16 |     stack=1, locals=1, args_size=1
 17 |     0: iload_0
 18 |     1: ifge 7
 19 |     4: iload_0
 20 |     5: ineg
 21 |     6: ireturn
 22 |     7: iload_0
 23 |     8: ireturn
 24 | 
 25 | 921
 26 | 
 27 | ifge跳转到7行偏移，如果栈顶的值大于等于0，别忘了，任何IFXX指令从栈中pop出栈值（用于进行比较）
 28 | 
 29 | 另外一个例子
 30 | 
 31 |     public static int min (int a, int b)
 32 |     {
 33 |     if (a>b)
 34 |     return b;
 35 |     return a;
 36 |     }
 37 | 
 38 | 
 39 | 我们得到的是：
 40 | 
 41 |     public static int min(int, int);
 42 |     flags: ACC_PUBLIC, ACC_STATIC
 43 |     Code:
 44 |     stack=2, locals=2, args_size=2
 45 |     0: iload_0
 46 |     1: iload_1
 47 |     2: if_icmple 7
 48 |     5: iload_1
 49 |     6: ireturn
 50 |     7: iload_0
 51 |     8: ireturn
 52 | 
 53 | if_icmple出栈两个值并比较他们，如果第三个子值比第一个值小（或者等于）发生跳转到行偏移7.
 54 | 
 55 | 当我们定义max()函数。
 56 | 
 57 |     public static int max (int a, int b)
 58 |     {
 59 |     if (a>b)
 60 |     return a;
 61 |     return b;
 62 |     }
 63 | 
 64 | 。。。结果代码是是一样的，但是最后两个iload指令（行偏移5和行偏移7）被跳转了。
 65 | 
 66 |     public static int max(int, int);
 67 |     flags: ACC_PUBLIC, ACC_STATIC
 68 |     Code:
 69 |     stack=2, locals=2, args_size=2
 70 |     0: iload_0
 71 |     1: iload_1
 72 |     2: if_icmple 7
 73 |     5: iload_0
 74 |     6: ireturn
 75 |     7: iload_1
 76 |     8: ireturn
 77 | 
 78 | 922
 79 | 更复杂的例子。。
 80 | 
 81 |     public class cond
 82 |     {
 83 |     public static void f(int i)
 84 |     {
 85 |     if (i<100)
 86 |     System.out.print("<100");
 87 |     if (i==100)
 88 |     System.out.print("==100");
 89 |     if (i>100)
 90 |     System.out.print(">100");
 91 |     if (i==0)
 92 |     System.out.print("==0");
 93 |     }
 94 |     }
 95 |     public static void f(int);
 96 |     flags: ACC_PUBLIC, ACC_STATIC
 97 |     Code:
 98 |     stack=2, locals=1, args_size=1
 99 |     0: iload_0
100 |     1: bipush 100
101 |     3: if_icmpge 14
102 |     6: getstatic #2 // Field java/⤦
103 |     Ç lang/System.out:Ljava/io/PrintStream;
104 |     9: ldc #3 // String <100
105 |     11: invokevirtual #4 // Method java/io⤦
106 |     Ç /PrintStream.print:(Ljava/lang/String;)V
107 |     14: iload_0
108 |     15: bipush 100
109 |     17: if_icmpne 28
110 |     20: getstatic #2 // Field java/⤦
111 |     Ç lang/System.out:Ljava/io/PrintStream;
112 |     23: ldc #5 // String ==100
113 |     25: invokevirtual #4 // Method java/io⤦
114 |     Ç /PrintStream.print:(Ljava/lang/String;)V
115 |     28: iload_0
116 |     29: bipush 100
117 |     31: if_icmple 42
118 |     34: getstatic #2 // Field java/⤦
119 |     Ç lang/System.out:Ljava/io/PrintStream;
120 |     37: ldc #6 // String >100
121 |     39: invokevirtual #4 // Method java/io⤦
122 |     Ç /PrintStream.print:(Ljava/lang/String;)V
123 |     42: iload_0
124 |     43: ifne 54
125 |     46: getstatic #2 // Field java/⤦
126 |     Ç lang/System.out:Ljava/io/PrintStream;
127 |     49: ldc #7 // String ==0
128 |     51: invokevirtual #4 // Method java/io⤦
129 |     Ç /PrintStream.print:(Ljava/lang/String;)V
130 |     54: return
131 |     
132 | 923
133 | if_icmpge出栈两个值，并且比较它们，如果第的二个值大于第一个，发生跳转到行偏移14，if_icmpne和if_icmple做的工作类似，但是使用不同的判断条件。
134 | 
135 | 在行偏移43的ifne指令，它的名字不是很恰当，我要愿意把它命名为ifnz
136 | 
137 | 如果栈定的值不是0跳转，但是这是怎么做的，总跳转到行偏移54，如果输入的值不是另，如果是0，执行流程进入行偏移46，“==”字符串被打印。
138 | 
139 | N.BJVM没有无符号数据类型，所以，比较指令的操作数，只有还有符号整数值。
140 | 
141 | 


--------------------------------------------------------------------------------
/Chapter-54/54.9传递参数值.md:
--------------------------------------------------------------------------------
 1 | 54.9 传递参数值
 2 | 
 3 | 我们来扩展一下min()/max()这个例子。
 4 | 
 5 | 
 6 |     public class minmax
 7 |     {
 8 |     public static int min (int a, int b)
 9 |     {
10 |     if (a>b)
11 |     return b;
12 |     return a;
13 |     }
14 |     public static int max (int a, int b)
15 |     {
16 |     if (a>b)
17 |     return a;
18 |     return b;
19 |     }
20 |     public static void main(String[] args)
21 |     {
22 |     int a=123, b=456;
23 |     int max_value=max(a, b);
24 |     int min_value=min(a, b);
25 |     System.out.println(min_value);
26 |     System.out.println(max_value);
27 |     }
28 |     }
29 | 
30 | 924
31 | 这是main()函数的代码。
32 | 
33 |     public static void main(java.lang.String[]);
34 |     flags: ACC_PUBLIC, ACC_STATIC
35 |     Code:
36 |     stack=2, locals=5, args_size=1
37 |     0: bipush 123
38 |     2: istore_1
39 |     3: sipush 456
40 |     6: istore_2
41 |     7: iload_1
42 |     8: iload_2
43 |     9: invokestatic #2 // Method max:(II⤦
44 |     Ç )I
45 |     12: istore_3
46 |     13: iload_1
47 |     14: iload_2
48 |     15: invokestatic #3 // Method min:(II⤦
49 |     Ç )I
50 |     18: istore 4
51 |     20: getstatic #4 // Field java/⤦
52 |     Ç lang/System.out:Ljava/io/PrintStream;
53 |     23: iload 4
54 |     25: invokevirtual #5 // Method java/io⤦
55 |     Ç /PrintStream.println:(I)V
56 |     28: getstatic #4 // Field java/⤦
57 |     Ç lang/System.out:Ljava/io/PrintStream;
58 |     31: iload_3
59 |     32: invokevirtual #5 // Method java/io⤦
60 |     Ç /PrintStream.println:(I)V
61 |     35: return
62 | 
63 | 
64 | 
65 | 925
66 | 参数在栈中的被传给其他函数，返回值在栈顶。
67 | 
68 | 


--------------------------------------------------------------------------------
/Chapter-55/55.1_MicrosoftVisualC++.md:
--------------------------------------------------------------------------------
  1 | #V 寻找代码中有趣或者重要的部分
  2 | 
  3 | 现代软件设计中，极简不是特别重要的特性。
  4 | 
  5 | 并不是因为程序员编写的代码多，而是由于许多库通常都会静态链接到可执行文件中。如果所有的外部库都移入了外部DLL文件中，情况将有所不同。(C++使用STL和其他模版库的另一个原因)
  6 | 
  7 | 因此，确定函数的来源很重要，是否来源于标准库或者其他著名的库(比如[Boost](http://go.yurichev.com/17036)，[libpng](http://go.yurichev.com/17037))，是否与我们在代码中寻找的东西相关。
  8 | 
  9 | 通过重写所有的C/C++代码来寻找我们想要的东西是不现实的。
 10 | 
 11 | 逆向工程师的一个主要的任务是迅速定位到目标代码。
 12 | 
 13 | IDA反汇编工具允许我们搜索文本字符串，字节序列和常量。甚至可以导出为.lst或者.asm文件，然后使用grep,awk等工具进一步分析。
 14 | 
 15 | 当你尝试去理解某些代码的功能时，一些开源库比如libpng会容易理解一些。当你觉得某些常量或者文本字符串眼熟时，值得用google搜索一下。如果你发现他们在某些地方使用了开源项目时，那么只要对比一下函数就可以了。这些方法能够解决部分问题。
 16 | 
 17 | 举个例子，如果一个程序使用XML文件，那么第一步是确定使用了哪个XML库。通常情况下使用的是标准库(或者有名的库)而非自编写的库。
 18 | 
 19 | 再举个例子，有一次我尝试去理解SAP 6.0中网络包如何压缩与解压。整个软件很大，但手头有一个包含详细debug信息的.PDB文件，非常方便。最后我找到一个负责解压网络包的函数，叫CsDecomprLZC。我马上就用google搜索了函数名，发现MaxDB(一个开源SAP项目)也使用了这个函数。[http://www.google.com/search?q=CsDecomprLZC](http://www.google.com/search?q=CsDecomprLZC)
 20 | 
 21 | 然后惊奇的发现，MaxDB和SAP 6.0 使用同样的代码来处理压缩和解压网络包。
 22 | 
 23 | 
 24 | #第55章
 25 | ##识别可执行文件
 26 | 
 27 | ###55.1 Microsoft Visual C++ 
 28 | 
 29 | 可导入的MSVC版本和DLL文件如下图：
 30 | 
 31 | ![](img/55-1.png)
 32 | 
 33 | 
 34 | msvcp*.dll包含C++相关函数，因此如果导入了这类dll，便可推测是C++程序。
 35 | 
 36 | 
 37 | ####55.1.1命名管理
 38 | 
 39 | 命名通常以问号?开始。
 40 | 
 41 | 获取更多关于MSVC命令管理的信息：51.1.1节
 42 | 
 43 | ###55.2 GCC
 44 | 
 45 | 除了*NIX环境，Win32下也有GCC，需要Cygwin和MinGW。
 46 | 
 47 | ####55.2.1 命名管理
 48 | 
 49 | 命名通常以_Z符号开头。
 50 | 
 51 | 更多关于GCC命名管理的信息：51.1.1节
 52 | 
 53 | ####55.2.2 Cygwin
 54 | 
 55 | cygwin1.dll经常被导入。
 56 | 
 57 | ####55.2.3 MinGW
 58 | 
 59 | msvcrt.dll可能会被导入。
 60 | 
 61 | ###55.3 Intel FORTRAN
 62 | 
 63 | libifcoremd.dll,libifportmd.dll和libiomp5md.dll(OpenMP支持)可能会被导入。
 64 | 
 65 | libifcoremd.dll中许多函数以前缀名for_开始，表示FORTRAN。
 66 | 
 67 | 
 68 | ###55.4Watcom,OpenWatcom
 69 | ####55.4.1 命名管理
 70 | 
 71 | 命名通常以W符号开始。
 72 | 
 73 | 举个例子，下面是"class"类名为"method"的方法没有任何参数并且返回void的加密：
 74 | 
 75 | ```
 76 | W?method$_class$n__v
 77 | ```								
 78 | 
 79 | ###55.5 Borland 
 80 | 这里有一个有关Borland Delphi和C++开发者命名管理的例子：
 81 | 
 82 | ```
 83 | @TApplication@IdleAction$qv@TApplication@ProcessMDIAccels$qp6tagMSG@TModule@$bctr$qpcpvt1@TModule@$bdtr$qv@TModule@ValidWindow$qp14TWindowsObject@TrueColorTo8BitN$qpviiiiiit1iiiiii@TrueColorTo16BitN$qpviiiiiit1iiiiii@DIB24BitTo8BitBitmap$qpviiiiiit1iiiii@TrueBitmap@$bctr$qpcl@TrueBitmap@$bctr$qpvl@TrueBitmap@$bctr$qiilll
 84 | ```
 85 | 
 86 | 命名通常以@符号开始，然后是类名、方法名、加密方法的参数类型。
 87 | 
 88 | 这些名称会被导入到.exe，.dll和debug信息内等等。
 89 | 
 90 | Borland Visual Component Libarary(VCL)存储在.bpl文件中，而不是.dll。比如vcl50.dll,rtl60.dll。
 91 | 
 92 | 其他可能导入的DLL：BORLNDMM.DLL。
 93 | 
 94 | 
 95 | ####55.5.1 Delphi
 96 | 
 97 | 
 98 | 几乎所有的Delphi可执行文件的代码段都以"Boolean"字符串开始，和其他类型名称一起。
 99 | 下面是一个典型的Delphi程序的代码段开头，这个块紧接着win32 PE文件头：
100 | 
101 | 
102 | ```
103 | 00000400  04 10 40 00 03 07 42 6f  6f 6c 65 61 6e 01 00 00  |..@...Boolean...|00000410  00 00 01 00 00 00 00 10  40 00 05 46 61 6c 73 65  |........@..False|00000420  04 54 72 75 65 8d 40 00  2c 10 40 00 09 08 57 69  |.True.@.,.@...Wi|00000430  64 65 43 68 61 72 03 00  00 00 00 ff ff 00 00 90  |deChar..........|00000440  44 10 40 00 02 04 43 68  61 72 01 00 00 00 00 ff  |D.@...Char......|00000450  00 00 00 90 58 10 40 00  01 08 53 6d 61 6c 6c 69  |....X.@...Smalli|00000460  6e 74 02 00 80 ff ff ff  7f 00 00 90 70 10 40 00  |nt..........p.@.|00000470  01 07 49 6e 74 65 67 65  72 04 00 00 00 80 ff ff  |..Integer.......|00000480  ff 7f 8b c0 88 10 40 00  01 04 42 79 74 65 01 00  |......@...Byte..|00000490  00 00 00 ff 00 00 00 90  9c 10 40 00 01 04 57 6f  |..........@...Wo|000004a0  72 64 03 00 00 00 00 ff  ff 00 00 90 b0 10 40 00  |rd............@.|000004b0  01 08 43 61 72 64 69 6e  61 6c 05 00 00 00 00 ff  |..Cardinal......|000004c0  ff ff ff 90 c8 10 40 00  10 05 49 6e 74 36 34 00  |......@...Int64.|000004d0  00 00 00 00 00 00 80 ff  ff ff ff ff ff ff 7f 90  |................|
104 | 000004e0  e4 10 40 00 04 08 45 78  74 65 6e 64 65 64 02 90  |..@...Extended..|000004f0  f4 10 40 00 04 06 44 6f  75 62 6c 65 01 8d 40 00  |..@...Double..@.|00000500  04 11 40 00 04 08 43 75  72 72 65 6e 63 79 04 90  |..@...Currency..|00000510  14 11 40 00 0a 06 73 74  72 69 6e 67 20 11 40 00  |..@...string .@.|00000520  0b 0a 57 69 64 65 53 74  72 69 6e 67 30 11 40 00  |..WideString0.@.|00000530  0c 07 56 61 72 69 61 6e  74 8d 40 00 40 11 40 00  |..Variant.@.@.@.|00000540  0c 0a 4f 6c 65 56 61 72  69 61 6e 74 98 11 40 00  |..OleVariant..@.|00000550  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|00000560  00 00 00 00 00 00 00 00  00 00 00 00 98 11 40 00  |..............@.|00000570  04 00 00 00 00 00 00 00  18 4d 40 00 24 4d 40 00  |.........M@.$M@.|00000580  28 4d 40 00 2c 4d 40 00  20 4d 40 00 68 4a 40 00  |(M@.,M@. M@.hJ@.|00000590  84 4a 40 00 c0 4a 40 00  07 54 4f 62 6a 65 63 74  |.J@..J@..TObject|000005a0  a4 11 40 00 07 07 54 4f  62 6a 65 63 74 98 11 40  |..@...TObject..@|000005b0  00 00 00 00 00 00 00 06  53 79 73 74 65 6d 00 00  |........System..|000005c0  c4 11 40 00 0f 0a 49 49  6e 74 65 72 66 61 63 65  |..@...IInterface|000005d0  00 00 00 00 01 00 00 00  00 00 00 00 00 c0 00 00  |................|000005e0  00 00 00 00 46 06 53 79  73 74 65 6d 03 00 ff ff  |....F.System....|000005f0  f4 11 40 00 0f 09 49 44  69 73 70 61 74 63 68 c0  |..@...IDispatch.|00000600  11 40 00 01 00 04 02 00  00 00 00 00 c0 00 00 00  |.@..............|00000610  00 00 00 46 06 53 79 73  74 65 6d 04 00 ff ff 90  |...F.System.....|00000620  cc 83 44 24 04 f8 e9 51  6c 00 00 83 44 24 04 f8  |..D$...Ql...D$..|00000630  e9 6f 6c 00 00 83 44 24  04 f8 e9 79 6c 00 00 cc  |.ol...D$...yl...|00000640  cc 21 12 40 00 2b 12 40  00 35 12 40 00 01 00 00  |.!.@.+.@.5.@....|00000650  00 00 00 00 00 00 00 00  00 c0 00 00 00 00 00 00  |................|00000660  46 41 12 40 00 08 00 00  00 00 00 00 00 8d 40 00  |FA.@..........@.|00000670  bc 12 40 00 4d 12 40 00  00 00 00 00 00 00 00 00  |..@.M.@.........|00000680  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|00000690  bc 12 40 00 0c 00 00 00  4c 11 40 00 18 4d 40 00  |..@.....L.@..M@.|000006a0  50 7e 40 00 5c 7e 40 00  2c 4d 40 00 20 4d 40 00  |P~@.\~@.,M@. M@.|000006b0  6c 7e 40 00 84 4a 40 00  c0 4a 40 00 11 54 49 6e  |l~@..J@..J@..TIn|000006c0  74 65 72 66 61 63 65 64  4f 62 6a 65 63 74 8b c0  |terfacedObject..|000006d0  d4 12 40 00 07 11 54 49  6e 74 65 72 66 61 63 65  |..@...TInterface|000006e0  64 4f 62 6a 65 63 74 bc  12 40 00 a0 11 40 00 00  |dObject..@...@..|000006f0  00 06 53 79 73 74 65 6d  00 00 8b c0 00 13 40 00  |..System......@.|00000700  11 0b 54 42 6f 75 6e 64  41 72 72 61 79 04 00 00  |..TBoundArray...|00000710  00 00 00 00 00 03 00 00  00 6c 10 40 00 06 53 79  |.........l.@..Sy|00000720  73 74 65 6d 28 13 40 00  04 09 54 44 61 74 65 54  |stem(.@...TDateT|00000730  69 6d 65 01 ff 25 48 e0  c4 00 8b c0 ff 25 44 e0  |ime..%H......%D.|
105 | ```
106 | 
107 | 数据段(DATA)最开始的四字节可能是00 00 00 00，32 13 8B C0或者FF FF FF FF。在处理加壳/加密的 Delphi可执行文件时这个信息很有用。
108 | 
109 | 
110 | ###55.6其他有名的DLLs
111 | 
112 | *	vcomp*.dll Microsoft实现的OpenMP
113 | 
114 | 
115 | 


--------------------------------------------------------------------------------
/Chapter-55/img/55-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-55/img/55-1.png


--------------------------------------------------------------------------------
/Chapter-56/56_communication_with_the_outer_world_(win32).md:
--------------------------------------------------------------------------------
 1 | #第56章 
 2 | ##与外部世界通信(win32)
 3 | 
 4 | 
 5 | 有时理解函数的功能通过观察函数的输入与输出就足够了。这样可以节省时间。
 6 | 
 7 | 文件和注册访问：对于最基本的分析，SysInternals的[Process Monitor](http://go.yurichev.com/17301)工具很有用。
 8 | 
 9 | 对于基本网络访问分析，Wireshark很有帮助。
10 | 
11 | 但接下来你仍需查看内部。
12 | 
13 | 第一步是查看使用的是OS的API哪个函数，标准库是什么。
14 | 
15 | 如果程序被分为主要的可执行文件和一系列DLL文件，那么DLL文件中的函数名可能会有帮助。
16 | 
17 | 如果我们对指定文本调用MessageBox()的细节感兴趣，我们可以在数据段中查找这个文本，定位文本引用处，以及控制权交给我们感兴趣的MessageBox()的地方。
18 | 
19 | 如果我们在谈论电子游戏，并且对里面的事件的随机性感兴趣，那么我们可以查找rand()函数或者类似函数(比如马特赛特旋转演算法)，然后定位调用这些函数的地方，更重要的是，函数执行结果如何被使用。
20 | 
21 | 但如果不是一个游戏，并且仍然使用了rand()函数，找出原因也很有意思。这里有一些关于在数据压缩算法中意外出现rand()函数调用的例子(模仿加密)：[blog.yurichev.com](blog.yurichev.com)
22 | 
23 | 
24 | ###56.1 Windows API中常用的函数
25 | 
26 | 下面这些函数可能会被导入。值得注意的是并不是每个函数都在代码中使用。许多函数可能被库函数和CRT代码调用。
27 | 
28 | *	注册表访问(advapi32.dll):RegEnumKeyEx, RegEnumValue, RegGetValue7, RegOpenKeyEx, RegQueryValueEx
29 | *	.ini-file访问(kernel32.dll): GetPrivateProfileString
30 | *	资源访问(68.2.8): (user32.dll): LoadMen
31 | *	TCP/IP网络(ws2_32.dll): WSARecv, WSASend
32 | *	文件访问(kernel32.dll): CreateFile, ReadFile, ReadFileEx, WriteFile, WriteFileEx
33 | *	Internet高级访问(wininet.dll): WinHttpOpen
34 | *	可执行文件数字签名(wintrust.dll): WinVerifyTrust
35 | *	标准MSVC库(如果是动态链接的) (msvcr*.dll): assert, itoa, ltoa, open, printf, read, strcmp, atol, atoi, fopen, fread, fwrite, memcmp, rand, strlen, strstr, strchr
36 | 
37 | ###56.2 tracer:拦截所有函数特殊模块
38 | 
39 | 这里有一个INT3断点，只触发了一次，但可以为指定DLL中的所有函数设置。
40 | 
41 | ```
42 | --one-time-INT3-bp:somedll.dll!.*
43 | ```
44 | 
45 | 我们给所有前缀是xml的函数设置INT3断点吧：
46 | 
47 | ```
48 | --one-time-INT3-bp:somedll.dll!xml.*
49 | ```
50 | 
51 | 另一方面，这样的断点只会触发一次。
52 | 
53 | Tracer会在函数调用发生时显示调用情况，但只有一次。但查看函数参数是不可能的。
54 | 
55 | 尽管如此，在你知道这个程序使用了一个DLL，但不知道实际上使用了哪个函数并且有许多的函数的情况下，这个特性还是很有用的。
56 | 
57 | 举个例子，我们来看看，cygwin的uptime工具使用了什么：
58 | 
59 | ```tracer -l:uptime.exe --one-time-INT3-bp:cygwin1.dll!.*```
60 | 我们可以看见所有的至少调用了一次的cygwin1.dll库函数，以及位置：```
61 | One-time INT3 breakpoint: cygwin1.dll!__main (called from uptime.exe!OEP+0x6d (0x40106d))One-time INT3 breakpoint: cygwin1.dll!_geteuid32 (called from uptime.exe!OEP+0xba3 (0x401ba3))One-time INT3 breakpoint: cygwin1.dll!_getuid32 (called from uptime.exe!OEP+0xbaa (0x401baa))One-time INT3 breakpoint: cygwin1.dll!_getegid32 (called from uptime.exe!OEP+0xcb7 (0x401cb7))One-time INT3 breakpoint: cygwin1.dll!_getgid32 (called from uptime.exe!OEP+0xcbe (0x401cbe))One-time INT3 breakpoint: cygwin1.dll!sysconf (called from uptime.exe!OEP+0x735 (0x401735))One-time INT3 breakpoint: cygwin1.dll!setlocale (called from uptime.exe!OEP+0x7b2 (0x4017b2))One-time INT3 breakpoint: cygwin1.dll!_open64 (called from uptime.exe!OEP+0x994 (0x401994))One-time INT3 breakpoint: cygwin1.dll!_lseek64 (called from uptime.exe!OEP+0x7ea (0x4017ea))One-time INT3 breakpoint: cygwin1.dll!read (called from uptime.exe!OEP+0x809 (0x401809))One-time INT3 breakpoint: cygwin1.dll!sscanf (called from uptime.exe!OEP+0x839 (0x401839))One-time INT3 breakpoint: cygwin1.dll!uname (called from uptime.exe!OEP+0x139 (0x401139))One-time INT3 breakpoint: cygwin1.dll!time (called from uptime.exe!OEP+0x22e (0x40122e))One-time INT3 breakpoint: cygwin1.dll!localtime (called from uptime.exe!OEP+0x236 (0x401236))One-time INT3 breakpoint: cygwin1.dll!sprintf (called from uptime.exe!OEP+0x25a (0x40125a))One-time INT3 breakpoint: cygwin1.dll!setutent (called from uptime.exe!OEP+0x3b1 (0x4013b1))One-time INT3 breakpoint: cygwin1.dll!getutent (called from uptime.exe!OEP+0x3c5 (0x4013c5))One-time INT3 breakpoint: cygwin1.dll!endutent (called from uptime.exe!OEP+0x3e6 (0x4013```


--------------------------------------------------------------------------------
/Chapter-57/57.1_text_strings.md:
--------------------------------------------------------------------------------
 1 | #第57章 
 2 | ##字符串
 3 | 
 4 | ### 57.1 文本字符串
 5 | 
 6 | #### 57.1.1 C/C++
 7 | 
 8 | 普通的C字符串是以零结束的(ASCIIZ字符串)。
 9 | 
10 | C字符串格式(以零结束)是这样的是出于历史原因。[Rit79中]:
11 | 
12 | ```
13 | A minor difference was that the unit of I/O was the word, not the byte, because the PDP-7 was a word- addressed machine. In practice this meant merely that all programs dealing with character streams ignored null characters, because null was used to pad a file to an even number of characters.
14 | ```
15 | 
16 | 在Hiew或者FAR Manager中，这些字符串看上去是这样的：
17 | 
18 | ```
19 | int main() {printf ("Hello, world!\n");};```
20 | ![](img/57-1.png)
21 | 
22 | ####57.1.2 Borland Delphi
23 | 
24 | 在Pascal和Borland Delphi中字符串为8-bit或者32-bit长。
25 | 
26 | 举个例子：
27 | ```
28 | CODE:00518AC8                 dd 19hCODE:00518ACC aLoading___Plea db 'Loading... , please wait.',0...CODE:00518AFC                 dd 10h
29 | CODE:00518B00 aPreparingRun__ db 'Preparing run...',0```
30 | 
31 | ####57.1.3 Unicode
32 | 
33 | 通常情况下，称Unicode是一种编码字符串的方法，每个字符占用2个字节或者16bit。这是一种常见的术语错误。在许多语言系统中，Unicode是一种用于给每个字符分配数字的标准，而不是用于描述编码的方法。
34 | 
35 | 最常用的编码方法是：UTF-8(在Internet和*NIX系统中使用较多)和UTF-16LE(在Windows中使用)。
36 | 
37 | **UTF-8**
38 | 
39 | UTF-8是最成功的字符编码方法之一。所有拉丁符号编码成ASCII，而超出ASCII表的字符的编码使用多个字节。0的编码方式和以前一样，所有的标准C字符串函数处理UTF-8字符串和处理其他字符串一样。
40 | 
41 | 我们来看看不同语言中的符号在UTF-8中是如何被编码的，在FAR中看上去又是什么样的，使用[437内码表](http://go.yurichev.com/17304):
42 | ![](img/57-2.png)
43 | 就像你看到的一样，英语字符串看上去和ASCII编码的一样。匈牙利语使用一些拉丁符号加上变音标志。这些符号使用多个字节编码。我用红色下划线标记出来了。对于冰岛语和波兰语也是一样的。我在开始处使用"Euro"通行符号，编码为3个字节。这里剩下的语言系统与拉丁文没有联系。至少在俄语、阿拉伯语、希伯来语和印地语中我们可以看到相同的字节，这并不稀奇：语言系统的所有符号通常位于同一个Unicode表中，所以他们的号码前几个数字相同。之前在"How much?"前面，我们看到了3个字节，这实际上是BOM。BOM定义了使用的编码系统。
44 | **UTF-16LE**在Windows中，许多win32函数带有后缀 -A和-W。第一种类型的函数用于处理普通字符串，第二种类型的函数用于处理UTF-16LE(wide)，每个符号存储类型通常为16比特的short。
45 | UTF-16中拉丁符号在Hiew和FAR中看上去插入了0字节：
46 | ```
47 | int wmain() {wprintf (L"Hello, world!\n");};
48 | ```![](img/57-3.png)
49 | 在Windows NT系统中经常可以看见这样的：![](img/57-4.png)在IDA中，占两个字节通常被称为Unicode：
50 | 
51 | ```
52 | .data:0040E000 aHelloWorld:.data:0040E000                 unicode 0, <Hello, world!>.data:0040E000                 dw 0Ah, 0
53 | ```
54 | 
55 | 下面是俄语字符串在UTF-16LE中如何被编码：
56 | ![](img/57-5.png)
57 | 容易发现的是，符号被插入了方形字符(ASCII码为4).实际上，西里尔符号位于Unicode第四个平面。因此，在UTF-16LE中，西里尔符号的范围为0x400到0x4FF.
58 | 我们回到使用多种语言书写的字符串的例子中吧。下面是他们在UTF-16LE中的样子。![](img/57-6.png)
59 | 
60 | 这里我们也能看到开始处有一个BOM。所有的拉丁字符都被插入了一个0字节。我也给一些带有变音符号的字符标注了红色下划线(匈牙利语和冰岛语)。
61 | ####57.1.4 Base64
62 | Base64编码方法多用于需要将二进制数据以文本字符串的形式传输的情况。实际上，这种算法将3个二进制字节编码为4个可打印字符：所有拉丁字母(包括大小写)、数字、加号、除号共64个字符。
63 | Base64字符串一个显著的特性是他们经常(并不总是)以1个或者2个等号结尾，举个例子：```
64 | AVjbbVSVfcUMu1xvjaMgjNtueRwBbxnyJw8dpGnLW8ZW8aKG3v4Y0icuQT+qEJAp9lAOuWs=
65 | ```
66 | ```
67 | WVjbbVSVfcUMu1xvjaMgjNtueRwBbxnyJw8dpGnLW8ZW8aKG3v4Y0icuQT+qEJAp9lAOuQ==```等号不会在base-64编码的字符串中间出现。###57.2 Error/debug messages 
68 | 
69 | 调试信息非常有帮助。在某种程度上，调试信息报告了程序当前的行为。通常这些printf类函数，写入信息到log文件中，在release模式下不写任何东西但会显示调用信息。如果本地或全局变量dump到了调试信息中，可能会有帮助，至少能获取变量名。比如在Oracle RDBMS中就有这样一个函数 ksdewt()。
70 | 
71 | 文本字符串常常很有帮助。IDA反汇编器可以展示指定字符串被哪个函数在哪里使用。经常会出现有趣的状况。
72 | 
73 | 错误信息也很有帮助。在Oracle RDBMS中，错误信息会报告使用的一系列函数。
74 | 
75 | 更多相关信息：[blog.yurichev.com](blog.yurichev.com)。
76 | 
77 | 快速获知哪个函数在什么情况下报告了错误信息是可以做到的。顺便说一句，这也是copy-protection系统为什么要设置模糊而难懂的错误信息或错误码。没有人会为软件破解者仅仅通过错误信息就快速找到了copy-protection被触发的原因而感到高兴。
78 | 
79 | 一个关于错误信息编码的例子：78.2节
80 | ###57.3 Suspicious magic strings
81 | 一些幻数字符串通常使用在后门中，看上去很神秘。举个例子，下面有一个[TP-Link WR740路由器的后门](http://sekurak.pl/tp-link-httptftp-backdoor/)。使用下面的URL可以激活后门：[http://192.168.0.1/userRpmNatDebugRpm26525557/start_art.html](http://192.168.0.1/userRpmNatDebugRpm26525557/start_art.html)。
82 | 实际上，"userRpmNatDebugRpm26525557"字符串会在硬件中显示。在后门信息泄漏前，这个字符串并不能被google到。你在任何RFC中都找不到这个。你也无法在任何计算机科学算法中找到使用了这个奇怪字节序列的地方。此外，这看上去也不像错误信息或者调试信息。因此，调查这样一个奇怪字符串的用途是明智的。有时像这样的字符串可能使用了base64编码。所以解码后再看一遍是明智的，甚至扫一眼就够了。
83 | 更确切的说，这种隐藏后门的方法称为“security through obscurity”。


--------------------------------------------------------------------------------
/Chapter-57/img/57-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-57/img/57-1.png


--------------------------------------------------------------------------------
/Chapter-57/img/57-2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-57/img/57-2.png


--------------------------------------------------------------------------------
/Chapter-57/img/57-3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-57/img/57-3.png


--------------------------------------------------------------------------------
/Chapter-57/img/57-4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-57/img/57-4.png


--------------------------------------------------------------------------------
/Chapter-57/img/57-5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-57/img/57-5.png


--------------------------------------------------------------------------------
/Chapter-57/img/57-6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-57/img/57-6.png


--------------------------------------------------------------------------------
/Chapter-58/58_call_to_assert.md:
--------------------------------------------------------------------------------
 1 | #第58章 
 2 | ##调用assert
 3 | 
 4 | 有时，assert()宏的出现也是有用的：通常这个宏会泄漏源文件名，行号和条件。
 5 | 
 6 | 最有用的信息包含在assert的条件中，我们可以从中推断出变量名或者结构体名。另一个有用的信息是文件名。我们可以从中推断出使用了什么类型的代码。并且也可能通过文件名识别出有名的开源库。
 7 | 
 8 | ```
 9 | .text:107D4B29 mov  dx, [ecx+42h].text:107D4B2D cmp  edx, 1.text:107D4B30 jz   short loc_107D4B4A.text:107D4B32 push 1ECh.text:107D4B37 push offset aWrite_c ; "write.c".text:107D4B3C push offset aTdTd_planarcon ; "td->td_planarconfig == PLANARCONFIG_CON"....text:107D4B41 call ds:_assert....text:107D52CA mov  edx, [ebp-4].text:107D52CD and  edx, 3.text:107D52D0 test edx, edx.text:107D52D2 jz   short loc_107D52E9.text:107D52D4 push 58h.text:107D52D6 push offset aDumpmode_c ; "dumpmode.c".text:107D52DB push offset aN30     ; "(n & 3) == 0".text:107D52E0 call ds:_assert....text:107D6759 mov  cx, [eax+6].text:107D675D cmp  ecx, 0Ch.text:107D6760 jle  short loc_107D677A.text:107D6762 push 2D8h.text:107D6767 push offset aLzw_c   ; "lzw.c".text:107D676C push offset aSpLzw_nbitsBit ; "sp->lzw_nbits <= BITS_MAX".text:107D6771 call ds:_assert
10 | ```
11 | 
12 | 同时google一下条件和文件名是明智的，可能会因此找到开源库。举个例子，如果我们google查找“sp->lzw_nbits <= BITS_MAX”，将会显示一些与LZW压缩有关的开源代码。
13 | 
14 | 
15 | 
16 | 


--------------------------------------------------------------------------------
/Chapter-59/59_constans.md:
--------------------------------------------------------------------------------
 1 | #第59章 
 2 | ##常量
 3 | 
 4 | 通常人们在生活中或者程序员在编写代码时喜欢使用像10，100，1000这样的整数。
 5 | 
 6 | 有经验的逆向工程师会对这些数字的十六进制形式很熟悉：10=0xA, 100=0x64, 1000=0x3E8, 10000=0x2710。
 7 | 
 8 | 常量 0xAAAAAAAA (10101010101010101010101010101010)和0x55555555 (01010101010101010101010101010101)也很常用——构成alternating bits。举个例子，0x55AA在引导扇区，MBR，IBM兼容扩展卡中使用过。
 9 | 
10 | 某些算法，特别是密码学方面的使用的常量很有代表性，我们可以在IDA中轻松找到。
11 | 
12 | 举个例子，MD5算法这样初始化内部变量：
13 | 
14 | var int h0 := 0x67452301
15 | var int h1 := 0xEFCDAB89
16 | var int h2 := 0x98BADCFE
17 | var int h3 := 0x10325476
18 | 
19 | 如果你在代码中某行发现这四个常量，那么极有可能该处函数与MD5有关。
20 | 
21 | 另一个有关CRC16/CRC32算法的例子，通常使用预先计算好的表来计算：
22 | 
23 | ```
24 | /** CRC table for the CRC-16. The poly is 0x8005 (x^16 + x^15 + x^2 + 1) */u16 const crc16_table[256] = {        0x0000, 0xC0C1, 0xC181, 0x0140, 0xC301, 0x03C0, 0x0280, 0xC241,        0xC601, 0x06C0, 0x0780, 0xC741, 0x0500, 0xC5C1, 0xC481, 0x0440,        0xCC01, 0x0CC0, 0x0D80, 0xCD41, 0x0F00, 0xCFC1, 0xCE81, 0x0E40,        ...
25 | ```
26 | 
27 | CRC3预计算表同见：第37节
28 | 
29 | ###59.1 幻数
30 | 
31 | 许多文件格式定义了标准的文件头，使用了幻数。
32 | 
33 | 举个例子，所有的Win32和MS-DOS可执行文件以"MZ"这两个字符开始。
34 | 
35 | MIDI文件的开始有"MThd"标志。如果我们有一个使用MIDI文件的程序，它很有可能会检查至少4字节的文件头来确认文件类型。
36 | 
37 | 可以这样实现：
38 | 
39 | (buf指向内存文件加载的开始处)
40 | 
41 | ```
42 | cmp [buf], 0x6468544D ; "MThd"
43 | jnz _error_not_a_MIDI_file
44 | 
45 | ```
46 | 也可能会调用某个函数比如memcmp()或者等同于CMPSB指令(A.6.3节)的代码用于比对内存块。
47 | 
48 | 当你发现这样的地方，你就可以确定的MIDI文件加载的开始处，同时我们可以看到缓冲区存放MIDI文件内容的地方，什么内容被使用以及如何使用。
49 | 
50 | 
51 | ####59.1.1 DHCP
52 | 
53 | 上面的方法对于网络协议也同样适用。举个例子，DHCP协议网络包包含了magic cookie：0x6353826。任何生成DHCP包的代码在某处一定将这个常量嵌入了包中。它在代码中出现的地方可能就与执行这些操作有关，或者不仅是如此。任何接收DHCP的包都会检查这个magic cookie，比对是否相同。
54 | 
55 | 举个例子，我们在Windows 7 x64的dhcpcore.dll文件中搜索这个常量。找到两处：看上去这个常量在名为DhcpExtractOptionsForValidation()和 DhcpExtractFullOptions()函数中使用:
56 | 
57 | ```
58 | .rdata:000007FF6483CBE8 dword_7FF6483CBE8 dd 63538263h ; DATA XREF: ⤦ 
59 | 	DhcpExtractOptionsForValidation+79￼￼￼￼￼￼￼￼￼￼￼.rdata:000007FF6483CBEC dword_7					      DATA XREF: ⤦ 
60 | 	DhcpExtractFullOptions+97
61 | 	```
62 | 
63 | 下面是常量被引用的地址：
64 | ```
65 | .text:000007FF6480875F  mov	eax, [rsi].text:000007FF64808761  cmp	eax, cs:dword_7FF6483CBE8.text:000007FF64808767  jnz	loc_7FF64817179```
66 | 
67 | 还有：
68 | ```
69 | .text:000007FF648082C7  mov	eax, [r12].text:000007FF648082CB  cmp	eax, cs:dword_7FF6483CBEC.text:000007FF648082D1  jnz	loc_7FF648173AF```
70 | ###59.2 搜索常量
71 | 在IDA中很容易：使用ALT-B或者ALT-I。如果是在大量文件或者在不可执行文件中搜索常量，我会使用自己编写一个叫[binary grep](http://go.yurichev.com/17017)的小工具。


--------------------------------------------------------------------------------
/Chapter-60/finding_the_right_instructions.md:
--------------------------------------------------------------------------------
 1 | # 第60章 
 2 | ##寻找合适的指令
 3 | 
 4 | 如果程序使用了FPU指令但使用不多，你可以尝试用调试器手工逐个检查。
 5 | 
 6 | 举个例子，我们可能会对用户如何在微软的Excel中输入计算公式感兴趣，比如除法操作。
 7 | 
 8 | 如果我们加载excel.exe(Offic 2010)版本为14.0.4756.1000 到IDA中，列出所有的条目，查找每一条FDIV指令(除了使用常量作为第二个操作数的——显然不是我们所关心的)：
 9 | 
10 | ```
11 | cat EXCEL.lst | grep fdiv | grep -v dbl_ > EXCEL.fdiv
12 | ```
13 | 然后我们就会看到有144条相关结果。
14 | 
15 | 我们可以在Excel中输入像"=(1/3)"这样的字符串然后对指令进行检查。
16 | 
17 | 通过使用调试器或者tracer(一次性检查4条指令)检查指令，我们幸运地发现目标指令是第14个：
18 | 
19 | ```
20 | .text:3011E919 DC 33		fdiv    qword ptr [ebx]
21 | ```
22 | ```
23 | PID=13944|TID=28744|(0) 0x2f64e919 (Excel.exe!BASE+0x11e919)EAX=0x02088006 EBX=0x02088018 ECX=0x00000001 EDX=0x00000001ESI=0x02088000 EDI=0x00544804 EBP=0x0274FA3C ESP=0x0274F9F8EIP=0x2F64E919FLAGS=PF IFFPU ControlWord=IC RC=NEAR PC=64bits PM UM OM ZM DM IMFPU StatusWord=FPU ST(0): 1.000000
24 | ```
25 | ST(0)存放了第一个参数，[EBX]存放了第二个参数。
26 | 
27 | FDIV(FSTP)之后的指令在内存中写入了结果：
28 | 
29 | ```	
30 | .text:3011E91B DD 1E		fstp    qword ptr [esi]
31 | ```
32 | 
33 | 如果我们设置一个断点，就可以看到结果：
34 | 
35 | ```
36 | PID=32852|TID=36488|(0) 0x2f40e91b (Excel.exe!BASE+0x11e91b)EAX=0x00598006 EBX=0x00598018 ECX=0x00000001 EDX=0x00000001ESI=0x00598000 EDI=0x00294804 EBP=0x026CF93C ESP=0x026CF8F8EIP=0x2F40E91BFLAGS=PF IFFPU ControlWord=IC RC=NEAR PC=64bits PM UM OM ZM DM IMFPU StatusWord=C1 PFPU ST(0): 0.333333
37 | ```
38 | 
39 | 我们也可以恶作剧地修改一下这个值：
40 | ```
41 | tracer -l:excel.exe bpx=excel.exe!BASE+0x11E91B,set(st0,666)```
42 | ```
43 | PID=36540|TID=24056|(0) 0x2f40e91b (Excel.exe!BASE+0x11e91b)EAX=0x00680006 EBX=0x00680018 ECX=0x00000001 EDX=0x00000001ESI=0x00680000 EDI=0x00395404 EBP=0x0290FD9C ESP=0x0290FD58EIP=0x2F40E91BFLAGS=PF IFFPU ControlWord=IC RC=NEAR PC=64bits PM UM OM ZM DM IMFPU StatusWord=C1 PFPU ST(0): 0.333333Set ST0 register to 666.000000
44 | ```Excel在这个单元中显示666，我们也可以确信的确找到了正确的位置。
45 | ![](img/60-1.png)
46 | 
47 | 如果我们尝试使用同样的Excel版本，但是是64位的，会发现只有12个FDIV指令，我们的目标指令在第三个。```
48 | tracer.exe -l:excel.exe bpx=excel.exe!BASE+0x1B7FCC,set(st0,666)```
49 | 
50 | 看起来似乎许多浮点数和双精度类型的除法操作都被编译器用SSE指令比如DIVSD(DIVSD总共出现了268次)替换了。
51 | 
52 | 


--------------------------------------------------------------------------------
/Chapter-60/img/60-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-60/img/60-1.png


--------------------------------------------------------------------------------
/Chapter-61/61.1_xor_instructions.md:
--------------------------------------------------------------------------------
 1 | #第61章 
 2 | ##可疑的代码模式
 3 | ###61.1 XOR 指令
 4 | 
 5 | 像XOR op这样的指令，op为寄存器(比如，xor eax，eax)通常用于将寄存器的值设置为零，但如果操作数不同，"互斥或"运算将被执行。在普通的程序中这种操作较罕见，但在密码学中应用较广，包括业余的。如果第二个操作数是一个很大的数字，那么就更可疑了。可能会指向加密/解密操作或校验和的计算等等。
 6 | 
 7 | 而这种观察也可能是无意义的，比如"canary"(18.3节)。canary的产生和检测通常使用XOR指令。
 8 | 
 9 | 下面这个awk脚本可用于处理IDA的.list文件：
10 | 
11 | ```
12 | gawk -e '$2=="xor" { tmp=substr($3, 0, length($3)-1); if (tmp!=$4) if($4!="esp") if ($4!="ebp")⤦￼￼￼￼￼￼￼￼􏰀 {print$1,$2,tmp,",",$4}}'filename.lst
13 | ```
14 | 
15 | ###61.2 Hand-written assembly code
16 | 
17 | 现代编译器不会emit LOOP和RCL指令。另一方面，这些指令对于直接用汇编语言编程的程序员来说很熟悉。如果你发现了这些指令，可以猜测这部分代码极有可能是手工编写的。这样的代码在这个指令列表中用(M)标记：A.6节。
18 | 
19 | 同时函数prologue/epilogue通常不会以手工编写的汇编的形式呈现。
20 | 
21 | 通常情况下，手工编写的代码中参数传递给函数没有固定的系统。
22 | 
23 | Windows 2003 内核(ntoskrnl.exe 文件)的例子：
24 | 
25 | ```
26 | MultiplyTest	proc near			; CODE XREF: Get386Stepping
27 | 				xor     cx, cx
28 | loc_620555:							; CODE XREF: MultiplyTest+E
29 | 				push 	cx
30 | 				call 	Multiply
31 | 				pop 	cx
32 | 				jb 		short locret_620563
33 | 				loop 	loc_620555
34 | 				clc
35 | locret_620563:						; CODE XREF:MultiplyTest+C
36 | 				retn
37 | MultiplyTest endp
38 | 
39 | Multiply 		proc near 			;CODE XREF:MultiplyTest+5
40 | 				mov ecx,81h
41 | 				mov eax,417A000h
42 | 				mul ecx
43 | 				cmp edx,2
44 | 				stc
45 | 				jnz short locret_62057F
46 | 				cmp  eax,0FE7A000h
47 | 				stc
48 | 				jnz short locret_62057F
49 | 				clc
50 | locret_62057F:						; CODE XREF:Multiply+10
51 | 									; Multiply+18
52 | 				retn
53 | Multiply		endp
54 | ```
55 | 
56 | 事实上，如果我们查看WRK v1.2源码，上面的代码在WRK-v1.2\base\ntos\ke\i386\cpu.asm文件中很容易找到。


--------------------------------------------------------------------------------
/Chapter-62/62_using_magic_numbers_while_tracing.md:
--------------------------------------------------------------------------------
 1 | #第62章  
 2 | ##跟踪时使用幻数
 3 | 
 4 | 通常情况下，我们的主要目标是理解程序从文件读取或从网络中接收的值的用途。手动跟踪某个值常常是个体力活。最简单应对技术之一(尽管不是百分之百靠谱)是使用自定义的幻数。
 5 | 
 6 | 这在某种程度上类似于X射线计算机断层扫描：造影剂注射到病人的血液中，增强患者的内部结构在X射线下的能见度。例如，健康人的血液在肾脏渗透是众所周知的，如果血液中有介质则可以很容易在断层中看到血液如何渗透的，是否有结石或肿瘤。
 7 | 
 8 | 我们可以使用一个32比特的数字，比如0x0badf00d，或者某人的生日0x11101979并将这个4字节数字写到我们目标程序使用的文件的某个位置。
 9 | 
10 | 然后使用code coverage模式下的tracer的跟踪这个程序，再用grep工具或仅仅是文本搜索(跟踪结果)，就可以轻松看到值的位置以及如何被使用。
11 | 
12 | 使用cc模式下tracer的结果，可使用grep：
13 | 
14 | ```
15 | 0x150bf66 (_kziaia+0x14), e=		1 [MOV EBX, [EBP+8]] [EBP+8]=0xf59c9340x150bf69 (_kziaia+0x17), e=		1 [MOV EDX, [69AEB08h]] [69AEB08h]=00x150bf6f (_kziaia+0x1d), e=		1 [FS: MOV EAX, [2Ch]]0x150bf75 (_kziaia+0x23), e=		1 [MOV ECX, [EAX+EDX*4]] [EAX+EDX*4]=0xf1ac3600x150bf78 (_kziaia+0x26), e=		1 [MOV [EBP-4], ECX] ECX=0xf1ac360```
16 | 对于网络包中也同样适用。很重要的一点是，幻数必须独特保证没有在该程序中出现过。
17 | 
18 | 除了tracer，[heavydebug模式下的DosBox(MS-DOS仿真器)也能将每条指令执行后寄存器状态写入到一个文本文件中](blog.yurichev.com)，因此，这种技术对于DOS程序也是很有用的。


--------------------------------------------------------------------------------
/Chapter-63/63.1_general_idea.md:
--------------------------------------------------------------------------------
 1 | #第63章
 2 | ## 其他
 3 | ###63.1  基本思想
 4 | 
 5 | 一个逆向工程师应该尽可能多地去尝试站在程序开发者的角度，并思考开发者碰见某些特殊情况会如何解决。
 6 | 
 7 | ###63.2 C++
 8 | 
 9 | RTTI(51.1.5)的数据对于C++类定义可能会有帮助。
10 | 
11 | ###63.3 某些二进制文件模式
12 | 
13 | 有时我们可以在十六进制编辑器中清楚地看到16/32/64比特值的数组。下面是一个非常典型的MIPS代码。每一个MIPS(还有ARM或ARM64模式的ARM)指令都是32比特(4字节)，构成32比特值的数组。通过查看快照可以看到这种模式。为了显示更清晰我加了红色的下划线：
14 | 
15 | ![](img/63-1.png)
16 | 
17 | 另一个这种模式的例子：第86节
18 | 
19 | ###63.4 内存快照比对
20 | 
21 | 将两个内存快照直接比对来查看变化的技术常用于做8比特的PC游戏的高分游戏挂。
22 | 
23 | 举个例子，如果你在8比特的电脑上加载了一个游戏(这里的内存不多，但游戏需要的内存通常更少)，假设你知道你现在有100发子弹，你可以给内存做个快照放到某处。然后打一发，子弹数变为99，然后再做一个快照进行比对：某处一定会有一个字节一开始是100，现在变成了99。考虑到这些8比特的游戏通常用汇编语言编写，并且这样的变量通常是全局变量，可以确定内存中确有某个地址包含了子弹数目。如果你在反汇编后的游戏代码中搜索了所有有关这个地址的引用，那么找到减少子弹数的代码并不难，然后使用NOP指令替换掉，这样在游戏里子弹数就会一直保持100。通常情况下8比特PC游戏加载地址不变，并且每个游戏的不同版本不多(通常一个版本就会流行很长一段时间)，所以游戏爱好者常常知道哪些地址的哪些字节需要覆盖(使用BASIC指令POKE)。由此形成了一个包含了POKE指令游戏挂，发布在和8比特游戏有关的杂志上。见:[wikipedia](http://go.yurichev.com/17114)
24 | 
25 | 同样的，修改高分文件也很容易，并且不仅仅是处理8比特游戏了。记下你的得分数并且将文件备份。当高分变化后将两个文件进行比对，使用DOS的FC工具就可以(高分文件通常是二进制形式)。某处一定会有部分字节不同，发现哪些字节包含了得分数很容易。然而，游戏开发者为了防范这些游戏挂可能会采取一些措施。
26 | 
27 | 这本书中其他类似的例子：第85节
28 | 
29 | ####63.4.1 Windows注册表
30 | 
31 | 在程序安装前后比对注册表的变化也是可行的，常用于寻找与程序有关的注册表元素。这也可能是"windows registry cleaner"共享软件如此受欢迎的原因吧。
32 | 
33 | ####63.4.2 Blink-comparator
34 | 
35 | 文件或内存快照的比对让我们想起了[blink-comparator](http://go.yurichev.com/17348)：一种曾被天文学家使用的设备，用于发现天体移动。blink-comparator允许在两个不同时间摄影快照间切换，便于天文学家发现差别。顺便说一句，冥王星就是在1930年用blink-comparator发现的。
36 | 
37 | 
38 | 
39 | 
40 | 
41 | 
42 | 
43 | 
44 | 


--------------------------------------------------------------------------------
/Chapter-63/img/63-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-63/img/63-1.png


--------------------------------------------------------------------------------
/Chapter-65/ThreadLocalStorage.md:
--------------------------------------------------------------------------------
  1 | #65章 线程局部存储
  2 | 
  3 | TLS是每个线程特有的数据区域，每个线程可以把自己需要的数据存储在这里。一个著名的例子是C标准的全局变量errno。多个线程可以同时使用errno获取返回的错误码，如果是全局变量它是无法在多线程环境下正常工作的。因此errno必须保存在TLS。
  4 | 
  5 | C++11标准里面新添加了一个thread_local修饰符，标明每个线程都属于自己版本的变量。它可以被初始化并位于TLS中。
  6 | 
  7 | Listing 65.1: C++11
  8 | 
  9 | ```
 10 | #include <iostream>
 11 | #include <thread>
 12 | thread_local int tmp=3;
 13 | int main()
 14 | {
 15 |     std::cout << tmp << std::endl;
 16 | };
 17 | ```
 18 | 
 19 | 使用MinGW GCC 4.8.1而不是MSVC2012编译。
 20 | 
 21 | 如果我们查看它的PE文件，可以看到tmp变量被放到TLS section。
 22 | 
 23 | ##65.1 线性同余发生器
 24 | 
 25 | 前面第20章的纯随机数生成器有一个缺陷：它不是线程安全的，因为它的内部状态变量可以被不同的线程同时读取或修改。
 26 | 
 27 | ###65.1.1 Win32
 28 | 
 29 | ####未初始化的TLS数据
 30 | 
 31 | 一个全局变量如果添加了_declspec(thread)修饰符，那么它会被分配在TLS。
 32 | 
 33 | ```
 34 | #include <stdint.h>
 35 | #include <windows.h>
 36 | #include <winnt.h>
 37 | 
 38 | // from the Numerical Recipes book
 39 | #define RNG_a 1664525
 40 | #define RNG_c 1013904223
 41 | 
 42 | __declspec( thread ) uint32_t rand_state;
 43 | 
 44 | void my_srand (uint32_t init)
 45 | {
 46 |     rand_state=init;
 47 | }
 48 | 
 49 | int my_rand ()
 50 | {
 51 |     rand_state=rand_state*RNG_a;
 52 |     rand_state=rand_state+RNG_c;
 53 |     return rand_state & 0x7fff;
 54 | }
 55 | 
 56 | int main()
 57 | {
 58 |     my_srand(0x12345678);
 59 |     printf ("%d\n", my_rand());
 60 | };
 61 | ```
 62 | 
 63 | 使用Hiew可以看到PE文件多了一个section：.tls。
 64 | 
 65 | Listing 65.2: Optimizing MSVC 2013 x86
 66 | 
 67 | ```
 68 | _TLS SEGMENT
 69 |     _rand_state DD 01H DUP (?)
 70 | _TLS ENDS
 71 | 
 72 | _DATA SEGMENT
 73 |     $SG84851 DB '%d', 0aH, 00H
 74 | _DATA ENDS
 75 | 
 76 | _TEXT SEGMENT
 77 | 
 78 | _init$ = 8  ; size = 4
 79 | 
 80 | _my_srand PROC
 81 | ; FS:0=address of TIB
 82 |     mov eax, DWORD PTR fs:__tls_array ; displayed in IDA as FS:2Ch
 83 | ; EAX=address of TLS of process
 84 |     mov ecx, DWORD PTR __tls_index
 85 |     mov ecx, DWORD PTR [eax+ecx*4]
 86 | ; ECX=current TLS segment
 87 |     mov eax, DWORD PTR _init$[esp-4]
 88 |     mov DWORD PTR _rand_state[ecx], eax
 89 |     ret 0
 90 | _my_srand ENDP
 91 | 
 92 | _my_rand PROC
 93 | ; FS:0=address of TIB
 94 |     mov eax, DWORD PTR fs:__tls_array ; displayed in IDA as FS:2Ch
 95 | ; EAX=address of TLS of process
 96 |     mov ecx, DWORD PTR __tls_index
 97 |     mov ecx, DWORD PTR [eax+ecx*4]
 98 | ; ECX=current TLS segment
 99 |     imul eax, DWORD PTR _rand_state[ecx], 1664525
100 |     add eax, 1013904223 ; 3c6ef35fH
101 |     mov DWORD PTR _rand_state[ecx], eax
102 |     and eax, 32767 ; 00007fffH
103 |     ret 0
104 | _my_rand ENDP
105 | 
106 | _TEXT ENDS
107 | ```
108 | 
109 | rand_state现在处于TLS段，而且这个变量每个线程都拥有属于自己版本。它是这么访问的：从FS:2Ch加载TIB（Thread Information Block）的地址，然后添加一个额外的索引（如果需要的话），接着计算出在TLS段的地址。
110 | 
111 | 最后可以通过ECX寄存器来访问rand_state变量，它指向每个线程特定的数据区域。
112 | 
113 | FS：这是每个逆向工程师都很熟悉的选择子了。它专门用于指向TIB，因此访问线程特定数据可以很快完成。
114 | 
115 | GS: 该选择子用于Win64，0x58的地址是TLS。
116 | 
117 | Listing 65.3: Optimizing MSVC 2013 x64
118 | 
119 | ```
120 | _TLS SEGMENT
121 |     rand_state DD 01H DUP (?)
122 | _TLS ENDS
123 | 
124 | _DATA SEGMENT
125 |     $SG85451 DB '%d', 0aH, 00H
126 | _DATA ENDS
127 | 
128 | _TEXT SEGMENT
129 | init$ = 8
130 | 
131 | my_srand PROC
132 |     mov edx, DWORD PTR _tls_index
133 |     mov rax, QWORD PTR gs:88 ; 58h
134 |     mov r8d, OFFSET FLAT:rand_state
135 |     mov rax, QWORD PTR [rax+rdx*8]
136 |     mov DWORD PTR [r8+rax], ecx
137 |     ret 0
138 | my_srand ENDP
139 | 
140 | my_rand PROC
141 |     mov rax, QWORD PTR gs:88 ; 58h
142 |     mov ecx, DWORD PTR _tls_index
143 |     mov edx, OFFSET FLAT:rand_state
144 |     mov rcx, QWORD PTR [rax+rcx*8]
145 |     imul eax, DWORD PTR [rcx+rdx], 1664525 ;0019660dH
146 |     add eax, 1013904223 ; 3c6ef35fH
147 |     mov DWORD PTR [rcx+rdx], eax
148 |     and eax, 32767 ; 00007fffH
149 |     ret 0
150 | my_rand ENDP
151 | 
152 | _TEXT ENDS
153 | ```
154 | 
155 | ####初始化TLS数据
156 | 
157 | 比方说，我们想为rand_state设置一些固定的值以避免程序员忘记初始化。
158 | 
159 | ```
160 | #include <stdint.h>
161 | #include <windows.h>
162 | #include <winnt.h>
163 | 
164 | // from the Numerical Recipes book
165 | #define RNG_a 1664525
166 | #define RNG_c 1013904223
167 | 
168 | __declspec( thread ) uint32_t rand_state=1234;
169 | 
170 | void my_srand (uint32_t init)
171 | {
172 |    rand_state=init;
173 | }
174 | 
175 | int my_rand ()
176 | {
177 |    rand_state=rand_state*RNG_a;
178 |    rand_state=rand_state+RNG_c;
179 |    return rand_state & 0x7fff;
180 | }
181 | 
182 | int main()
183 | {
184 |     printf ("%d\n", my_rand());
185 | };
186 | ```
187 | 
188 | 代码除了给rand_state设定初始值外与之前的并没有什么不同，但在IDA我们看到：
189 | 
190 | ```
191 | .tls:00404000 ; Segment type: Pure data
192 | .tls:00404000 ; Segment permissions: Read/Write
193 | .tls:00404000 _tls segment para public 'DATA' use32
194 | .tls:00404000 assume cs:_tls
195 | .tls:00404000 ;org 404000h
196 | .tls:00404000 TlsStart db 0 ; DATA XREF: .rdata:TlsDirectory
197 | .tls:00404001 db 0
198 | .tls:00404002 db 0
199 | .tls:00404003 db 0
200 | .tls:00404004 dd 1234
201 | .tls:00404008 TlsEnd db 0 ; DATA XREF: .rdata:TlsEnd_pt
202 | ...
203 | ```
204 | 
205 | 每次一个新的线程运行的时候，会分配新的TLS给它，然后包括1234所有数据将被拷贝过去。
206 | 
207 | 这是一个典型的场景：
208 | - 线程A开始运行，然后分配给它一个TLS，并把1234拷贝到rand_state。
209 | - 线程A里面多次调用my_rand()函数，rand_state已经不是1234。
210 | - 线程B开始运行，然后分配给它一个TLS，并把1234拷贝到rand_state，这时候可以观察到两个线程使用同一个变量，但它们的值是不一样的。
211 | 
212 | #### TLS callbacks
213 | 
214 | 如果我们想给TLS赋一个变量值呢？比方说：程序员忘记调用my_srand()函数来初始化PRNG，但是随机数生成器在开始的时候必须使用一个真正的随机数值而不是1234。这种情况下则可以使用TLS callbaks。
215 | 
216 | 下面的代码的可移植性很差，原因你应该明白。我们定义了一个函数(tls_callback())，它在进程/线程开始执行前调用，该函数使用GetTickCount()函数的返回值来初始化PRNG。
217 | 
218 | ```
219 | #include <stdint.h>
220 | #include <windows.h>
221 | #include <winnt.h>
222 | 
223 | // from the Numerical Recipes book
224 | #define RNG_a 1664525
225 | #define RNG_c 1013904223
226 | 
227 | __declspec( thread ) uint32_t rand_state;
228 | 
229 | void my_srand (uint32_t init)
230 | {
231 |     rand_state=init;
232 | }
233 | 
234 | void NTAPI tls_callback(PVOID a, DWORD dwReason, PVOID b)
235 | {
236 |     my_srand (GetTickCount());
237 | }
238 | 
239 | #pragma data_seg(".CRT$XLB")
240 | PIMAGE_TLS_CALLBACK p_thread_callback = tls_callback;
241 | #pragma data_seg()
242 | 
243 | int my_rand ()
244 | {
245 |     rand_state=rand_state*RNG_a;
246 |     rand_state=rand_state+RNG_c;
247 |     return rand_state & 0x7fff;
248 | }
249 | int main()
250 | {
251 |     // rand_state is already initialized at the moment (using GetTickCount())
252 |     printf ("%d\n", my_rand());
253 | };
254 | ```
255 | 
256 | 用IDA看一下：
257 | 
258 | Listing 65.4: Optimizing MSVC 2013
259 | 
260 | ```
261 | .text:00401020 TlsCallback_0 proc near ; DATA XREF: .rdata:TlsCallbacks
262 | .text:00401020     call ds:GetTickCount
263 | .text:00401026     push eax
264 | .text:00401027     call my_srand
265 | .text:0040102C     pop ecx
266 | .text:0040102D     retn 0Ch
267 | .text:0040102D TlsCallback_0 endp
268 | ...
269 | .rdata:004020C0 TlsCallbacks dd offset TlsCallback_0 ; DATA XREF: .rdata:TlsCallbacks_ptr
270 | ...
271 | .rdata:00402118 TlsDirectory dd offset TlsStart
272 | .rdata:0040211C TlsEnd_ptr dd offset TlsEnd
273 | .rdata:00402120 TlsIndex_ptr dd offset TlsIndex
274 | .rdata:00402124 TlsCallbacks_ptr dd offset TlsCallbacks
275 | .rdata:00402128 TlsSizeOfZeroFill dd 0
276 | .rdata:0040212C TlsCharacteristics dd 300000h
277 | ```
278 | 
279 | TLS callbacks函数时常用于隐藏解包处理过程。为此有些人可能会困惑，为什么一些代码可以偷偷地在OEP（Original Entry Point）之前执行。
280 | 
281 | ###65.1.2 Linux
282 | 
283 | 下面是GCC声明线程局部存储的方式：
284 | 
285 | ```
286 | __thread uint32_t rand_state=1234;
287 | ```
288 | 
289 | 这不是标准C/C++的修饰符，但是是GCC的一个扩展特性。
290 | 
291 | GS：该选择子同样用于访问TLS，但稍微有点区别：
292 | 
293 | Listing 65.5: Optimizing GCC 4.8.1 x86
294 | 
295 | ```
296 | .text:08048460 my_srand proc near
297 | .text:08048460
298 | .text:08048460 arg_0 = dword ptr 4
299 | .text:08048460
300 | .text:08048460     mov eax, [esp+arg_0]
301 | .text:08048464     mov gs:0FFFFFFFCh, eax
302 | .text:0804846A     retn
303 | .text:0804846A my_srand endp
304 | .text:08048470 my_rand proc near
305 | .text:08048470     imul eax, gs:0FFFFFFFCh, 19660Dh
306 | .text:0804847B     add eax, 3C6EF35Fh
307 | .text:08048480     mov gs:0FFFFFFFCh, eax
308 | .text:08048486     and eax, 7FFFh
309 | .text:0804848B     retn
310 | .text:0804848B my_rand endp
311 | ```
312 | 
313 | 更多例子：[ELF Handling For Thread-Local Storage](http://go.yurichev.com/17272)


--------------------------------------------------------------------------------
/Chapter-66/SystemCalls.md:
--------------------------------------------------------------------------------
 1 | #66章 系统调用(syscall-s)
 2 | 
 3 | 众所周知，所有运行的进程在操作系统里面分为两类：一类拥有访问全部硬件设备的权限（内核空间）而另一类无法直接访问硬件设备(用户空间)。
 4 | 
 5 | 操作系统内核和驱动程序通常是属于第一类的。
 6 | 
 7 | 而应用程序通常是属于第二类的。
 8 | 
 9 | 举个例子，Linux kernel运行于内核空间，而Glibc运行于用户空间。
10 | 
11 | 这种分离对与操作系统的安全性是至关重要的：它最重要的一点是，不给任何进程有破坏到其它进程甚至是系统内核的机会。另一方面，一个错误的驱动或系统内核错误都会造成系统崩溃或者蓝屏。
12 | 
13 | 保护模式下的x86处理器允许使用4个保护等级（ring）。但Linux和Windows两个操作系统都只使用了两个：ring0（内核空间）和ring3（用户空间）。
14 | 
15 | 系统调用（syscall-s）是两个运行空间的连接点。可以说，这是提供给应用程序主要的API。
16 | 
17 | 在Windows NT，系统调用表存在于SSDT。
18 | 
19 | 通过系统调用实现shellcode在计算机病毒作者之间非常流行。因为很难确定所需函数在系统库里面的地址，但系统调用很容易确定。然而，由于系统调用属于比较底层的API，所以需要编写更多的代码。最后值得一提的是，在不同的操作系统版本里面，系统调用号是有可能不同的。
20 | 
21 | ##66.1 Linux
22 | 
23 | 在Linux系统中，系统调用通常使用int 0x80中断进行调用。通过EAX寄存器传递调用号，再通过其它寄存器传递所需参数。
24 | 
25 | Listing 66.1: A simple example of the usage of two syscalls
26 | 
27 | ```
28 | section .text
29 | global _start
30 | _start:
31 | 	mov edx,len	; buf len
32 | 	mov ecx,msg	; buf
33 | 	mov ebx,1	; file descriptor. stdout is 1
34 | 	mov eax,4	; syscall number. sys_write is 4
35 | 	int 0x80
36 | 	mov eax,1	; syscall number. sys_exit is 4
37 | 	int 0x80
38 | section .data
39 | msg db 'Hello, world!',0xa
40 | len equ $ - msg
41 | ```
42 | 
43 | 编译：
44 | 
45 | ```
46 | nasm -f elf32 1.s
47 | ld 1.o
48 | ```
49 | 
50 | Linux所有的系统调用在这里可以查看：[http://go.yurichev.com/17319](http://go.yurichev.com/17319)。
51 | 
52 | 在Linux中可以使用strace(71章)对系统调用进行跟踪或者拦截。
53 | 
54 | ##66.2 Windows
55 | 
56 | Windows系统使用int 0x2e中断或x86下特有的指令SYSENTER调用用系统调用服务。
57 | 
58 | Windows所有的系统调用在这里可以查看：[http://go.yurichev.com/17320](http://go.yurichev.com/17320)。
59 | 
60 | 扩展阅读：[“Windows Syscall Shellcode” by Piotr Bania](http://go.yurichev.com/17321.)


--------------------------------------------------------------------------------
/Chapter-67/Linux.md:
--------------------------------------------------------------------------------
  1 | #67章 Linux
  2 | 
  3 | ##67.1 位置无关代码
  4 | 
  5 | 在分析Linux共享库的时候(.so)的时候，可能会经常看到类似下面的代码：
  6 | 
  7 | Listing 67.1: libc-2.17.so x86
  8 | 
  9 | ```
 10 | .text:0012D5E3 __x86_get_pc_thunk_bx proc near ; CODE XREF: sub_17350+3
 11 | .text:0012D5E3 ; sub_173CC+4 ...
 12 | .text:0012D5E3     mov ebx, [esp+0]
 13 | .text:0012D5E6     retn
 14 | .text:0012D5E6 __x86_get_pc_thunk_bx endp
 15 | ...
 16 | .text:000576C0 sub_576C0 proc near ; CODE XREF: tmpfile+73
 17 | ...
 18 | .text:000576C0     push ebp
 19 | .text:000576C1     mov ecx, large gs:0
 20 | .text:000576C8     push edi
 21 | .text:000576C9     push esi
 22 | .text:000576CA     push ebx
 23 | .text:000576CB     call __x86_get_pc_thunk_bx
 24 | .text:000576D0     add ebx, 157930h
 25 | .text:000576D6     sub esp, 9Ch
 26 | ...
 27 | .text:000579F0     lea eax, (a__gen_tempname - 1AF000h)[ebx] ; "__gen_tempname"
 28 | .text:000579F6     mov [esp+0ACh+var_A0], eax
 29 | .text:000579FA     lea eax, (a__SysdepsPosix - 1AF000h)[ebx] ; "../sysdeps/posix/tempname.c"
 30 | .text:00057A00     mov [esp+0ACh+var_A8], eax
 31 | .text:00057A04     lea eax, (aInvalidKindIn_ - 1AF000h)[ebx] ; "! \"invalid KIND in __gen_tempname\""
 32 | .text:00057A0A     mov [esp+0ACh+var_A4], 14Ah
 33 | .text:00057A12     mov [esp+0ACh+var_AC], eax
 34 | .text:00057A15     call __assert_fail
 35 | ```
 36 | 
 37 | 在每个函数开始处，所有指向字符串的指针都需要通过EBX和一些常量值来修正地址。这就是所谓的PIC（位置无关代码），它的目的是让这段代码即使随机地放在内存中某个位置都能正确地执行。这也是为什么不能使用绝对地址的原因。
 38 | 
 39 | PIC（位置无关代码）对于早期的操作系统和现在那些没有虚拟内存支持的嵌入式系统来说至关重要（所有进程都放在同一个连续的内存块）。此外，它还用于*NIX系统的共享库。这样共享库只需要加载一次到内存之后就可以让所有需要的进程使用，而且这些进程可以把同一个共享库映射到各自不同的内存地址上。这也是为什么共享库不使用绝对地址也能够正常地工作。
 40 | 
 41 | 让我们做一个简单的实验：
 42 | 
 43 | ```
 44 | #include <stdio.h>
 45 | int global_variable=123;
 46 | int f1(int var)
 47 | {
 48 |     int rt=global_variable+var;
 49 |     printf ("returning %d\n", rt);
 50 |     return rt;
 51 | };
 52 | ```
 53 | 
 54 | 用GCC 4.7.3编译它并用IDA查看.so文件的反汇编代码：
 55 | 
 56 | ```
 57 | gcc -fPIC -shared -O3 -o 1.so 1.c
 58 | ```
 59 | 
 60 | ```
 61 | .text:00000440 public __x86_get_pc_thunk_bx
 62 | .text:00000440 __x86_get_pc_thunk_bx proc near ; CODE XREF: _init_proc+4
 63 | .text:00000440 ; deregister_tm_clones+4 ...
 64 | .text:00000440     mov ebx, [esp+0]
 65 | .text:00000443     retn
 66 | .text:00000443 __x86_get_pc_thunk_bx endp
 67 | .text:00000570 public f1
 68 | .text:00000570 f1 proc near
 69 | .text:00000570
 70 | .text:00000570 var_1C = dword ptr -1Ch
 71 | .text:00000570 var_18 = dword ptr -18h
 72 | .text:00000570 var_14 = dword ptr -14h
 73 | .text:00000570 var_8 = dword ptr -8
 74 | .text:00000570 var_4 = dword ptr -4
 75 | .text:00000570 arg_0 = dword ptr 4
 76 | .text:00000570
 77 | .text:00000570     sub esp, 1Ch
 78 | .text:00000573     mov [esp+1Ch+var_8], ebx
 79 | .text:00000577     call __x86_get_pc_thunk_bx
 80 | .text:0000057C     add ebx, 1A84h
 81 | .text:00000582     mov [esp+1Ch+var_4], esi
 82 | .text:00000586     mov eax, ds:(global_variable_ptr - 2000h)[ebx]
 83 | .text:0000058C     mov esi, [eax]
 84 | .text:0000058E     lea eax, (aReturningD - 2000h)[ebx] ; "returning %d\n"
 85 | .text:00000594     add esi, [esp+1Ch+arg_0]
 86 | .text:00000598     mov [esp+1Ch+var_18], eax
 87 | .text:0000059C     mov [esp+1Ch+var_1C], 1
 88 | .text:000005A3     mov [esp+1Ch+var_14], esi
 89 | .text:000005A7     call ___printf_chk
 90 | .text:000005AC     mov eax, esi
 91 | .text:000005AE     mov ebx, [esp+1Ch+var_8]
 92 | .text:000005B2     mov esi, [esp+1Ch+var_4]
 93 | .text:000005B6     add esp, 1Ch
 94 | .text:000005B9     retn
 95 | .text:000005B9 f1 endp
 96 | ```
 97 | 
 98 | 如上所示：每个函数执行时都会矫正“returning %d\n”和global_variable的地址。__x86_get_pc_thunk_bx()函数通过EBX返回一个指向自身的指针（返回的是0x57C）。这是一种获取程序计数器（EIP）的简单方法。0x1A84常量是这个函数开始处到（Global Offset Table Procedure Linkage Table(GOT PLT)）它们之间的距离差。IDA会把这些偏移处理成更容易理解后再显示出来，所以实际上的代码是：
 99 | 
100 | ```
101 | .text:00000577 call __x86_get_pc_thunk_bx
102 | .text:0000057C add ebx, 1A84h
103 | .text:00000582 mov [esp+1Ch+var_4], esi
104 | .text:00000586 mov eax, [ebx-0Ch]
105 | .text:0000058C mov esi, [eax]
106 | .text:0000058E lea eax, [ebx-1A30h]
107 | ```
108 | 
109 | 这里的EBX指向了GOT PLT section。当计算global_variable（存储在GOT）的地址时须减去0x0C偏移量。当计算"returning %d\n"字符串的地址时须减去0x1A30偏移量。
110 | 
111 | 顺便说一下，AMD64的指令支持使用RIP用于相对寻址，这使得它可以产生出更简洁的PIC代码。
112 | 
113 | 让我们用相同的GCC编译器编译相同的C代码，但使用x64平台。
114 | 
115 | IDA会简化了反汇编代码，造成我们无法看到使用RIP相对寻址的细节，所以我在这里使用了objdump来查看反汇编代码：
116 | 
117 | ```
118 | 0000000000000720 <f1>:
119 | 720: 48 8b 05 b9 08 20 00    mov rax,QWORD PTR [rip+0x2008b9] # 200fe0 <_DYNAMIC+0x1d0>
120 | 727: 53                      push rbx
121 | 728: 89                      fb mov ebx,edi
122 | 72a: 48 8d 35 20 00 00 00    lea rsi,[rip+0x20] #751 <_fini+0x9>
123 | 731: bf 01 00 00 00          mov edi,0x1
124 | 736: 03 18                   add ebx,DWORD PTR [rax]
125 | 738: 31 c0                   xor eax,eax
126 | 73a: 89 da                   mov edx,ebx
127 | 73c: e8 df fe ff ff          call 620 <__printf_chk@plt>
128 | 741: 89 d8                   mov eax,ebx
129 | 743: 5b                      pop rbx
130 | 744: c3                      ret
131 | ```
132 | 
133 | 0x2008b9是0x720处指令地址到global_variable地址的差，0x20是0x72a处指令地址到"returning %d\n"字符串地址的差。
134 | 
135 | 你可能会看到，频繁重新计算地址会导致执行效率变差（虽然在x64会更好）。所以如果你比较关心性能的话最好还是使用静态链接。
136 | 
137 | ###67.1.1 Windows
138 | 
139 | Windows的DLL并没有使用PIC机制。如果Windows加载器需加载DLL到另外一个基地址，它需要把DLL在内存中的“重定位段”（在固定的位置）里所有地址都调整为正确的。这意味着多个Windows进程不能在不同进程内存块的不同地址共享一份DLL，因为每个实例加载在内存后只固定在这些地址工作。
140 | 
141 | ##67.2 LD_PRELOAD hack in Linux
142 | 
143 | Linux允许让我们自己的动态链接库加载在其它动态链接库之前，甚至是系统库（如 libc.so.6）。
144 | 
145 | 反过来想，也就是允许我们用自己写的函数去“代替”系统库的函数。举个例子，我们可以很容易地拦截掉time()，read()，write()等等这些函数。
146 | 
147 | 来瞧瞧我们是如何愚弄uptime这个程序的。我们知道，该程序显示计算机已经工作了多长时间。借助strace的帮助可以看到，该程序通过/proc/uptime文件获取到计算机的工作时长。
148 | 
149 | ```
150 | $ strace uptime
151 | ...
152 | open("/proc/uptime", O_RDONLY) = 3
153 | lseek(3, 0, SEEK_SET) = 0
154 | read(3, "416166.86 414629.38\n", 2047) = 20
155 | ...
156 | ```
157 | 
158 | /proc/uptime并不是存放在磁盘的真实文件。而是由Linux Kernel产生的一个虚拟的文件。它有两个数值：
159 | 
160 | ```
161 | $ cat /proc/uptime
162 | 416690.91 415152.03
163 | ```
164 | 
165 | 我们可以用wikipedia来看一下它的含义：
166 | ```
167 | 第一个数值是系统运行总时长，第二个数值是系统空闲的时间。都以秒为单位表示。
168 | ```
169 | 
170 | 我们来写一个含open()，read()，close()函数的动态链接库。
171 | 
172 | 首先，我们的open()函数会比较一下文件名是不是我们所想要打开的，如果是，则将文件描述符记录下来。然后，read()函数会判断如果我们调用的是不是我们所保存的文件描述符，如果是则代替它输出，否则调用libc.so.6里面原来的函数。最后，close()函数会关闭我们所保存的文件描述符。
173 | 
174 | 在这里我们借助了dlopen()和dlsym()函数来确定原先在libc.so.6的函数的地址，因为我们需要控制“真实”的函数。
175 | 
176 | 题外话，如果我们的程序想劫持strcmp()函数来监控每个字符串的比较，则需要我们自己实现一个strcmp()函数而不能用原先的函数。
177 | 
178 | ```
179 | #include <stdio.h>
180 | #include <stdarg.h>
181 | #include <stdlib.h>
182 | #include <stdbool.h>
183 | #include <unistd.h>
184 | #include <dlfcn.h>
185 | #include <string.h>
186 | 
187 | void *libc_handle = NULL;
188 | int (*open_ptr)(const char *, int) = NULL;
189 | int (*close_ptr)(int) = NULL;
190 | ssize_t (*read_ptr)(int, void*, size_t) = NULL;
191 | bool inited = false;
192 | 
193 | _Noreturn void die (const char * fmt, ...)
194 | {
195 |     va_list va;
196 |     va_start (va, fmt);
197 |     vprintf (fmt, va);
198 |     exit(0);
199 | };
200 | 
201 | static void find_original_functions ()
202 | {
203 |     if (inited)
204 |         return;
205 |     libc_handle = dlopen ("libc.so.6", RTLD_LAZY);
206 |     if (libc_handle==NULL)
207 |         die ("can't open libc.so.6\n");
208 |     open_ptr = dlsym (libc_handle, "open");
209 |     if (open_ptr==NULL)
210 |         die ("can't find open()\n");
211 |     close_ptr = dlsym (libc_handle, "close");
212 |     if (close_ptr==NULL)
213 |         die ("can't find close()\n");
214 |     read_ptr = dlsym (libc_handle, "read");
215 |     if (read_ptr==NULL)
216 |         die ("can't find read()\n");
217 |     inited = true;
218 | }
219 | 
220 | static int opened_fd=0;
221 | 
222 | int open(const char *pathname, int flags)
223 | {
224 |     find_original_functions();
225 |     int fd=(*open_ptr)(pathname, flags);
226 |     if (strcmp(pathname, "/proc/uptime")==0)
227 |         opened_fd=fd; // that's our file! record its file descriptor
228 |     else
229 |         opened_fd=0;
230 |     return fd;
231 | };
232 | 
233 | int close(int fd)
234 | {
235 |     find_original_functions();
236 |     if (fd==opened_fd)
237 |         opened_fd=0; // the file is not opened anymore
238 |     return (*close_ptr)(fd);
239 | };
240 | 
241 | ssize_t read(int fd, void *buf, size_t count)
242 | {
243 |     find_original_functions();
244 |     if (opened_fd!=0 && fd==opened_fd)
245 |     {
246 |         // that's our file!
247 |         return snprintf (buf, count, "%d %d", 0x7fffffff, 0x7fffffff)+1;
248 |     };
249 |     // not our file, go to real read() function
250 |     return (*read_ptr)(fd, buf, count);
251 | };
252 | 
253 | ```
254 | 
255 | 把它编译成动态链接库：
256 | ```
257 | gcc -fpic -shared -Wall -o fool_uptime.so fool_uptime.c -ldl
258 | ```
259 | 
260 | 运行uptime，并让它在加载其它库之前加载我们的库：
261 | 
262 | ```
263 | LD_PRELOAD=`pwd`/fool_uptime.so uptime
264 | ```
265 | 
266 | 可以看到：
267 | 
268 | ```
269 | 01:23:02 up 24855 days, 3:14, 3 users, load average: 0.00, 0.01, 0.05
270 | ```
271 | 
272 | 如果LD_PRELOAD环境变量一直指向我们的动态链接库文件名，其它程序在启动的时候也会加载我们的动态链接库。
273 | 
274 | 更多的例子请看：
275 | 
276 | - [Verysimpleinterceptionofthestrcmp()(YongHuang)](http://go.yurichev.
277 | com/17043)
278 | - [KevinPulo—FunwithLD_PRELOAD.Alotofexamplesandideas.](http://go.yurichev.com/17145)
279 | - [File functions interception for compression/decompression files on fly (zlibc).](http://go.yurichev.com/17146)


--------------------------------------------------------------------------------
/Chapter-68/img/Figure_68.1_A_scheme_that_unites_all_PE-file_structures_related_to_imports.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-68/img/Figure_68.1_A_scheme_that_unites_all_PE-file_structures_related_to_imports.jpg


--------------------------------------------------------------------------------
/Chapter-68/img/Figure_68.2_Windows_XP.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-68/img/Figure_68.2_Windows_XP.jpg


--------------------------------------------------------------------------------
/Chapter-68/img/Figure_68.3_Windows_XP.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-68/img/Figure_68.3_Windows_XP.jpg


--------------------------------------------------------------------------------
/Chapter-68/img/Figure_68.4_Windows_7.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-68/img/Figure_68.4_Windows_7.jpg


--------------------------------------------------------------------------------
/Chapter-68/img/Figure_68.5_Windows_8.1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-68/img/Figure_68.5_Windows_8.1.jpg


--------------------------------------------------------------------------------
/Chapter-68/img/exception.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-68/img/exception.jpg


--------------------------------------------------------------------------------
/Chapter-68/img/seh3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-68/img/seh3.jpg


--------------------------------------------------------------------------------
/Chapter-68/img/seh4.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-68/img/seh4.jpg


--------------------------------------------------------------------------------
/Chapter-69/Disassembler.md:
--------------------------------------------------------------------------------
 1 | #第69章
 2 | ##反汇编器
 3 | ###69.1 IDA
 4 | 
 5 | 较老的可下载的免费版本:[http://go.yurichev.com/17031](http://go.yurichev.com/17031)
 6 | 
 7 | 短热键列表(第977页)
 8 | 
 9 | 
10 | 


--------------------------------------------------------------------------------
/Chapter-70/Debugger.md:
--------------------------------------------------------------------------------
 1 | #第70章
 2 | 
 3 | ##调试器
 4 | 
 5 | ###70.1 tracer
 6 | 
 7 | 我用[tracer](http://yurichev.com)代替调试器。
 8 | 
 9 | 我最终不再使用调试器是因为我所需要的只是在代码执行的时候找到函数的参数，或者寄存器在某点的状态。每次加载调试器的时间太长，因此我编写了一个小工具tracer。它有控制台接口，运行在命令行下，允许我们给函数下断，查看寄存器状态，修改值等等。
10 | 
11 | 但出于学习的目的更建议在调试器中手动跟踪代码，观察寄存器状态是怎么变化的(比如经典的SoftICE,Ollydbg,Windbg寄存器值发生变化会高亮)，手动修改标志位，数据然后观察效果。
12 | 
13 | ###70.2 OllyDbg
14 | 
15 | 非常流行的win32用户态调试器
16 | [http://go.yurichev.com/17032](http://go.yurichev.com/17032)
17 | 短热键列表(第977页)
18 | 
19 | ###70.3 GDB
20 | 
21 | GDB在逆向工程师中并不非常流行，但用起来非常舒适。部分命令（第978页）


--------------------------------------------------------------------------------
/Chapter-71/SystemCallTracing.md:
--------------------------------------------------------------------------------
 1 | #第71章
 2 | ##系统调用跟踪
 3 | 
 4 | ###71.0.1 stace/dtruss
 5 | 
 6 | 显示当前进程的系统调用(第697页)。比如：
 7 | 
 8 | ```
 9 | # strace df -h...access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220\232\1\0004\0\0\0"..., 512) = 512fstat64(3, {st_mode=S_IFREG|0755, st_size=1770984, ...}) = 0mmap2(NULL, 1780508, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75b3000
10 | 
11 | ```
12 | 
13 | Mac OS X 的dtruss也有这个功能。
14 | 
15 | Cygwin也有strace，但如果我理解正确的话，它只为cygwin环境编译exe文件工作。


--------------------------------------------------------------------------------
/Chapter-72/Decompilers.md:
--------------------------------------------------------------------------------
1 | #第72章
2 | 
3 | ##反编译器
4 | 
5 | 只有一个已知的，公开的，高质量的反编译C代码的反编译器：Hex-Rays
6 | 
7 | [http://go.yurichev.com/17033](http://go.yurichev.com/17033)
8 | 


--------------------------------------------------------------------------------
/Chapter-73/OtherTools.md:
--------------------------------------------------------------------------------
1 | #第73章
2 | ##其他工具
3 | [Microsoft Visual Studio Express1](http://go.yurichev.com/17034):Visual Studio精简版，方便做简单的实验。部分有用的选项(第978页)
4 | 
5 | [Hiew](http://go.yurichev.com/17035)：适用于二进制文件小型修改
6 | 
7 | binary grep:大量文件中搜索常量(或者任何有序字节)的小工具，也可以用于不可执行文件：[GitHub](http://go.yurichev.com/17017)


--------------------------------------------------------------------------------
/Chapter-84/Primitive XOR-encryption.md:
--------------------------------------------------------------------------------
 1 | #第84章
 2 | 
 3 | ##简单异或加密
 4 | 
 5 | ###84.1 
 6 | 
 7 | Norton Guide这款工具在MS-DOS时代很受欢迎，作为超文本参考手册程序常驻在系统中。
 8 | Norton Guide的数据库文件扩展名是.ng，内容看上去是加密的：
 9 | ![](img/84-1.png)
10 | 
11 | 为什么我说内容是加密的而不是压缩的呢？可以看到，0x1A字节(看起来是“→”)经常出现，而在压缩文件中不会有这种情况,所以这是个加密文件。同时我们也发现大段只包含拉丁字母的部分，看上去就像未知语言的字符串。
12 | 
13 | 0x1A字节出现得频率很高，我们可以尝试解密这个文件，先假设它是用最简单的异或加密。如果我们用0x1A和Hiew中的每个字节异或，我们就能看见熟悉的英文字符串：
14 | 
15 | ![](img/84-2.png)
16 | 
17 | 与单个固定字节异或是最简单的可能的加密方法，有时可能会碰到。
18 | 
19 | 现在我们理解了为什么0x1A出现的频率如此高了：因为文件包含了大量0字节，加密之后替换成了0x1A。
20 | 
21 | 但是常量可能会不同。在这个例子中，我们可以尝试0到255之间的每一个常量，在解密文件中寻找熟悉的内容，256就不行了。
22 | 
23 | 更多关于Norton Guide文件格式内容：[ http://go.yurichev.com/17317]( http://go.yurichev.com/17317)
24 | 
25 | ###84.1.1 熵
26 | 
27 | 像这样简单的加密系统一个很重要的特性就是加密/解密块的信息熵是一样的。下面是我用 Wolfram Mathematica 10的分析。
28 | 
29 | ```
30 | In[1]:= input = BinaryReadList["X86.NG"];In[2]:= Entropy[2, input] // NOut[2]= 5.62724In[3]:= decrypted = Map[BitXor[#, 16^^1A] &, input];In[4]:= Export["X86_decrypted.NG", decrypted, "Binary"];
31 | In[5]:= Entropy[2, decrypted] // NOut[5]= 5.62724In[6]:= Entropy[2, ExampleData[{"Text", "ShakespearesSonnets"}]] // NOut[6]= 4.42366
32 | ```
33 | 我所做是加载文件，获取它的熵，解密保存之后再次获取它的熵(竟然是一样的！)。Mathematica也提供一些著名了英文文本来分析。所以我获取了莎士比亚十四行诗的熵，它很接近我们所分析的文件。我们分析文件包含了英文句子，和莎士比亚十四行诗的语言接近。使用了异或的英文文本有相同的熵。
34 | 
35 | 然而当文件使用大于一个字节来异或时就不可信了。
36 | 
37 | 我们分析的文件可以在这里下载到[http://go.yurichev.com/17350](http://go.yurichev.com/17350)
38 | 
39 | ####关于熵的基数多说一点
40 | 
41 | Wolfram Mathematica使用e(自然对数)为基数计算，UNIX的[ent](http://www.fourmilab.ch/random/)工具使用2为基数。所以我在熵命令中将2设为基数，所以Mathematica获得的结果和ent一样。
42 | 
43 | ###84.2 最简单4字节异或加密
44 | 
45 | 如果异或加密的时候使用了更长的模式，比如，4字节模式，那么也很容易发现。下面这个例子是kernel32.dll文件的起始部分(Windows Server 2008 32位版本)：
46 | 
47 | ![](img/84-3.png)
48 | 
49 | 下面是使用4字节密钥“加密”的结果：
50 | 
51 | ![](img/84-4.png)
52 | 
53 | 容易发现有四个字符重复出现。事实上，PE文件头有许多0字节填充区，这也是密钥能被看出来的原因。
54 | 
55 | 下面是十六进制形式PE头的开头：
56 | 
57 | ![](img/84-5.png)
58 | 
59 | 下面是“加密”后：
60 | 
61 | ![](84-6.png)
62 | 
63 | 容易发现密钥是这四个字节：8C 61 D3 63 . 使用这个信息解密整个文件很容易。因此记住PE文件的这些特性是很重要的：1）PE头有许多0字节填充区；2）每页有4096字节，所有的PE区段用0补齐，经常可以看到所有的区段后出现很长的0字节填充区。
64 | 
65 | 一些其他的文件格式可能包含长0字节填充区，对于科学和工程软件文件来说非常典型。
66 | 
67 | 想自己分析这些文件可以到这里下载：[http://go.yurichev.com/](http://go.yurichev.com/17352)
68 | 
69 | ###84.2.1 练习
70 | 
71 | 作为一个练习尝试解密下面这个文件。当然，密钥已经改变。[http://go.yurichev.com/17353](http://go.yurichev.com/17353)
72 | 
73 | 
74 | 
75 | 
76 | 
77 | 
78 | 


--------------------------------------------------------------------------------
/Chapter-84/img/84-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-84/img/84-1.png


--------------------------------------------------------------------------------
/Chapter-84/img/84-2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-84/img/84-2.png


--------------------------------------------------------------------------------
/Chapter-84/img/84-3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-84/img/84-3.png


--------------------------------------------------------------------------------
/Chapter-84/img/84-4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-84/img/84-4.png


--------------------------------------------------------------------------------
/Chapter-84/img/84-5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-84/img/84-5.png


--------------------------------------------------------------------------------
/Chapter-84/img/84-6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-84/img/84-6.png


--------------------------------------------------------------------------------
/Chapter-85/Millenium game save file.md:
--------------------------------------------------------------------------------
 1 | #第85章
 2 | ##Millenium 游戏存档文件
 3 | 
 4 | "Millenium Return to Earth"是一款很老的DOS游戏(1991)，你可以挖掘矿产资源，修建船只，在其他星球上装备它们等等。
 5 | 
 6 | 就像其他游戏一样，你可以将游戏状态存入一个文件中。
 7 | 
 8 | 咱们来看看能不能找到点什么。
 9 | 
10 | 下面这是游戏中的一处矿井。有些星球的矿井工作更快，也有工作慢的。资源的设置也不同。来看看现在底下埋的是什么样的资源：
11 | 
12 | ![](img/85-1.png)
13 | 
14 | 我保存了游戏状态。存档文件大小为9538字节。
15 | 
16 | 我在游戏中等了"几天"，现在我们可以从矿井中得到更多的资源：
17 | 
18 | ![](img/85-2.png)
19 | 
20 | 我再次保存了游戏状态。
21 | 
22 | 下面我们使用简单的DOS/Windows FC 工具来比较二进制存档文件。
23 | 
24 | ```
25 | ￼...> FC /b 2200save.i.v1 2200SAVE.I.V2Comparing files 2200save.i.v1 and 2200SAVE.I.V200000016: 0D 0400000017: 03 040000001C: 1F 1E00000146: 27 3B00000BDA: 0E 1600000BDC: 66 9B00000BDE: 0E 1600000BE0: 0E 1600000BE6: DB 4C00000BE7: 00 0100000BE8: 99 E800000BEC: A1 F300000BEE: 83 C700000BFB: A8 2800000BFD: 98 1800000BFF: A8 2800000C01: A8 2800000C07: D8 5800000C09: E4 A400000C0D: 38 B800000C0F: E8 68
26 | ...
27 | ```
28 | 这里的输出并不完整，有许多不同之处，我截取了结果来展示最有趣的部分。
29 | 
30 | 在第一个状态中，我有14个单位的氢气，102个单位的氧气。在第二个状态中则分别是22和155个单位。如果这些值保存到了存档文件中，我们会看到差异。的确如此，较老存档的0XBDA处的值为0x0E(14),而在新存档中则变成了0x16(22)。这可能是氢气。同样，老存档的0XBDC处的值为0x66(102)，新存档中值变为0x9B(155)。这看上去是氧气。我将两个文件都放在我的网站上，想要自己实验了解更多信息的请戳这里：[beginners.re](http://beginners.re/examples/millenium_DOS_game/)
31 | 
32 | 下面是在Hiew中显示的新存档文件，我将游戏中与资源相关的值标记了出来：
33 | 
34 | ![](img/85-3.png)
35 | 
36 | 我检查后确认它们是16-bit值，不是16-bit DOS软件中什么奇怪的东西，16-bit的DOS软件int类型为16比特。
37 | 
38 | 下面来验证咱们的假设吧。我在第一个位置(氢气的位置)写入1234(0x4D2):
39 | 
40 | ![](img/85-4.png)
41 | 
42 | 然后我加载这个已改变的文件到游戏中，查看矿井的数据：
43 | 
44 | ![](img/85-5.png)
45 | 
46 | 这就对了。
47 | 
48 | 
49 | 现在我们尝试让这个游戏更快结束，把值设为最大：
50 | 
51 | ![](img/85-6.png)
52 | 
53 | 0xFFFF就是65536，现在我们拥有了许多资源：
54 | 
55 | ![](img/85-7.png)
56 | 
57 | 我在游戏中跳过了"几天"，结果有些资源变少了：
58 | 
59 | ![](img/85-8.png)
60 | 
61 | 溢出发生了。游戏开发者可能没考虑到会出现这么多的资源的情况，没有设置溢出检查，但游戏中的矿井仍然在工作，资源在增加，所以导致了溢出。我想我不应该那么贪婪。
62 | 
63 | 可能大量的数值都保存在了这个文件中。
64 | 
65 | 这就是非常简单的游戏欺骗方法。高分文件可以通过这样打补丁轻松得到。
66 | 
67 | 更多关于文件和内存快照的比对：63.4 第681页
68 | 


--------------------------------------------------------------------------------
/Chapter-85/img/85-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-85/img/85-1.png


--------------------------------------------------------------------------------
/Chapter-85/img/85-2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-85/img/85-2.png


--------------------------------------------------------------------------------
/Chapter-85/img/85-3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-85/img/85-3.png


--------------------------------------------------------------------------------
/Chapter-85/img/85-4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-85/img/85-4.png


--------------------------------------------------------------------------------
/Chapter-85/img/85-5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-85/img/85-5.png


--------------------------------------------------------------------------------
/Chapter-85/img/85-6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-85/img/85-6.png


--------------------------------------------------------------------------------
/Chapter-85/img/85-7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-85/img/85-7.png


--------------------------------------------------------------------------------
/Chapter-85/img/85-8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-85/img/85-8.png


--------------------------------------------------------------------------------
/Chapter-86/Oracle RDBMS SYM-files.md:
--------------------------------------------------------------------------------
  1 | #第86章
  2 | 
  3 | ##Oracle RDBMS: .SYM-files
  4 | 
  5 | 当一个Oracle RDBMS进程出于某种原因崩溃时，会将许多信息写入日志文件，包括栈回溯，就像这样：
  6 | 
  7 | ```
  8 | ----- Call Stack Trace -----
  9 | calling				call    entry                	argument values in hex
 10 | location            type    point                	(? means dubious value)
 11 | ------------------- -------- -------------------- 	----------------------------
 12 | _kqvrow()					00000000
 13 | _opifch2()+2729		CALLptr 00000000			  	23D4B914 E47F264 1F19AE2
 14 | 													EB1C8A8 1
 15 | _kpoal8()+2832		CALLrel _opifch2()				89 5 EB1CC74
 16 | _opiodr()+1248		CALLreg 00000000				5E 1C EB1F0A0
 17 | _ttcpip()+1051		CALLreg 00000000				5E 1C EB1F0A0 0
 18 | _opitsk()+1404		CALL??? 00000000				C96C040 5E EB1F0A0 0 EB1ED30
 19 | 													EB1F1CC 53E52E 0 EB1F1F8
 20 | _opiino()+980		CALLrel	_opitsk()				0   0
 21 | _opiodr()+1248		CALLreg	00000000				3C 4 EB1FBF4
 22 | _opidrv()+1201		CALLrel	_opiodr()				3C 4 EB1FBF4 0
 23 | _sou2o()+55			CALLrel	_opidrv()				3C 4 EB1FBF4
 24 | _opimai_real()+124	CALLrel	_opimai_real()			2 EB1FC2C
 25 | _OracleThreadStart@ CALLrel	_opimai()				2 EB1FF6C 7C88A7F4 EB1FC34 0
 26 | 4()+830												EB1FD04
 27 | 77E6481C			CALLreg 00000000				E41FF9C 0 0 E41FF9C 0 EB1FFC4
 28 | 00000000			CALL??? 00000000
 29 | ```
 30 | 
 31 | 当然Oracle RDBMS 可执行文件肯定拥有某种调试信息,或者带有符号信息(或者类似信息)的映射文件。
 32 | 
 33 | WindowsNT Oracle RDBMS 的符号信息包含在具有.SYM扩展名的文件中，具有专门的格式。(平文文件挺好的，但是需要额外解析，因此获取速度更慢)
 34 | 
 35 | 咱们来看看能不能理解它的格式。我选了最短的orawtc8.sym文件，来自于Oracle 8.1.7 版本orawtc8.dll 文件。
 36 | 
 37 | 下面是Hiew加载后的效果：
 38 | 
 39 | ![](img/86-1.png)
 40 | 
 41 | 通过与其他.SYN文件的对比，我们可以快速发现"OSYM"总是头(和尾)，因此这可能就是文件的标志。
 42 | 
 43 | 同时也可以看出，文件格式是：OSYM+一些二进制数据+以0为界定符的字符串+OSYM。这些字符串显然是函数和全局变量名。
 44 | 
 45 | 我标记了OSYM标志和字符串：
 46 | 
 47 | ![](img/86-2.png)
 48 | 
 49 | 咱们来看看。我在Hiew中标记了整个字符串块(除了末尾的OSYM)，然后把它放进单独的文件中。然后我运行UNIX的strings和wc工具分析字符串
 50 | 
 51 | ```
 52 | strings strings_block | wc -l
 53 | 66
 54 | ```
 55 | 有66个文本字符串，请记住这个数字。
 56 | 
 57 | 可以这么说，常规情况下，数量值会被存储在一个单独的二进制文件中。的确也是如此，我们可以在文件开头找到这个66数字(0x42)，就在OSYM这个标志右边：
 58 | 
 59 | ```
 60 | $ hexdump -C orawtc8.sym00000000  4f 53 59 4d 42 00 00 00  00 10 00 10 80 10 00 10  |OSYMB...........|00000010  f0 10 00 10 50 11 00 10  60 11 00 10 c0 11 00 10  |....P...`.......|00000020  d0 11 00 10 70 13 00 10  40 15 00 10 50 15 00 10  |....p...@...P...|00000030  60 15 00 10 80 15 00 10  a0 15 00 10 a6 15 00 10  |`...............|....
 61 | ```
 62 | 
 63 | 当然，这里的0x42不是一个字节，更像是一个32比特的值，小端，后面至少跟着3个0字节。
 64 | 
 65 | 为什么我认为是32位呢？因为Oracle RDBMS的符号文件比较大。主要的oracle.exe可执行文件(10.2.0.4版本)的oracle.sym文件包含0x3A38E(238478)个符号。一个16位的值在这里明显不够。
 66 | 
 67 | 我检查了其他.SYM文件，证实了我的猜想：OSYM符号后面的32位值总表示文件字符串的数量。
 68 | 
 69 | 这对于所有的二进制文件来说几乎是通用的：文件头包含标志和文件其他信息。
 70 | 
 71 | 现在我们来进一步调查二进制块是什么。再次使用Hiew，我把从块头8个字节(32位计数值后面)开始一直到字符串块结尾的内容放入单独的文件中。
 72 | 
 73 | 在Hiew中看看这个二进制块：
 74 | 
 75 | ![](img/86-3.png)
 76 | 
 77 | 有一个明显的规律。
 78 | 
 79 | 我用红线划分了这个块：
 80 | 
 81 | ![](img/86-4.png)
 82 | 
 83 | Hiew,就像其他的十六进制编辑器一样，每行显示16个字节。所以规律很容易看出来：每行有4个32位的值。
 84 | 
 85 | 这个规律容易看出来的原因是其中的一些值(地址0x104之前)总是具有0x1000xxxx的格式，以0x10和0字节开始。其他值(从地址0x108开始)都是0x0000xxxx的格式，总是以两个0字节开始。
 86 | 
 87 | 我们把这个块当作32位值的数组dump出来：
 88 | 
 89 | 
 90 | ```
 91 | $ od -v -t x4 binary_block0000000 10001000 10001080 100010f0 100011500000020 10001160 100011c0 100011d0 100013700000040 10001540 10001550 10001560 100015800000060 100015a0 100015a6 100015ac 100015b20000100 100015b8 100015be 100015c4 100015ca0000120 100015d0 100015e0 100016b0 100017600000140 10001766 1000176c 10001780 100017b00000160 100017d0 100017e0 10001810 100018160000200 10002000 10002004 10002008 1000200c0000220 10002010 10002014 10002018 1000201c0000240 10002020 10002024 10002028 1000202c0000260 10002030 10002034 10002038 1000203c0000300 10002040 10002044 10002048 1000204c0000320 10002050 100020d0 100020e4 100020f80000340 1000210c 10002120 10003000 100030040000360 10003008 1000300c 10003098 1000309c0000400 100030a0 100030a4 00000000 000000080000420 00000012 0000001b 00000025 0000002e0000440 00000038 00000040 00000048 00000051
 92 | 0000460 0000005a 00000064 0000006e 0000007a0000500 00000088 00000096 000000a4 000000ae0000520 000000b6 000000c0 000000d2 000000e20000540 000000f0 00000107 00000110 000001160000560 00000121 0000012a 00000132 0000013a0000600 00000146 00000153 00000170 000001860000620 000001a9 000001c1 000001de 000001ed0000640 000001fb 00000207 0000021b 0000022a0000660 0000023d 0000024e 00000269 000002770000700 00000287 00000297 000002b6 000002ca0000720 000002dc 000002f0 00000304 000003210000740 0000033e 0000035d 0000037a 000003950000760 000003ae 000003b6 000003be 000003c60001000 000003ce 000003dc 000003e9 000003f80001020
 93 | 
 94 | ```
 95 | 这里有132个值，也就是66*2。或许每一个符号有两个32位的值，或者有两个数组呢？咱们接着看。
 96 | 
 97 | 从0x1000开始的值可能是地址。毕竟这是dll的.SYM文件，win32 DLL默认的基址是0x10000000，代码通常从0x10001000开始。
 98 | 
 99 | 我用IDA打开orawtc8.dll文件时发现基址并不相同，不过没关系，第一个函数是：
100 | 
101 | ```
102 | .text:60351000 sub_60351000	proc near
103 | .text:60351000
104 | .text:60351000 arg_0		= dword ptr 8
105 | .text:60351000 arg_4		= dword ptr 0Ch
106 | .text:60351000 arg_8		= dword ptr 10Ch
107 | .text:60351000
108 | .text:60351000 				push 	ebp
109 | .text:60351001				mov		ebp,esp
110 | .text:60351003				mov		eax, dword_60353014
111 | .text:60351008				cmp		eax, 0FFFFFFFFh
112 | .text:6035100B				jnz		short loc_6035104F
113 | .text:6035100D				mov		ecx, hModule
114 | .text:60351013				xor		eax, eax
115 | .text:60351015				cmp		ecx, 0FFFFFFFFh
116 | .text:60351018				mov		dword_60353014, eax
117 | .text:6035101D				jnz		short loc_60351031
118 | .text:6035101F				call	sub_603510F0
119 | .text:60351024				mov		ecx, eax
120 | .text:60351026				mov		eax, dword_60353014
121 | .text:6035102B				mov		hModule, ecx
122 | .text:60351031
123 | .text:60351031 loc_60351031:					; CODE XREF: sub_60351000+1D
124 | .text:60351031 				test	ecx, ecx
125 | .text:60351033				jbe		short loc_6035104F
126 | .text:60351035				push	offset ProcName ; "ax_reg"
127 | .text:6035103A				push	ecx             ; hModule
128 | .text:6035103B				call	ds:GetProcAddress
129 | ...
130 | ```
131 | 
132 | 哇，"ax_reg"字符串看起来很熟悉。它的确是字符串块的第一个字符串。所以函数名是"ax_reg"。
133 | 
134 | 第二个函数是：
135 | 
136 | 
137 | ```
138 | .text:60351080 sub_60351080		proc near
139 | .text:60351080
140 | .text:60351080 arg_0			= dword ptr  8
141 | .text:60351080 arg_4			= dword ptr  0Ch
142 | .text:60351080
143 | .text:60351080					push    ebp
144 | .text:60351081					mov     ebp, esp
145 | .text:60351083					mov     eax, dword_60353018
146 | .text:60351088					cmp     eax, 0FFFFFFFFh
147 | .text:6035108B					jnz     short loc_603510CF
148 | .text:6035108D					mov     ecx, hModule
149 | .text:60351093					xor     eax, eax
150 | .text:60351095					cmp     ecx, 0FFFFFFFFh
151 | .text:60351098					mov     dword_60353018, eax
152 | .text:6035109D					jnz     short loc_603510B1
153 | .text:6035109F					call    sub_603510F0
154 | .text:603510A4					mov     ecx, eax
155 | .text:603510A6					mov     eax, dword_60353018
156 | .text:603510AB  				mov     hModule, ecx
157 | .text:603510B1
158 | .text:603510B1 loc_603510B1:							 ; CODE XREF: sub_60351080+1D
159 | .text:603510B1 					test    ecx, ecx
160 | .text:603510B3					jbe     short loc_603510CF
161 | .text:603510B5					push    offset aAx_unreg ; "ax_unreg"
162 | .text:603510BA 					push    ecx              ; hModule
163 | .text:603510BB					call    ds:GetProcAddress
164 | ...
165 | 
166 | ```
167 | 
168 | "ax_unreg"字符串也是字符串块的第二个字符串！第二个函数的开始地址是0x60351080，二进制块的第二个值是10001080.因此就是这个地址，但对于DLL加上了默认基地址
169 | 
170 | 现在可以快速的检查然后确定数组开始的66个值(也就是数组前一半)只是DLL中的函数地址，包括一些标签等等。那么另一半是什么呢？剩余的66个值都是以0x0000开始的，看上去范围是[0...0x3FB8]。并且他们看上去不像位域：序列的数量在增长。最后一个十六进制数字看上去是随机的。因此不像是地址(如果它是4字节，8字节，16字节则可除尽)
171 | 
172 | 我们问问自己吧：Oracle RDBMS的开发者还会在文件中保存什么呢？随便猜猜：可能是文本字符串(函数名)的地址。可以迅速验证这一点，是的，每个数字代表的就是字符串在这个块中第一个字符的位置。
173 | 
174 | 就是这样，完成了！
175 | 
176 | 我写了一个工具将这些.SYM文件转换到IDA脚本中，然后我可以加载.IDC脚本，设置函数名：
177 | 
178 | ```
179 | #include <stdio.h>#include <stdint.h>#include <io.h>#include <assert.h>#include <malloc.h>#include <fcntl.h>#include <string.h>int main (int argc, char *argv[]){        uint32_t sig, cnt, offset;        uint32_t *d1, *d2;        int     h, i, remain, file_len;        char    *d3;        uint32_t array_size_in_bytes;        assert (argv[1]); // file name        assert (argv[2]); // additional offset (if needed)        // additional offset        assert (sscanf (argv[2], "%X", &offset)==1);        // get file length        assert ((h=open (argv[1], _O_RDONLY | _O_BINARY, 0))!=-1);        assert ((file_len=lseek (h, 0, SEEK_END))!=-1);        assert (lseek (h, 0, SEEK_SET)!=-1);        // read signature        assert (read (h, &sig, 4)==4);        // read count        assert (read (h, &cnt, 4)==4);        assert (sig==0x4D59534F); // OSYM        // skip timedatestamp (for 11g)        //_lseek (h, 4, 1);        array_size_in_bytes=cnt*sizeof(uint32_t);
180 | 		// load symbol addresses array		d1=(uint32_t*)malloc (array_size_in_bytes);		assert (d1);		assert (read (h, d1, array_size_in_bytes)==array_size_in_bytes);		// load string offsets array		d2=(uint32_t*)malloc (array_size_in_bytes);		assert (d2);		assert (read (h, d2, array_size_in_bytes)==array_size_in_bytes);		// calculate strings block size		remain=file_len-(8+4)-(cnt*8);		// load strings block		assert (d3=(char*)malloc (remain));		assert (read (h, d3, remain)==remain);		printf ("#include <idc.idc>\n\n");		printf ("static main() {\n");		for (i=0; i<cnt; i++)        		printf ("\tMakeName(0x%08X, \"%s\");\n", offset + d1[i], &d3[d2[i]]);		printf ("}\n");		close (h);		free (d1); free (d2); free (d3);
181 | ```
182 | 
183 | 下面是它工作的一个例子：
184 | 
185 | ```
186 | #include <idc.idc>static main() {
187 | 	MakeName(0x60351000, "_ax_reg");	MakeName(0x60351080, "_ax_unreg");	MakeName(0x603510F0, "_loaddll");	MakeName(0x60351150, "_wtcsrin0");	MakeName(0x60351160, "_wtcsrin");	MakeName(0x603511C0, "_wtcsrfre");	MakeName(0x603511D0, "_wtclkm");	MakeName(0x60351370, "_wtcstu");... }
188 | ```
189 | 
190 | 我使用的例子可以在这里找到：[beginners.re](http://go.yurichev.com/17216)
191 | 
192 | 咱们来试试64位的Oracle RDBMS。相应的，地址应该为64位，对么？
193 | 
194 | 8字节的规律看上去更加明显了：
195 | 
196 | ![](img/86-5.png)
197 | 
198 | 是的，所有的表含有64位的元素，甚至是字符串的偏移。现在标志也变成了OSYMAM64，猜测是用于区分目标平台的。
199 | 
200 | 就是这样了。连接Oracle RDBMS-SYM文件我用的函数的库：[GitHub](https://github.com/dennis714/porg/blob/master/oracle_sym.c)
201 | 
202 | 
203 |  
204 | 
205 | 
206 | 
207 | 
208 | 


--------------------------------------------------------------------------------
/Chapter-86/img/86-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-86/img/86-1.png


--------------------------------------------------------------------------------
/Chapter-86/img/86-2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-86/img/86-2.png


--------------------------------------------------------------------------------
/Chapter-86/img/86-3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-86/img/86-3.png


--------------------------------------------------------------------------------
/Chapter-86/img/86-4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-86/img/86-4.png


--------------------------------------------------------------------------------
/Chapter-86/img/86-5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-86/img/86-5.png


--------------------------------------------------------------------------------
/Chapter-87/Oracle RDBMS MSB-files.md:
--------------------------------------------------------------------------------
 1 | #第87章
 2 | ##Oracle RDBMS:.MSB-files
 3 | 
 4 | 这个二进制文件包含了错误信息和对应的错误码。我们来理解它的格式然后找到unpack方法。
 5 | 
 6 | 这里有文本格式的Oracle RDBMS错误信息，因此我们可以比对文本和pack后的二进制文件。
 7 | 
 8 | 下面是ORAUS.MSG文本文件的开头，一些无关紧要的注释已经去掉：
 9 | 
10 | ```
11 | 00000, 00000, "normal, successful completion"00001, 00000, "unique constraint (%s.%s) violated"00017, 00000, "session requested to set trace event"00018, 00000, "maximum number of sessions exceeded"00019, 00000, "maximum number of session licenses exceeded"00020, 00000, "maximum number of processes (%s) exceeded"00021, 00000, "session attached to some other process; cannot switch session"00022, 00000, "invalid session ID; access denied"00023, 00000, "session references process private memory; cannot detach session"00024, 00000, "logins from more than one process not allowed in single-process mode"00025, 00000, "failed to allocate %s"00026, 00000, "missing or invalid session ID"00027, 00000, "cannot kill current session"00028, 00000, "your session has been killed"00029, 00000, "session is not a user session"00030, 00000, "User session ID does not exist."00031, 00000, "session marked for kill"...
12 | ```
13 | 
14 | 第一个数字是错误码，第二个可能是附加的标志，我不太确定。
15 | 
16 | 现在我们来打开ORAUS.MSB二进制文件，找到这些文本字符串。这里有：
17 | 
18 | ![](img/87-1.png)
19 | 
20 | 可以看到，这些文本字符串之间(包括ORAUS.MSG文件开头的那些)插入了一些二进制值。通过快速调查分析可发现二进制文件的主要部分按0x200(512)字节的大小进行分割。
21 | 
22 | 咱们来看看第一个块的内容：
23 | 
24 | ![](img/87-2.png)
25 | 
26 | 这里可以看到第一条错误信息文本。同时也看到错误信息之间没有0字节。这意味着没有以null结尾的c字符串。因此，每一条错误信息的长度值肯定以某种形式加密了。我们再来找找错误码。ORAUS.MSG文件这样开始：0，1，17(0x11),18 (0x12), 19 (0x13), 20 (0x14), 21 (0x15), 22 (0x16), 23 (0x17), 24 (0x18)...我在块头找到这些数字并且用红线标注出来了。错误码的间隔是6个字节。这意味着可能有6个字节分配给每条错误信息。
27 | 
28 | 第一个16位值(0xA或者10)表示每个块的消息数量：我通过分析其他块证实了这一点，的确是这样：错误信息大小不定，有的长有的短。但是块大小总是固定的，所以你永远也不知道每个块可以pack多少条文本信息。
29 | 
30 | 我注意到，既然这些c字符串不以null结尾，那么他们的大小一定在某处被加密了。第一个字符串"normal, successful completion"的大小是29(0x1D)字节。第二个字符串"unique constraint (%s.%s) violated"的大小是34(0x22)字节。在块里面找不到这些值(0x1D和0x22)。
31 | 
32 | 还有一点，Oracle RDBMS需要知道需要加载的字符串在块中的位置，对么？第一个字符串"normal, successful completion"从地址0x14444(如果我们从文件开始处计数的话)或者0x44(从块开始处计数)开始。第二个字符串"unique constraint (%s.%s) violated"从0x1461(从文件开始处计算)或者0x61(从块开始处计算)开始。这些数字(0x44和0x61)看上去很熟悉！我们能在块的开始处找到他们。
33 | 
34 | 因此，每个6字节块是：
35 | 
36 | *	16比特错误码
37 | *	16比特0(或者附加标志)
38 | *	16比特当前块文本字符串起始位置
39 | 
40 | 可以通过快速核对其他值证明我是对的。然后这里还有最后一个6字节块，错误码为0，从最后一条错误信息的最后一个字符后开始。也许这就是确定文本信息长度的方法?我们刚刚枚举了6字节块来寻找我们需要的错误码，然后我们找到了文本字符串的位置，接着我们通过查看下一个6字节块获取文本字符串的位置。这样我们就找到了字符串的边界。这种方法通过不保存文本字符串的大小节省了一些空间。我不敢说它特别省，但是这是一个聪明的技巧。
41 | 
42 | 我们再回到.MSB文件的头部：
43 | 
44 | ![](img/87-3.png)
45 | 
46 | 可以迅速找到文件中记录块数量的值(用红线标注出来了)，然后检查了其他.MSB文件，结果发现都是这样的。这里还有很多其他值，但我没有查看他们，因为我的工作已经完成了(一个unpack工具)。如果我要写一个.MSB文件packer，那么我可能需要理解其他值的含义。
47 | 
48 | 头的后面接着一个可能包含16比特值的表：
49 | 
50 | ![](img/87-4.png)
51 | 
52 | 其大小可以直观的划出来(我用红线画出)。在dump这些值的过程中，我发现每个16比特的值是每个块最后一个错误码。
53 | 
54 | 这就是如何快速找到Oracle RDBMS错误信息的方法：
55 | 
56 | *	加载那个我称为last_errnos的表(包含每个块最后一个错误码)；
57 | *	找到包含我们所需错误码的块，假定所有的错误码的增加跨越了每个块到所有文件；
58 | *	加载特殊块；
59 | *	枚举6字节结构体直到找到目标错误码；
60 | *	从下一个6字节块获取最后一个字符的位置；
61 | *	加载这个范围内错误信息所有的字符。
62 | 
63 | 这是我编写的unpack.MSB文件的c程序：[beginners.re](http://go.yurichev.com/17213)
64 | 
65 | 这是我用作实例的两个文件(Oracle RDBMS 11.1.0.6):[beginners.re](http://go.yurichev.com/17214),[beginners.re](http://go.yurichev.com/17215)
66 | 
67 | ###87.1 总结
68 | 
69 | 这种方法对于许多现代计算机来说也许太老了，假如这个文件格式是80年代中期某个具有内存/硬盘空间节省意识的硬件开发者设计的。尽管如此，这仍是一个有趣又简单的任务，因为不需要分析Oracle RDBMS的代码就能理解特殊文件的格式。
70 | 
71 | 
72 | 	
73 | 
74 | 
75 | 
76 | 
77 | 
78 | 
79 | 
80 | 
81 | 
82 | 
83 | 
84 | 


--------------------------------------------------------------------------------
/Chapter-87/img/87-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-87/img/87-1.png


--------------------------------------------------------------------------------
/Chapter-87/img/87-2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-87/img/87-2.png


--------------------------------------------------------------------------------
/Chapter-87/img/87-3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-87/img/87-3.png


--------------------------------------------------------------------------------
/Chapter-87/img/87-4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leecade/reverse-engineering-for-beginners/400807b9d4b5edfa0173e30f08c5ddb649006bd0/Chapter-87/img/87-4.png


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Reverse engineering for beginners
 2 | 
 3 | ## 说明
 4 | 
 5 | 该项目为某同性交友团体翻译某本bl文的项目地址，异性恋请勿[点击](index.md)
 6 | 
 7 | ## 源起
 8 | 
 9 | 两年前乌云知识库的 @瞌睡龙 菊苣发起对《Reverse engineering for beginners》本书的翻译项目，但是后期对这本书都没怎么做过修缮工作，以至于两年间Dennis Yurichev对这本书的调整修改都得不到更新。而且乌云译版里面还有着不少语句不通，格式混乱的瑕疵。
10 | 
11 | 恰巧在前几天在Dennis Yurichev菊苣的[主页](http://beginners.re/)上看到《reverse engineering for beginners》的韩语版已经出版发售。而相比之下这本书在中国的技术社区上却显得很冷清，作为之前参加过这项翻译工作我实在不忍心看着这么好的逆向书籍在中国的技术社区上被埋没。我尝试搜了一下，目前还没找到乌云译版以外的翻译版本，另外乌云的中文翻译版可以在Dennis Yurichev的blog看找到链接http://yurichev.com/blog/2015-aug-20/。
12 | 
13 | Dennis Yurichev在感谢列表上antiy公司也有一份Chinese的翻译版本，但是我没找到，如果有小伙伴知道的麻烦告诉我一声，谢谢: )
14 | 
15 | ![](http://static.wooyun.org/upload/image/201605/2016052513120168158.png)
16 | 
17 | 最后，我目前看过的逆向书籍不多，但是对比一下国内看雪社区的《加密与解密》，《reverse engineering for beginners》更适合作为逆向入门书籍，而且里面覆盖的范围更广，包括了x86/x64，arm两三种CPU的指令集，而且囊括了Linux和Windows的OS hack等内容，各种逆向工具也都介绍了一番。所以逆向方面我挺推荐这本书作为入门读物的。
18 | 
19 | ## 修缮计划
20 | 
21 | 1. 对每个章节都review并修缮，把语病，格式混乱等问题给解决了。
22 | 2. 里面大量的术语翻译不一致，格式不一致，这方面也要做改善。
23 | 3. 因为这期间作者对原版调整了很多，有不少章节缺漏或被删，分章情况也大不相同，所以需要继续完善跟原版的衔接工作。
24 | 4. 迁移到gitbook和看云上，并在乌云drops上开一个专题连载这本书，提供一个更好的阅读体验。
25 | 5. LaTex版本的迁移工作，但是同时markdown版本也会进行维护，而且维护应该也把markdown作为主要的编辑工具，比较会的人多一些：)，LaTex粉体谅一下。
26 | 
27 | ## 任务分配计划
28 | 
29 | - 统一化翻译术语和格式。
30 | - 目录调整，将每章章节都和原版进行归并。
31 | - 补全和删除掉与原版不符合的章节，之前翻译的时候是800多页，现在已经1300多页。
32 | - 文章校验工作，最好每章都有人可以帮忙维护校对。
33 | - LaTex版本的迁移工作。不少同学跟我说要出一个LaTex版本，所以也把它加入TODO列表了，希望有人来领取这项任务。
34 | - gitbook上电子书的建设和维护。
35 | - 看云上电子书的建设和维护。
36 | - 乌云drops上电子书的建设和维护。
37 | 
38 | 如果有兴趣的同学，请在我的github分支上提一个issue，并说明您要做哪一块工作，并简单说明一下您的工作计划。其它同学也对该工作有兴趣的，可以继续在这个issue上跟进和讨论。
39 | 
40 | 即使提交一个逗号的修改我也会将您加入到贡献名单中。
41 | 
42 | ## 贡献名单
43 | 
44 | ### 第一期的翻译贡献人员
45 | 
46 | 瞌睡龙、糖果、blast、magix526、Larryxi、左懒、DM_、Zing
47 | 
48 | ### 修缮工作的贡献人员
49 | 
50 | 如有缺漏请通知我一声。谢谢！
51 | 
52 | ## Copyright
53 | 
54 | 原书版权归Dennis Yurichev作者所有，翻译版本解释权归乌云社区（感谢@瞌睡龙菊苣的带头和支持）和所有参加过翻译的小伙伴。
55 | 
56 | ## 最后
57 | 
58 | 最后，我在乌云zone上面发布了一个悬赏贴，里面就悬赏给第一个contributor了：）
59 | 
60 | http://zone.wooyun.org/content/27463
61 | 
62 | 希望有兴趣的同学一起帮忙完善这本书吧！
63 | 
64 | - 原版：https://github.com/dennis714/RE-for-beginners
65 | - 乌云主分支：https://github.com/woolabs/Reverseng
66 | - 我的github上面的分支：https://github.com/veficos/reverse-engineering-for-beginners
67 | 
68 | 因为乌云主分支的管理员比较少上线，所以跟@瞌睡龙菊苣商量后，先在我的github分支上做修缮工作，之后任务差不多了再归并到主分支上。
69 | 
70 | 有兴趣的同学也可以加我的QQ：736809745。
71 | 
72 | 无关翻译工作的，闲聊，技术交流都可以：）
73 | 


--------------------------------------------------------------------------------
/index.md:
--------------------------------------------------------------------------------
 1 | # 阅读目录
 2 | 
 3 | - [CPU简介](Chapter-01/Chapter-1.md)
 4 | - [Hello,world!](Chapter-02/Chapter-2.md)
 5 | - [函数的序幕](Chapter-03/Chapter-3.md)
 6 | - [栈](Chapter-04/Chapter-4.md)
 7 | - [printf()与参数处理](Chapter-05/Chapter-5.md)
 8 | - [scanf()](Chapter-06/Chapter-6.md)
 9 | - [访问传递参数](Chapter-07/Chapter-7.md)
10 | - [一个或者多个字的返回值](Chapter-08/Chapter-8.md)
11 | - [指针](Chapter-09/Chapter-9.md)
12 | - [条件跳转](Chapter-10/Chapter-10.md)
13 | - [选择结构switch()/case/default](Chapter-11/Chapter-11.md)
14 | - [循环结构](Chapter-12/Chapter-12.md)
15 | - [strlen()](Chapter-13/Chapter-13.md)
16 | - [除法](Chapter-14/Chapter-14.md)
17 | - [用FPU工作](Chapter-15/Chapter-15.md)
18 | - [数组](Chapter-16/Chapter-16.md)
19 | - [位域](Chapter-17/Chapter-17.md)
20 | - [结构体](Chapter-18/Chapter-18.md)
21 | - [联合体](Chapter-19/Chapter-19.md)
22 | - [函数指针](Chapter-20/Chapter-20.md)
23 | - [在32位环境中的64位值](Chapter-21/Chapter-21.md)
24 | - [SIMD](Chapter-22/Chapter-22.md)
25 | - [64位化](Chapter-23/Chapter-23.md)
26 | - [使用x64下的SIMD来处理浮点数](Chapter-24/Chapter-24.md)
27 | - [温度转换](Chapter-25/Chapter-25.md)
28 | - [C99的限制](Chapter-26/Chapter-26.md)
29 | - [内联函数](Chapter-27/Chapter-27.md)
30 | - [处理不当的反汇编代码](Chapter-28/Chapter-28.md)
31 | - [花指令](Chapter-29/Chapter-29.md)
32 | - [16位Windows](Chapter-30/Chapter-30.md)
33 | - [类](Chapter-31/Chapter-31.md)
34 | - [ostream](Chapter-32/Chapter-32.md)
35 | - [STL](Chapter-33/Chapter-33.md)
36 | 
37 | 以下链接需要修复，请自行查看文件夹对应内容
38 | - [Java](Chapter-54/Chapter-54.md)
39 | - [MicrosoftVisualC++](Chapter-55/Chapter-55.md)
40 | - [communication_with_the_outer_world_(win32)](Chapter-56/Chapter-56.md)
41 | - [text_strings](Chapter-57/Chapter-57.md)
42 | - [call_to_assert](Chapter-58/Chapter-58.md)
43 | - [constans](Chapter-59/Chapter-59.md)
44 | - [finding_the_right_instructions](Chapter-60/Chapter-60.md)
45 | - [xor_instructions](Chapter-61/Chapter-61.md)
46 | - [using_magic_numbers_while_tracing](Chapter-62/Chapter-62.md)
47 | - [general_idea](Chapter-63/Chapter-63.md)
48 | - [ArgumentsPassingMethods](Chapter-64/Chapter-64.md)
49 | - [ThreadLocalStorage](Chapter-65/Chapter-65.md)
50 | - [SystemCalls](Chapter-66/Chapter-66.md)
51 | - [Linux](Chapter-67/Chapter-67.md)
52 | - [Windows-NT](Chapter-68/Chapter-68.md)
53 | - [Disassembler](Chapter-69/Chapter-69.md)
54 | - [Debugger](Chapter-70/Chapter-70.md)
55 | - [SystemCallTracing](Chapter-71/Chapter-71.md)
56 | - [Decompilers](Chapter-72/Chapter-72.md)
57 | - [OtherTools](Chapter-73/Chapter-73.md)
58 | - [Primitive XOR-encryption](Chapter-84/Chapter-84.md)
59 | - [Millenium game save file](Chapter-85/Chapter-85.md)
60 | - [Oracle RDBMS SYM-files](Chapter-86/Chapter-86.md)
61 | - [Oracle RDBMS MSB-files](Chapter-87/Chapter-87.md)
62 | 


--------------------------------------------------------------------------------