├── .gitignore ├── LICENSE ├── OCaml_crackmes ├── README.md ├── baby ├── imgs │ ├── 1.png │ ├── 10.png │ ├── 2.png │ ├── 3.png │ ├── 4.png │ ├── 5.png │ ├── 6.png │ ├── 7.png │ ├── 8.png │ └── 9.png ├── solve.py └── teenager ├── README.md ├── ReverseMe3 ├── README.md ├── ReverseMe3.EXE ├── ReverseMe3_vm_fixed.exe ├── imgs │ ├── 1.png │ ├── 10.png │ ├── 11.png │ ├── 12.png │ ├── 13.png │ ├── 14.png │ ├── 15.png │ ├── 16.png │ ├── 17.png │ ├── 18.png │ ├── 19.png │ ├── 2.png │ ├── 20.png │ ├── 21.png │ ├── 22.png │ ├── 23.png │ ├── 3.png │ ├── 4.png │ ├── 5.png │ ├── 6.png │ ├── 7.png │ ├── 8.png │ └── 9.png └── name_hash.py ├── armageddon ├── angr-solve.py ├── armageddon ├── imgs │ ├── 1.png │ ├── 2.png │ └── 3.png ├── qiling_emulate.py ├── solve.md └── z3_solve.py ├── automating-gdb ├── README.md ├── creator-writeup.rar ├── gdb_solve.py ├── imgs │ ├── 1.png │ ├── 2.png │ ├── 3.png │ ├── 4.png │ ├── 5.png │ ├── 6.png │ ├── binaryninja-recursion.bndb-20ab.html │ └── binaryninja-recursion.bndb-code_buffer.html ├── recursion.elf └── solution.txt ├── client_houseplant_ctf_2020 ├── client.apk ├── imgs │ ├── 1.png │ ├── 2.png │ ├── 3.png │ ├── 4.png │ └── video.png ├── mitm-solve.py └── solve.md ├── elf_format ├── README.md ├── binaries │ ├── dumped-elf │ ├── dumped-elf_fixed │ ├── dumped-elf_fixed_patched │ └── tricky-crackme ├── imgs │ ├── 1.png │ ├── 10.png │ ├── 11.png │ ├── 12.png │ ├── 13.png │ ├── 2.png │ ├── 3.png │ ├── 4.png │ ├── 5.png │ ├── 6.png │ ├── 7.png │ ├── 8.png │ └── 9.png └── test_lazy_binding │ ├── .gdb_history │ ├── simple │ ├── simple.c │ ├── simple_4 │ ├── simple_non_lazy │ ├── simple_patched │ └── simple_patched_2 ├── how_to_not_write_a_bad_crackme └── how_to_not_write_a_bad_crackme.md ├── obfuscation ├── README.md ├── imgs │ ├── 1.png │ ├── 10.png │ ├── 2.png │ ├── 3.png │ ├── 4.png │ ├── 5.png │ ├── 6.png │ ├── 7.png │ ├── 8.png │ └── 9.png ├── keygenme4.exe └── solution │ ├── de-obfuscate │ ├── __init__.py │ └── de-obfuscate.py │ └── keygen.py ├── quarkslab_android_crackme ├── imgs │ ├── analyze_me.png │ ├── app.png │ ├── country_code.png │ ├── data.png │ ├── jni.png │ ├── key.png │ ├── loop.png │ ├── loop_body.png │ ├── phone_func.png │ ├── wrong.png │ └── wrong_number.png └── main.md ├── reverse_engineering_and_fixing_a_fan ├── imgs │ ├── img-1.jpg │ ├── img-2.jpg │ ├── img-3.jpg │ └── img-4.jpg └── main.md ├── shl_undefined_behavior ├── imgs │ ├── 1.png │ ├── 2.png │ └── 3.png ├── src_and_binaries │ ├── test │ ├── test.c │ ├── test_O2 │ ├── test_missing_UL │ ├── test_missing_UL.c │ └── test_missing_UL_O2 └── writeup.md └── x86 ├── README.md ├── imgs └── 1.png ├── source ├── Makefile ├── code.h ├── generator.py └── main.c ├── x86 └── z3_solve.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.bndb 2 | .vscode -------------------------------------------------------------------------------- /OCaml_crackmes/README.md: -------------------------------------------------------------------------------- 1 | # Solving Two OCaml Crackmes Without Knowing Much about OCaml 2 | 3 | Earlier this year, my friend [Towel](https://crackmes.one/user/Towel) uploaded two OCaml crackmes to crackmes.one. One of them is [`Baby OCaml`](https://crackmes.one/crackme/5f600af333c5d4357b3b01d6), and the other one is called [`Teenager OCaml`](https://crackmes.one/crackme/5f600b9933c5d4357b3b01d7). Well, interesting names! 4 | 5 | This is not the first time Towel came up with OCaml crackmes. [`Qt Scanner`](https://crackmes.one/crackme/5ec1b82133c5d449d91ae539), rated as level 5, is a hard challenge. I attempted that, but have not succeeded yet. So, when I first saw these two new OCaml challenges, I am not very eager to try them, despite they are rated as level 1 and 3. Nevertheless, we cannot hide from challenges forever, so I decided to try it last week. And the outcome is good, I managed to solve them without digging deep into the OCaml runtime. 6 | 7 | ## Baby OCaml 8 | 9 | OCaml is an interpretive language, but it can be compiled to native code. This is in contrast to Python/PyInstaller, where the script is just packaged into the generated binary and we can restore the original source of it. The OCaml compiler generates native code based on the source code, and the source is not present within the generated binary. Worse still, when we deal with new programming languages, e.g., OCaml, Go, Rust, we are likely to encounter some novel things we do not expect. For example, Rust has a very different way of passing parameters and return values of a function. We need to first get familiar with it, then start reversing the actual code logic. 10 | 11 | The Baby binary is 2.0 MB in size, which is HUGE for a crackme. The OCaml runtime will occupy lots of space in it, so we need to find the code that we are interested in. Opening the binary in BinaryNinja reveals that it is a statically linked binary: 12 | 13 | 14 | 15 | OK, so even libc functions are not easy to find. But the entry point looks so familiar to me that I can still recognize the `call 0x470980` at 0x401c48 is `libc_start_main`, and `sub_401770` at 0x401c41 is the `main` function. However, the `main` function is mostly initializing the OCaml runtime, and I cannot find the actual entry point to the code. 16 | 17 | Then I decided to run the binary and see if I can get any clue from it: 18 | 19 | ```Bash 20 | $ ./baby 21 | -= Montrehack =- 22 | Baby OCaml 23 | 24 | [!] Nope, try again. 25 | ``` 26 | 27 | Ok, it does not ask for input, so the input should probably be supplied as a command line argument. I tried to find the strings it prints but failed. Well, the strings must be encrypted or otherwise obfuscated. Now I cannot quickly find the logic that checks the input, so again this is a dead end. 28 | 29 | I tried to reverse the binary for a half-day but cannot make a breakthrough. The call stack is deep and lots of function pointers are used. I was lost and put the binary aside for a while until one day Towel poked me to try his challenges. I told him that I cannot even solve the baby one, thanks to the string obfuscation. We chatted about the challenges a little bit, and I decided to give it a try again. 30 | 31 | This time, I have to admit, that I am super lucky. I browsed the string list and spotted something unusual in the first few: 32 | 33 | 34 | 35 | Looks readable, right? I navigated to the location and the code seems to be comparing strings: 36 | 37 | 38 | 39 | I am pretty sure the code is checking whether the ASCII string at `rax` is `Getting_Warmed_Up`. Note, the last char, `'p'`, with an ASCII value 0x70, is checked against 0x600000000000070. Well, due to little-endian, this will be effectively checking the lowest byte in the qword, but I have no idea what the 0x60 means. So OCaml runtime does have some weird things that are quite unusual. 40 | 41 | Anyways, I solved the challenge: 42 | 43 | ```Bash 44 | $ ./baby Getting_Warmed_Up 45 | -= Montrehack =- 46 | Baby OCaml 47 | 48 | [+] Success! 49 | 50 | FLAG-c34bc2bd73fdb06799061a8e76f62664 51 | ``` 52 | 53 | ## Tennager OCaml 54 | 55 | Although I did not solve the last challenge decently, I cannot wait to start working on the Teenager one. This binary is 1.9 MB in size. So, yeah, the size is mostly static libraries + OCaml runtime, and the size of the actual logic is almost negligible within it. 56 | 57 | This time it does not use string obfuscation so I can easily locate the place where the binary asks for input: 58 | 59 | 60 | 61 | The control flow seems quite obvious, in the first node it asks for input, there there are two checks, and we must get to the lower left node to pass the check. I was pretty relieved when I saw this since there aren't many functions in this graph. However, it turns out I am naive and too optimistic about it. 62 | 63 | The first thing that I cannot understand is.... the first check. 64 | 65 | 66 | 67 | At 0x403168, `rbx` must be 0x2b, from which we can deduce that `rbx` must be 0x15 at 0x403163. And tracing back, it becomes weird. From debugging I noticed at 0x40314b, `rax` actually holds the ASCII string of the input. What could be located at `rax-0x8`? Well, I am not sure, but it is highly likely to be something related to the string's length. However, reading the code I cannot make any sense of it. I tried inputs with different lengths and the value does not change according to the input length. 68 | 69 | Furthermore, at 0x40315b there is a `movzx rdi, byte [rax+rbx]`. We know `rax` is the string, if this is one of the input char, then this check is very strange. The length will be checked against one particular char, and the result must be 0x15. 70 | 71 | Luckily, I debugged the code more and find after code at 0x403160, `rbx` always holds the length of the input. So this one is checking whether the string length is 0x15. The OCaml is yet unsolved, but I managed to get some information out of it. 72 | 73 | Now, there are only three functions ahead, but I cannot trace the execution easily. The code uses lots of function pointers and I quickly get lost. A patient reverser would study the OCaml compiler to figure out how the code is generated, but I still have one thing to try: hardware breakpoint on the input string. 74 | 75 | The plan is simple, we now know the string is held in `rax` at 0x40314b, then we can set a hardware breakpoint on it and see who accesses it. If everything goes well, we can find the code that reads the input, which is very likely to be also the checking logic code. 76 | 77 | I set a breakpoint at 0x40317f, and supplied the input string "111111111111111111111" (which is just '1' * 0x15). It hits! Not bad, at least we are correct on the length check. The pwndbg shows `rax` does point to the input string: 78 | 79 | ```Bash 80 | RAX 0x7ffff7ff9b90 ◂— '111111111111111111111' 81 | ``` 82 | 83 | Then I set a hardware read breakpoint: 84 | 85 | ```Bash 86 | pwndbg> rwatch *0x7ffff7ff9b95 87 | Hardware read watchpoint 3: *0x7ffff7ff9b95 88 | ``` 89 | 90 | Note, the string starts at 0x7ffff7ff9b95, but I set a breakpoint at 0x7ffff7ff9b95, which is the 6th char of the input. This is a personal habit since there could potentially be more places that access the first char than we are interested in. On the other hand, the code that reads a char in the middle is more likely to be interesting and worth checking out. 91 | 92 | The hardware breakpoint is hit at 0x402c07, and the instruction above it is reading the 6th char of the input: 93 | 94 | 95 | 96 | This function (sub_4024f0) looks like: 97 | 98 | 99 | 100 | So it is very likely that the function is checking very char one by one. This function has no xref to it at all, so I probably will not be able to find it easily, if I do not use hardware breakpoint. Inspecting the stack gives me the actual caller, 0x402410, and I have to say it is not easy to find the actual callee without debugging. The good news is if I were to reverse OCaml in the future, I know where to look at and with the help of debugging, I can hopefully find the callee and sort out the execution flow. 101 | 102 | 103 | 104 | 105 | I notice if the check passes, the return value is set to 0xa7. Remember the check at 0x4031e6? 0xd9f is quite a strange value, but it could be related to the 0xa7 here. 106 | 107 | 108 | 109 | ```Python 110 | >>> hex(0xd9f/0x15) 111 | '0xa6' 112 | ``` 113 | 114 | So, there is some code, which I have not discovered, that minus 1 from the return value and then sum everything up. Now we know the check for each char, and it should not be hard to dump the constraints and solve it with z3. 115 | 116 | I am lazy and do not wish to manually transcript the constraints and z3 syntax. However, angr does not easily work with it, thanks to the OCaml runtime, which angr does not understand. So I need to combine the power of BinaryNinja API to simplify the binary and enable angr to work with it. 117 | 118 | ## Solving with BinaryNinja and Angr 119 | 120 | 121 | 122 | If we look at the basic block at 0x402d4c, there are two inputs to it: 1) the ASCII string in `rax`, 2) the value of `rbx` set at 0x402d24. We also need to extract the char index from the instruction at 0x402d4c (0x3 for in this screenshot). To get the initial value of `rbx`, we do not need to search for the instruction at 0x402d24. Instead, we can use the possible value set of `rbx` to get it. To enable angr, we also need to get the target address of the true/false branch of the conditional at 0x402d64. 123 | 124 | ### Getting True/False Branch Address 125 | 126 | To get the good/bad branch, we first get the `outgoing_edges` of a basic block and check the `edge.type`: 127 | 128 | ```Python 129 | bbl = bv.get_basic_blocks_at(addr)[0] 130 | edges = bbl.outgoing_edges 131 | for edge in edges: 132 | if edge.type == BranchType.TrueBranch: 133 | good_addr = edge.target.start 134 | elif edge.type == BranchType.FalseBranch: 135 | bad_addr = edge.target.start 136 | ``` 137 | 138 | ### Parsing LLIL and Getting Char Index 139 | 140 | For each constraint, we need to know the index of the char being checked. For example, for instruction `movzx rax, byte [rax+0x3]`, we need to get 0x3 from it. This requires us to walk the LLIL instruction and find its value. 141 | 142 | ```Python 143 | def find_llil_basic_block(llil_basic_blocks, addr): 144 | for llil_bbl in llil_basic_blocks: 145 | if llil_bbl[0].address == addr: 146 | return llil_bbl 147 | 148 | func = bv.get_functions_containing(addr)[0] 149 | llil_basic_blocks = list(func.llil_basic_blocks) 150 | llil_bbl = find_llil_basic_block(llil_basic_blocks, addr) 151 | src = llil_bbl[0].operands[1].operands[0].operands[0] 152 | 153 | char_idx = 0 154 | if src.operation == LowLevelILOperation.LLIL_ADD: 155 | char_idx = src.operands[1].value.value 156 | ``` 157 | 158 | Note, the above code might not be very reader-friendly, e.g., `src = llil_bbl[0].operands[1].operands[0].operands[0]`. This is because LLIL is essentially a tree, and we are travelling down it. 159 | 160 | 161 | ### Getting the Possible Value of rbx 162 | 163 | To get the possible value of rbx when the execution enters the basic block, we need to use the `get_possible_reg_values` API. 164 | 165 | ```Python 166 | rbx_value = 0 167 | value_set = llil_bbl[0].get_possible_reg_values('rbx') 168 | if value_set.type == RegisterValueType.ConstantValue: 169 | rbx_value = value_set.value 170 | rbx_value &= 0xffffffffffffffff 171 | ``` 172 | 173 | Note, not all of the check uses rbx. For them, the `value_set.type` will be `UnderterminedValue`, and rbx_value will remain 0x0. This has no side effect on solving. 174 | 175 | ### Angr Time 176 | 177 | The last step is to solve it with angr: 178 | 179 | ```Python 180 | def angr_solve(addr, good_addr, bad_addr, char_idx, rbx_value): 181 | proj = angr.Project('./teenager') 182 | state = proj.factory.entry_state(addr = addr) 183 | # suppose the input string (ASCII) is stored at 0xaa000000 184 | input_addr = 0xaa000000 185 | state.regs.rax = input_addr 186 | state.regs.rbx = rbx_value 187 | flag = state.solver.BVS('flag', 8) 188 | state.memory.store(input_addr + char_idx, flag) 189 | simgr = proj.factory.simgr(state) 190 | simgr.explore(find = good_addr, avoid = [bad_addr]) 191 | if simgr.found: 192 | solution_state = simgr.found[0] 193 | char_solution = solution_state.solver.eval(flag, cast_to = bytes) 194 | return True, char_solution 195 | else: 196 | False, None 197 | ``` 198 | 199 | Note, the above script is not super robust, since we really expect the solving to succeed. 200 | 201 | We still need to manually collect the 0x15 address of the basic blocks. Although it is possible to automatically collect them, I feel the time to make it work will be longer than just select and copy 0x15 addresses. 202 | 203 | The script returns `0CamL_Ints_Ar3_W4rped`, and feeding it to the challenge gives me: 204 | 205 | ```Bash 206 | $ ./teenager 207 | -= Montrehack =- 208 | Teenager 209 | 210 | Enter Password: 0CamL_Ints_Ar3_W4rped 211 | 212 | [+] Success! 213 | FLAG-221fddd2bbf810be10d156b060b0eda5 214 | ``` 215 | 216 | This reminds me of the description of the challenge: 217 | 218 | ``` 219 | A slightly harder OCaml challenge to get practice with OCaml integer representations. 220 | ``` 221 | 222 | So, it seems that I solved without knowing anything about OCaml integer representation. -------------------------------------------------------------------------------- /OCaml_crackmes/baby: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/baby -------------------------------------------------------------------------------- /OCaml_crackmes/imgs/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/imgs/1.png -------------------------------------------------------------------------------- /OCaml_crackmes/imgs/10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/imgs/10.png -------------------------------------------------------------------------------- /OCaml_crackmes/imgs/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/imgs/2.png -------------------------------------------------------------------------------- /OCaml_crackmes/imgs/3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/imgs/3.png -------------------------------------------------------------------------------- /OCaml_crackmes/imgs/4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/imgs/4.png -------------------------------------------------------------------------------- /OCaml_crackmes/imgs/5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/imgs/5.png -------------------------------------------------------------------------------- /OCaml_crackmes/imgs/6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/imgs/6.png -------------------------------------------------------------------------------- /OCaml_crackmes/imgs/7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/imgs/7.png -------------------------------------------------------------------------------- /OCaml_crackmes/imgs/8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/imgs/8.png -------------------------------------------------------------------------------- /OCaml_crackmes/imgs/9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/imgs/9.png -------------------------------------------------------------------------------- /OCaml_crackmes/solve.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python3 2 | from binaryninja import * 3 | import angr 4 | 5 | bv = BinaryViewType.get_view_of_file('teenager.bndb') 6 | 7 | basic_block_addrs = [ 8 | 0x402d4c, 9 | 0x402cdc, 10 | 0x402c6c, 11 | 0x402c02, 12 | 0x402ba2, 13 | 0x402b3c, 14 | 0x402ad2, 15 | 0x402a6c, 16 | 0x4029fc, 17 | 0x40298c, 18 | 0x402922, 19 | 0x4028b2, 20 | 0x402842, 21 | 0x4027e2, 22 | 0x402782, 23 | 0x40271c, 24 | 0x4026b2, 25 | 0x40264c, 26 | 0x4025e2, 27 | 0x402582, 28 | 0x402512 29 | ] 30 | 31 | def find_llil_basic_block(llil_basic_blocks, addr): 32 | for llil_bbl in llil_basic_blocks: 33 | if llil_bbl[0].address == addr: 34 | return llil_bbl 35 | 36 | def collect_info(bv, addr): 37 | 38 | bbl = bv.get_basic_blocks_at(addr)[0] 39 | 40 | edges = bbl.outgoing_edges 41 | good_addr = 0 42 | bad_addr = 0 43 | for edge in edges: 44 | if edge.type == BranchType.TrueBranch: 45 | good_addr = edge.target.start 46 | elif edge.type == BranchType.FalseBranch: 47 | bad_addr = edge.target.start 48 | 49 | func = bv.get_functions_containing(addr)[0] 50 | llil_basic_blocks = list(func.llil_basic_blocks) 51 | llil_bbl = find_llil_basic_block(llil_basic_blocks, addr) 52 | src = llil_bbl[0].operands[1].operands[0].operands[0] 53 | 54 | char_idx = 0 55 | if src.operation == LowLevelILOperation.LLIL_ADD: 56 | char_idx = src.operands[1].value.value 57 | 58 | rbx_value = 0 59 | value_set = llil_bbl[0].get_possible_reg_values('rbx') 60 | if value_set.type == RegisterValueType.ConstantValue: 61 | rbx_value = value_set.value 62 | rbx_value &= 0xffffffffffffffff 63 | 64 | return good_addr, bad_addr, char_idx, rbx_value 65 | 66 | def angr_solve(addr, good_addr, bad_addr, char_idx, rbx_value): 67 | proj = angr.Project('./teenager') 68 | state = proj.factory.entry_state(addr = addr) 69 | # suppose the input string (ASCII) is stored at 0xaa000000 70 | input_addr = 0xaa000000 71 | state.regs.rax = input_addr 72 | state.regs.rbx = rbx_value 73 | flag = state.solver.BVS('flag', 8) 74 | state.memory.store(input_addr + char_idx, flag) 75 | simgr = proj.factory.simgr(state) 76 | simgr.explore(find = good_addr, avoid = [bad_addr]) 77 | if simgr.found: 78 | solution_state = simgr.found[0] 79 | char_solution = solution_state.solver.eval(flag, cast_to = bytes) 80 | return True, char_solution 81 | else: 82 | False, None 83 | 84 | def main(): 85 | solution = [0] * 0x15 86 | for addr in basic_block_addrs: 87 | good_addr, bad_addr, char_idx, rbx_value = collect_info(bv, addr) 88 | print(hex(addr), hex(good_addr), hex(bad_addr), hex(char_idx), hex(rbx_value)) 89 | solved, output = angr_solve(addr, good_addr, bad_addr, char_idx, rbx_value) 90 | if solved: 91 | print(hex(char_idx), output) 92 | solution[char_idx] = output.decode('ascii') 93 | 94 | flag = ''.join(solution) 95 | print(flag) 96 | 97 | if __name__ == '__main__': 98 | main() 99 | 100 | # 0CamL_Ints_Ar3_W4rped 101 | 102 | # $ ./teenager 103 | # -= Montrehack =- 104 | # Teenager 105 | 106 | # Enter Password: 0CamL_Ints_Ar3_W4rped 107 | 108 | # [+] Success! 109 | # FLAG-221fddd2bbf810be10d156b060b0eda5 -------------------------------------------------------------------------------- /OCaml_crackmes/teenager: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/OCaml_crackmes/teenager -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # writeups 2 | writeups for CTFs and other stuff 3 | -------------------------------------------------------------------------------- /ReverseMe3/ReverseMe3.EXE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/ReverseMe3.EXE -------------------------------------------------------------------------------- /ReverseMe3/ReverseMe3_vm_fixed.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/ReverseMe3_vm_fixed.exe -------------------------------------------------------------------------------- /ReverseMe3/imgs/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/1.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/10.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/11.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/11.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/12.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/12.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/13.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/14.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/14.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/15.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/15.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/16.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/16.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/17.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/17.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/18.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/18.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/19.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/19.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/2.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/20.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/20.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/21.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/21.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/22.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/22.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/23.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/23.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/3.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/4.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/5.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/6.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/7.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/8.png -------------------------------------------------------------------------------- /ReverseMe3/imgs/9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/ReverseMe3/imgs/9.png -------------------------------------------------------------------------------- /ReverseMe3/name_hash.py: -------------------------------------------------------------------------------- 1 | def rol(val, n): 2 | bin_str = bin(val)[2:] 3 | bin_str = '0' * (32 - len(bin_str)) + bin_str 4 | bin_str = bin_str[n : ] + bin_str[ : n] 5 | return int(bin_str, 2) 6 | 7 | def calc_hash(name): 8 | val = 0 9 | for c in name: 10 | val += ord(c) 11 | val &= 0xffffffff 12 | val = rol(val, ord(c) & 31) 13 | return val 14 | 15 | def main(): 16 | hash_1 = calc_hash('LoadLibraryA') 17 | print(hex(hash_1)) 18 | hash_2 = calc_hash('MessageBoxA') 19 | print(hex(hash_2)) 20 | 21 | main() -------------------------------------------------------------------------------- /armageddon/angr-solve.py: -------------------------------------------------------------------------------- 1 | import angr 2 | import claripy 3 | 4 | proj = angr.Project('./armageddon') 5 | print(hex(proj.entry)) 6 | start_address = 0x14a88 7 | state = proj.factory.entry_state(addr = start_address) 8 | 9 | input_addr = 0xaa000000 10 | r11 = input_addr + 0x34 11 | state.regs.r11 = r11 12 | 13 | n = 42 14 | flag = state.solver.BVS('flag', n * 8) 15 | state.memory.store(input_addr, flag) 16 | 17 | simgr = proj.factory.simgr(state) 18 | good = 0x1504c 19 | simgr.explore(find = good, 20 | avoid = [ 21 | 0x10674, 22 | 0x107c8, 23 | 0x109ac, 24 | 0x10b6c, 25 | 0x10cf0, 26 | 0x10ea4, 27 | 0x11010, 28 | 0x11190, 29 | 0x11308, 30 | 0x114a4, 31 | 0x116a8, 32 | 0x1185c, 33 | 0x119c8, 34 | 0x11b84, 35 | 0x11d38, 36 | 0x11f10, 37 | 0x120c4, 38 | 0x122e4, 39 | 0x124c8, 40 | 0x1264c, 41 | 0x12800, 42 | 0x12948, 43 | 0x12b1c, 44 | 0x12d30, 45 | 0x12e9c, 46 | 0x13070, 47 | 0x13248, 48 | 0x133e0, 49 | 0x135f0, 50 | 0x137d4, 51 | 0x13970, 52 | 0x13b50, 53 | 0x13cbc, 54 | 0x13e6c, 55 | 0x14014, 56 | 0x141c8, 57 | 0x1434c, 58 | 0x144c4, 59 | 0x14648, 60 | 0x1485c, 61 | 0x149a0 62 | ] 63 | ) 64 | 65 | if simgr.found: 66 | solution_state = simgr.found[0] 67 | input1 = solution_state.solver.eval(flag, cast_to = bytes) 68 | print('flag: ', input1) 69 | else: 70 | print('Counld not find flag') 71 | 72 | -------------------------------------------------------------------------------- /armageddon/armageddon: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/armageddon/armageddon -------------------------------------------------------------------------------- /armageddon/imgs/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/armageddon/imgs/1.png -------------------------------------------------------------------------------- /armageddon/imgs/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/armageddon/imgs/2.png -------------------------------------------------------------------------------- /armageddon/imgs/3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/armageddon/imgs/3.png -------------------------------------------------------------------------------- /armageddon/qiling_emulate.py: -------------------------------------------------------------------------------- 1 | from qiling import * 2 | 3 | if __name__ == "__main__": 4 | ql = Qiling(["./armageddon"], "/mnt/F/reversing/qiling/examples/rootfs/arm_linux") 5 | ql.run() 6 | -------------------------------------------------------------------------------- /armageddon/solve.md: -------------------------------------------------------------------------------- 1 | # Solving an ARM challenge with z3 2 | 3 | ## First Impression 4 | 5 | The last week's challenge is hosted at https://crackmes.one/crackme/5edb0b8533c5d449d91ae73b. It is authored by Towel and it is a real challenge in UMDCTF 2019. 6 | 7 | Loading it into BinaryNinja reveals that it is an ARM binary. Not very surprised as its name is `armageddon`. ARM is no longer special for me as I gradually become familiar with the ISA. After all, it is simpler than the x86 and those frequently used instructions are easy to understand and remember. 8 | 9 | BinaryNinja has no problem recognizing the `__libc_start_main` and I can get to `main` easily. The first thing I find is that `main` is a **long** function. 10 | 11 | 12 | 13 | Well, a long function is not necessarily hard to analyze. It probably leverages certain obfuscation and/or its code is pretty repetitive. I started browsing the code from the beginning. 14 | 15 | 16 | 17 | It first prints a welcoming message and then asks the user to type the input. After that, it calls `scanf` with `"%41s"` which reads at most 41 chars from the terminal. Not bad, we now know it accepts a string as the input and we know the maximum length of it. 18 | 19 | We also notice that the basic blocks are split into quite short ones. This is probably an obfuscation technique. Nevetheless BinaryNinja kind of automatically accounts for it so we are not bothered by it. If a disassembler does not correctly inline the blocks after the jump (`b`), it could be harder to analyze. 20 | 21 | After reading the input, the code becomes repetitive: each time, a function is called with the user input as its only parameter. The pattern is repeated until near the bottom of the function, where a loop is found. The loop could be decrypting the flag based on the correct user input. And the checks on the input is obviously inside these called functions. 22 | 23 | I followed the first check function and it looks like this: 24 | 25 | 26 | 27 | Near the bottom of the function we see the comparison and if the comparison is not equal, an error message is printed. After analyzing the algorithm, I find the constraint is: 28 | 29 | ``` 30 | passwd[1] * passwd[0x27] * passwd[0x15] + passwd[0x11] + passwd[0x13] * passwd[0x1e] == 0xdb11e 31 | ``` 32 | 33 | I browsed several other check functions and they all look similar. Now it becomes obvious: There are a series of constraints and the correct input must satisfy all of them. 34 | 35 | 36 | ## Round One: Failure of angr 37 | 38 | This challenge is very suitable for tools like `angr` or `z3`. In fact angr also uses z3 as its constraint solving backend. However, angr can automatically extract constraints from the binary, which could save a lot of time for reversers. So I decided to first give angr a try. 39 | 40 | The code is not hard to write -- especially they all look similar for different binaries. 41 | 42 | Code for [angr-solve.py](angr-solve.py) 43 | 44 | ```Python 45 | import angr 46 | import claripy 47 | 48 | proj = angr.Project('./armageddon') 49 | print(hex(proj.entry)) 50 | start_address = 0x14a88 51 | state = proj.factory.entry_state(addr = start_address) 52 | 53 | input_addr = 0xaa000000 54 | r11 = input_addr + 0x34 55 | state.regs.r11 = r11 56 | 57 | n = 42 58 | flag = state.solver.BVS('flag', n * 8) 59 | state.memory.store(input_addr, flag) 60 | 61 | simgr = proj.factory.simgr(state) 62 | good = 0x1504c 63 | simgr.explore(find = good, 64 | avoid = [ 65 | 0x10674, 66 | 0x107c8, 67 | 0x109ac, 68 | 0x10b6c, 69 | 0x10cf0, 70 | 0x10ea4, 71 | 0x11010, 72 | 0x11190, 73 | 0x11308, 74 | 0x114a4, 75 | 0x116a8, 76 | 0x1185c, 77 | 0x119c8, 78 | 0x11b84, 79 | 0x11d38, 80 | 0x11f10, 81 | 0x120c4, 82 | 0x122e4, 83 | 0x124c8, 84 | 0x1264c, 85 | 0x12800, 86 | 0x12948, 87 | 0x12b1c, 88 | 0x12d30, 89 | 0x12e9c, 90 | 0x13070, 91 | 0x13248, 92 | 0x133e0, 93 | 0x135f0, 94 | 0x137d4, 95 | 0x13970, 96 | 0x13b50, 97 | 0x13cbc, 98 | 0x13e6c, 99 | 0x14014, 100 | 0x141c8, 101 | 0x1434c, 102 | 0x144c4, 103 | 0x14648, 104 | 0x1485c, 105 | 0x149a0 106 | ] 107 | ) 108 | 109 | if simgr.found: 110 | solution_state = simgr.found[0] 111 | input1 = solution_state.solver.eval(flag, cast_to = bytes) 112 | print('flag: ', input1) 113 | else: 114 | print('Cound not find flag') 115 | 116 | ``` 117 | 118 | We tell angr where is the input, and specify a good address to reach, as well as an (optional) list of addresses to avoid. Those addresses to be avoided are those printing error messages. 119 | 120 | This, in theory, should work. However, after running for several minutes angr tells me there is no solution. This is a little bit surprising as I assume as long as there is a solution, angr either returns it or keeps running. There could be multiple reasons for it, e.g., a bug in angr, or the constraints are not properly lifted, etc. We could output the constraints that angr is solving and troubleshot what went wrong. But please allow me to save it as future work. 121 | 122 | ## Round Two: Conquering it with z3 123 | 124 | The next option is to convert the constraints into Python syntax and solve it with z3. The transcribing is arduous work and prone to error. It is better done in an automated or semi-automated way. 125 | 126 | I opened the challenge binary in Ghidra and found that the decompilation is generally good: 127 | 128 | ```C 129 | int FUN_000104fc(int param_1) 130 | 131 | { 132 | if ((uint)*(byte *)(param_1 + 1) * 133 | (uint)*(byte *)(param_1 + 0x27) * (uint)*(byte *)(param_1 + 0x15) + 134 | (uint)*(byte *)(param_1 + 0x11) + 135 | (uint)*(byte *)(param_1 + 0x13) * (uint)*(byte *)(param_1 + 0x1e) != 0xdb11e) { 136 | puts("\n[!] Code did not validate! :(\n"); 137 | /* WARNING: Subroutine does not return */ 138 | exit(0); 139 | } 140 | return param_1; 141 | } 142 | ``` 143 | Then I copy-and-pasted all the constraints into a temp script and converted it into Python syntax. Note this work is still quite repetitive, so I decided to convert the code with a regular expression. 144 | 145 | I did it in VS Code. I searched for 146 | 147 | ```Python 148 | \(uint\)\*\(byte \*\)\(param_1 \+ ((0x)?[0-9a-f]+)\) 149 | ``` 150 | 151 | and replaced them with 152 | 153 | ``` 154 | passwd[$1] 155 | ``` 156 | 157 | Basically, this will convert `(uint)*(byte *)(param_1 + 1)` to `passwd[1]`. There are still manual works needed, like removing the `if`, etc. But those are not hard to do. 158 | 159 | Eventually, the solving script looks like this [(z3_solve.py)](z3_solve.py.py): 160 | 161 | ```Python 162 | from z3 import * 163 | 164 | n = 41 165 | passwd = [BitVec('s_%d' % i, 32) for i in range(n)] 166 | 167 | s = Solver() 168 | for i in range(n): 169 | s.add(passwd[i] >= 0x21) 170 | s.add(passwd[i] <= 127) 171 | 172 | s.add(passwd[1] * passwd[0x27] * passwd[0x15] + passwd[0x11] + passwd[0x13] * passwd[0x1e] == 0xdb11e) 173 | s.add(passwd[0x25] - passwd[0x13] * passwd[0xc] == -0xc0c) 174 | s.add(((passwd[2] - passwd[0x1f]) + passwd[0x21] * passwd[0xd] * passwd[0x14]) - passwd[0x11] == 0xebd1d) 175 | s.add((passwd[7] + passwd[0x24] * passwd[0xf]) - passwd[0x1d] * passwd[0x22] == 0x18e5) 176 | s.add((passwd[0x15] - passwd[0x1b] * passwd[0xf]) - passwd[0x11] == -0x2e3b) 177 | s.add(((passwd[0xf] - passwd[0x25] * passwd[8]) - passwd[5]) - passwd[6] == -0x19a5) 178 | s.add(((passwd[0x23] + passwd[0x1d]) - passwd[0x14]) + passwd[0x1a] == 0xc4) 179 | s.add(passwd[7] * passwd[0x20] + passwd[0x1f] * passwd[0xb] == 0x45ca) 180 | s.add(passwd[0x1d] * passwd[0x18] * passwd[0x24] + passwd[0x25] == 0xac3fb) 181 | s.add(((passwd[8] - passwd[0x10]) - passwd[0xc]) + passwd[0x28] + passwd[0xf] == 0xd0) 182 | s.add((passwd[0x23] * passwd[0x11] * passwd[0x0] - passwd[0xb]) + passwd[0xc] * passwd[7] * passwd[0x26] == 0x172e48) 183 | s.add(((passwd[0x1a] - passwd[0xd]) + passwd[3] * passwd[8]) - passwd[5] == 0x10b8) 184 | s.add(passwd[3] + passwd[0x11] + passwd[0x24] + passwd[0x14] == 0x160) 185 | s.add((passwd[0x1a] - passwd[0x15] * passwd[0x12]) + passwd[0x1b] * passwd[0x19] == 0x8a2) 186 | s.add((passwd[0x22] - passwd[0xe]) + passwd[5] * passwd[0x21] + passwd[0x23] == 0x1bd8) 187 | s.add(passwd[5] * passwd[8] * passwd[0x26] * passwd[0x19] + passwd[0x15] + passwd[0x23] == 0x2ca6988) 188 | s.add((passwd[8] * passwd[8] + passwd[0x15] * passwd[0xc]) - passwd[0x24] == 0x2430) 189 | s.add((((passwd[0x23] + passwd[2]) - passwd[7]) - passwd[9] * passwd[0x12]) + passwd[2] * passwd[0x27] == 0x2de) 190 | s.add(((passwd[5] * (passwd[0x11] - 1) - passwd[6]) - passwd[0x14]) - passwd[0x22] * passwd[0x17] == -0x11d5) 191 | s.add((passwd[0x22] - passwd[0xb]) + passwd[0xb] * passwd[0xd] == 0x2aba) 192 | s.add((passwd[0x22] - passwd[0xb]) + passwd[0xb] * passwd[0xd] == 0x2aba) 193 | s.add(passwd[0x1b] + passwd[0x12] * passwd[0xf] + passwd[0x20] + passwd[9] == 0x2668) 194 | s.add(passwd[0x15] - passwd[0xe] * passwd[0x1d] == -0x1400) 195 | s.add((((passwd[9] * passwd[9] - passwd[10]) + passwd[0xd]) - passwd[0x24]) - passwd[0x14] == 0x19ac) 196 | s.add(((passwd[0xc] + passwd[2] + passwd[0x22]) - passwd[4] * passwd[0x14] * passwd[0x17]) + passwd[0x16] == -0xafa0c) 197 | s.add(((passwd[4] + passwd[5]) - passwd[10]) + passwd[0x1b] == 0xb4) 198 | s.add(((passwd[0xf] - passwd[0x1c]) - passwd[0x25]) - passwd[0x18] * passwd[0x12] * passwd[0x0] == -0xd06e8) 199 | s.add(((passwd[4] * passwd[0x23] + passwd[0x19]) - passwd[0x15]) - passwd[0x18] * passwd[0x14] == -0x1f8) 200 | s.add((((passwd[0x19] + passwd[10]) - passwd[0xf]) + passwd[0x1c]) - passwd[0x21] == 0x3e) 201 | s.add((((passwd[6] - passwd[0x19]) + passwd[2]) - passwd[0x19]) + passwd[1] + passwd[0x12] * passwd[0x1c] == 0x1eb9) 202 | s.add(passwd[0xb] * (passwd[5] + passwd[0x22] * passwd[0x16]) + passwd[0xc] + passwd[0x22] == 0x121b93) 203 | s.add((((passwd[3] + passwd[0xe]) - passwd[0x26]) - passwd[0xd]) - passwd[1] == -0x80) 204 | s.add((((passwd[0x1e] + passwd[0x15]) - passwd[0x11]) - passwd[0x17] * passwd[5]) + passwd[0x21] == -0x1afd) 205 | s.add((passwd[7] - passwd[0xe]) + passwd[0x11] + passwd[0x21] == 0xdf) 206 | s.add((passwd[8] - passwd[3]) + passwd[2] * passwd[10] * passwd[10] == 0x626e2) 207 | s.add(((passwd[0x25] + passwd[7]) - passwd[0x13]) + passwd[0xc] + passwd[0xb] == 0x12f) 208 | s.add(passwd[1] + passwd[8] * passwd[0x14] + passwd[0x20] + passwd[0xf] == 0x167a) 209 | s.add((passwd[0x11] - passwd[4]) - passwd[0x1d] * passwd[0x12] == -0x11ca) 210 | s.add((passwd[0xd] * passwd[0x16] - passwd[10]) - passwd[0x23] == 0x32e9) 211 | s.add(passwd[0xd] + passwd[0xb] + passwd[0x1d] * passwd[0x13] == 0xec9) 212 | s.add((((passwd[0x19] + passwd[0x26] * passwd[0xf]) - passwd[0xb]) + passwd[0x20]) - passwd[0x15] * passwd[0x22] == 0x2a) 213 | s.add(passwd[6] * passwd[9] + passwd[0x23] == 0xedd) 214 | 215 | if s.check() == sat: 216 | print('solved!') 217 | m = s.model() 218 | flag = '' 219 | for i in range(n): 220 | c = m[passwd[i]].as_long() 221 | flag += chr(c) 222 | print(flag) 223 | else: 224 | print('failed') 225 | ``` 226 | 227 | One thing to mention here is although the individual chars of passwd are only 8 bits wide, we declare them to be 32-bit wide. Otherwise, it could cause a problem to the `==` at the end of the line. Obviously, we have to add the constraint `passwd[i] >= 0x21` and `passwd[i] <= 127`, to actually enforce they are printable ASCII chars. 228 | 229 | Running this immediately returns the flag: 230 | 231 | ``` 232 | UMDCTF-{ARM_1s_s0_SATisfying_7y8fdlsjebn} 233 | ``` 234 | 235 | ## Epilog 236 | 237 | Despite z3 returns a result and it looks quite convincing, there are still some code below the last constraint. Typically, in CTF, this means the correct input that passes the constraints is NOT the actual flag; rather the input is used to decrypt the flag to be submitted. However, the above code is already in a good flag format. This confuses me so I decide to run the binary to see what happens. There is an excellent tool for this situation: the process level emulator -- [`Qiling`](https://www.qiling.io/). 238 | 239 | Qiling is an emulator based on Unicorn. It is simpler than Qemu since it only emulates the process that we are interested in. So there is no need to set up a bulky OS to run it. The code is extremely simple [(qiling_emulate.py)](qiling_emulate.py): 240 | 241 | ```Python 242 | from qiling import * 243 | 244 | if __name__ == "__main__": 245 | ql = Qiling(["./armageddon"], "QILING_INSTALL_PATH/examples/rootfs/arm_linux") 246 | ql.run() 247 | ``` 248 | 249 | Since major system calls are implemented by Qiling, the program executes properly. Below is an excerpt of the output: 250 | 251 | ``` 252 | write(1,27008,16) = 0 253 | [+] write() CONTENT: bytearray(b'[+] Enter Code: ') 254 | [+] Enter Code: UMDCTF-{ARM_1s_s0_SATisfying_7y8fdlsjebn} 255 | read(0, 0x29010, 0x2000) = 42 256 | write(1,27008,1) = 0 257 | [+] write() CONTENT: bytearray(b'\n') 258 | 259 | write(1,27008,33) = 0 260 | [+] write() CONTENT: bytearray(b'[+] Code validated successfully!\n') 261 | [+] Code validated successfully! 262 | write(1,27008,1) = 0 263 | [+] write() CONTENT: bytearray(b'\n') 264 | 265 | [!] 0xf7ca9be8: syscall number = 0x8c(140) not implemented 266 | exit_group(0) 267 | ``` 268 | 269 | So after we supply the correct code, it simply prints `Code validated successfully!\n`. 270 | 271 | LoL! I forget that it does not tell us the code is correct yet. Well, not bad, since playing with Qiling is quite painless. -------------------------------------------------------------------------------- /armageddon/z3_solve.py: -------------------------------------------------------------------------------- 1 | from z3 import * 2 | 3 | n = 41 4 | passwd = [BitVec('s_%d' % i, 32) for i in range(n)] 5 | 6 | s = Solver() 7 | for i in range(n): 8 | s.add(passwd[i] >= 0x21) 9 | s.add(passwd[i] <= 127) 10 | 11 | s.add(passwd[1] * passwd[0x27] * passwd[0x15] + passwd[0x11] + passwd[0x13] * passwd[0x1e] == 0xdb11e) 12 | s.add(passwd[0x25] - passwd[0x13] * passwd[0xc] == -0xc0c) 13 | s.add(((passwd[2] - passwd[0x1f]) + passwd[0x21] * passwd[0xd] * passwd[0x14]) - passwd[0x11] == 0xebd1d) 14 | s.add((passwd[7] + passwd[0x24] * passwd[0xf]) - passwd[0x1d] * passwd[0x22] == 0x18e5) 15 | s.add((passwd[0x15] - passwd[0x1b] * passwd[0xf]) - passwd[0x11] == -0x2e3b) 16 | s.add(((passwd[0xf] - passwd[0x25] * passwd[8]) - passwd[5]) - passwd[6] == -0x19a5) 17 | s.add(((passwd[0x23] + passwd[0x1d]) - passwd[0x14]) + passwd[0x1a] == 0xc4) 18 | s.add(passwd[7] * passwd[0x20] + passwd[0x1f] * passwd[0xb] == 0x45ca) 19 | s.add(passwd[0x1d] * passwd[0x18] * passwd[0x24] + passwd[0x25] == 0xac3fb) 20 | s.add(((passwd[8] - passwd[0x10]) - passwd[0xc]) + passwd[0x28] + passwd[0xf] == 0xd0) 21 | s.add((passwd[0x23] * passwd[0x11] * passwd[0x0] - passwd[0xb]) + passwd[0xc] * passwd[7] * passwd[0x26] == 0x172e48) 22 | s.add(((passwd[0x1a] - passwd[0xd]) + passwd[3] * passwd[8]) - passwd[5] == 0x10b8) 23 | s.add(passwd[3] + passwd[0x11] + passwd[0x24] + passwd[0x14] == 0x160) 24 | s.add((passwd[0x1a] - passwd[0x15] * passwd[0x12]) + passwd[0x1b] * passwd[0x19] == 0x8a2) 25 | s.add((passwd[0x22] - passwd[0xe]) + passwd[5] * passwd[0x21] + passwd[0x23] == 0x1bd8) 26 | s.add(passwd[5] * passwd[8] * passwd[0x26] * passwd[0x19] + passwd[0x15] + passwd[0x23] == 0x2ca6988) 27 | s.add((passwd[8] * passwd[8] + passwd[0x15] * passwd[0xc]) - passwd[0x24] == 0x2430) 28 | s.add((((passwd[0x23] + passwd[2]) - passwd[7]) - passwd[9] * passwd[0x12]) + passwd[2] * passwd[0x27] == 0x2de) 29 | s.add(((passwd[5] * (passwd[0x11] - 1) - passwd[6]) - passwd[0x14]) - passwd[0x22] * passwd[0x17] == -0x11d5) 30 | s.add((passwd[0x22] - passwd[0xb]) + passwd[0xb] * passwd[0xd] == 0x2aba) 31 | s.add((passwd[0x22] - passwd[0xb]) + passwd[0xb] * passwd[0xd] == 0x2aba) 32 | s.add(passwd[0x1b] + passwd[0x12] * passwd[0xf] + passwd[0x20] + passwd[9] == 0x2668) 33 | s.add(passwd[0x15] - passwd[0xe] * passwd[0x1d] == -0x1400) 34 | s.add((((passwd[9] * passwd[9] - passwd[10]) + passwd[0xd]) - passwd[0x24]) - passwd[0x14] == 0x19ac) 35 | s.add(((passwd[0xc] + passwd[2] + passwd[0x22]) - passwd[4] * passwd[0x14] * passwd[0x17]) + passwd[0x16] == -0xafa0c) 36 | s.add(((passwd[4] + passwd[5]) - passwd[10]) + passwd[0x1b] == 0xb4) 37 | s.add(((passwd[0xf] - passwd[0x1c]) - passwd[0x25]) - passwd[0x18] * passwd[0x12] * passwd[0x0] == -0xd06e8) 38 | s.add(((passwd[4] * passwd[0x23] + passwd[0x19]) - passwd[0x15]) - passwd[0x18] * passwd[0x14] == -0x1f8) 39 | s.add((((passwd[0x19] + passwd[10]) - passwd[0xf]) + passwd[0x1c]) - passwd[0x21] == 0x3e) 40 | s.add((((passwd[6] - passwd[0x19]) + passwd[2]) - passwd[0x19]) + passwd[1] + passwd[0x12] * passwd[0x1c] == 0x1eb9) 41 | s.add(passwd[0xb] * (passwd[5] + passwd[0x22] * passwd[0x16]) + passwd[0xc] + passwd[0x22] == 0x121b93) 42 | s.add((((passwd[3] + passwd[0xe]) - passwd[0x26]) - passwd[0xd]) - passwd[1] == -0x80) 43 | s.add((((passwd[0x1e] + passwd[0x15]) - passwd[0x11]) - passwd[0x17] * passwd[5]) + passwd[0x21] == -0x1afd) 44 | s.add((passwd[7] - passwd[0xe]) + passwd[0x11] + passwd[0x21] == 0xdf) 45 | s.add((passwd[8] - passwd[3]) + passwd[2] * passwd[10] * passwd[10] == 0x626e2) 46 | s.add(((passwd[0x25] + passwd[7]) - passwd[0x13]) + passwd[0xc] + passwd[0xb] == 0x12f) 47 | s.add(passwd[1] + passwd[8] * passwd[0x14] + passwd[0x20] + passwd[0xf] == 0x167a) 48 | s.add((passwd[0x11] - passwd[4]) - passwd[0x1d] * passwd[0x12] == -0x11ca) 49 | s.add((passwd[0xd] * passwd[0x16] - passwd[10]) - passwd[0x23] == 0x32e9) 50 | s.add(passwd[0xd] + passwd[0xb] + passwd[0x1d] * passwd[0x13] == 0xec9) 51 | s.add((((passwd[0x19] + passwd[0x26] * passwd[0xf]) - passwd[0xb]) + passwd[0x20]) - passwd[0x15] * passwd[0x22] == 0x2a) 52 | s.add(passwd[6] * passwd[9] + passwd[0x23] == 0xedd) 53 | 54 | if s.check() == sat: 55 | print('solved!') 56 | m = s.model() 57 | flag = '' 58 | for i in range(n): 59 | c = m[passwd[i]].as_long() 60 | flag += chr(c) 61 | print(flag) 62 | else: 63 | print('failed') 64 | 65 | # UMDCTF-{ARM_1s_s0_SATisfying_7y8fdlsjebn} -------------------------------------------------------------------------------- /automating-gdb/README.md: -------------------------------------------------------------------------------- 1 | # Solving a Recursive Crackme by Automating GDB 2 | 3 | The last week's challenge is called [`Recursion`](https://0x00sec.org/t/reverseme-recursion/21802). From the name we already expect to do some automation -- manually solving stuff recursively is not a wise idea. 4 | 5 | ## First Impression 6 | 7 | The forum probably does not allow users to post binary files, so challenges are all posted as base64 encoded. There are too many ways to restore the binary, but Binary Ninja saves you from remembering the command: Just copy the encoded text, create a new empty binary, and then click "Paste From" -> "Base64". Then you are done! 8 | 9 | 10 | 11 | We get a 14.5 kB ELF file. There is some mild obfuscation in the start of the `main`, which does not pose a serious challenge. In the middle of the `main` we see the program is reading input and checking length: 12 | 13 | 14 | 15 | The first thing I notice is that the input must be exactly 0x50 chars, which is quite unusual. Not it reads at most 0x50 chars and checks if the chars read are at least 0x50 chars, which means it must be 0x50 chars. 16 | 17 | Besides, after the length check, we see it calls `mmap`. For reversing challenges, once we see a `mmap` in it, probably there is a self-modifying code. 18 | 19 | 20 | 21 | Moving downward we see that the program copies a 0xae4-byte buffer into the newly allocated buffer, and then calls it. A strange thing here is the user input is moved into register `r12`. Typically, no compilers will use register `r12` to pass function argument, so this code might be hand-crafted. 22 | 23 | After the `call rdx`, the program tells if the flag is correct based on the return value. Now the next step is obvious, we need to define a function on that code buffer and see what it has. 24 | 25 | ## Decryption routine 26 | 27 | 28 | 29 | The function looks like this. The loop decrypts another buffer at `data_20ab`, whose size is 0xa59. The decryption is just xor with 0x9f. Note the code_size variable sits right after this function, and right before the next data buffer to be decrypted. Meanwhile, the loop calculates a checksum of the next data buffer, and compare it with the dword at register `r12`. What is it? It is the user input! So the user input must match the checksum value. 30 | 31 | If the checksum matches, the program continues to execute the second newly decrypted buffer. Here, we can use Binary Ninja's transformation to transform the data in place, after which we define a function at the start of it. 32 | 33 | 34 | 35 | The newly defined function looks like this: 36 | 37 | 38 | 39 | It looks almost the same as the previous one, except for some small mutations. The xor key is different and it is 0xb6 this time. The buffer size is 0x9ce this time, which is smaller than the previous one. And that indicates we are probably recursively decrypting this buffer and each time we only decrypt the first part of it, which forms a function. 40 | 41 | I tried to repeat the process a few times and it just repeats. RECURSION. That is probably a good reason for the name. 42 | 43 | The first way to solve this is to solve it statically. We only need to get the xor key and the buffer size, to decrypt the buffer and calculate the checksum. However, due to the mutation, it is not that easy to get it correct. It is, though, definitely possible, but not optimal. So I come up with a dynamic approach. 44 | 45 | ## Using Hardware Breakpoints and Automating GDB 46 | 47 | I did not rewrite the checksum algorithm by myself, despite it is super simple. Even if it is super complex and I cannot reverse/rewrite it, I can still solve this challenge. Why? 48 | 49 | Because we can wait at the line where the dword from the user input is compared with the correct checksum. Particularly, it is the `cmp esi, edi` line. the register `esi` holds our input, which, during debugging, is trash. Register `edi` holds the correct checksum. If we set a breakpoint here and examine the value of `edi`, we directly get the correct checksum. 50 | 51 | However, this approach cannot easily scale to the entire challenge. The problem here is we do not know where to set the next breakpoint before we decrypt the code. However, manually decrypting the code is arduous and error-prone, so we would better automate the solution. 52 | 53 | Note the address of the user input buffer is moved into r12 and never changed. If one checksum matches, the program executes `add r12, 0x4` to move to the next dword. So we can use a hardware breakpoint to catch the program when it reads the buffer `r12`, and read the value of `edi`. Then we remove the current hardware breakpoint, set a new one on the next address, and wait for the program to break again. 54 | 55 | Automating GDB is easier said than done. I have known it is possible for a long time, though I have never done it before. After duckduckgo-ing a little bit, I found there are two ways to do it. The first one is to implement a GDB command in Python; the second way is to use pygdbmi to interact with GDB's machine interface. 56 | 57 | Both methods allow us to execute gdb commands as if we directly use GDB, and get the output from GDB afterward. However, I found the pygdbmi approach is much harder to use for the current purpose. First of all, it runs GDB headlessly. So if there is an error in the script, it is hard to find it. Conversely, if we take the first approach, since we register ourselves as a new command (`solve` in particular) after we run the stuff we are still in GDB. We can see the commands we executed and see the outputs from GDB, which allows painless debugging. Also, despite the name machine interface, it does not automatically parse the string output from GDB. For example, if we examine the value of `rdi` by executing 58 | 59 | ```p/x $rdi``` 60 | 61 | The GDB returns something like: 62 | 63 | ```$1 = 0x555555557e90``` 64 | 65 | I would expect the pygdbmi to parse the value for me. However, it does nothing for this and directly returns the string output. We get the very same thing in the first approach. So obviously it is the better way to do it. 66 | 67 | Note that I am not saying gdbmi is not good. It is used by various projects, e.g., gdbgui, which is a browser-based GDB frontend. If you have not tried it, I strongly recommend you to experiment with it. It is just using gdbmi will require more development work and it is not suitable for reversing challenge, where we care more about getting things rolling faster. 68 | 69 | Ok, so much for the comparison. It is time to get to the code. The code is not fancy -- it just requires some effort to write it correctly. 70 | 71 | ```Python 72 | import gdb 73 | import struct 74 | 75 | def get_reg_value(response): 76 | response = response.split()[2] 77 | value = int(response, 16) 78 | return value 79 | 80 | class Solve(gdb.Command): 81 | def __init__(self): 82 | # This registers our class as "solve" 83 | super(Solve, self).__init__("solve", gdb.COMMAND_DATA) 84 | 85 | def invoke(self, arg, from_tty): 86 | # When we call "solve" from gdb, this is the method 87 | # that will be called. 88 | 89 | dummy_input = open('input.txt', 'wb') 90 | dummy_input.write(b'1' * 0x50) 91 | dummy_input.close() 92 | 93 | solution = bytes() 94 | 95 | inferiors = gdb.inferiors() 96 | inferior = inferiors[0] 97 | gdb.execute('del') 98 | gdb.execute('file crackme.elf') 99 | gdb.execute('set breakpoint pending on') 100 | gdb.execute('b __libc_start_main') 101 | gdb.execute('r < input.txt') 102 | response = gdb.execute('p/x $rdi', to_string = True) 103 | main_addr = get_reg_value(response) 104 | main_addr_raw = 0x1229 105 | print(main_addr) 106 | base = main_addr - main_addr_raw 107 | 108 | gdb.execute('b *%d' % (base + 0x1399)) 109 | gdb.execute('c') 110 | 111 | response = gdb.execute('p/x $rax', to_string = True) 112 | input_addr = get_reg_value(response) 113 | print('input_addr', hex(input_addr)) 114 | 115 | i = 0 116 | while True: 117 | try: 118 | gdb.execute('del') 119 | gdb.execute('rwatch *%d' % (input_addr + i * 4)) 120 | gdb.execute('c') 121 | 122 | response = gdb.execute('p/x $edi', to_string = True) 123 | checksum = get_reg_value(response) 124 | print('checksum', hex(checksum)) 125 | solution += struct.pack(' 2 | 3 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | Function Graph 0 98 | 99 | Basic Block 0 100 | 101 | 102 | Opcode: 4d 31 c9sub_20ab: 103 | Opcode: 4d 31 c9000020ab  xor     r9r9  {0x0} 104 | Opcode: 49 c7 c0 ff ff ff ff000020ae  mov     r80xffffffffffffffff 105 | Opcode: 48 31 ff000020b5  xor     rdirdi  {0x0} 106 | Opcode: ba 07 00 00 00000020b8  mov     edx0x7 107 | Opcode: b8 09 00 00 00000020bd  mov     eax0x9 108 | Opcode: 41 ba 22 00 00 00000020c2  mov     r10d0x22 109 | Opcode: 48 8b 35 5f 00 00 00000020c8  mov     rsiqword [rel data_212e]  {0x9ce} 110 | Opcode: 0f 05000020cf  syscall  111 | Opcode: 48 31 c9000020d1  xor     rcxrcx  {0x0} 112 | Opcode: 48 31 ff000020d4  xor     rdirdi  {0x0} 113 | 114 | 115 | 116 | Basic Block 1 117 | 118 | 119 | Opcode: 48 8d 15 58 00 00 00000020d7  lea     rdx[rel data_2136] 120 | Opcode: 8a 14 0a000020de  mov     dlbyte [rdx+rcx] 121 | Opcode: 80 f2 b6000020e1  xor     dl0xb6 122 | Opcode: 88 14 08000020e4  mov     byte [rax+rcx]dl 123 | Opcode: 89 fe000020e7  mov     esiedi 124 | Opcode: 48 c1 ee 18000020e9  shr     rsi0x18 125 | Opcode: 48 c1 e7 08000020ed  shl     rdi0x8 126 | Opcode: 40 00 d7000020f1  add     dildl 127 | Opcode: 48 31 f7000020f4  xor     rdirsi 128 | Opcode: 48 ff c1000020f7  inc     rcx 129 | Opcode: 48 3b 0d 2d 00 00 00000020fa  cmp     rcxqword [rel data_212e] 130 | Opcode: 75 d400002101  jne     0x20d7  {data_212e} 131 | 132 | 133 | 134 | Basic Block 2 135 | 136 | 137 | Opcode: 41 8b 34 2400002103  mov     esidword [r12] 138 | Opcode: 39 fe00002107  cmp     esiedi 139 | Opcode: 0f 94 c300002109  sete    bl 140 | Opcode: 75 0a0000210c  jne     0x2118 141 | 142 | 143 | 144 | Basic Block 3 145 | 146 | 147 | Opcode: 48 89 c700002118  mov     rdirax 148 | Opcode: b8 0b 00 00 000000211b  mov     eax0xb 149 | Opcode: 48 8b 35 07 00 00 0000002120  mov     rsiqword [rel data_212e]  {0x9ce} 150 | Opcode: 0f 0500002127  syscall  151 | Opcode: 48 0f b6 c300002129  movzx   raxbl 152 | Opcode: c30000212d  retn     {__return_addr} 153 | 154 | 155 | 156 | Basic Block 4 157 | 158 | 159 | Opcode: 500000210e  push    rax {var_8_1} 160 | Opcode: 49 83 c4 040000210f  add     r120x4 161 | Opcode: ff d000002113  call    rax 162 | Opcode: 88 c300002115  mov     blal 163 | Opcode: 5800002117  pop     rax {var_8_1} 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 |

This CFG generated by Binary Ninja from recursion.bndb on Wed 29 Jul 2020 02:39:15 PM CST showing 20ab as Assembly.

-------------------------------------------------------------------------------- /automating-gdb/imgs/binaryninja-recursion.bndb-code_buffer.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | Function Graph 0 98 | 99 | Basic Block 0 100 | 101 | 102 | Opcode: 48 8b 35 7c 00 00 00code_buffer_copy: 103 | Opcode: 48 8b 35 7c 00 00 0000002020  mov     rsiqword [rel code_size]  {0xa59} 104 | Opcode: 41 ba 22 00 00 0000002027  mov     r10d0x22 105 | Opcode: b8 09 00 00 000000202d  mov     eax0x9 106 | Opcode: ba 07 00 00 0000002032  mov     edx0x7 107 | Opcode: 49 c7 c0 ff ff ff ff00002037  mov     r80xffffffffffffffff 108 | Opcode: 48 31 ff0000203e  xor     rdirdi  {0x0} 109 | Opcode: 4d 31 c900002041  xor     r9r9  {0x0} 110 | Opcode: 0f 0500002044  syscall  111 | Opcode: 48 31 c900002046  xor     rcxrcx  {0x0} 112 | Opcode: 48 31 ff00002049  xor     rdirdi  {0x0} 113 | 114 | 115 | 116 | Basic Block 1 117 | 118 | 119 | Opcode: 48 8d 15 58 00 00 000000204c  lea     rdx[rel data_20ab] 120 | Opcode: 8a 14 0a00002053  mov     dlbyte [rdx+rcx] 121 | Opcode: 80 f2 9f00002056  xor     dl0x9f 122 | Opcode: 88 14 0800002059  mov     byte [rax+rcx]dl 123 | Opcode: 89 fe0000205c  mov     esiedi 124 | Opcode: 48 c1 ee 180000205e  shr     rsi0x18 125 | Opcode: 48 c1 e7 0800002062  shl     rdi0x8 126 | Opcode: 40 00 d700002066  add     dildl 127 | Opcode: 48 31 f700002069  xor     rdirsi 128 | Opcode: 48 ff c10000206c  inc     rcx 129 | Opcode: 48 3b 0d 2d 00 00 000000206f  cmp     rcxqword [rel code_size] 130 | Opcode: 75 d400002076  jne     0x204c  {code_size} 131 | 132 | 133 | 134 | Basic Block 2 135 | 136 | 137 | Opcode: 41 8b 34 2400002078  mov     esidword [r12] 138 | Opcode: 39 fe0000207c  cmp     esiedi 139 | Opcode: 0f 94 c30000207e  sete    bl 140 | Opcode: 75 0a00002081  jne     0x208d 141 | 142 | 143 | 144 | Basic Block 3 145 | 146 | 147 | Opcode: 48 89 c70000208d  mov     rdirax 148 | Opcode: 48 8b 35 0c 00 00 0000002090  mov     rsiqword [rel code_size]  {0xa59} 149 | Opcode: b8 0b 00 00 0000002097  mov     eax0xb 150 | Opcode: 0f 050000209c  syscall  151 | Opcode: 48 0f b6 c30000209e  movzx   raxbl 152 | Opcode: c3000020a2  retn     {__return_addr} 153 | 154 | 155 | 156 | Basic Block 4 157 | 158 | 159 | Opcode: 49 83 c4 0400002083  add     r120x4 160 | Opcode: 5000002087  push    rax {var_8_1} 161 | Opcode: ff d000002088  call    rax 162 | Opcode: 88 c30000208a  mov     blal 163 | Opcode: 580000208c  pop     rax {var_8_1} 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 |

This CFG generated by Binary Ninja from recursion.bndb on Wed 29 Jul 2020 02:35:21 PM CST showing code_buffer_copy as Assembly.

-------------------------------------------------------------------------------- /automating-gdb/recursion.elf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/automating-gdb/recursion.elf -------------------------------------------------------------------------------- /automating-gdb/solution.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/automating-gdb/solution.txt -------------------------------------------------------------------------------- /client_houseplant_ctf_2020/client.apk: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/client_houseplant_ctf_2020/client.apk -------------------------------------------------------------------------------- /client_houseplant_ctf_2020/imgs/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/client_houseplant_ctf_2020/imgs/1.png -------------------------------------------------------------------------------- /client_houseplant_ctf_2020/imgs/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/client_houseplant_ctf_2020/imgs/2.png -------------------------------------------------------------------------------- /client_houseplant_ctf_2020/imgs/3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/client_houseplant_ctf_2020/imgs/3.png -------------------------------------------------------------------------------- /client_houseplant_ctf_2020/imgs/4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/client_houseplant_ctf_2020/imgs/4.png -------------------------------------------------------------------------------- /client_houseplant_ctf_2020/imgs/video.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/client_houseplant_ctf_2020/imgs/video.png -------------------------------------------------------------------------------- /client_houseplant_ctf_2020/mitm-solve.py: -------------------------------------------------------------------------------- 1 | import json 2 | from mitmproxy import ctx 3 | import pytesseract 4 | import pyautogui 5 | import time 6 | import threading 7 | 8 | i = 0 9 | 10 | def recognize_char(): 11 | print('recognizing ...') 12 | image = pyautogui.screenshot() 13 | image = image.crop((1540, 430, 1560, 465)) 14 | # image.save('sc.png') 15 | txt = pytesseract.image_to_string(image, config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123') 16 | val = int(txt, 10) 17 | if not val in [0, 1, 2, 3]: 18 | print('recognition failed!!!!!!!!!!!!!!!!') 19 | val = 0 20 | return val 21 | 22 | def solve_and_inject(flow): 23 | global i 24 | time.sleep(0.5) 25 | ans = recognize_char() 26 | sol = {'method' : 'answer', 'answer' : ans} 27 | print(sol) 28 | sol_str = json.dumps(sol) 29 | flow.inject_message(flow.server_conn, sol_str) 30 | i += 1 31 | print('solved: %d' % i) 32 | pass 33 | 34 | def websocket_message(flow): 35 | 36 | # get the latest message 37 | message = flow.messages[-1] 38 | 39 | # was the message sent from the client or server? 40 | if message.from_client: 41 | ctx.log.info("Client sent a message: {}".format(message.content)) 42 | else: 43 | ctx.log.info("Server sent a message: {}".format(message.content)) 44 | 45 | if 'correctAnswer' in message.content: 46 | 47 | message.content = message.content.replace('questionText', 'replaced') 48 | message.content = message.content.replace('correctAnswer', 'questionText') 49 | 50 | t = threading.Thread(target = solve_and_inject, args = [flow]) 51 | t.start() 52 | -------------------------------------------------------------------------------- /client_houseplant_ctf_2020/solve.md: -------------------------------------------------------------------------------- 1 | # Solving a Reversing Challenge with Mitmproxy and OCR 2 | 3 | Over the weekend I had some fun with the [Houseplant CTF](https://houseplant.riceteacatpanda.wtf/home). Among the reversing challenges, the [**RTCP Trivia**](https://houseplant.riceteacatpanda.wtf/challenge?id=8) is particularly interesting and I would like to share my unconventional way of solving it. 4 | 5 | ## First Impression 6 | 7 | We get a **client.apk** after downloading the challenge. I have no Android phones so I ran it in an emulator. It has no ARM native library so it runs well in x86 emulators. 8 | 9 | After asking for a user name, the app presents a multiple-choice problem with four options (shown below). The problem itself is not difficult. However, there is a ten-second countdown and we must answer it before the time elapses. The challenge description says that we need to correctly answer 1000 such problems. So manual solving is probably not a wise idea. 10 | 11 | 12 | 13 | ## Inspecting the Traffic 14 | 15 | After I unzipped the apk and inspected the files inside of it, I found the challenges are not stored inside the apk. I confirmed this by cutting the network to the emulator -- it longer shows new challenges or tells you the answer is wrong. 16 | 17 | I inspected the resources of this app and found the real flag is not there (a fake flag can be found in the strings). So it probably comes from the server after we solve 1000 problems. 18 | 19 | I then launched Wireshark to have a look at the traffic. The app uses websocket to communicate with the server. The problem is sent from the server and the choice is submitted to the server. So the logic is not local. But I quickly notice something strange: 20 | 21 | ```JSON 22 | { 23 | "method": "question", 24 | "id": "30a3956f-cd60-4c51-bc01-dbbf1b09f9b0", 25 | "questionText": "S62ZtWoNqto0jxuZalalAmv4s/n2GmaTai5Z7/bVsk6W48CbtUvYcOyVRi7qcPeP", 26 | "options": [ 27 | "bNMO3oWCI/s5OHBEiXfgkg==", 28 | "qpDFxRVJXyczm52QbPTa8A==", 29 | "8UQQMs42vvLpLIq0wNEIaw==", 30 | "cLYF4H6LVlIi3YPF3R4MUg==" 31 | ], 32 | "correctAnswer": "mboZgfosD3S1ZUf330zmxaeq+bR2vzKkCV2AKOB8vlA=", 33 | "requestIdentifier": "f814ce11519a16be435ac73bc0e89238" 34 | } 35 | ``` 36 | 37 | Despite most data are encrypted, we see that the **correctAnswer** is also sent to the client. This means if we can decrypt it, we get the correct answer. And we know the app can decrypt the questionText and options, since it needs to show them to us. It is highly likely that the answer is encrypted in the same way and we can also decrypt it. 38 | 39 | ## Reversing the Algorithm? No! 40 | 41 | A routine way to solve this is: 1). reverse the app to find out the encryption algorithm; 2). rewrite a client to communicate with the server. I did not take this approach since: 1). although it is easy to find out the encryption algorithm is AES and the iv is indeed requestIdentifier, it is not immediately clear how is the key generated. 2). I mistakenbly think the traffic sent from the client to the server is encrypted using a custom crypto (which later turns out to be just compression). These two obstacles are not prohibiting me from solving it, but I think it will take longer than I expected, so I decide to try a novel method. 42 | 43 | After reading how the app displays the question text, I found that if I swap the keyword "questionText" with "requestIdentifier" in the json, the correct answer will be displayed on the screen! 44 | 45 | Since the traffic is plaintext websocket, it is quite easy to implement it. I first tried Burp but it does not support match-and-replace in websocket. Then I used [mitmproxy](https://mitmproxy.org/). Mitmproxy allows us to script in Python, so we can easily modify the traffic. 46 | 47 | I copy-and-pasted one example from the official repo and made some changes. The following code will change ```'correctAnswer'``` to ```'questionText'``` and change ```'questionText'``` to ```'replaced'```: 48 | 49 | ```Python 50 | from mitmproxy import ctx 51 | def websocket_message(flow): 52 | 53 | message = flow.messages[-1] 54 | 55 | if message.from_client: 56 | ctx.log.info("Client sent a message: {}".format(message.content)) 57 | else: 58 | ctx.log.info("Server sent a message: {}".format(message.content)) 59 | 60 | if 'correctAnswer' in message.content: 61 | 62 | message.content = message.content.replace('questionText', 'replaced') 63 | message.content = message.content.replace('correctAnswer', 'questionText') 64 | ``` 65 | 66 | Mitmproxy scripts are not meant to run on its own. Instread, we should run tools from mitmrpoxy and specify it with the ```-s``` option: 67 | 68 | ``` 69 | mitmdump -s ./mitm-solve.py 70 | ``` 71 | 72 | And it works! Now instead of the question text, the app shows the index of the correct answer to us. 73 | 74 | 75 | 76 | I tried to solve it by hand. But even if I have the correct answer, I still cannot stop clicking the wrong button. I do not want to solve it as an action game, so I start to seek viable ways to automate the solving. 77 | 78 | The good thing is, mitmproxy allows us to inject packets. And thanks to the nature of websocket, this will not disrupt the communicaition between the client and the server. So the last problem is how to get the correct answer. Reversing the crypto algorithm is always an option, but I decide not to do it for this time. 79 | 80 | ## Solving a Reversing Challenge with OCR 81 | 82 | It quickly pops up my mind that I can use OCR to recognize the correct answer. Does it work? I have not really done it before. Nevertheless the workflow is really simple: 1). capture a screenshot and crop it to the desired region. 2). use some OCR tool to recognize it. 83 | 84 | I use **pyautogui** to capture a screenshot of my laptop screen. I already measured the bounding box of the answer digit with gimp. Then I just crop it accordingly. It feels like: 85 | 86 | ```Python 87 | image = pyautogui.screenshot() 88 | image = image.crop((1540, 430, 1560, 465)) 89 | ``` 90 | 91 | After that I used a well-known open-source OCR engine [tesseract](https://github.com/tesseract-ocr/tesseract) to recognize the digit on it. I have not used it before but it is quite reliable (at least for our super easy case). 92 | 93 | ```Python 94 | txt = pytesseract.image_to_string(image, 95 | config = '--psm 10 --oem 3 -c tessedit_char_whitelist=0123') 96 | ``` 97 | 98 | The config option is found on the Stackoverflow and I do not really understand it. But it works! 99 | 100 | ![digits](imgs/4.png) 101 | 102 | Now that it comes to the last step: injecting the solution. Note we need to first do the keyword swap, let the traffic reach the client app, wait for the answer to be displayed on the screen, and then read it and inject it. In my script, I waited 0.5 seconds to start the recognition. 103 | 104 | ```Python 105 | def solve_and_inject(flow): 106 | global i 107 | time.sleep(0.5) 108 | ans = recognize_char() 109 | sol = {'method' : 'answer', 'answer' : ans} 110 | print(sol) 111 | sol_str = json.dumps(sol) 112 | flow.inject_message(flow.server_conn, sol_str) 113 | i += 1 114 | print('solved: %d' % i) 115 | ``` 116 | 117 | Alright, it now works! Wait for some 20 minutes and we get the flag: rtcp{qu1z_4pps_4re_c00l_aeecfa13}. 118 | 119 | 120 | 121 | I actually recorded a [video](https://youtu.be/Acp8PDbsvQk) to demonstrate the solving. 122 | 123 | [![Video](imgs/video.png)](https://youtu.be/Acp8PDbsvQk "Video") -------------------------------------------------------------------------------- /elf_format/binaries/dumped-elf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/binaries/dumped-elf -------------------------------------------------------------------------------- /elf_format/binaries/dumped-elf_fixed: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/binaries/dumped-elf_fixed -------------------------------------------------------------------------------- /elf_format/binaries/dumped-elf_fixed_patched: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/binaries/dumped-elf_fixed_patched -------------------------------------------------------------------------------- /elf_format/binaries/tricky-crackme: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/binaries/tricky-crackme -------------------------------------------------------------------------------- /elf_format/imgs/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/1.png -------------------------------------------------------------------------------- /elf_format/imgs/10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/10.png -------------------------------------------------------------------------------- /elf_format/imgs/11.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/11.png -------------------------------------------------------------------------------- /elf_format/imgs/12.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/12.png -------------------------------------------------------------------------------- /elf_format/imgs/13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/13.png -------------------------------------------------------------------------------- /elf_format/imgs/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/2.png -------------------------------------------------------------------------------- /elf_format/imgs/3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/3.png -------------------------------------------------------------------------------- /elf_format/imgs/4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/4.png -------------------------------------------------------------------------------- /elf_format/imgs/5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/5.png -------------------------------------------------------------------------------- /elf_format/imgs/6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/6.png -------------------------------------------------------------------------------- /elf_format/imgs/7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/7.png -------------------------------------------------------------------------------- /elf_format/imgs/8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/8.png -------------------------------------------------------------------------------- /elf_format/imgs/9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/imgs/9.png -------------------------------------------------------------------------------- /elf_format/test_lazy_binding/.gdb_history: -------------------------------------------------------------------------------- 1 | b main 2 | r 3 | ni 4 | q 5 | b main 6 | c 7 | r 8 | si 9 | q 10 | -------------------------------------------------------------------------------- /elf_format/test_lazy_binding/simple: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/test_lazy_binding/simple -------------------------------------------------------------------------------- /elf_format/test_lazy_binding/simple.c: -------------------------------------------------------------------------------- 1 | //simple.c 2 | //gcc -Wl,-z,lazy -o simple simple.c 3 | #include 4 | // #include 5 | 6 | int main() 7 | { 8 | // Elf32_Dyn s; 9 | puts("0xdeadbeef\n"); 10 | getchar(); 11 | return 0; 12 | } -------------------------------------------------------------------------------- /elf_format/test_lazy_binding/simple_4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/test_lazy_binding/simple_4 -------------------------------------------------------------------------------- /elf_format/test_lazy_binding/simple_non_lazy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/test_lazy_binding/simple_non_lazy -------------------------------------------------------------------------------- /elf_format/test_lazy_binding/simple_patched: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/test_lazy_binding/simple_patched -------------------------------------------------------------------------------- /elf_format/test_lazy_binding/simple_patched_2: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/elf_format/test_lazy_binding/simple_patched_2 -------------------------------------------------------------------------------- /how_to_not_write_a_bad_crackme/how_to_not_write_a_bad_crackme.md: -------------------------------------------------------------------------------- 1 | # How to Avoid Writing a Bad Crackme 2 | 3 | Recently, I was promoted to a reviewer on crackmes.one (along with @zed). I am so honored with this and I appreciate the recognition and trust from @stan (creator of crackmes.one) and the entire community. The task for a reviewer is interesting, that I read submitted solutions and verify new crackmes. This allows me to grasp the latest trend on the website. 4 | 5 | I did not tally the statistics, but there is a fairly good amount of new submissions (of both crackmes and solutions) every week. And most of them are nice! For me, crackmes.one is a place for reversers to exchange knowledge and joyfulness. So I am very glad that we see a steady flow of input. Now that we have three reviewers, and I hope an increase in the reviewing speed will shorten the feedback loop for contributors, which, in turn, will lead to more contributions from the community. 6 | 7 | Nevertheless, some of the submissions did not meet our standards and got rejected. The reasons vary, but many of them are using existing obfuscator/protector. Among them, many have a dull verification algorithm and the sole challenge is to get past the obfuscator. We welcome the use of protectors/obfuscators, but we do not like the use of existing ones, especially commercial ones, e.g., VMP, WinLicense. These protectors are definitely breakable (trust me), but it is too hard for a crackme and it will take very long to solve. For folks who can do it, they would probably invest the time in some more important/interesting things, rather than spending a long time on it to break the protector, and only to find the actual algorithm is just on XOR. 8 | 9 | Meanwhile, using existing tools deviates from the spirit of crackmes.one. As I wrote above, I believe this is a place for us reversers to `"exchange knowledge and joyfulness"`. We not only practice and improve our reversing skills but also share and obtain knowledge. However, using an existing tool does not help the author learn anything, beyond how to execute the tool, which is relatively simple. Conversely, if the author digs deep into an existing (open-source) tool, understands how it works, makes certain changes to defeat existing tools, s/he would learn more. 10 | 11 | Below, I will list some of the things that we should better avoid when writing a crackme. Note, these rules are not absolute and I will write a longer version of explanation following it. 12 | 13 | ## Don'ts 14 | 15 | 1. Do not upload crackmes that are not written by you. 16 | 2. Do not upload malware or unwanted software of any kind, e.g., trojan, ransomware, adware, etc. 17 | 3. Do not use a commercial packer, protector, or obfuscator. 18 | 4. Do not upload a crackme that you cannot solve. 19 | 5. The crackme should not fail to execute. Please, no missing library dependencies or internal errors! 20 | 6. The crackme should not make network connections to any host other than localhost (127.0.0.1). 21 | 7. The crackme me should make it clear how it accepts/expects input (if any). And it should also clearly tell the player whether the input is correct. 22 | 8. The crackme must be solvable without guessing or a non-trivial amount of brute-forcing. 23 | 9. The crackme must be solvable in a reasonable time -- when solved optimally. 24 | 10. The crackme should not rely on any hardware unique identifier as part of the algorithm. 25 | 11. The crackme should not stack unrelated levels of protection together. 26 | 27 | 28 | ## Justifications 29 | 30 | A reader might notice some of the items above are too restrictive. So I will now explain the reason to set them and some of the exceptions for it. Also, if you are in doubt about a specific crackme or crackme idea, please contact one of the reviewers on Discord. 31 | 32 | 1. The crackme should be the uploader's original work. Do not upload crackmes that have potential copyright issues. Do not upload crackmes you see on the Internet or in CTFs, unless you get permission to do so. 33 | 2. An exception is that one might make a pseudo-ransomware/malware that is a reverse engineering challenge. If that is the case, be sure to limit the damage to a very small and specific range (e.g., a `flag.txt` in the current dir), and state it clearly before the actual payload runs. 34 | 3. Using commercial packers, protectors, or obfuscators does not help challenge authors to learn and improve. And it could also take too long to solve. Also, avoid using any of these tools that already exist. Making your own or improving existing tools are very welcome! 35 | 4. Related to #3 and #8, do not make a crackme that even the author cannot solve. 36 | 5. This is a disappointing situation. Try to be compatible with more systems you target. Though we know that compatibility with all systems is impossible. At least test it on another computer and see if it works! 37 | 6. We discourage the use of network connections. Network traffic makes it harder to determine whether the program has any malicious behavior. If you need to have a network connection, only do that with the localhost. If you do not wish the player to temper with the "remote" server, still bundle the server and run it on the localhost, but tell the player not to reverse it. 38 | 7. If the crackme accepts inputs, e.g., user name and passwords, do not obscure the way it reads it. Also, be honest and tell the player if s/he solves it. Do not accept fake flags. Do not hide the flag somewhere that cannot be triggered by code execution. Note, this rule does not state that a crackme has to do password validation. We do have crackmes that ask the player to defeat the anti-debugging or decrypt a file that gets encrypted. These are good and not affected by this rule. In other words, if you have a novel challenge style, explain it to the player so they do not get lost. 39 | 8. Do not put the flag/secret in a function that is never gonna be executed. Do not make crackmes that the player has to guess something important to proceed. 40 | 9. Most crackmes can be solved instantly, or in a few seconds. I think a max 1-minute time limit is a reasonable recommended maximum. 41 | 11. Do not blindly add layers of protection, unless they form a cohesive unity. If protections are duplicated in large numbers, there should be a way to automatically tackle it. -------------------------------------------------------------------------------- /obfuscation/imgs/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/obfuscation/imgs/1.png -------------------------------------------------------------------------------- /obfuscation/imgs/10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/obfuscation/imgs/10.png -------------------------------------------------------------------------------- /obfuscation/imgs/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/obfuscation/imgs/2.png -------------------------------------------------------------------------------- /obfuscation/imgs/3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/obfuscation/imgs/3.png -------------------------------------------------------------------------------- /obfuscation/imgs/4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/obfuscation/imgs/4.png -------------------------------------------------------------------------------- /obfuscation/imgs/5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/obfuscation/imgs/5.png -------------------------------------------------------------------------------- /obfuscation/imgs/6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/obfuscation/imgs/6.png -------------------------------------------------------------------------------- /obfuscation/imgs/7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/obfuscation/imgs/7.png -------------------------------------------------------------------------------- /obfuscation/imgs/8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/obfuscation/imgs/8.png -------------------------------------------------------------------------------- /obfuscation/imgs/9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/obfuscation/imgs/9.png -------------------------------------------------------------------------------- /obfuscation/keygenme4.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/obfuscation/keygenme4.exe -------------------------------------------------------------------------------- /obfuscation/solution/de-obfuscate/__init__.py: -------------------------------------------------------------------------------- 1 | from binaryninja import * 2 | import os 3 | 4 | def bootstrap(bv, addr): 5 | 6 | plugin_dir = os.path.dirname(os.path.abspath(__file__)) 7 | src_path = os.path.join(plugin_dir, 'de-obfuscate.py') 8 | print(src_path) 9 | src = open(src_path, 'rb').read() 10 | exec src in globals() 11 | 12 | # deobfuscate_function is defined in de-obfuscate.py 13 | deobfuscate_function(bv, addr) 14 | 15 | 16 | def bootstrap2(bv, addr): 17 | 18 | plugin_dir = os.path.dirname(os.path.abspath(__file__)) 19 | src_path = os.path.join(plugin_dir, 'de-obfuscate.py') 20 | print(src_path) 21 | src = open(src_path, 'rb').read() 22 | exec src in globals() 23 | 24 | # deobfuscate_function is defined in de-obfuscate.py 25 | simplify_func(bv, addr) 26 | 27 | def bootstrap3(bv, addr): 28 | 29 | plugin_dir = os.path.dirname(os.path.abspath(__file__)) 30 | src_path = os.path.join(plugin_dir, 'de-obfuscate.py') 31 | print(src_path) 32 | src = open(src_path, 'rb').read() 33 | exec src in globals() 34 | 35 | # deobfuscate_function is defined in de-obfuscate.py 36 | simplify_bbl_handler(bv, addr) 37 | 38 | PluginCommand.register_for_address("Deobfuscate", 39 | "Remove tcc", 40 | bootstrap) 41 | 42 | PluginCommand.register_for_address("Simplify", 43 | "Simplify tcc", 44 | bootstrap2) 45 | 46 | PluginCommand.register_for_address("Simplify BBL", 47 | "Simplify tcc", 48 | bootstrap3) -------------------------------------------------------------------------------- /obfuscation/solution/de-obfuscate/de-obfuscate.py: -------------------------------------------------------------------------------- 1 | from binaryninja import * 2 | # from __future__ import print_function 3 | from triton import * 4 | import struct 5 | import re 6 | 7 | arch = Architecture['x86'] 8 | 9 | def is_opaque_predicate(instr): 10 | 11 | tokens = instr.tokens 12 | if tokens[0].text == 'xor' and tokens[2].text == tokens[4].text: 13 | return True 14 | if tokens[0].text == 'sub' and tokens[2].text == tokens[4].text: 15 | return True 16 | return False 17 | 18 | def should_patch_to_always_branch(instr): 19 | 20 | tokens = instr.tokens 21 | opcode = tokens[0].text 22 | if opcode in ['je', 'jz']: 23 | return True 24 | 25 | return False 26 | 27 | def should_patch_to_never_branch(instr): 28 | 29 | tokens = instr.tokens 30 | opcode = tokens[0].text 31 | if opcode in ['jne', 'jnz']: 32 | return True 33 | 34 | return False 35 | 36 | def print_slice_result(slice_result): 37 | 38 | # print(slice_result) 39 | for addr, dis in sorted(slice_result.items()): 40 | # Here we display the comment to understand the correspondence 41 | # between an expression and its referenced instruction. 42 | print('[slicing] 0x%x: %s' % (addr, dis)) 43 | # print(v) 44 | 45 | def slice_expr(ctx, regExpr): 46 | 47 | # print(regExpr) 48 | 49 | slicing = ctx.sliceExpressions(regExpr) 50 | 51 | result = {} 52 | # print(slicing.items()) 53 | for k, v in slicing.items(): 54 | # print(type(v)) 55 | # print(v.getReadRegisters()) 56 | comment = v.getComment() 57 | print(comment) 58 | try: 59 | addr, dis = comment.split(': ') 60 | addr = int(addr, 16) 61 | result[addr] = dis 62 | except: 63 | pass 64 | 65 | # print_slice_result(result) 66 | return result 67 | 68 | 69 | 70 | def init_triton(): 71 | 72 | ctx = TritonContext() 73 | ctx.setArchitecture(ARCH.X86) 74 | ctx.setMode(MODE.ALIGNED_MEMORY, True) 75 | ctx.setAstRepresentationMode(AST_REPRESENTATION.PYTHON) 76 | return ctx 77 | 78 | def symbolize_regs(ctx): 79 | 80 | ctx.symbolizeRegister(ctx.registers.eax) 81 | ctx.symbolizeRegister(ctx.registers.ebx) 82 | ctx.symbolizeRegister(ctx.registers.ecx) 83 | ctx.symbolizeRegister(ctx.registers.edx) 84 | # ctx.symbolizeRegister(ctx.registers.ebp) 85 | 86 | def merge_dict(dict1, dict2): 87 | 88 | ret = dict1.copy() 89 | ret.update(dict2) 90 | # print(len(ret)) 91 | 92 | return ret 93 | 94 | 95 | # def further_taint_bbl_instrs(bv, bbl, addr_to_start_taint, reg_to_taint): 96 | def further_taint_bbl_instrs(bv, bbl, instrs_to_taint): 97 | 98 | newly_tainted_instrs = {} 99 | all_addrs_to_taint = instrs_to_taint.keys() 100 | 101 | ctx = init_triton() 102 | symbolize_regs(ctx) 103 | 104 | pc = bbl.start 105 | for inst in bbl: 106 | tokens, inst_size = inst 107 | 108 | inst = Instruction() 109 | curr_pc = pc 110 | inst.setAddress(pc) 111 | inst_bytes = bv.read(pc, inst_size) 112 | inst.setOpcode(inst_bytes) 113 | pc += inst_size 114 | 115 | ctx.processing(inst) 116 | for se in inst.getSymbolicExpressions(): 117 | se.setComment(str(inst)) 118 | 119 | if curr_pc in all_addrs_to_taint: 120 | 121 | # print('here') 122 | 123 | read_registers = inst.getReadRegisters() 124 | # print(read_registers) 125 | if type(read_registers) == tuple: 126 | read_registers = [read_registers] 127 | 128 | if len(read_registers) > 0: 129 | # print(hex(inst.getAddress())) 130 | for read_reg, ast in read_registers: 131 | # print(read_reg.getName()) 132 | if read_reg.getName() == 'eax': 133 | # print('here') 134 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.EAX]) 135 | # print(result) 136 | newly_tainted_instrs = merge_dict(newly_tainted_instrs, result) 137 | 138 | if read_reg.getName() == 'ebx': 139 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.EBX]) 140 | newly_tainted_instrs = merge_dict(newly_tainted_instrs, result) 141 | 142 | if read_reg.getName() == 'ecx': 143 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.ECX]) 144 | newly_tainted_instrs = merge_dict(newly_tainted_instrs, result) 145 | 146 | if read_reg.getName() == 'edx': 147 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.EDX]) 148 | newly_tainted_instrs = merge_dict(newly_tainted_instrs, result) 149 | 150 | print('===============') 151 | 152 | return newly_tainted_instrs 153 | 154 | 155 | # def further_taint_bbl_instrs(bv, bbl, instrs): 156 | # pass 157 | 158 | def find_instrs_need_further_taint(bv, bbl, newly_tainted_instrs, all_instrs): 159 | 160 | instrs_need_further_taint = {} 161 | for addr in newly_tainted_instrs.keys(): 162 | inst = all_instrs[addr] 163 | if inst.isMemoryRead(): 164 | read_registers = inst.getReadRegisters() 165 | # print(read_registers) 166 | if type(read_registers) == tuple: 167 | read_registers = [read_registers] 168 | 169 | if len(read_registers) > 0: 170 | # print(hex(inst.getAddress())) 171 | for read_reg, ast in read_registers: 172 | # print(read_reg.getName()) 173 | if read_reg.getName() in ['eax', 'ebx', 'ecx', 'edx']: 174 | instrs_need_further_taint[addr] = all_instrs[addr] 175 | 176 | return instrs_need_further_taint 177 | 178 | def is_inst_writes_reg(inst, reg): 179 | 180 | found = False 181 | write_registers = inst.getWrittenRegisters() 182 | if type(write_registers) == tuple: 183 | write_registers = [write_registers] 184 | 185 | if len(write_registers) > 0: 186 | # print(hex(inst.getAddress())) 187 | for read_reg, ast in write_registers: 188 | if read_reg.getName() == reg: 189 | found = True 190 | break 191 | 192 | return found 193 | 194 | def simplify_bbl(bv, bbl): 195 | 196 | 197 | ctx = init_triton() 198 | symbolize_regs(ctx) 199 | 200 | # ctx.symbolizeMemory(MemoryAccess(0, 4)) 201 | # ctx.symbolizeMemory(MemoryAccess(0x2b918004, 4)) 202 | 203 | # bv.begin_undo_actions() 204 | 205 | # make a copy of all instructions in the bbl 206 | bbl_all_instrs = {} 207 | 208 | tainted_instrs = {} 209 | instrs_to_include = {} 210 | 211 | pc = bbl.start 212 | for inst in bbl: 213 | tokens, inst_size = inst 214 | 215 | inst = Instruction() 216 | curr_pc = pc 217 | inst.setAddress(pc) 218 | inst_bytes = bv.read(pc, inst_size) 219 | inst.setOpcode(inst_bytes) 220 | bbl_all_instrs[pc] = inst 221 | 222 | pc += inst_size 223 | 224 | ctx.processing(inst) 225 | for se in inst.getSymbolicExpressions(): 226 | se.setComment(str(inst)) 227 | 228 | # if pc == 0x416e07: 229 | # print(inst.getLoadAccess()) 230 | # print(inst.getOperands()) 231 | 232 | # print('abc') 233 | 234 | # if inst. 235 | 236 | # if pc == 0x00416e62: 237 | # # inst.setReadRegister 238 | # read_registers = inst.getReadRegisters() 239 | # if inst.isMemoryRead() and len(read_registers) > 0: 240 | # ecxExpr = read_registers[0] 241 | # print(ecxExpr[0]) 242 | # # print(ecxExpr[0].getName()) 243 | # print('===============') 244 | 245 | # if pc == 0x0040114c: 246 | # print(inst.getReadRegisters()) 247 | 248 | # print(hex(pc)) 249 | 250 | # skip call and jmp instructions 251 | if inst_bytes[0] in ['\xe8', '\xe9']: 252 | instrs_to_include[inst.getAddress()] = inst.getDisassembly() 253 | continue 254 | 255 | if is_inst_writes_reg(inst, 'ebp'): 256 | instrs_to_include[inst.getAddress()] = inst.getDisassembly() 257 | 258 | if is_inst_writes_reg(inst, 'esp'): 259 | instrs_to_include[inst.getAddress()] = inst.getDisassembly() 260 | 261 | # related_regs = [REG.X86.EAX, REG.X86.EBX, REG.X86.ECX, REG.X86.EDX] 262 | if inst.isMemoryWrite(): 263 | read_registers = inst.getReadRegisters() 264 | # print(read_registers) 265 | if type(read_registers) == tuple: 266 | read_registers = [read_registers] 267 | 268 | if len(read_registers) > 0: 269 | print(hex(inst.getAddress())) 270 | for read_reg, ast in read_registers: 271 | # print(read_reg.getName()) 272 | if read_reg.getName() == 'eax': 273 | # print('here') 274 | instrs_to_include[inst.getAddress()] = inst.getDisassembly() 275 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.EAX]) 276 | tainted_instrs = merge_dict(tainted_instrs, result) 277 | 278 | if read_reg.getName() == 'ebx': 279 | instrs_to_include[inst.getAddress()] = inst.getDisassembly() 280 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.EBX]) 281 | tainted_instrs = merge_dict(tainted_instrs, result) 282 | 283 | if read_reg.getName() == 'ecx': 284 | instrs_to_include[inst.getAddress()] = inst.getDisassembly() 285 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.ECX]) 286 | tainted_instrs = merge_dict(tainted_instrs, result) 287 | 288 | if read_reg.getName() == 'edx': 289 | instrs_to_include[inst.getAddress()] = inst.getDisassembly() 290 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.EDX]) 291 | tainted_instrs = merge_dict(tainted_instrs, result) 292 | 293 | print('===============') 294 | 295 | 296 | if inst.isBranch(): 297 | if inst.getDisassembly().startswith('jne') or \ 298 | inst.getDisassembly().startswith('je') or \ 299 | inst.getDisassembly().startswith('jz') or \ 300 | inst.getDisassembly().startswith('jnz'): 301 | 302 | instrs_to_include[inst.getAddress()] = inst.getDisassembly() 303 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.ZF]) 304 | tainted_instrs = merge_dict(tainted_instrs, result) 305 | 306 | 307 | if inst.getDisassembly().startswith('jb') or \ 308 | inst.getDisassembly().startswith('jnae') or \ 309 | inst.getDisassembly().startswith('jnb') or \ 310 | inst.getDisassembly().startswith('jae'): 311 | 312 | instrs_to_include[inst.getAddress()] = inst.getDisassembly() 313 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.CF]) 314 | tainted_instrs = merge_dict(tainted_instrs, result) 315 | 316 | 317 | if inst.getDisassembly().startswith('jl') or \ 318 | inst.getDisassembly().startswith('jge') or \ 319 | inst.getDisassembly().startswith('jnl') or \ 320 | inst.getDisassembly().startswith('jg') or \ 321 | inst.getDisassembly().startswith('jnle'): 322 | 323 | instrs_to_include[inst.getAddress()] = inst.getDisassembly() 324 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.SF]) 325 | tainted_instrs = merge_dict(tainted_instrs, result) 326 | result = slice_expr(ctx, ctx.getSymbolicRegisters()[REG.X86.OF]) 327 | tainted_instrs = merge_dict(tainted_instrs, result) 328 | 329 | # print_slice_result(tainted_instrs) 330 | newly_tainted_instrs = tainted_instrs.copy() 331 | 332 | if len(tainted_instrs) > 0: 333 | while True: 334 | instrs_need_further_taint = \ 335 | find_instrs_need_further_taint(bv, bbl, newly_tainted_instrs, bbl_all_instrs) 336 | 337 | print('there are %d instructions that need further tainting' \ 338 | % len(instrs_need_further_taint)) 339 | for addr in instrs_need_further_taint.keys(): 340 | print(hex(addr)) 341 | 342 | if len(instrs_need_further_taint) == 0: 343 | break 344 | 345 | newly_tainted_instrs = further_taint_bbl_instrs(bv, bbl, instrs_need_further_taint) 346 | # print_slice_result(newly_tainted_instrs) 347 | 348 | tainted_instrs = merge_dict(tainted_instrs, newly_tainted_instrs) 349 | 350 | # print_slice_result(tainted_instrs) 351 | # print('===============================') 352 | # print_slice_result(instrs_to_include) 353 | # print('===============================') 354 | 355 | # print_slice_result(instrs_to_include) 356 | instrs_to_include = merge_dict(instrs_to_include, tainted_instrs) 357 | print_slice_result(instrs_to_include) 358 | 359 | return instrs_to_include 360 | 361 | 362 | def highlight_included_instrs(bv, bbl, instrs_to_include): 363 | for addr in instrs_to_include.keys(): 364 | bv.set_comment_at(addr, '1') 365 | 366 | def nop_excluded_instrs(bv, bbl, instrs_to_include): 367 | 368 | print('inside nop_excluded_instrs()') 369 | included_addrs = instrs_to_include.keys() 370 | 371 | nop_started = False 372 | # nop_len = 0 373 | nop_start_addr = 0 374 | 375 | for inst in bbl.get_disassembly_text(): 376 | addr = inst.address 377 | if not addr in included_addrs: 378 | bv.convert_to_nop(addr) 379 | if not nop_started: 380 | nop_started = True 381 | nop_start_addr = addr 382 | # accumulate the length 383 | # nop_len += bv.get_instruction_length(addr) 384 | else: 385 | # a sequence of nop comes to an end 386 | if nop_started: 387 | 388 | # a jmp is 5 bytes long 389 | if addr - nop_start_addr >= 5: 390 | dis = 'jmp 0x%x' % addr 391 | inst_bytes = arch.assemble(dis, nop_start_addr) 392 | bv.write(nop_start_addr, inst_bytes) 393 | print('0x%x: %s' % (nop_start_addr, dis)) 394 | 395 | nop_started = False 396 | # nop_len = 0 397 | nop_start_addr = 0 398 | 399 | def solve_opaque_predicate(bv, func): 400 | 401 | print('solve_opaque_predicate()') 402 | 403 | for bbl in func.basic_blocks: 404 | 405 | # print('processing basic block at addr: 0x%x' % bbl.start) 406 | 407 | # jne to self 408 | if bbl.instruction_count == 1: 409 | instr = bbl.get_disassembly_text()[0] 410 | if instr.tokens[0].text.startswith('jne'): 411 | bv.never_branch(instr.address) 412 | continue 413 | 414 | instrs = bbl.get_disassembly_text() 415 | 416 | try: 417 | instr1, instr2 = instrs[-2 :] 418 | except: 419 | print('error at: 0x%x' % bbl.start) 420 | # break 421 | 422 | # print(instr1, instr2) 423 | 424 | if is_opaque_predicate(instr1): 425 | # print('found opaque predicate') 426 | if should_patch_to_always_branch(instr2): 427 | log_info('always branch at: 0x%x' % instr2.address) 428 | bv.always_branch(instr2.address) 429 | elif should_patch_to_never_branch(instr2): 430 | log_info('never branch at: 0x%x' % instr2.address) 431 | bv.never_branch(instr2.address) 432 | 433 | def solve_push_jmp(bv, func): 434 | 435 | print('solve_push_jmp()') 436 | for bbl in func.basic_blocks: 437 | if bbl.instruction_count < 5: 438 | continue 439 | 440 | disassembly_text = bbl.get_disassembly_text() 441 | if str(disassembly_text[-5]).startswith('call $+5') and \ 442 | str(disassembly_text[-4]).startswith('pop eax') and \ 443 | str(disassembly_text[-3]).startswith('add eax, 0xa') and \ 444 | str(disassembly_text[-2]).startswith('push eax') and \ 445 | str(disassembly_text[-1]).startswith('jmp'): 446 | 447 | patch_addr = disassembly_text[-5].address 448 | print('push_jump at: 0x%x' % patch_addr) 449 | 450 | jmp_addr = disassembly_text[-1].address 451 | callee_offset_bytes = bv.read(jmp_addr + 1, 4) 452 | caller_offset = struct.unpack('> int(operand, 16) 493 | else: 494 | print('unknown operation: %s' % opcode) 495 | 496 | return val 497 | 498 | def is_valid_op_sequence(ops): 499 | if not ops[-1][0] == 'sub': 500 | # print('warning:sequence does not end with sub') 501 | return False 502 | if not (5 <= len(ops) <= 8): 503 | # print('warning: there are %d operations in the sequence' % len(ops)) 504 | return False 505 | 506 | return True 507 | 508 | def solve_load_for_reg_bbl(bv, bbl, reg): 509 | pass 510 | 511 | def solve_load_bbl(bv, bbl): 512 | 513 | for reg in ['eax', 'ebx', 'ecx', 'edx']: 514 | 515 | sequence_found = False 516 | sequence_start = 0 517 | sequence_instr_count = 0 518 | sequence_byte_length = 0 519 | ops = [] 520 | 521 | for inst in bbl.get_disassembly_text(): 522 | 523 | if not sequence_found: 524 | # print(str(inst)) 525 | # print(r'mov\s*%s, dword \[data_(.*)\]' % reg) 526 | found = re.findall(r'mov\s*%s, dword \[data_(.*)\]' % reg, str(inst)) 527 | if len(found) >= 1: 528 | # print('found') 529 | sequence_found = True 530 | sequence_start = inst.address 531 | ops.append(('mov', found[0])) 532 | sequence_instr_count = 1 533 | sequence_byte_length = bv.get_instruction_length(inst.address) 534 | 535 | else: 536 | matched = re.findall(r'(add|sub|shl|shr|xor)\s*%s, 0x(.*)' % reg, str(inst)) 537 | if len(matched) >= 1: 538 | ops.append((matched[0][0], matched[0][1])) 539 | sequence_instr_count += 1 540 | sequence_byte_length += bv.get_instruction_length(inst.address) 541 | 542 | else: 543 | # we reached an end of a sequence 544 | if is_valid_op_sequence(ops): 545 | 546 | val = solve_load_ops(bv, ops) 547 | dis = 'mov %s, 0x%x' % (reg, val) 548 | inst_bytes = arch.assemble(dis, sequence_start) 549 | mov_instr_len = len(inst_bytes) 550 | bv.write(sequence_start, inst_bytes) 551 | print('0x%x: %s' % (sequence_start, dis)) 552 | 553 | dis = 'jmp 0x%x' % (sequence_start + sequence_byte_length) 554 | inst_bytes = arch.assemble(dis, sequence_start + mov_instr_len) 555 | bv.write(sequence_start + mov_instr_len, inst_bytes) 556 | 557 | # reset 558 | sequence_found = False 559 | sequence_start = 0 560 | sequence_instr_count = 0 561 | sequence_byte_length = 0 562 | ops = [] 563 | 564 | # print(sequence_byte_length) 565 | 566 | 567 | def solve_load(bv, func): 568 | 569 | print('solve_load()') 570 | for bbl in func.basic_blocks: 571 | solve_load_bbl(bv, bbl) 572 | 573 | 574 | def deobfuscate_function(bv, addr): 575 | 576 | func = bv.get_functions_containing(addr)[0] 577 | print('there are %d basic blocks in func %s' % 578 | (len(func.basic_blocks), func.name)) 579 | 580 | bv.begin_undo_actions() 581 | 582 | solve_opaque_predicate(bv, func) 583 | 584 | solve_push_jmp(bv, func) 585 | 586 | # solve_load(bv, func) 587 | 588 | bv.commit_undo_actions() 589 | 590 | 591 | def simplify_func(bv, addr): 592 | 593 | func = bv.get_functions_containing(addr)[0] 594 | print('there are %d basic blocks in func %s' % 595 | (len(func.basic_blocks), func.name)) 596 | 597 | bv.begin_undo_actions() 598 | 599 | simplify_bbls(bv, func) 600 | 601 | solve_load(bv, func) 602 | 603 | bv.commit_undo_actions() 604 | 605 | def simplify_bbls(bv, func): 606 | 607 | for bbl in func: 608 | # bbl = bv.get_basic_blocks_at(addr)[0] 609 | instrs_to_include = simplify_bbl(bv, bbl) 610 | 611 | # bv.begin_undo_actions() 612 | 613 | # highlight_included_instrs(bv, bbl, instrs_to_include) 614 | nop_excluded_instrs(bv, bbl, instrs_to_include) 615 | 616 | # bv.commit_undo_actions() 617 | 618 | def simplify_bbl_handler(bv, addr): 619 | 620 | bbl = bv.get_basic_blocks_at(addr)[0] 621 | 622 | instrs_to_include = simplify_bbl(bv, bbl) 623 | 624 | bv.begin_undo_actions() 625 | 626 | nop_excluded_instrs(bv, bbl, instrs_to_include) 627 | 628 | solve_load_bbl(bv, bbl) 629 | 630 | bv.commit_undo_actions() -------------------------------------------------------------------------------- /obfuscation/solution/keygen.py: -------------------------------------------------------------------------------- 1 | import struct 2 | def crc32(s, init_val = 0, final_xor = 0): 3 | 4 | poly = 0xedb88320 5 | crc = init_val 6 | for c in s: 7 | if type(c) == str: 8 | asc = ord(c) 9 | else: 10 | asc = c 11 | 12 | asc ^= 0xffffffff 13 | crc ^= asc 14 | 15 | for _ in range(8): 16 | eax = crc & 1 17 | var_c_1 = (-eax) % 0xffffffff 18 | 19 | var_8 = crc >> 1 20 | var_c_1 &= poly 21 | 22 | crc = var_8 ^ var_c_1 23 | 24 | crc ^= 0xffffffff 25 | 26 | crc ^= final_xor 27 | return crc 28 | 29 | def transform_back(buffer): 30 | 31 | rngs_vals = [ 32 | 0x10D88067, 33 | 0xBC16D3D5, 34 | 0xE7039A64, 35 | 0x39EC8A6D, 36 | 0xFF09B4BF, 37 | 0xF828DB76, 38 | 0x8BE40C8E, 39 | 0xF7AA583E, 40 | 0x60858E23, 41 | 0xE487F5A3, 42 | 0x39A57B89, 43 | 0xB006573E, 44 | 0x79609807, 45 | 0x620AD108, 46 | 0x5CD86398, 47 | 0x6CA94B51 48 | ] 49 | var_0x8c = 0 50 | 51 | ints = struct.unpack('<' + 'I' * 16, buffer) 52 | restored_ints = [] 53 | 54 | for i in range(16): 55 | restored_int = ints[i] ^ rngs_vals[i] ^ var_0x8c 56 | restored_ints.append(restored_int) 57 | var_0x8c = restored_int 58 | 59 | return restored_ints 60 | 61 | def main(): 62 | 63 | name = 'jeff' 64 | sn = 0x12348765 65 | feature = 0x123 66 | expire_year = 0x2981 67 | expire_month = 0x34 68 | expire_date = 0x12 69 | 70 | buffer = name + '\x00' * (32 - len(name)) 71 | buffer += struct.pack(' 18 | 19 | If we randomly input a phone number, we will be greeted by an error message. So this is already the main crackme. We need to find a special phone number (along with the country code) that is accepted by the app. 20 | 21 | ## Finding the Code 22 | 23 | As always, we need to first locate the code which does the verification. One clue is the error message itself: **Wrong number! Try again**. This string can be found in the libtmessages.29.so. However, there are no Xrefs to it. Now there are several possibilities: 1. the string will be used but the code is obfuscated so my disassembler does not find a reference to it; 2. the string is not used and the code is somewhere else. I continued to search in class.dex, libtmessages.28.so, and also used [Apktool](https://ibotpeaches.github.io/Apktool/) to unpack the resources.arsc. Nothing else can be found. 24 | 25 | I do not want to create the illusion that I **systematically** find the verification function. Actually, I took some detous here. I reversed the class.dex and libtmessages.28.so for a while without success before I tried the libtmessages.29.so. This is indeed quite common in reverse engineering. Going back to the libtmessages.29.so, I had a look at the JNI_OnLoad() which has some related stuff but does not have the verification function. I checked the functions before and after the JNI_OnLoad() to see if there are any interesting functions. The logic is compilers tend to arrange the functions close to each other in the source code also adjacent to the generated binary. So there is a chance the important function is near the JNI_OnLoad(). 26 | 27 | I spotted the **data_6871** that sits right after the function. It starts with 0x5b81, which looks like code for me. 28 | 29 | 30 | 31 | Then I defined a function here and it is real code. It seems innocent at first look, but I quickly noticed that it is preparing a constant string on the stack: 32 | 33 | 34 | 35 | ``` 36 | Are you trying to analyze me? 37 | ``` 38 | 39 | It looks like a message related to anti-debugging -- we might see this while debugging the app. Remember we are not yet sure whether this is related to the verification, so it is worthy to debug it now and see if this function is called. 40 | 41 | ## Setting up Debugging 42 | 43 | Simply put, debugging an android app is a remote debugging scene. We run the **gdbserver** on the phone (either an emulator or a real one) and attach it to the target process. And then we launch **gdb** on our computer and connect to the remote target. After that, there is no difference between debugging locally and remotely. 44 | 45 | An android app may run inside a Dalvik VM. However, the VM is just a regular process and can be debugged like any other processes. Furthermore, the native libraries are directed loaded into the process memory space so we can also debug that. 46 | 47 | We first need to download the [Android NDK](https://developer.android.com/ndk/) since we need the prebuilt gdbserver in it. The NDK is large and we do not need other things in it (for debugging purpose). However, it is better than randomly searching on the Internet for it -- it may not work properly inside the AVD. 48 | 49 | The gdbserver can be found in the ```android-ndk-r21b/prebuilt/android-x86/gdbserver```. Note I have a x86 AVD so I need the x86 version of it. First I push it to the device: 50 | 51 | ``` 52 | $ adb push ./android-ndk-r21b/prebuilt/android-x86/gdbserver /system/bin/ 53 | ``` 54 | 55 | After that, I launch the app on the device. Then On my computer, I spawn an adb shell by running: 56 | 57 | ``` 58 | $ adb shell 59 | ``` 60 | 61 | The app is called **telegram** so I run the following command to find the PID of the target process by running: 62 | 63 | ``` 64 | # ps -A | grep telegram 65 | u0_a80 4165 7934 1562976 153292 ep_poll e9897b59 S org.telegram.messenger 66 | ``` 67 | 68 | Note: I use ```$``` for any command to be executed in the host shell and ```#``` for anything inside the adb shell. 69 | 70 | The PID of our target is 4165. The command to attach gdbserver to the process is: 71 | 72 | ``` 73 | # gdbserver --attach host:port PID 74 | ``` 75 | 76 | In my case, I use: 77 | 78 | ``` 79 | # gdbserver --attach localhost:12345 4165 80 | ``` 81 | 82 | Now the gdbserver will attach to the process with PID 4165 and listen on port 12345 for remote connection. Meanwhile, the app will hang. 83 | 84 | We need to set up a port forwarding before connecting to it. This is because the gdbserver is listening on the port 12345 of the device, not our host computer. 85 | 86 | ``` 87 | $ adb forward tcp:12345 tcp:12345 88 | ``` 89 | 90 | This will forward the port 12345 on the host to the port 12345 on the device. 91 | 92 | Now launch gdb on the computer and attach to it: 93 | 94 | ``` 95 | pwndbg> target remote localhost:12345 96 | ``` 97 | 98 | If everything works fine gdb should be printing a lot of information about the remote target. This might take a while and eventually, it should stop and ask for your input. The prompt starts with ```pwndbg>``` because I installed the pwngdb enhancement, which makes gdb more usable. 99 | 100 | The next thing to figure out is the base address of the loaded libtmessages.29.so. 101 | 102 | ``` 103 | pwndbg> info sharedlibrary 104 | From To Syms Read Shared Object Library 105 | // many lines omitted 106 | 0xc9b69000 0xc9b6eaf7 Yes (*) target:/data/app/org.telegram.messenger-o_d807FF7eGAXMhf5s3qqQ==/oat/x86/base.odex 107 | 0xc9406570 0xc9406830 Yes (*) target:/data/app/org.telegram.messenger-o_d807FF7eGAXMhf5s3qqQ==/lib/x86/libtmessages.29.so 108 | 0xc8896400 0xc8f70f71 Yes (*) target:/data/app/org.telegram.messenger-o_d807FF7eGAXMhf5s3qqQ==/lib/x86/libtmessages.28.so 109 | 0xc7d329b0 0xc7d36ea5 Yes (*) target:/vendor/lib/hw/gralloc.ranchu.so 110 | (*): Shared library is missing debugging information. 111 | ``` 112 | 113 | We can see the address of libtmessages.29.so is 0xc9406570. Interestingly, the address reported by **info sharedblibrary** is the address of the **.text** section, which is not very convenient for rebasing. But it is fine since we can calculate it manually. 114 | 115 | In BinaryNinja we can see the start of the .text is at 0x5570, while the start of the function is at 0x6871. We now the offset of the remains the same, so the actual address to set the breakpoint is: 116 | 117 | ``` 118 | >>> hex(0xc9406570 + (0x6871 - 0x5570)) 119 | '0xc9407871' 120 | ``` 121 | 122 | . Then we rebase it in BinaryNinja and we now the address of that. 123 | 124 | ``` 125 | pwndbg> b *0xc9407871 126 | Breakpoint 1 at 0xc9407871 127 | pwndbg> c 128 | Continuing. 129 | ``` 130 | 131 | Now, give a random phone number and hit enter on the phone. And the breakpoint hits! We find the verification function! 132 | 133 | ## Solving the Country Code 134 | 135 | The function is medium-sized and we need to have a big picture of it before plunging into lines of assemblies. Near the bottom of the function, we see the string "Wrong number" being created in a buffer: 136 | 137 | 138 | 139 | So we need to avoid this basic block. Scrolling up a little bit and we find two checks must be satisfied: 140 | 141 | 142 | 143 | These are testing if the lowest bit is set. However, if we go further upward we can find that both **check_1** and **check_2** are booleans and they represent whether a check is satisfied. For check_1, we have the following block: 144 | 145 | 146 | 147 | We see a string input is passed into function **std::__ndk1::stoul** and converted to an integer using base 10. Then **7 * int + 9** is calculated and the result is fed into function **__umoddi3**. I have seen **__umoddi3** before so I quickly figure out the divisor is 0x25. In fact, **__umoddi3** calculates 64-bit unsigned modulus. This is a 32-bit binary so it has to use two registers to hold 64bit values. The **edx** pushed onto the stack is the higher 32 bits of the dividend; the **eax** is the lower 32bits. If I have not seen it, I can also figure it out by debugging the code and observe the input and output for it. The modulus is returned as **edx:eax** too. 148 | 149 | We want variable **check_1** to be 1, so we must set it at 0x6a60. To ensure the **ZF** is set when it gets to 0x6a60, the eax must be 0x17 and the edx must be 0. This means the modulus must be equal to 0x17. 150 | 151 | A quick debugging veries the input string is the country code we input. So the constaint here is: 152 | 153 | ``` 154 | (7 * country_code + 9) % 0x25 == 0x17 155 | ``` 156 | 157 | A simple script to print the accepted coutry_code is as follows: 158 | 159 | ```Python 160 | for coutry_code in range(999): 161 | if (7 * coutry_code + 9) % 0x25 == 0x17: 162 | print(coutry_code) 163 | ``` 164 | 165 | We know there are many values that satisfy the above equation, but only one among them is a valid country code. It is +39, which is the code for Italy. 166 | 167 | ## Solving the Phone Number 168 | 169 | Below the country code check, we can find the check for the phone number. At 0x6c6f it calls into another medium-sized function, which is probably the check function. It looks like this: 170 | 171 | 172 | 173 | It is not immediately obvious what this function does. Though from the first few basic blocks we can observe the **std::string** being used and the valid length is probably 0x16. Remember the correct phone number is not necessarily a phone number at **al** and it does not have to have a length that looks like a phone number (e.g., 10 digits for the U.S.). 174 | 175 | To approach a function like this, there are two methods. The first way is to check how is the return value calculated and back-slice it and do taint-analysis in the brain. From the previous analysis we now this function should return 1 in **eax**. We can go back from the last instruction that touches **eax** and see what is the way to set it to 1. 176 | 177 | Besides, we see there is a loop in the lower-right side of the mini graph. Loops can give us a lot of information about what is happening. My way to reverse a loop is to identify the iteration variable (similar to **i** in C code), and see what is the initial value, final value, and stride. Or more generally speaking, what is the exit condition and what is the update rule. This lets us know how many times this loop is going to be executed. 178 | 179 | Then we should get into the loop body and analyze it. This lets us know what is done in one iteration. These two combined tell us what the loop is doing as a whole. 180 | 181 | It is hard to include every step I took to reverse this loop, but let me describe the major steps. 182 | First thing first, there are many ways to exit this loop, but the exit at 0x7325 is the only place where the return value of this function can be 1. Above it, we see **cmp ecx, esi**, which is probably comparing the iterator with the final value. But which one is the iterator? 183 | 184 | 185 | 186 | In many cases, we can figure out by looking at the code, but for this one, I am not so sure. Never mind, we can debug it. I set a breakpoint at 0x7323 and send an input with length 0x16 (if the length is wrong, the execution never enters the loop). In the first iteration **ecx** is 0 and **esi** is 0x16; in the second iteration **ecx** is 2 and **esi** is 0x16. 187 | 188 | So, it looks like ecx is **i** and esi is the final value 0x16. Going up a little bit and at 0x7306 we see the i is incremented by 2 each time. So this loop probably processes two bytes of the input one time. 189 | 190 | Now it is time to analyze one iteration. We want the code at **0x732d** to set al, then **edi** must not be 0. At **0x7300** there is a **and**, so ecx must not be 0. **ecx** is updated at 0x72d8, where we have a **cmp** before it. So to have 1 as the return value for the function, this **cmp** must be equal. Then we move further upward to see what is **al** and **byte [esp+0x36]**. 191 | 192 | 193 | 194 | It turns out the al is the result of another **std::__ndk1::stoul**. The base is still 10 and the input is the two chars (for every iteration) from the input string. The other operand is a little bit complex. During debugging, I find that at 0x7283, the **eax** points to a string 195 | ``` 196 | org.telegram.messenger 197 | ``` 198 | It is the name of the app. I did not bother backtrace how it gets to here but this is an interesting finding: it is probably used in the algorithm. At 0x7295, it takes the ith char of the above string. At 0x729d, the char (ASCII value) is xor-ed with a variable we do not understand yet. 199 | 200 | Then we see a **division by multiplication**. This is an optimization technique used by compilers to speed up divisions. Division instructions (e.g., **idiv**) are super slow to execute so the compilers calculate it differently. Even though we ended up with more instructions, the code executes faster. For more details on this topic, please refer to: [ref 1](https://stackoverflow.com/questions/30790184/perform-integer-division-using-multiplication) or [ref 2](https://gmplib.org/~tege/divcnst-pldi94.pdf). 201 | 202 | It is not hard to recognize the divisor from the assembly after we know how it works. Furthermore, if the division is used to calculate a modulus, it is easier to recognize. For example, if the code calculates **eax % n**, it will do the following two things: 203 | 204 | ``` 205 | quotient = eax / n 206 | modulus = eax - quotient * n 207 | ``` 208 | 209 | The "divide by n" part might not be immediately obvious, but the "multiply by n" part is super easy. 210 | 211 | ``` 212 | 000072a3 movsx ecx, byte [esp+0x36 {var_3a_1}] 213 | 000072a8 mov eax, ecx 214 | 000072aa mov edx, 0x51eb851f 215 | 000072af imul edx 216 | 000072b1 mov eax, edx 217 | 000072b3 shr eax, 0x1f 218 | 000072b6 shr edx, 0x5 219 | 000072b9 add edx, eax 220 | 000072bb imul eax, edx, 0x64 221 | 000072be sub ecx, eax 222 | 000072c0 mov byte [esp+0x36 {xored_val % 0x64}], cl 223 | ``` 224 | 225 | At 0x72bb we see a **imul eax, edx, 0x64** followed by a **sub ecx, eax**. Obviously, this is calculating the modules and divisor is 0x64. So this whole thing is calculating **ecx % 0x64**. 226 | 227 | For many other divisor values, the multiplication will be further optimized. Like if the divisor is 9, it will become something like **mov edx, eax; shl edx, 3; add edx, eax**. But the "shift left and add" trick is still more obvious than the division. 228 | 229 | Now the only missing piece is the mysterious variable referenced at address 0x7299. Notice it is initialized to 0 before entering the loop and update at 0x72c9 according to the result of the transformation in each iteration. In fact, this is similar to the [Cipher block chaining (CBC)](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation) in block ciphers, where an initilization vector is provided and updated on every block. 230 | 231 | We can now reconstruct the algorithm as the following pseudo-code: 232 | 233 | ```Python 234 | def check(input_string): 235 | if len(input_string) != 0x16: 236 | return False 237 | 238 | s = 'org.telegram.messenger' 239 | IV = 0 240 | check_ok = True 241 | for i, two_char in yield_two_char_every_time(input_string): 242 | val = int(two_char, 10) 243 | IV = (IV ^ i ^ asc(s[i])) % 0x64 244 | if val != IV: 245 | check_ok = False 246 | break 247 | return check_ok 248 | ``` 249 | 250 | Which can be solved by the following script: 251 | 252 | ```Python 253 | s = 'org.telegram.messenger' 254 | val = 0 255 | i = 0 256 | flag = '' 257 | while i < 0x16: 258 | c = s[i] 259 | asc = val ^ i ^ ord(c) 260 | asc %= 0x64 261 | val = asc 262 | flag += '%d' % asc 263 | i += 2 264 | 265 | print(flag) 266 | # the flag is: 267 | # 1110222419205493626651 268 | ``` 269 | 270 | -------------------------------------------------------------------------------- /reverse_engineering_and_fixing_a_fan/imgs/img-1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/reverse_engineering_and_fixing_a_fan/imgs/img-1.jpg -------------------------------------------------------------------------------- /reverse_engineering_and_fixing_a_fan/imgs/img-2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/reverse_engineering_and_fixing_a_fan/imgs/img-2.jpg -------------------------------------------------------------------------------- /reverse_engineering_and_fixing_a_fan/imgs/img-3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/reverse_engineering_and_fixing_a_fan/imgs/img-3.jpg -------------------------------------------------------------------------------- /reverse_engineering_and_fixing_a_fan/imgs/img-4.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/reverse_engineering_and_fixing_a_fan/imgs/img-4.jpg -------------------------------------------------------------------------------- /reverse_engineering_and_fixing_a_fan/main.md: -------------------------------------------------------------------------------- 1 | # Reverse Engineering and Repairing a Fan 2 | 3 | Last summer, I broke a fan and managed to repair it. Although the repairing process is not so exciting, I recently find it can serve as a good example to explain a reverser's mindset. Like how I approached the problem and solved it. I hope to share some of my understanding about reverse engineering in this writeup. 4 | 5 | ## A Broken Fan 6 | 7 | I have a fan -- an eight-year-old fan -- that is NOT smart or IoT. It is just a simple fan. One day it fell from the table to the ground and stopped working. *RIP*. It accompanied me for several summers and I love it. I decided to take it apart and see what is indeed broken, before saying farewell to my friend. 8 | 9 | I know little about electronics, but a fan should not be too complex. It looks like this after I opened it: 10 | 11 | 12 | 13 | We can see the fan blades, the power cord, the timer, and the ON/OFF switch in it. It looks all in good shape despite the impact. How should I start reversing it? 14 | 15 | I quickly notice there is a small metal cylinder that is moving freely in the fan closure. Normally we do not have such small moving parts in a fan (it will clash with the blade easily). It is probably broken apart from the fan due to the impact. It is a reasonable guess. But how could I prove it or refute it? 16 | 17 | 18 | 19 | I decided to see other parts of the fan. The logic is, if the cylinder breaks apart from somewhere, there should be a trace of it. I then spotted a previously unnoticed part. It is a plastic box and there is a sharp irregular edge on it, which is a sign of a broken part. I have no idea what the box is and it does not look critical to the fan's functionality, since I already identified the timer and the switch, etc. 20 | 21 | ## Making the Fan Alive Again! 22 | 23 | Upon closer inspection of the plastic box, I see two wires going into it and the wires are connected to small metal blades. One weird thing is the blades are **NOT** connected. And they are likely to remain unconnected during the operation of the fan. What is this? 24 | 25 | Now it becomes interesting: I have examined most parts of the fan, and found stuff that worth investigating. I need to connect the dots. Some creativity, as well as luck, are needed here. I stared at the plastic box and the cylinder for a while, and I suddenly **have a hypothesis**. If I put this metal cylinder inside the plastic box, the blades will be connected. Could it be the reason that the fan stopped working? 26 | 27 | 28 | 29 | I did a quick test: I put the cylinder inside it and turned the switch on. Wow, it works! The fan is alive again! 30 | 31 | ## Why is there a Plastic Box and a Metal Cylinder? 32 | 33 | Not all of the mysteries are solved yet. I am still puzzled by this plastic box and the metal cylinder. What is the purpose of having them? What did they look like before the fall? 34 | 35 | Now it comes to the fundamental part of reverse engineering: **understanding how the system works**. The fan works but it is quite weird: this plastic box can be removed and we just connect the two wires directly. There must be a reason to have it. 36 | 37 | There are two ways to reason about it in such a situation. The first way is to think of what could go wrong if we do not have it. Like why we need to check whether the divisor is zero before we divide. However, as mentioned above, nothing seems wrong without this box. This method does not work here. It must be serving **certain purpose yet unknown** to us. This is quite a typical scenario in reverse engineering. 38 | 39 | The other way is to imagine different inputs to the system (the fan), and predict the possible status or outcomes of the system. Then we deduce the purpose of it. This is harder to do because we need to generate lots of inputs and examine many possible status or outcomes. And it is not guaranteed to succeed! It could be purposed for a situation we could never think of, so we would never know why it is here. 40 | 41 | Let us start with it. The cylinder currently connects the two metal blades. What would stop it from doing so? Not too hard, right? If it leaves the current position and goes up, the blades are disconnected. However, due to gravity, it will not move up by itself. Can we come up with a case where the gravity does not moderate this cylinder? Well, if this fan is used in the space station then the cylinder can move freely. But it is not the case here. It is a consumer product. What could be another case where the effect of gravity is gone or altered? 42 | 43 | **WHEN IT FALLS!** 44 | 45 | When the fan falls, the gravity will no longer drag the cylinder toward the position that connects the two blades. The result is, the cylinder moves, leaving the two blads disconnected, and the fan stops working. Now we have a reasonable explanation for the plastic part and the metal cylinder: it is a **fall-protection mechanism**! 46 | 47 | ## Connecting the Dots 48 | 49 | Note the cylinder can not only move vertically, but it can also move horizontally. It can leave the plastic box and never (easily) get back. In fact, this is probably the cause of the fan's failure. We still miss something. 50 | 51 | I did not guess it, though some readers could already guessed it. I examined the fan again and found another plastic piece in it. It looks like a lid for the plastic box. If there is a lid, then the cylinder will not leave the plasitc box. And in case of the fan falls, once it is erected again, the cylinder will go back to its original position and connect the blades again. 52 | 53 | All the dots are eventually connected. The metal cylinder was confined in a plastic box. It serves as a fall protection mechanism. However, the fan fell from a high desk and the impact was so strong that the cylinder broke the plastic box apart. (We can see the lid was somehow connected to the box before it broke.) It is unable to go back to its original position again and the fan stopped working. 54 | 55 | I have to admit this is quite simple yet effective. If I were to implement such functionality, what comes to my mind first is gyroscopes and a program, which is both complex and expensive. Through reverse enginnering, I learned the same thing can be achieved like this. 56 | 57 | It eventyally comes to the last step in reverse engineering. We need to repair the fan. For this particular one, it is not hard to repair. We first put the cylinder inside the box, then put the lid on top of the box, and then use some tape to secure it. It looks like this after it is repaired: 58 | 59 | 60 | 61 | ## Relating to Reverse Engineering 62 | 63 | I admit this example of reverse engineering the fall-protection mechanism and repairing the fan is trivial. However, it does show some important steps in reverse engineering. Let me explain. 64 | 65 | In the first step, I opened the fan to see its internal. This is analogous to analyzing a binary statically. I did some preliminary analysis on the fan, like identifying the core components. In reverse engineering, we do this too. Typically we would have a quick look at the binary to get some information about it. Like what platform it runs on and what API functions it calls. 66 | 67 | Then I spotted a metal cylinder that moves freely. This is called (by me) a **pivot**. A binary program can be huge and we cannot blindly reverse it entirely. We need to focus on something. It could be a string, an API function, or a constant value (in crypto function). 68 | 69 | From the cylinder, I investigated the fan and came up with a possible **hypothesis** for it. I tested it by putting the cylinder back and turn the fan on. Then the hypothesis is confirmed. This loop is quite common in real-world reverse engineering. For example, there is a function that we are not sure about. We could study it and get several possible guesses for it. Then we confirm or refute them. What I did is most close to **debugging**, where I launch the fan and see if it works. I am lucky since my first hypothesis is correct. In reversing this loop could repeat several times before one understands a complex function. 70 | 71 | Now it comes to the hard part. I did not immediately understand why there are such a plastic box and a cylinder. This is also common in reverse engineering. We encounter lots of things that we cannot properly understand or guess their meaning. The approach I took can be understood as a **symbolic execution** of the fan. I tried to reason about what could happen to the fan in a different scenario. While doing this, **constriant solving** is quite helpful as it gave me several cases of why the cylinder could move. Symbolic execution and constraint solving are intermediate topics in reverse engineering. They could look like *magic* in many cases. 72 | 73 | After I get a comprehensive understanding of the fan, I need to repair it. In reverse engineering, most likely we do not need to repair anything (well, in certain cases we need to fix a bug in the binary, but that is rare). We need to re-implement it, either as code or documentation. 74 | 75 | The above can be summarized in the following chart: 76 | 77 | 78 | |Repairing a fan | Reverse Enginnering| 79 | |--- | --- | 80 | |take the fan apart | static analysis| 81 | |spot the cylinder | find a pivot| 82 | |guess the cylinder can connect the circuit | have a hypothesis| 83 | |put the cylinder back and turn the fan on | test the hypothesis (debugging)| 84 | |reason about the plastic box's functionality | symbolic execution & constraint solving| 85 | |understand it is fall protection mechanism | understand the functionality of code 86 | repair the fan | reimplement as code or documentation| 87 | 88 | Of course, this analogy is not meant to be complete or always accurate. For example, debugging is only one of the ways to test the hypothesis. And we do not explicitly use symbolic execution and constraint solving every time we reverse. An interesting fact is, when we reason about a piece of code, we probably symbolically executed it many times **in our mind** without using any external tools like Triton or angr. -------------------------------------------------------------------------------- /shl_undefined_behavior/imgs/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/shl_undefined_behavior/imgs/1.png -------------------------------------------------------------------------------- /shl_undefined_behavior/imgs/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/shl_undefined_behavior/imgs/2.png -------------------------------------------------------------------------------- /shl_undefined_behavior/imgs/3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/shl_undefined_behavior/imgs/3.png -------------------------------------------------------------------------------- /shl_undefined_behavior/src_and_binaries/test: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/shl_undefined_behavior/src_and_binaries/test -------------------------------------------------------------------------------- /shl_undefined_behavior/src_and_binaries/test.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | int main() 5 | { 6 | int n = 64; 7 | uint64_t ret = (1UL << n) - 1; 8 | printf("0x%lx\n", ret); 9 | } -------------------------------------------------------------------------------- /shl_undefined_behavior/src_and_binaries/test_O2: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/shl_undefined_behavior/src_and_binaries/test_O2 -------------------------------------------------------------------------------- /shl_undefined_behavior/src_and_binaries/test_missing_UL: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/shl_undefined_behavior/src_and_binaries/test_missing_UL -------------------------------------------------------------------------------- /shl_undefined_behavior/src_and_binaries/test_missing_UL.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | int main() 5 | { 6 | int n = 48; 7 | uint64_t ret = (1 << n) - 1; 8 | printf("0x%lx\n", ret); 9 | } -------------------------------------------------------------------------------- /shl_undefined_behavior/src_and_binaries/test_missing_UL_O2: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/shl_undefined_behavior/src_and_binaries/test_missing_UL_O2 -------------------------------------------------------------------------------- /shl_undefined_behavior/writeup.md: -------------------------------------------------------------------------------- 1 | # Examining the difference between C program and Assembly -- An Example of << and shl 2 | 3 | ## Encountering a Weird Issue 4 | 5 | Recently, I needed to write one function that returns a bitmask according to the number of bits. Basically, if the input is 8, it should return 0xff. The input `n` is in the range of 0-64 (both side include). 6 | 7 | The first idea is to use left shift and then minus 1: 8 | 9 | ```C 10 | uint64_t getBitMask(size_t n) 11 | { 12 | uint64_t ret = (1UL << n) - 1; 13 | return ret; 14 | } 15 | ``` 16 | 17 | This works well when n is in the range of 0-63. However, when n is 64, the code returns 0 instead of 0xffffffffffffffff. 18 | 19 | I isolated the problem and created the following minimal PoC: 20 | 21 | ```C 22 | #include 23 | #include 24 | 25 | int main() 26 | { 27 | int n = 64; 28 | uint64_t ret = (1UL << n) - 1; 29 | printf("0x%lx\n", ret); 30 | } 31 | ``` 32 | 33 | And the command to compile and run it: 34 | 35 | ``` 36 | $ gcc -o test test.c 37 | $ ./test 38 | 0x0 39 | ``` 40 | 41 | This result is counter-intuitive since when n is 64, the only bit in 1 should be shifted out and it becomes `0 - 1`, which should give me 0xffffffffffffffff. 42 | 43 | I have no idea why it behaves like this so I decided to load the compiled binary into BinaryNinja to see what is happening. 44 | 45 | ## Assembly Never Lies 46 | 47 | 48 | 49 | It looks correct to me. Completely confused, I launched gdb and see what is happening. It quickly turns out that after the `shl rdx, cl` at 0x663, rdx remains 0x1 rather than becoming 0. And 1 - 1 is 0 -- that is why 0x0 is printed. 50 | 51 | Some vague impression of the shl instruction struck me. `cl` is 64 now, which is also the size of the register being shifted. Does it affect the execution? I navigated to the Intel reference manual and start reading the page that documents the shl instruction. I found this: 52 | 53 | ``` 54 | The count operand can be an immediate value or the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used). 55 | ``` 56 | 57 | We are in 64-bit mode here. The documentation states that the bits beyond the lowest 6 are discarded. Now we have 64 (0b1000000) in cl, whose lowest 6 bits are zeros. No wonder rdx remains 1 after the shl -- we are effectively shifting 0 bit. 58 | 59 | Ok, things are sorted out now. But I decided to test how gcc handles this when optimizations are on. Because when we turn on optimization (e.g., -O2), it is very likely the value of `ret` is calculated by the compiler rather than in runtime. Does gcc also enforce the width limit on the shift count? 60 | 61 | ``` 62 | $ gcc -O2 -o test_O2 test.c 63 | $ ./test_O2 64 | 0xffffffffffffffff 65 | ``` 66 | 67 | Wow, the output is different from the previous one! And the disassembly looks like this: 68 | 69 | 70 | 71 | The value 0xffffffffffffffff is directly printed. It same gcc -O2 behaves in the same way as I expected -- it ignores the limit on the shift count. 72 | 73 | Well, we now get one source code that gives different result when compiled with `-O0` and `-O2`. Is this a gcc bug? 74 | 75 | Nope, it is not. C standard actually defines the behavior as undefined: 76 | 77 | ``` 78 | -- An expression is shifted by a negative number or by an amount greater than or equal to the width of the promoted expression (6.5.7). 79 | ``` 80 | 81 | Since this behavior is undefined, the difference between the `-O0` and `-O2` is not a bug. 82 | 83 | Back to the function I need to write, although there might be a way to implement the functionality without a branch, it probably exploits certain implementation of a particular compiler. Which is unreliable and bad for cross-platform and cross-compiler compatibility. I decided to put a `if` for the case `n == 64`. 84 | 85 | ## Epilog 86 | 87 | Differences between the C source code and the compiled x86 binary is an well-known issue. This paper comes to my mind first: [WYSINWYX:What You See Is Not What You eXecute](https://research.cs.wisc.edu/wpis/papers/wysinwyx.final.pdf). 88 | 89 | C is quite low level so it has a close relation with the underlying hardware. C standard defines certain behavior as undefined to save the effort of C compiler authors. If the `<<` operator is defined when the shift count is larger than or equal to the register width, there will be more branches in the compiler code to take care of many edge cases. 90 | 91 | Reading the assembly is probably the best method to resolve similar issues. In fact, during the development I once missed the `UL` after the constant `1`. And the code stops working after the n is larger than 32. 92 | 93 | ```C 94 | int main() 95 | { 96 | int n = 48; 97 | uint64_t ret = (1 << n) - 1; 98 | printf("0x%lx\n", ret); 99 | } 100 | ``` 101 | 102 | When the above code is compiled with `-O0`, it prints `0xffff`. Why? Because 1 is considered a 32-bit integer and gcc decides to use `edx` (instead of `rdx`) to hold it. 103 | 104 | 105 | 106 | Since 48 = 0b110000, and only the lowest 5 bits are involved in the calculation, we are effectively left shifting 16 bits. That is why we get 0xffff as the output. 107 | 108 | Last but not least, what would we get if we compile the above code with `-O2`? The result is surprising to me at first sight, followed by an aha moment. 109 | 110 | -------------------------------------------------------------------------------- /x86/README.md: -------------------------------------------------------------------------------- 1 | # Making and solving a Reversing Challenge Based-on x86 ISA Encoding 2 | 3 | This time the writeup is a little bit different -- I am the maker of this challenge so the narrative is from a different perspective. I will first cover how I made it, and then show two possible ways to solve it. 4 | 5 | ## The Plan 6 | 7 | I have always been hoping to make some reversing challenges based-on the encoding of the x86 instruction set. It does not have to be super hard, maybe just explore some interesting aspects of the x86, which goes lower than the disassembly. Recently, thanks to my intern task that lifts x86 instructions, as well as reading this [blog post](https://www.msreverseengineering.com/blog/2015/6/9/x86-trivia-for-nerds), I decided to do it rather than set it for the future (indefinitely). 8 | 9 | There are several ways to do it, and I think it is not a bad idea to mutate the executable code according to the user input. It is interesting because, for most reversing challenges, the solver is not expected to change (patch) the code. However, we can take the user input and explicitly use it, in certain ways, to modify the code. 10 | 11 | So how do we do it? Executing the user input directly is probably not a good idea. Since code is typically non-printable, so the solution is going to be ugly. More importantly, when we grant the player with arbitrary code execution, it is hard to enforce that they solve it in our intended way. 12 | 13 | So it is best to modify existing code according to the user input. The first thing that came to my mind is we can do some arithmetics with it. We can have an equation like: 14 | 15 | ``` 16 | start_value ± a1 ± a2 ± .. ± an == result 17 | ``` 18 | 19 | Where the user has to figure out the correct plus or minus sign to make this equation correct. The `start_value`, `result`, as well as the `ai` (1 <= i <= n), are all randomly generated. I made them 32-bit integers. 20 | 21 | ## Implementing and Automating 22 | 23 | There are a couple of things to make the idea concrete. 24 | 25 | Firstly, how do we accept the user inputs? We can directly take plus or minus signs as string literals but I wish to make it slightly twisted here: the program will take a 32-bit integer and use each of its bits as the indicator of plus/minus. 26 | 27 | The next thing is about the x86 instruction encoding. I decided to use the register `eax` to hold the accumulated value and eventually compare it with the target value. We know that x86 instruction encodes the opcode in a straight-forward way, so it is quite easy to switch between an add instruction and sub instruction. 28 | 29 | 30 | 31 | If you look at the highlighted line, you will notice that `ADD EAX, imm32` is encoded as `05 id`, where the `05` stands for the opcode, and the `id` means a 32-bit immediate follows it. So if we have bytes `0512345678`, it will decode to `ADD EAX, 0x78563412` (note the endianess). Similarly, `SUB EAX, imm32` is encoded as `2D id`. So the real difference between an `ADD EAX, imm32` and a `SUB EAX, imm32` is the opcode, i.e., the first byte of the instruction. 32 | 33 | So the code modifying is easy: we just need to check every bit of the user input and overwrite the opcode byte of the correct one (05 or 2D). Each instruction is 5 bytes and the latter four bytes encode the immediate value in the equation. 34 | 35 | This challenge can be made manually, but I prefer to be able to generate it automatically. That brings several benefits, e.g., the ease of debugging during development. The source code of the challenge is provided in the [source folder](/source), and you can have a look at it. 36 | 37 | The code that does not change is written in C, whereas a Python script will generate random constant values for the changes and write it to a .h header file. the header file is included in the C source file so it can compile end-to-end. I also make a Makefile so I can easily build debug and release version of it. The Python generator looks like this: 38 | 39 | ```Python 40 | import random 41 | import os 42 | 43 | rounds = 32 44 | MAXINT = 0xffffffff 45 | 46 | output = open('code.h', 'w') 47 | 48 | val = random.randint(0, MAXINT) 49 | # mov eax, val 50 | output.write('{0xb8, 0x%x},\n' % val) 51 | ans = 0 52 | 53 | for i in range(rounds): 54 | op = random.randint(0, 1) 55 | round_val = random.randint(0, MAXINT) 56 | ans |= (op << i) 57 | if op == 0: 58 | val -= round_val 59 | else: 60 | val += round_val 61 | 62 | val &= MAXINT 63 | 64 | junk_opcode = random.randint(0, 0xff) 65 | output.write('{0x%x, 0x%x},\n' % (junk_opcode, round_val)) 66 | 67 | # cmp eax, val 68 | output.write('{0x3d, 0x%x},' % val) 69 | output.close() 70 | 71 | print('the answer is: %d' % ans) 72 | os.system('make') 73 | ``` 74 | 75 | The C source file defines a struct to describe the two particular instructions we are using: 76 | 77 | ```C 78 | #pragma pack(1) 79 | typedef struct 80 | { 81 | unsigned char opCode; 82 | uint32_t operand; 83 | }instr; 84 | ``` 85 | 86 | The main.c is the core part of the challenge: 87 | 88 | ```C 89 | #define N 32 90 | 91 | instr code[] __attribute__ ((section (".x86"))) = { 92 | #include "code.h" 93 | {0x0f, 0x9090d094}, 94 | // 00201043 0f94d0 sete al {0x1} 95 | // 00201046 90 nop 96 | // 00201047 90 nop 97 | {0xc3, 0} 98 | // 00201048 c3 retn {__return_addr} 99 | }; 100 | 101 | int main() 102 | { 103 | // read the input 104 | int input = 0; 105 | int unused = scanf("%d", &input); 106 | // modify the code according to the user input 107 | for(int i = 0; i < N; i ++) 108 | { 109 | bool bit = input & 1; 110 | input >>= 1; 111 | if (bit) 112 | { 113 | // add eax, imm32 114 | code[i + 1].opCode = 0x05; 115 | } 116 | else 117 | { 118 | // sub eax, imm32 119 | code[i + 1].opCode = 0x2d; 120 | } 121 | } 122 | // set page to executable 123 | void *page = 124 | (void *) ((unsigned long) (&code) & 125 | ~(getpagesize() - 1)); 126 | mprotect(page, getpagesize(), PROT_READ | PROT_WRITE | PROT_EXEC); 127 | 128 | // call the code and check result 129 | bool (*func_ptr)() = (void*)&code; 130 | if (func_ptr()) 131 | { 132 | printf("Well done!\n"); 133 | } 134 | else 135 | { 136 | printf("Try again!\n"); 137 | } 138 | } 139 | ``` 140 | 141 | ## Solving it with Z3 142 | 143 | Now it is time to solve it. A dull brute-force solves it, though it could take a while to complete. The most straightforward idea is to use Z3. We create 32 booleans and transcribe the calculations into Z3 syntax. Of course, we need to extract those constant values, but it should be relatively easy. Then I get: 144 | 145 | ```Python 146 | from z3 import * 147 | 148 | # extracted from the challenge binary 149 | init_val = 0x3df2f794 150 | target_val = 0x7a612770 151 | constants = [ 152 | 0x52ae22f2, 153 | 0xbf409bcc, 154 | 0x46417dc1, 155 | 0x25f7d9a1, 156 | 0xef83a7ce, 157 | 0x2dd63e8e, 158 | 0x584a1ec5, 159 | 0x8e58e1df, 160 | 0xf2705f70, 161 | 0x2e94ef1e, 162 | 0x3ca9e080, 163 | 0xa617b5df, 164 | 0x29ae9c3d, 165 | 0x7461ed52, 166 | 0x7125faac, 167 | 0x65dfffd6, 168 | 0x97f1f41c, 169 | 0x6f4e0648, 170 | 0xd803e5d0, 171 | 0xf358f0eb, 172 | 0xbc3b30c7, 173 | 0x585685f8, 174 | 0x2a9cc47c, 175 | 0x7f03d175, 176 | 0xc1d942ae, 177 | 0x174c7d4f, 178 | 0xb7d004f0, 179 | 0xbec8b077, 180 | 0x8ce8eaa2, 181 | 0x2510e330, 182 | 0x4aed0eee, 183 | 0x4043cd91 184 | ] 185 | 186 | # solver script 187 | n = 32 188 | inputs = [Bool('bit_%d' % i) for i in range(n)] 189 | 190 | val = BitVecVal(init_val, 32) 191 | for i in range(n): 192 | val = If(inputs[i], val + constants[i], val - constants[i]) 193 | 194 | s = Solver() 195 | s.add(val == BitVecVal(target_val, 32)) 196 | 197 | if s.check() == sat: 198 | print('solved') 199 | m = s.model() 200 | solution = 0 201 | for i in range(n): 202 | bit = m.evaluate(inputs[i]) 203 | if bit: 204 | solution |= (1 << i) 205 | print(solution) 206 | else: 207 | print('failed') 208 | ``` 209 | 210 | It works but it is a little bit slow. It took 5 minutes to solve it, IIRC. The solution I get is: 211 | 212 | ``` 213 | $ python z3_solve.py 214 | solved 215 | 2371132652 216 | ``` 217 | 218 | And it works: 219 | 220 | ``` 221 | $ ./x86 222 | 2371132652 223 | Well done! 224 | ``` 225 | 226 | Interestingly, the solution found by Z3 is different from the seed I used to generate the challenge, which is `1804139300`. But this is not surprising since there could exist other solutions than the original one. And I did not do anything to enforce the uniqueness of the challenge. 227 | 228 | 229 | ## Solving it with Divide-and-Conquer 230 | 231 | Z3 is good enough. However, there is another way to solve it. We can use divide-and-conquer to accelerate the brute-force. We can try the first 16 bits, which make up (2 ^ 16 = 65536) possibilities. We take note of the values we get. After that, we do the same thing for the latter 16 bits and do the same. Now we compare the two sets and compare if there are any matches. This allows us to find solutions in a faster way. Also, this can help us find ALL the solutions to this challenge. 232 | 233 | I am too lazy to do it by myself. I will leave it for interested readers! -------------------------------------------------------------------------------- /x86/imgs/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/x86/imgs/1.png -------------------------------------------------------------------------------- /x86/source/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | $(CC) -O2 -o x86 main.c 3 | strip x86 4 | 5 | debug: 6 | $(CC) -g -O2 -o x86 main.c 7 | 8 | clean: 9 | rm x86 -------------------------------------------------------------------------------- /x86/source/code.h: -------------------------------------------------------------------------------- 1 | {0xb8, 0x3df2f794}, 2 | {0xf3, 0x52ae22f2}, 3 | {0x4f, 0xbf409bcc}, 4 | {0xdd, 0x46417dc1}, 5 | {0xd3, 0x25f7d9a1}, 6 | {0x21, 0xef83a7ce}, 7 | {0x41, 0x2dd63e8e}, 8 | {0xc7, 0x584a1ec5}, 9 | {0x8b, 0x8e58e1df}, 10 | {0x25, 0xf2705f70}, 11 | {0x6c, 0x2e94ef1e}, 12 | {0xac, 0x3ca9e080}, 13 | {0x98, 0xa617b5df}, 14 | {0xf5, 0x29ae9c3d}, 15 | {0x56, 0x7461ed52}, 16 | {0xed, 0x7125faac}, 17 | {0x31, 0x65dfffd6}, 18 | {0x41, 0x97f1f41c}, 19 | {0x7f, 0x6f4e0648}, 20 | {0xec, 0xd803e5d0}, 21 | {0xe, 0xf358f0eb}, 22 | {0xae, 0xbc3b30c7}, 23 | {0xd, 0x585685f8}, 24 | {0xa5, 0x2a9cc47c}, 25 | {0x15, 0x7f03d175}, 26 | {0xe8, 0xc1d942ae}, 27 | {0x5b, 0x174c7d4f}, 28 | {0xbd, 0xb7d004f0}, 29 | {0x71, 0xbec8b077}, 30 | {0x47, 0x8ce8eaa2}, 31 | {0xd7, 0x2510e330}, 32 | {0xf7, 0x4aed0eee}, 33 | {0xff, 0x4043cd91}, 34 | {0x3d, 0x7a612770}, -------------------------------------------------------------------------------- /x86/source/generator.py: -------------------------------------------------------------------------------- 1 | import random 2 | import os 3 | 4 | rounds = 32 5 | MAXINT = 0xffffffff 6 | 7 | output = open('code.h', 'w') 8 | 9 | val = random.randint(0, MAXINT) 10 | # mov eax, val 11 | output.write('{0xb8, 0x%x},\n' % val) 12 | ans = 0 13 | 14 | for i in range(rounds): 15 | op = random.randint(0, 1) 16 | round_val = random.randint(0, MAXINT) 17 | ans |= (op << i) 18 | if op == 0: 19 | val -= round_val 20 | else: 21 | val += round_val 22 | 23 | val &= MAXINT 24 | 25 | junk_opcode = random.randint(0, 0xff) 26 | output.write('{0x%x, 0x%x},\n' % (junk_opcode, round_val)) 27 | 28 | # cmp eax, val 29 | output.write('{0x3d, 0x%x},' % val) 30 | output.close() 31 | 32 | print('the answer is: %d' % ans) 33 | os.system('make') 34 | 35 | # 1804139300 -------------------------------------------------------------------------------- /x86/source/main.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | 8 | #pragma pack(1) 9 | #define N 32 10 | 11 | typedef struct 12 | { 13 | unsigned char opCode; 14 | uint32_t operand; 15 | }instr; 16 | 17 | instr code[] __attribute__ ((section (".x86"))) = { 18 | #include "code.h" 19 | {0x0f, 0x9090d094}, 20 | // 00201043 0f94d0 sete al {0x1} 21 | // 00201046 90 nop 22 | // 00201047 90 nop 23 | {0xc3, 0} 24 | // 00201048 c3 retn {__return_addr} 25 | }; 26 | 27 | int main() 28 | { 29 | // read the input 30 | int input = 0; 31 | int unused = scanf("%d", &input); 32 | // modify the code according to the user input 33 | for(int i = 0; i < N; i ++) 34 | { 35 | bool bit = input & 1; 36 | input >>= 1; 37 | if (bit) 38 | { 39 | // add eax, imm32 40 | code[i + 1].opCode = 0x05; 41 | } 42 | else 43 | { 44 | // sub eax, imm32 45 | code[i + 1].opCode = 0x2d; 46 | } 47 | } 48 | // set page to executable 49 | void *page = 50 | (void *) ((unsigned long) (&code) & 51 | ~(getpagesize() - 1)); 52 | mprotect(page, getpagesize(), PROT_READ | PROT_WRITE | PROT_EXEC); 53 | 54 | // call the code and check result 55 | bool (*func_ptr)() = (void*)&code; 56 | if (func_ptr()) 57 | { 58 | printf("Well done!\n"); 59 | } 60 | else 61 | { 62 | printf("Try again!\n"); 63 | } 64 | } -------------------------------------------------------------------------------- /x86/x86: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jeffli678/writeups/2bfd17d87af0b4e356b94d28c5f66fe91f529701/x86/x86 -------------------------------------------------------------------------------- /x86/z3_solve.py: -------------------------------------------------------------------------------- 1 | from z3 import * 2 | 3 | # extracted from the challenge binary 4 | init_val = 0x3df2f794 5 | target_val = 0x7a612770 6 | constants = [ 7 | 0x52ae22f2, 8 | 0xbf409bcc, 9 | 0x46417dc1, 10 | 0x25f7d9a1, 11 | 0xef83a7ce, 12 | 0x2dd63e8e, 13 | 0x584a1ec5, 14 | 0x8e58e1df, 15 | 0xf2705f70, 16 | 0x2e94ef1e, 17 | 0x3ca9e080, 18 | 0xa617b5df, 19 | 0x29ae9c3d, 20 | 0x7461ed52, 21 | 0x7125faac, 22 | 0x65dfffd6, 23 | 0x97f1f41c, 24 | 0x6f4e0648, 25 | 0xd803e5d0, 26 | 0xf358f0eb, 27 | 0xbc3b30c7, 28 | 0x585685f8, 29 | 0x2a9cc47c, 30 | 0x7f03d175, 31 | 0xc1d942ae, 32 | 0x174c7d4f, 33 | 0xb7d004f0, 34 | 0xbec8b077, 35 | 0x8ce8eaa2, 36 | 0x2510e330, 37 | 0x4aed0eee, 38 | 0x4043cd91 39 | ] 40 | 41 | # solver script 42 | n = 32 43 | inputs = [Bool('bit_%d' % i) for i in range(n)] 44 | 45 | val = BitVecVal(init_val, 32) 46 | for i in range(n): 47 | val = If(inputs[i], val + constants[i], val - constants[i]) 48 | 49 | s = Solver() 50 | s.add(val == BitVecVal(target_val, 32)) 51 | 52 | if s.check() == sat: 53 | print('solved') 54 | m = s.model() 55 | solution = 0 56 | for i in range(n): 57 | bit = m.evaluate(inputs[i]) 58 | if bit: 59 | solution |= (1 << i) 60 | print(solution) 61 | else: 62 | print('failed') 63 | 64 | # $ python z3_solve.py 65 | # solved 66 | # 2371132652 67 | 68 | # $ ./x86 69 | # 2371132652 70 | # Well done! --------------------------------------------------------------------------------