├── .gitignore ├── Makefile ├── README.md ├── final ├── Makefile ├── hello.asm └── hello.out ├── step0 ├── Makefile └── hello.c ├── step1 ├── Makefile └── hello.c ├── step2 ├── Makefile └── hello.c ├── step3 ├── Makefile └── hello.c ├── step4 ├── Makefile └── hello.c ├── step5 ├── Makefile ├── hello.c └── link.lds ├── step6 ├── Makefile ├── hello.asm └── link.lds └── step7 ├── Makefile └── hello.asm /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | *.out 3 | *.o 4 | !final/hello.out 5 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | SHELL = /bin/bash 2 | 3 | all := step0,step1,step2,step3,step4,step5,step6,step7,final 4 | 5 | build: 6 | @for dir in {$(all)}; do \ 7 | (cd $$dir && make build); \ 8 | done 9 | .PHONY: build 10 | 11 | clean: 12 | @for dir in {$(all)}; do \ 13 | rm $$dir/hello.out; \ 14 | done 15 | .PHONY: clean 16 | 17 | list: 18 | @ls -U -l {$(all)}/hello.out 19 | .PHONY: list 20 | 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Tiny x64 Hello World 2 | 3 | A step by step adventure to find out how small a x64 binary can be which prints "hello, world". 4 | 5 | - OS: CentOS 7, Linux 3.10.0-862.el7.x86_64 6 | - GCC: gcc (Homebrew GCC 5.5.0_7) 5.5.0 7 | 8 | ## Overview 9 | 10 | - `make build`: build all steps 11 | - `make list`: list binary size of all steps 12 | - `final/hello.out` is our final result, and it's 170 bytes 🎉 13 | 14 | ``` 15 | $ make build && make list 16 | -rwxr-xr-x 1 root root 16712 Dec 4 00:08 step0/hello.out 17 | -rwxr-xr-x 1 root root 14512 Dec 4 00:08 step1/hello.out 18 | -rwxr-xr-x 1 root root 14512 Dec 4 00:08 step2/hello.out 19 | -rwxr-xr-x 1 root root 13664 Dec 4 00:08 step3/hello.out 20 | -rwxr-xr-x 1 root root 12912 Dec 4 00:08 step4/hello.out 21 | -rwxr-xr-x 1 root root 584 Dec 4 00:08 step5/hello.out 22 | -rwxr-xr-x 1 root root 440 Dec 4 00:08 step6/hello.out 23 | -rwxr-xr-x 1 root root 170 Dec 4 00:08 step7/hello.out 24 | -rwxr-xr-x 1 root root 170 Dec 4 00:08 final/hello.out 25 | ``` 26 | 27 | ## Step0 28 | 29 | This is our first try, the good old program to print "hello, world". 30 | 31 | ```c 32 | #include 33 | 34 | int 35 | main() 36 | { 37 | printf("hello, world\n"); 38 | return 0; 39 | } 40 | ``` 41 | 42 | `cd step0 && make build` and we get our first `hello.out` binary. 43 | 44 | Unfortunately, it's too big, 16712 bytes! 45 | 46 | ## Step1: Strip Symbols 47 | 48 | Let's take an easy move to strip all the symbols. 49 | 50 | Let `gcc -c` do the work and now we get out new binary, it's 14512 bytes. 51 | 52 | Still big, but hey, we certainly make a progress, do hurry 😉. 53 | 54 | ## Step2: Optimization 55 | 56 | Modern compilers can do a lot of "magic" to optimize our program, let's give it a try. 57 | 58 | `gcc -O3` enable the maximum optimization level and we will find oud that our binary size keeps the same 😢, 14512 bytes. 59 | 60 | It actually makes sense though. Our program is too simple, there isn't any room left to optimize. 61 | 62 | ## Step3: Remove Startup Files 63 | 64 | Our C program always starts with `main`, but Do you ever wonder who calls `main`? 65 | 66 | It turns out the `main` function is being called by something called `crt`, the C runtime library which is implemented by the compiler. 67 | 68 | If we remove it, our binary must be smaller, right? 69 | 70 | We need to change our program a little bit. 71 | 72 | - Let's change our entry function name to `nomain` to make the fact more obviously that we don't use crt 73 | - Since we don't use crt, we need to explicitly use system call to exit 74 | 75 | ```c 76 | #include 77 | #include 78 | 79 | int 80 | nomain() 81 | { 82 | printf("hello, world\n"); 83 | _exit(0); 84 | } 85 | ``` 86 | 87 | You must wonder why we are using `_exit` rather than `exit`? Good old StackOverflow always helps, check [What is the difference between using \_exit() & exit() in a conventional Linux fork-exec? 88 | ](https://stackoverflow.com/questions/5422831/what-is-the-difference-between-using-exit-exit-in-a-conventional-linux-fo). 89 | 90 | Use `gcc -e nomain -nostartfiles` to compiler our program and now our binary is 13664 bytes. 91 | 92 | We are making a progress again! 93 | 94 | ## Step4: Remove Standard Library 95 | 96 | We can go more wilder. We don't need the crt to do the startup, why do we need to use `printf` to print? We can certainly do it on our own! 97 | 98 | To print something to the terminal, we need to use the `write` system call. [Here](https://github.com/torvalds/linux/blob/v3.13/arch/x86/syscalls/syscall_64.tbl) is the full x64 system call table. 99 | 100 | To directly invoke system call in C, we need to use [inline assembly](https://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html). 101 | 102 | ```c 103 | char *str = "hello, world\n"; 104 | 105 | void 106 | myprint() 107 | { 108 | asm("movq $1, %%rax \n" 109 | "movq $1, %%rdi \n" 110 | "movq %0, %%rsi \n" 111 | "movq $13, %%rdx \n" 112 | "syscall \n" 113 | : // no output 114 | : "r"(str) 115 | : "rax", "rdi", "rsi", "rdx"); 116 | } 117 | 118 | void 119 | myexit() 120 | { 121 | asm("movq $60, %rax \n" 122 | "xor %rdi, %rdi \n" 123 | "syscall \n"); 124 | } 125 | 126 | int 127 | nomain() 128 | { 129 | myprint(); 130 | myexit(); 131 | } 132 | ``` 133 | 134 | Looks kind of messy, but it's actually very simple code. 135 | 136 | `gcc -nostdlib` to tell GCC we don't want the standard library since we are cool enough to do all the tings by ourselves. 137 | 138 | And we get 12912 bytes. 139 | 140 | Our program doesn't depend on anyting now, but it's still very big, why???? 141 | 142 | ## Step5: Custom Linker Script 143 | 144 | Let's examine sctions of our binary. 145 | 146 | ``` 147 | $ readelf -S -W step4/hello.out 148 | Section Headers: 149 | [Nr] Name Type Address Off Size ES Flg Lk Inf Al 150 | [ 0] NULL 0000000000000000 000000 000000 00 0 0 0 151 | [ 1] .text PROGBITS 0000000000401000 001000 00006e 00 AX 0 0 16 152 | [ 2] .rodata PROGBITS 0000000000402000 002000 00000e 01 AMS 0 0 1 153 | [ 3] .eh_frame_hdr PROGBITS 0000000000402010 002010 000024 00 A 0 0 4 154 | [ 4] .eh_frame PROGBITS 0000000000402038 002038 000054 00 A 0 0 8 155 | [ 5] .data PROGBITS 0000000000404000 003000 000008 00 WA 0 0 8 156 | [ 6] .comment PROGBITS 0000000000000000 003008 000022 01 MS 0 0 1 157 | [ 7] .shstrtab STRTAB 0000000000000000 00302a 000040 00 0 0 1 158 | ``` 159 | 160 | `Off` is kind of strange, some sections start with very big offset. 161 | 162 | Maybe linker does some alignment? Check `ld --verbose` and yes it does! 163 | 164 | So our binary is so big because of alignment, if we use `xxd` to see the binary content we can see that there are a lot of zeroes. 165 | 166 | Time to write our own linker script `link.lds`. 167 | 168 | ``` 169 | ENTRY(nomain) 170 | 171 | SECTIONS 172 | { 173 | . = 0x8048000 + SIZEOF_HEADERS; 174 | 175 | tiny : { *(.text) *(.data) *(.rodata*) } 176 | 177 | /DISCARD/ : { *(*) } 178 | } 179 | ``` 180 | 181 | `gcc -T link.lds` and we get 584 bytes, a huge step 🔥. 182 | 183 | ## Step6: Assembly 184 | 185 | Can we do better? There is nothing we can do inside the C world, it's time to move to the lower level. 186 | 187 | Let's write some assembly code! It sounds terrfying, but just give it a try, you will find it's actually very interesting. 188 | 189 | We are the God of computer, we can control everyting! 190 | 191 | ```nasm 192 | section .data 193 | message: db "hello, world", 0xa 194 | 195 | section .text 196 | 197 | global nomain 198 | nomain: 199 | mov rax, 1 200 | mov rdi, 1 201 | mov rsi, message 202 | mov rdx, 13 203 | syscall 204 | mov rax, 60 205 | xor rdi, rdi 206 | syscall 207 | ``` 208 | 209 | Use `nasm -f elf64` to assemble our code and we get 440 bytes. 210 | 211 | ## Step7: Handmade Binary 212 | 213 | Is there anyting we can do now? We are at the lowest level, there is no "lower-level" for us to go. 214 | 215 | There is no room for our code, but the binary that runs on OS is not just the code. It is a file format called ELF and it contains some extra info. 216 | 217 | So maybe we can do something to shrink that extra info? 218 | 219 | Or maybe we can write the ELF from scratch? This way, we can control every bit of our binary. 220 | 221 | ```nasm 222 | BITS 64 223 | org 0x400000 224 | 225 | ehdr: ; Elf64_Ehdr 226 | db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident 227 | times 8 db 0 228 | dw 2 ; e_type 229 | dw 0x3e ; e_machine 230 | dd 1 ; e_version 231 | dq _start ; e_entry 232 | dq phdr - $$ ; e_phoff 233 | dq 0 ; e_shoff 234 | dd 0 ; e_flags 235 | dw ehdrsize ; e_ehsize 236 | dw phdrsize ; e_phentsize 237 | dw 1 ; e_phnum 238 | dw 0 ; e_shentsize 239 | dw 0 ; e_shnum 240 | dw 0 ; e_shstrndx 241 | ehdrsize equ $ - ehdr 242 | 243 | phdr: ; Elf64_Phdr 244 | dd 1 ; p_type 245 | dd 5 ; p_flags 246 | dq 0 ; p_offset 247 | dq $$ ; p_vaddr 248 | dq $$ ; p_paddr 249 | dq filesize ; p_filesz 250 | dq filesize ; p_memsz 251 | dq 0x1000 ; p_align 252 | phdrsize equ $ - phdr 253 | 254 | _start: 255 | mov rax, 1 256 | mov rdi, 1 257 | mov rsi, message 258 | mov rdx, 13 259 | syscall 260 | mov rax, 60 261 | xor rdi, rdi 262 | syscall 263 | 264 | message: db "hello, world", 0xa 265 | 266 | filesize equ $ - $$ 267 | ``` 268 | 269 | `nasm -f bin` to bake our binary and our final result is 170 bytes. 270 | 271 | ## Final Binary Anatomy 272 | 273 | And now, we reach the final limit, 170 bytes, there is no way to reduce that any more. 274 | 275 | PS: Actually, there is, check the post [A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux](http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html). I am not gonna use techniques in this post, because they are so "hack". 276 | 277 | Now let's see what exactly every byte does in our 170-bytes final binary. 278 | 279 | ```elixir 280 | # ELF Header 281 | 00: 7f 45 4c 46 02 01 01 00 # e_ident 282 | 08: 00 00 00 00 00 00 00 00 # reserved 283 | 10: 02 00 # e_type 284 | 12: 3e 00 # e_machine 285 | 14: 01 00 00 00 # e_version 286 | 18: 78 00 40 00 00 00 00 00 # e_entry 287 | 20: 40 00 00 00 00 00 00 00 # e_phoff 288 | 28: 00 00 00 00 00 00 00 00 # e_shoff 289 | 30: 00 00 00 00 # e_flags 290 | 34: 40 00 # e_ehsize 291 | 36: 38 00 # e_phentsize 292 | 38: 01 00 # e_phnum 293 | 3a: 00 00 # e_shentsize 294 | 3c: 00 00 # e_shnum 295 | 3e: 00 00 # e_shstrndx 296 | 297 | # Program Header 298 | 40: 01 00 00 00 # p_type 299 | 44: 05 00 00 00 # p_flags 300 | 48: 00 00 00 00 00 00 00 00 # p_offset 301 | 50: 00 00 40 00 00 00 00 00 # p_vaddr 302 | 58: 00 00 40 00 00 00 00 00 # p_paddr 303 | 60: aa 00 00 00 00 00 00 00 # p_filesz 304 | 68: aa 00 00 00 00 00 00 00 # p_memsz 305 | 70: 00 10 00 00 00 00 00 00 # p_align 306 | 307 | # Code 308 | 78: b8 01 00 00 00 # mov $0x1,%eax 309 | 7d: bf 01 00 00 00 # mov $0x1,%edi 310 | 82: 48 be 9d 00 40 00 00 00 00 00 # movabs $0x40009d,%rsi 311 | 8c: ba 0d 00 00 00 # mov $0xd,%edx 312 | 91: 0f 05 # syscall 313 | 93: b8 3c 00 00 00 # mov $0x3c,%eax 314 | 98: 48 31 ff # xor %rdi,%rdi 315 | 9b: 0f 05 # syscall 316 | 9d: 68 65 6c 6c 6f 2c 20 77 6f 72 6c 64 0a # "hello, world\n" 317 | ``` 318 | -------------------------------------------------------------------------------- /final/Makefile: -------------------------------------------------------------------------------- 1 | build: 2 | nasm -f bin hello.asm -o hello.out 3 | chmod +x hello.out 4 | .PHONY: build 5 | -------------------------------------------------------------------------------- /final/hello.asm: -------------------------------------------------------------------------------- 1 | BITS 64 2 | org 0x400000 3 | 4 | ehdr: ; Elf64_Ehdr 5 | db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident 6 | times 8 db 0 7 | dw 2 ; e_type 8 | dw 0x3e ; e_machine 9 | dd 1 ; e_version 10 | dq _start ; e_entry 11 | dq phdr - $$ ; e_phoff 12 | dq 0 ; e_shoff 13 | dd 0 ; e_flags 14 | dw ehdrsize ; e_ehsize 15 | dw phdrsize ; e_phentsize 16 | dw 1 ; e_phnum 17 | dw 0 ; e_shentsize 18 | dw 0 ; e_shnum 19 | dw 0 ; e_shstrndx 20 | ehdrsize equ $ - ehdr 21 | 22 | phdr: ; Elf64_Phdr 23 | dd 1 ; p_type 24 | dd 5 ; p_flags 25 | dq 0 ; p_offset 26 | dq $$ ; p_vaddr 27 | dq $$ ; p_paddr 28 | dq filesize ; p_filesz 29 | dq filesize ; p_memsz 30 | dq 0x1000 ; p_align 31 | phdrsize equ $ - phdr 32 | 33 | _start: 34 | mov rax, 1 35 | mov rdi, 1 36 | mov rsi, message 37 | mov rdx, 13 38 | syscall 39 | mov rax, 60 40 | xor rdi, rdi 41 | syscall 42 | 43 | message: db "hello, world", 0xa 44 | 45 | filesize equ $ - $$ 46 | -------------------------------------------------------------------------------- /final/hello.out: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cj1128/tiny-x64-helloworld/ba9f63c6396d50fe61f16387061c16b542d8d5ef/final/hello.out -------------------------------------------------------------------------------- /step0/Makefile: -------------------------------------------------------------------------------- 1 | build: 2 | gcc hello.c -o hello.out 3 | .PHONY: build 4 | -------------------------------------------------------------------------------- /step0/hello.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int 4 | main() 5 | { 6 | printf("hello, world\n"); 7 | return 0; 8 | } 9 | -------------------------------------------------------------------------------- /step1/Makefile: -------------------------------------------------------------------------------- 1 | build: 2 | gcc hello.c -s -o hello.out 3 | .PHONY: build 4 | -------------------------------------------------------------------------------- /step1/hello.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int 4 | main() 5 | { 6 | printf("hello, world\n"); 7 | return 0; 8 | } 9 | -------------------------------------------------------------------------------- /step2/Makefile: -------------------------------------------------------------------------------- 1 | build: 2 | gcc hello.c -s -O3 -o hello.out 3 | .PHONY: build 4 | -------------------------------------------------------------------------------- /step2/hello.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int 4 | main() 5 | { 6 | printf("hello, world\n"); 7 | return 0; 8 | } 9 | -------------------------------------------------------------------------------- /step3/Makefile: -------------------------------------------------------------------------------- 1 | build: 2 | gcc hello.c -s -O3 -e nomain -nostartfiles -o hello.out 3 | .PHONY: build 4 | -------------------------------------------------------------------------------- /step3/hello.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | int 5 | nomain() 6 | { 7 | printf("hello, world\n"); 8 | _exit(0); 9 | } 10 | -------------------------------------------------------------------------------- /step4/Makefile: -------------------------------------------------------------------------------- 1 | build: 2 | gcc hello.c -s -O3 -e nomain -nostartfiles -nostdlib -o hello.out 3 | .PHONY: build 4 | -------------------------------------------------------------------------------- /step4/hello.c: -------------------------------------------------------------------------------- 1 | char *str = "hello, world\n"; 2 | 3 | void 4 | myprint() 5 | { 6 | asm("movq $1, %%rax \n" 7 | "movq $1, %%rdi \n" 8 | "movq %0, %%rsi \n" 9 | "movq $13, %%rdx \n" 10 | "syscall \n" 11 | : // no output 12 | : "r"(str) 13 | : "rax", "rdi", "rsi", "rdx"); 14 | } 15 | 16 | void 17 | myexit() 18 | { 19 | asm("movq $60, %rax \n" 20 | "xor %rdi, %rdi \n" 21 | "syscall \n"); 22 | } 23 | 24 | int 25 | nomain() 26 | { 27 | myprint(); 28 | myexit(); 29 | } 30 | -------------------------------------------------------------------------------- /step5/Makefile: -------------------------------------------------------------------------------- 1 | build: 2 | gcc hello.c -s -O3 -e nomain -nostartfiles -nostdlib -T link.lds -o hello.out 3 | .PHONY: build 4 | -------------------------------------------------------------------------------- /step5/hello.c: -------------------------------------------------------------------------------- 1 | char *str = "hello, world\n"; 2 | 3 | void 4 | myprint() 5 | { 6 | asm("movq $1, %%rax \n" 7 | "movq $1, %%rdi \n" 8 | "movq %0, %%rsi \n" 9 | "movq $13, %%rdx \n" 10 | "syscall \n" 11 | : // no output 12 | : "r"(str) 13 | : "rax", "rdi", "rsi", "rdx"); 14 | } 15 | 16 | void 17 | myexit() 18 | { 19 | asm("movq $60, %rax \n" 20 | "xor %rdi, %rdi \n" 21 | "syscall \n"); 22 | } 23 | 24 | int 25 | nomain() 26 | { 27 | myprint(); 28 | myexit(); 29 | } 30 | -------------------------------------------------------------------------------- /step5/link.lds: -------------------------------------------------------------------------------- 1 | ENTRY(nomain) 2 | 3 | SECTIONS 4 | { 5 | . = 0x8048000 + SIZEOF_HEADERS; 6 | 7 | tiny : { *(.text) *(.data) *(.rodata*) } 8 | 9 | /DISCARD/ : { *(*) } 10 | } 11 | -------------------------------------------------------------------------------- /step6/Makefile: -------------------------------------------------------------------------------- 1 | build: 2 | nasm -f elf64 hello.asm 3 | ld hello.o -T link.lds -o hello.out 4 | strip hello.out 5 | .PHONY: build 6 | -------------------------------------------------------------------------------- /step6/hello.asm: -------------------------------------------------------------------------------- 1 | section .data 2 | message: db "hello, world", 0xa 3 | 4 | section .text 5 | 6 | global nomain 7 | nomain: 8 | mov rax, 1 9 | mov rdi, 1 10 | mov rsi, message 11 | mov rdx, 13 12 | syscall 13 | mov rax, 60 14 | xor rdi, rdi 15 | syscall 16 | -------------------------------------------------------------------------------- /step6/link.lds: -------------------------------------------------------------------------------- 1 | ENTRY(nomain) 2 | 3 | SECTIONS 4 | { 5 | . = 0x8048000 + SIZEOF_HEADERS; 6 | 7 | tiny : { *(.text) *(.data) *(.rodata*) } 8 | 9 | /DISCARD/ : { *(*) } 10 | } 11 | -------------------------------------------------------------------------------- /step7/Makefile: -------------------------------------------------------------------------------- 1 | build: 2 | nasm -f bin hello.asm -o hello.out 3 | chmod +x hello.out 4 | .PHONY: build 5 | -------------------------------------------------------------------------------- /step7/hello.asm: -------------------------------------------------------------------------------- 1 | BITS 64 2 | org 0x400000 3 | 4 | ehdr: ; Elf64_Ehdr 5 | db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident 6 | times 8 db 0 7 | dw 2 ; e_type 8 | dw 0x3e ; e_machine 9 | dd 1 ; e_version 10 | dq _start ; e_entry 11 | dq phdr - $$ ; e_phoff 12 | dq 0 ; e_shoff 13 | dd 0 ; e_flags 14 | dw ehdrsize ; e_ehsize 15 | dw phdrsize ; e_phentsize 16 | dw 1 ; e_phnum 17 | dw 0 ; e_shentsize 18 | dw 0 ; e_shnum 19 | dw 0 ; e_shstrndx 20 | ehdrsize equ $ - ehdr 21 | 22 | phdr: ; Elf64_Phdr 23 | dd 1 ; p_type 24 | dd 5 ; p_flags 25 | dq 0 ; p_offset 26 | dq $$ ; p_vaddr 27 | dq $$ ; p_paddr 28 | dq filesize ; p_filesz 29 | dq filesize ; p_memsz 30 | dq 0x1000 ; p_align 31 | phdrsize equ $ - phdr 32 | 33 | _start: 34 | mov rax, 1 35 | mov rdi, 1 36 | mov rsi, message 37 | mov rdx, 13 38 | syscall 39 | mov rax, 60 40 | xor rdi, rdi 41 | syscall 42 | 43 | message: db "hello, world", 0xa 44 | 45 | filesize equ $ - $$ 46 | --------------------------------------------------------------------------------