├── LICENSE ├── README.md └── x64.h /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Martin Cohen 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # x64 2 | 3 | x64 assembler library in C. 4 | 5 | # About 6 | 7 | The goal is to be able to comfortably generate x64 native code to memory and then run it. Useful for runtime code generation, compilers, code optimizations. 8 | 9 | Made as part of learning how x64 encoding works. Library is set out to produce same code as `ML64` currently, since it's the assembler used for generating tests. The `GCC` one does prefer slightly different variants of some byte encodings. I'm trying to keep track of these so I will be able to generate tests from `GCC` too. 10 | 11 | # x64 instructions 12 | 13 | - `mov` 14 | - `add` 15 | - `sub` 16 | - `and` 17 | - `or` 18 | - `xor` 19 | - `push` 20 | - `pop` 21 | - `ret` 22 | 23 | Features: 24 | 25 | - All `reg/mem`, `mem/reg`, `mem/i` and `reg/i` supported. 26 | - `REX` prefix generated only when needed. 27 | - Absolute addressing mode supported. 28 | - Relative addressing mode supported. 29 | - `RBP` indexing supported. 30 | - GCC-style `RSP` indexing supported (as long as scale is 1). 31 | - Always choosing the shortest byte sequence as long as the result is the same. (This will need some more testing, though.) 32 | 33 | Missing: 34 | 35 | - Opcodes working just with `RAX` with `reg/*` not supported yet. 36 | 37 | # Usage 38 | 39 | **Still under development with some major changes pending.** Released out for people to use it early should they need it. Library is developed in my private repository and each major change is pushed here. I'll eventually move to this repo for further development. 40 | 41 | ```c 42 | // Binary 43 | X64Inst x64_mov(X64Size size, X64Operand D, X64Operand S); 44 | X64Inst x64_add(X64Size size, X64Operand D, X64Operand S); 45 | X64Inst x64_sub(X64Size size, X64Operand D, X64Operand S); 46 | X64Inst x64_and(X64Size size, X64Operand D, X64Operand S); 47 | X64Inst x64_or (X64Size size, X64Operand D, X64Operand S); 48 | X64Inst x64_xor(X64Size size, X64Operand D, X64Operand S); 49 | 50 | // Unary 51 | X64Inst x64_pop(X64Operand S); 52 | X64Inst x64_push(X64Operand S); 53 | 54 | // Nullary 55 | X64Inst x64_ret(); 56 | ``` 57 | 58 | Functions return `X64Inst` structure, which is just a static buffer with `.bytes` and `.count`. The structure also contains `.error` string which is set in case there was an error processing. This behavior will eventually change to a proper buffered writer and custom error handler. 59 | 60 | Where `X64Size size` denotes intent of size on the operation. In case it is not possible to satisfy the size and operands for given instruction, error is returned via `XInst.error`. 61 | 62 | ```c 63 | enum X64Size { 64 | X64_S8 65 | X64_S16 66 | X64_S32 67 | X64_S64 68 | }; 69 | ``` 70 | 71 | `X64Operand` can be constructed either directly or via three helper functions: 72 | 73 | - Register operand: 74 | - `X64Operand x64r(X64Reg reg)` 75 | - Memory expression operand: 76 | - `X64Operand x64m(X64Reg base, X64Reg index, X64Scale scale, uint64_t displacement)` 77 | - Immediate (constant) operand: 78 | - `X64Operand x64i(uint64_t imm)` 79 | 80 | # TODO 81 | 82 | - Obviously more instructions. 83 | - Priority on instructions that map to C-like language expressions (arithmetics, calls) 84 | - Floating point (via SSE+). 85 | - Vector operations. 86 | - Some API changes regarding how `size` argument is used. 87 | - Use proper buffer writer with user callback. 88 | - Use user callback for error handling. 89 | 90 | # Testing 91 | 92 | There's a test generator for all of the instruction with all happy-path possibilities. Additionally I'm adding test for error cases. The tests are not yet part of the release, but will be pushed out soon. 93 | 94 | Tests are working as follows: 95 | 96 | 1. First we generate all permutations of all valid arguments. 97 | 2. Then we generate and assembly file with corresponding `ML64` notation. 98 | 3. The assembly file is passed through `ML64` assembler that produces `OBJ` file. 99 | 4. `OBJ` file is disassembled through `DUMPBIN` and we extract bytes for each instruction. 100 | 5. We generate a C source file with each permutation mapped to it's expected result bytes from `DUMPBIN`. 101 | 6. We compare result from the library to result from `ML64`. 102 | 103 | Tests coverage per instruction: 104 | 105 | ``` 106 | mov 7510 tests 107 | sub 7508 tests 108 | add 7508 tests 109 | and 7508 tests 110 | or 7508 tests 111 | xor 7508 tests 112 | pop 442 tests 113 | push 445 tests 114 | ``` 115 | 116 | # Links 117 | 118 | - [x64 encoding writeup](https://github.com/martincohen/Wiki/wiki/x64) 119 | - Not comprehensive yet, but can help with additional instructions should one need them. 120 | - [Development streams](https://twitch.tv/martincohen) 121 | - Occasional streams. 122 | - [Development streams archive on YouTube](https://www.youtube.com/playlist?list=PLPdqby1EYYdUJw27y0LpIffko8EhP6ICs) 123 | - Kept in sync with archive on Twitch. 124 | -------------------------------------------------------------------------------- /x64.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | // --- 4 | // Author: Martin 'Halt' Cohen, @martin_cohen 5 | // License: MIT (see LICENSE) 6 | // --- 7 | 8 | #include 9 | #include 10 | 11 | // NOTE: To use these, you'll have to include appropriate headers yourself. 12 | 13 | #ifndef X64_ERROR 14 | #define X64_ERROR(Message) { printf(Message); abort(); } 15 | #endif 16 | 17 | #ifndef X64_ASSERT 18 | #define X64_ASSERT assert 19 | #endif 20 | 21 | #ifndef X64_ASSERT_DEBUG 22 | #define X64_ASSERT_DEBUG assert 23 | #endif 24 | 25 | #define X64_TODO X64_ERROR("todo") 26 | 27 | typedef enum X64Reg { 28 | X64_None = 0, 29 | 30 | X64_RAX, // 000 31 | X64_RCX, // 001 32 | X64_RDX, // 010 33 | X64_RBX, // 011 34 | X64_RSP, // 100 35 | X64_RBP, // 101 36 | X64_RSI, // 110 37 | X64_RDI, // 111 38 | 39 | X64_R8, // 1.000 40 | X64_R9, // 1.001 41 | X64_R10, // 1.010 42 | X64_R11, // 1.011 43 | X64_R12, // 1.100 44 | X64_R13, // 1.101 45 | X64_R14, // 1.110 46 | X64_R15, // 1.111 47 | 48 | X64Reg_LAST__, 49 | 50 | // Special-case register, used for RIP-based addressing mode with Mod R/M. 51 | X64_RIP, 52 | } X64Reg; 53 | 54 | #define x64reg_check(Reg) \ 55 | X64_ASSERT_DEBUG((Reg) > X64_None && (Reg) < X64Reg_LAST__) 56 | 57 | #define x64reg_is_int_ext(Reg) \ 58 | ((Reg) >= X64_R8 && (Reg) <= X64_R15) 59 | // 60 | // 61 | // 62 | 63 | typedef enum X64Size { 64 | X64_SDefault, 65 | X64_S8, 66 | X64_S16, 67 | X64_S32, 68 | X64_S64, 69 | } X64Size; 70 | 71 | typedef enum X64Scale { 72 | X64_X1 = 0b00, 73 | X64_X2 = 0b01, 74 | X64_X4 = 0b10, 75 | X64_X8 = 0b11, 76 | } X64Scale; 77 | 78 | typedef enum X64OperandKind { 79 | X64O_Reg, 80 | X64O_Mem, 81 | X64O_Imm 82 | } X64OperandKind; 83 | 84 | typedef struct X64Operand { 85 | X64OperandKind kind; 86 | union { 87 | X64Reg reg; 88 | struct { 89 | X64Reg base; 90 | X64Reg index; 91 | X64Scale scale; 92 | // TODO: Is this supposed to be signed int? 93 | int32_t displacement; 94 | } mem; 95 | uint64_t imm; 96 | }; 97 | } X64Operand; 98 | 99 | #define x64o_pair(A, B) ((A << 4) | B) 100 | 101 | #define x64r(Reg) \ 102 | (X64Operand) { .kind = X64O_Reg, .reg = Reg } 103 | 104 | #define x64m(Base, Index, Scale, Displacement) \ 105 | (X64Operand) { \ 106 | .kind = X64O_Mem, \ 107 | .mem.base = Base, \ 108 | .mem.index = Index, \ 109 | .mem.scale = Scale, \ 110 | .mem.displacement = Displacement \ 111 | } 112 | 113 | #define x64i(Immediate) \ 114 | (X64Operand) { .kind = X64O_Imm, .imm = (Immediate) } 115 | 116 | #define x64o_swap(A, B) { \ 117 | X64Operand t = A; \ 118 | A = B; \ 119 | B = t; \ 120 | } 121 | 122 | // 123 | // 124 | // 125 | 126 | typedef enum X64ModRMMode { 127 | X64ModRM_Indirect = 0b00, 128 | X64ModRM_IndirectDisp8 = 0b01, 129 | X64ModRM_IndirectDisp32 = 0b10, 130 | X64ModRM_Direct = 0b11, 131 | } X64ModRMMode; 132 | 133 | // 134 | // 135 | // 136 | 137 | // Whenever we have `rm` we encode other register in modrm.reg. 138 | // Otherwise (reg/imm) we encode register in opcode. 139 | 140 | // Using int16_t to be able to denote -1 as "not available" in 141 | // case the opset doesn't support it. This is because `0` is 142 | // used with ADD r/m8,r8. 143 | typedef struct X64OpBinary 144 | { 145 | // Register in modrm.reg 146 | // Opcode must support 16, 32 and 64 operand sizes. 147 | int16_t reg_rm; 148 | // Register in modrm.reg 149 | // Opcode must support 16, 32 and 64 operand sizes. 150 | int16_t rm_reg; 151 | 152 | // Register in modrm.reg. 153 | int16_t rm8_reg8; 154 | // Register in modrm.reg. 155 | // In case of (reg8, reg8) GCC seems to refer rm8_reg8, 156 | // while ML64 preferes reg8_rm8. 157 | // TODO: Check if there's any difference. 158 | int16_t reg8_rm8; 159 | 160 | // rmX_immX: 161 | // - In case when used for writing to a register: 162 | // - modrm.mode = 11 163 | // - modrm.reg = 0 164 | // - modrm.rm = destination register 165 | 166 | // rmX_immX 167 | // Opcode must support 16, 32 and 64 operand sizes. 168 | int16_t rm_imm8; 169 | // Extends rm_imm8 opcode with modrm.reg field. 170 | // Opcode must support 16, 32 and 64 operand sizes. 171 | int16_t rm_imm8_op; 172 | 173 | // rmX_immX 174 | // Opcode must support 16, 32 and 64 operand sizes. 175 | int16_t rm_imm32; 176 | // Extends rm_imm32 opcode with modrm.reg field. 177 | int16_t rm_imm32_op; 178 | 179 | // rmX_immX 180 | // So far all opcodes have this note: In 64-bit no AH, BH, CH, DH. 181 | int16_t rm8_imm8; 182 | // Extends rm8_imm8 opcode with modrm.reg field. 183 | int16_t rm8_imm8_op; 184 | 185 | // Regsiter in opcode. 186 | int16_t reg8_imm8; 187 | // Register in opcode. 188 | // In 64-bit the imm32 is sign-extended to 64-bit. 189 | // So far all opcodes support 16, 32 and sign-extended 64. 190 | int16_t reg32_imm32; 191 | // Register in opcode. 192 | // So far reg32_imm32 with reg. 193 | // So far only mov has this variant. 194 | int16_t reg64_imm64; 195 | } X64OpBinary; 196 | 197 | const X64OpBinary X64Op_Mov = 198 | { 199 | .reg_rm = 0x8B, 200 | .rm_reg = 0x89, 201 | .rm8_reg8 = 0x88, 202 | .reg8_rm8 = 0x8A, 203 | .rm_imm8 = -1, 204 | .rm_imm32 = 0xC7, .rm_imm32_op = 0, 205 | .rm8_imm8 = 0xC6, .rm8_imm8_op = 0, 206 | .reg8_imm8 = 0xB0, 207 | .reg32_imm32 = 0xB8, 208 | .reg64_imm64 = 0xB8, 209 | }; 210 | 211 | const X64OpBinary X64Op_Sub = 212 | { 213 | .reg_rm = 0x2B, 214 | .rm_reg = 0x29, 215 | .rm8_reg8 = 0x28, 216 | .reg8_rm8 = 0x2A, 217 | .rm_imm8 = 0x83, .rm_imm8_op = 5, 218 | .rm_imm32 = 0x81, .rm_imm32_op = 5, 219 | .rm8_imm8 = 0x80, .rm8_imm8_op = 5, 220 | .reg8_imm8 = -1, // Not available. 221 | .reg32_imm32 = -1, // Not available. 222 | .reg64_imm64 = -1, // Not available. 223 | // TODO: reg8_imm8 only for AL 224 | // TODO: reg32_imm32 only for AX, EAX, RAX. 225 | }; 226 | 227 | const X64OpBinary X64Op_Add = 228 | { 229 | .reg_rm = 0x03, 230 | .rm_reg = 0x01, 231 | .rm8_reg8 = 0x00, 232 | .reg8_rm8 = 0x02, 233 | .rm_imm8 = 0x83, .rm_imm8_op = 0, 234 | .rm_imm32 = 0x81, .rm_imm32_op = 0, 235 | .rm8_imm8 = 0x80, .rm8_imm8_op = 0, 236 | .reg8_imm8 = -1, 237 | .reg32_imm32 = -1, 238 | .reg64_imm64 = -1, 239 | // TODO: reg8_imm8 only for AL 240 | // TODO: reg32_imm32 only for AX, EAX, RAX. 241 | }; 242 | 243 | const X64OpBinary X64Op_And = 244 | { 245 | .reg_rm = 0x23, 246 | .rm_reg = 0x21, 247 | .rm8_reg8 = 0x20, 248 | .reg8_rm8 = 0x22, 249 | .rm_imm8 = 0x83, .rm_imm8_op = 4, 250 | .rm_imm32 = 0x81, .rm_imm32_op = 4, 251 | .rm8_imm8 = 0x80, .rm8_imm8_op = 4, 252 | .reg8_imm8 = -1, 253 | .reg32_imm32 = -1, 254 | .reg64_imm64 = -1, 255 | }; 256 | 257 | const X64OpBinary X64Op_Or = 258 | { 259 | .reg_rm = 0x0B, 260 | .rm_reg = 0x09, 261 | .rm8_reg8 = 0x08, 262 | .reg8_rm8 = 0x0A, 263 | .rm_imm8 = 0x83, .rm_imm8_op = 1, 264 | .rm_imm32 = 0x81, .rm_imm32_op = 1, 265 | .rm8_imm8 = 0x80, .rm8_imm8_op = 1, 266 | .reg8_imm8 = -1, 267 | .reg32_imm32 = -1, 268 | .reg64_imm64 = -1, 269 | }; 270 | 271 | const X64OpBinary X64Op_Xor = 272 | { 273 | .reg_rm = 0x33, 274 | .rm_reg = 0x31, 275 | .rm8_reg8 = 0x30, 276 | .reg8_rm8 = 0x32, 277 | .rm_imm8 = 0x83, .rm_imm8_op = 6, 278 | .rm_imm32 = 0x81, .rm_imm32_op = 6, 279 | .rm8_imm8 = 0x80, .rm8_imm8_op = 6, 280 | .reg8_imm8 = -1, 281 | .reg32_imm32 = -1, 282 | .reg64_imm64 = -1, 283 | }; 284 | 285 | typedef struct X64OpUnary { 286 | uint8_t rm; 287 | // Number to use on 'modrm.reg' when we're encoding memory expression. 288 | // This is extension of the op code. 289 | // SDM notes this after the opcode as /, for example 290 | // 8F /0 -> pop, opcode = 8F, modrm.reg set to 0 (RAX) 291 | // FF /6 -> push, opcode = FF, modrm.reg set to 6 (RSI) 292 | uint8_t rm_op; 293 | uint8_t reg; 294 | uint8_t imm8; 295 | uint8_t imm32; 296 | 297 | } X64OpUnary; 298 | 299 | static inline x64opunary_has_imm(const X64OpUnary op) { 300 | return op.imm8 || op.imm32; 301 | } 302 | 303 | const X64OpUnary X64Op_Pop = { 304 | .reg = 0x58, 305 | // TODO: Cannot encode 32-bit operand size. 306 | // NOTE: Notated as 8F /0 in SDM. 307 | .rm = 0x8F, .rm_op = 0, 308 | }; 309 | 310 | const X64OpUnary X64Op_Push = { 311 | .reg = 0x50, 312 | // NOTE: Notated as FF /6 in SDM. 313 | .rm = 0xFF, .rm_op = 6, 314 | .imm8 = 0x6A, 315 | .imm32 = 0x68, 316 | }; 317 | 318 | // 319 | // 320 | // 321 | 322 | typedef struct X64Inst { 323 | uint8_t bytes[ 324 | 3 + // prefixes 325 | 3 + // opcode 326 | 1 + // mod r/m 327 | 1 + // sib 328 | 8 + // displacement (some rare instructions take 8B displacement) 329 | 8 + // immediate (some rare instructions take 8B immediate) 330 | 0 331 | ]; 332 | uint8_t count; 333 | const char* error; 334 | } X64Inst; 335 | 336 | // 337 | // 338 | // 339 | 340 | static inline int8_t 341 | x64imm_get_size(uint64_t imm) 342 | { 343 | if (imm <= 0xff) { 344 | return 1; 345 | } else if (imm <= 0xffffffffull) { 346 | return 4; 347 | } else { 348 | return 8; 349 | } 350 | } 351 | 352 | // 'w' operand size is 64-bit 353 | // 'r' extension of modrm.reg 354 | // 'x' extension of sib.index 355 | // 'b' extension of modrm.rm, sib.base or opcode reg field 356 | static inline uint8_t 357 | x64rex(int8_t w, int8_t r, uint8_t x, uint8_t b) { 358 | // bits: 0100 W R X B 359 | uint8_t wrxb = (w & 1) << 3 | (r & 1) << 2 | (x & 1) << 1 | (b & 1) << 0; 360 | if (wrxb) { 361 | return 0b01000000 | wrxb; 362 | } 363 | return 0; 364 | } 365 | 366 | // 'mode' 367 | // - 00 - memory expression with no displacement 368 | // - 01 - memory expression with 8-bit displacement 369 | // - 10 - memory expression with 32-bit displacement 370 | // - 11 - register 371 | // 'reg' is reg/opcode field 372 | // - specifies either a register number or three more bits of opcode information. 373 | // - for exampel PUSH is FF opcode, with value 6 in this field. 374 | // 'rm' 375 | // - can specify a register as an operand or it can be combined with 376 | // the mod field to encode an addressing mode. Sometimes, certain 377 | // combinations of the mod field and the rm field are used to express 378 | // opcode information for some instructions. 379 | static inline uint8_t 380 | x64modrm(X64ModRMMode mode, X64Reg reg, X64Reg rm) { 381 | X64_ASSERT_DEBUG(mode >= 0 && mode < 4); 382 | x64reg_check(reg); 383 | x64reg_check(rm); 384 | return 385 | (mode << 6) | 386 | (((reg - 1) & 7) << 3) | 387 | ((rm - 1) & 7); 388 | } 389 | 390 | static inline uint8_t 391 | x64sib(X64Scale scale, X64Reg index, X64Reg base) { 392 | X64_ASSERT_DEBUG(scale >= 0 && scale < 4); 393 | x64reg_check(index); 394 | x64reg_check(base); 395 | return 396 | (scale << 6) | 397 | (((index - 1) & 7) << 3) | 398 | ((base - 1) & 7); 399 | } 400 | 401 | static inline uint8_t 402 | x64op_reg(int16_t op, X64Reg reg) { 403 | X64_ASSERT_DEBUG(op != -1); 404 | x64reg_check(reg); 405 | return op | ((reg - 1) & 7); 406 | } 407 | 408 | // 409 | // Full instruction encoders 410 | // 411 | 412 | static inline uint8_t* 413 | x64e_bytes_(uint8_t* it, uint8_t* bytes, int bytes_count) { 414 | while (bytes_count) { 415 | *it++ = *bytes++; 416 | --bytes_count; 417 | } 418 | return it; 419 | } 420 | 421 | // Mem/Reg (rm_reg) 422 | // Reg/Mem (reg_rm) 423 | // Mem/Imm (some of rm_imm, others are encoded with x64e_modrm_) 424 | // Instruction with memory expression operand. 425 | static inline uint8_t* 426 | x64e_modrm_sib_disp_(uint8_t* it, X64Size size, int opcode, X64Reg reg, X64Reg base, X64Reg index, X64Scale scale, uint64_t displacement, char** error) 427 | { 428 | X64_ASSERT_DEBUG(opcode != -1); 429 | 430 | int modrm_mode = -1; 431 | int modrm_reg = reg; 432 | int modrm_rm = base; 433 | 434 | bool sib = false; 435 | int sib_scale = 0; 436 | int sib_index = 0; 437 | int sib_base = 0; 438 | 439 | uint8_t rex = 0; 440 | 441 | int8_t displacement_size = 0; 442 | if (displacement == 0) { 443 | displacement_size = 0; 444 | modrm_mode = X64ModRM_Indirect; 445 | } else if (displacement < 0x100) { 446 | displacement_size = 1; 447 | modrm_mode = X64ModRM_IndirectDisp8; 448 | } else { 449 | displacement_size = 4; 450 | modrm_mode = X64ModRM_IndirectDisp32; 451 | } 452 | 453 | if (base == X64_RIP) 454 | { 455 | if (index != 0 || scale != 0) { 456 | *error = "index and scale must be 0 in when base is RIP"; 457 | return NULL; 458 | } 459 | // Special case. 460 | // Forcing mode to Indirect, and displacement size to 4. 461 | modrm_mode = X64ModRM_Indirect; 462 | modrm_rm = X64_RBP; 463 | displacement_size = 4; 464 | } 465 | else if (index == 0) 466 | { 467 | if (scale != 0) { 468 | X64_ERROR("scale must be set to X1"); 469 | } 470 | 471 | // Signal no index. 472 | sib_index = X64_RSP; 473 | 474 | if (base == 0) { 475 | // No base, no index, assuming absolute addressing. 476 | sib = true; 477 | modrm_rm = X64_RSP; 478 | sib_base = X64_RBP; 479 | modrm_mode = X64ModRM_Indirect; 480 | displacement_size = 4; 481 | } else if (base == X64_RBP || base == X64_R13) { 482 | // RBS-base, no index. 483 | if (modrm_mode == X64ModRM_Indirect) { 484 | // Special case. 485 | X64_ASSERT_DEBUG(displacement == 0); 486 | displacement_size = 1; 487 | modrm_mode = X64ModRM_IndirectDisp8; 488 | } 489 | } else if (base == X64_RSP || base == X64_R12) { 490 | // RSP-base, no index. 491 | // Because RSP has special meaning in modrm.rm, 492 | // we need to force SIB here and do it through sib.base. 493 | sib = true; 494 | modrm_rm = X64_RSP; 495 | sib_base = base; 496 | } else { 497 | // Base only, no index. 498 | // Other-than RBP base, no SIB. 499 | X64_ASSERT_DEBUG(sib == false); 500 | X64_ASSERT_DEBUG(modrm_rm); 501 | } 502 | } 503 | else if (index == X64_RSP) 504 | { 505 | // This is a special feature provided on assembler level. 506 | // If we want to index by RSP, we can only do so by setting it as sib.base. 507 | // That means that sib.scale has to be set to X1. 508 | if (scale != X64_X1) { 509 | *error = "cannot index by RSP with scale other than 1"; 510 | return NULL; 511 | } 512 | 513 | // Same as branch index == 0 && base == X64_RSP. 514 | sib = true; 515 | modrm_rm = X64_RSP; 516 | sib_base = X64_RSP; 517 | if (base == 0) { 518 | sib_index = X64_RSP; 519 | } else { 520 | sib_index = base; 521 | } 522 | } 523 | else 524 | { 525 | sib = true; 526 | modrm_rm = X64_RSP; 527 | 528 | sib_index = index; 529 | sib_scale = scale; 530 | sib_base = base; 531 | if (base == 0) { 532 | // Flag we're using no base. 533 | sib_base = X64_RBP; 534 | // We have to switch mode to 0b00 and force displacement_size to 4. 535 | modrm_mode = X64ModRM_Indirect; 536 | displacement_size = 4; 537 | } else if (base == X64_RBP || base == X64_R13) { 538 | // RBS-base, no index. 539 | if (modrm_mode == X64ModRM_Indirect) { 540 | // Special case. 541 | X64_ASSERT_DEBUG(displacement == 0); 542 | displacement_size = 1; 543 | modrm_mode = X64ModRM_IndirectDisp8; 544 | } 545 | } 546 | } 547 | 548 | rex = x64rex( 549 | size == X64_S64, 550 | x64reg_is_int_ext(reg), 551 | sib && x64reg_is_int_ext(sib_index), 552 | sib 553 | ? x64reg_is_int_ext(sib_base) 554 | : x64reg_is_int_ext(modrm_rm)); 555 | if (rex) *it++ = rex; 556 | 557 | *it++ = opcode; 558 | *it++ = x64modrm(modrm_mode, modrm_reg, modrm_rm); 559 | if (sib) *it++ = x64sib(sib_scale, sib_index, sib_base); 560 | it = x64e_bytes_(it, (uint8_t*)&displacement, displacement_size); 561 | 562 | return it; 563 | } 564 | 565 | // Reg/Imm (rmX_immX) 566 | // Reg and OpCode extension in ModRm 567 | static inline uint8_t* 568 | x64e_modrm_(uint8_t* it, X64Size size, int opcode, int opcode_ext, X64Reg reg) 569 | { 570 | X64_ASSERT_DEBUG(opcode != -1); 571 | X64_ASSERT_DEBUG(reg != 0); 572 | 573 | // This encodes: 574 | // 1. OpCode 575 | // 2. ModRm with Mode=11, Reg=opcode_ext, RM=reg 576 | // 3. No SIB it seems. 577 | // TODO: Test variants of this with ModRm/SIB registers. 578 | 579 | uint8_t rex = x64rex(size == X64_S64, 0, 0, x64reg_is_int_ext(reg)); 580 | if (rex) *it++ = rex; 581 | *it++ = opcode; 582 | *it++ = x64modrm(X64ModRM_Direct, opcode_ext + 1, reg); 583 | return it; 584 | } 585 | 586 | // Reg/Imm (regX_immX) 587 | // Reg in OpCode (hence rex extends it via `.b`) 588 | static inline uint8_t* 589 | x64e_op_reg_(uint8_t* it, X64Size size, int opcode, X64Reg reg) 590 | { 591 | X64_ASSERT_DEBUG(opcode != -1); 592 | 593 | uint8_t rex = x64rex(size == X64_S64, 0, 0, x64reg_is_int_ext(reg)); 594 | if (rex) *it++ = rex; 595 | // Encode reg in opcode (bottom 3 bits). 596 | *it++ = x64op_reg(opcode, reg); 597 | 598 | return it; 599 | } 600 | 601 | // 602 | // 603 | // 604 | 605 | X64Inst 606 | x64_emit_error(const char* error) 607 | { 608 | X64_ASSERT_DEBUG(error); 609 | return (X64Inst){ .error = error }; 610 | } 611 | 612 | // 613 | // 614 | // 615 | 616 | 617 | uint8_t* 618 | x64_emit_binary_reg_reg_(uint8_t* it, X64Size size, const X64OpBinary op, X64Operand D, X64Operand S, char** error) 619 | { 620 | // reg_rm 621 | // rm8_reg8 -- in case size == X64_S8 622 | 623 | if (D.reg == 0) { 624 | *error = "destination register cannot be none"; 625 | return 0; 626 | } 627 | 628 | int16_t opcode = op.reg_rm; 629 | uint8_t rex = 0; 630 | if (size == X64_S8) { 631 | // ML64 seems to prefer rm8_reg8, while 632 | // GCC prefers reg8_rm8. 633 | #if 0 634 | if (op.rm8_reg8) { 635 | opcode = op.rm8_reg8; 636 | x64o_swap(D, S); 637 | } 638 | #else 639 | if (op.reg8_rm8) { 640 | opcode = op.reg8_rm8; 641 | } 642 | #endif 643 | } 644 | X64_ASSERT_DEBUG(opcode != -1); 645 | 646 | rex = x64rex(size == X64_S64, x64reg_is_int_ext(D.reg), 0, x64reg_is_int_ext(S.reg)); 647 | if (rex) *it++ = rex; 648 | *it++ = opcode; 649 | *it++ = x64modrm(X64ModRM_Direct, D.reg, S.reg); 650 | return it; 651 | } 652 | 653 | uint8_t* 654 | x64_emit_binary_reg_imm_(uint8_t* it, X64Size size, const X64OpBinary op, X64Operand D, X64Operand S, char **error) 655 | { 656 | // This encodes one of two op codes: 657 | // - reg8_imm8 (or rm8_imm8) 658 | // - reg32_imm32 (or rm_imm32) 659 | // In case of rmX_immX variant, we encode: 660 | // - modrm.mode == 0b11 661 | // - modrm.reg = op.reg8_imm8_op 662 | // - modrm.rm = D.reg 663 | // In case of regX_immX: 664 | // - we encode D.reg into opcode directly 665 | 666 | // In case this is specified and required, we'll use: 667 | // reg64_imm64 668 | // If not defined, but required, we'll raise and error. 669 | 670 | X64_ASSERT_DEBUG(D.kind == X64O_Reg); 671 | X64_ASSERT_DEBUG(S.kind == X64O_Imm); 672 | if (D.reg == 0) { 673 | *error = "destination register cannot be none"; 674 | return 0; 675 | } 676 | 677 | int imm_size = x64imm_get_size(S.imm); 678 | 679 | // (uint8_t)a -= 400 680 | 681 | switch (size) 682 | { 683 | case X64_S8: 684 | if (imm_size > 1) { 685 | *error = "immediate value truncated to 8 bits because of size argument"; 686 | return NULL; 687 | } 688 | imm_size = 1; 689 | if (op.reg8_imm8 == -1) { 690 | it = x64e_modrm_(it, X64_S32, op.rm8_imm8, op.rm8_imm8_op, D.reg); 691 | } else { 692 | it = x64e_op_reg_(it, X64_S8, op.reg8_imm8, D.reg); 693 | } 694 | break; 695 | 696 | case X64_S64: 697 | if (imm_size > 4) { 698 | // 64-bit 699 | if (op.reg64_imm64 == -1) { 700 | *error = "64-bit immediate value not supported with this instruction"; 701 | return NULL; 702 | } 703 | it = x64e_op_reg_(it, X64_S64, op.reg64_imm64, D.reg); 704 | break; 705 | } 706 | // Fallthrough. 707 | case X64_S32: 708 | // WARNING: X64_S64 falls through here as well. 709 | if (imm_size == 1 && op.rm_imm8 != -1) { 710 | // We're using `size` here because we want REX in case we fallthrough 711 | // from size == X64_S64. 712 | it = x64e_modrm_(it, size, op.rm_imm8, op.rm_imm8_op, D.reg); 713 | } else { 714 | if (imm_size > 4) { 715 | *error = "immediate value truncated to 32 bits because of size argument"; 716 | return NULL; 717 | } 718 | imm_size = 4; 719 | if (size == X64_S32 && op.reg32_imm32 != -1) { 720 | // We're not taking this branch in case the size 64 fallsthrough here. 721 | // Otherwise we'd have to encode imm64, even if it's way smaller. 722 | it = x64e_op_reg_(it, X64_S32, op.reg32_imm32, D.reg); 723 | } else { 724 | // We're using `size` here because we want REX in case we fallthrough 725 | // from size == X64_S64. 726 | it = x64e_modrm_(it, size, op.rm_imm32, op.rm_imm32_op, D.reg); 727 | } 728 | } 729 | break; 730 | 731 | } 732 | 733 | return x64e_bytes_(it, (uint8_t*)&S.imm, imm_size); 734 | } 735 | 736 | uint8_t* 737 | x64_emit_binary_mem_imm_(uint8_t* it, X64Size size, const X64OpBinary op, X64Operand D, X64Operand S, char** error) 738 | { 739 | X64_ASSERT(D.kind == X64O_Mem && S.kind == X64O_Imm); 740 | 741 | int8_t imm_size = x64imm_get_size(S.imm); 742 | 743 | int opcode = -1; 744 | int opcode_ext = 0; 745 | 746 | switch (size) 747 | { 748 | case X64_S8: 749 | if (imm_size > 1) { 750 | *error = "immediate value truncated to 8 bits because of size argument"; 751 | return NULL; 752 | } 753 | imm_size = 1; 754 | opcode = op.rm8_imm8; 755 | opcode_ext = op.rm8_imm8_op; 756 | break; 757 | 758 | case X64_S64: 759 | if (imm_size > 4) { 760 | *error = "operation mem64, imm64 is not supported"; 761 | return NULL; 762 | } 763 | // Fallthrough as we'll use rm_imm8 or rm_imm32 with rex (produced by x64e used). 764 | case X64_S32: 765 | // TODO: Check whether SDefault works here as it's supposed to. 766 | if (imm_size == 1 && op.rm_imm8 != -1) { 767 | opcode = op.rm_imm8; 768 | opcode_ext = op.rm_imm8_op; 769 | } else { 770 | if (imm_size > 4) { 771 | *error = "immediate value truncated to 32 bits because of size argument"; 772 | return NULL; 773 | } 774 | opcode = op.rm_imm32; 775 | opcode_ext = op.rm_imm32_op; 776 | imm_size = 4; 777 | } 778 | break; 779 | } 780 | 781 | it = x64e_modrm_sib_disp_(it, 782 | size, 783 | opcode, 784 | opcode_ext + 1, 785 | D.mem.base, 786 | D.mem.index, 787 | D.mem.scale, 788 | D.mem.displacement, 789 | error); 790 | return x64e_bytes_(it, (uint8_t*)&S.imm, imm_size); 791 | } 792 | 793 | uint8_t* 794 | x64_emit_binary_reg_mem_(uint8_t* it, X64Size size, const X64OpBinary op, X64Operand D, X64Operand S, char** error) 795 | { 796 | int opcode = 0; 797 | switch (x64o_pair(D.kind, S.kind)) 798 | { 799 | case x64o_pair(X64O_Reg, X64O_Mem): 800 | // All good. 801 | opcode = size == X64_S8 802 | ? op.reg8_rm8 803 | : op.reg_rm; 804 | break; 805 | case x64o_pair(X64O_Mem, X64O_Reg): 806 | x64o_swap(D, S); 807 | opcode = size == X64_S8 808 | ? op.rm8_reg8 809 | : op.rm_reg; 810 | break; 811 | default: 812 | X64_ERROR("invalid operands"); 813 | return it; 814 | } 815 | 816 | if (opcode == -1) { 817 | X64_ERROR("opcode is not defined"); 818 | return it; 819 | } 820 | 821 | return x64e_modrm_sib_disp_(it, size, opcode, D.reg, S.mem.base, S.mem.index, S.mem.scale, S.mem.displacement, error); 822 | } 823 | 824 | X64Inst 825 | x64_emit_binary(X64Size size, const X64OpBinary op, X64Operand D, X64Operand S) 826 | { 827 | X64Inst inst = {0}; 828 | uint8_t* it = inst.bytes; 829 | 830 | char* error = "uknown error"; 831 | switch (x64o_pair(D.kind, S.kind)) 832 | { 833 | case x64o_pair(X64O_Reg, X64O_Reg): 834 | it = x64_emit_binary_reg_reg_(it, size, op, D, S, &error); 835 | break; 836 | 837 | case x64o_pair(X64O_Reg, X64O_Imm): 838 | it = x64_emit_binary_reg_imm_(it, size, op, D, S, &error); 839 | break; 840 | 841 | case x64o_pair(X64O_Mem, X64O_Reg): 842 | case x64o_pair(X64O_Reg, X64O_Mem): 843 | it = x64_emit_binary_reg_mem_(it, size, op, D, S, &error); 844 | break; 845 | 846 | case x64o_pair(X64O_Mem, X64O_Imm): 847 | it = x64_emit_binary_mem_imm_(it, size, op, D, S, &error); 848 | break; 849 | 850 | 851 | default: 852 | X64_ERROR("unexpected arguments"); 853 | } 854 | 855 | if (it == 0) { 856 | return x64_emit_error(error); 857 | } 858 | 859 | inst.count = it - inst.bytes; 860 | return inst; 861 | } 862 | 863 | X64Inst x64_emit_unary(X64OpUnary op, X64Operand D) 864 | { 865 | X64Inst inst = {0}; 866 | uint8_t* it = inst.bytes; 867 | 868 | char* error = "unknown error"; 869 | uint8_t rex = 0; 870 | switch (D.kind) 871 | { 872 | case X64O_Reg: 873 | if (D.reg == 0) { 874 | return x64_emit_error("invalid register"); 875 | } 876 | rex = x64rex(0, 0, 0, x64reg_is_int_ext(D.reg)); 877 | if (rex) *it++ = rex; 878 | *it++ = op.reg | (D.reg - 1) & 7; 879 | break; 880 | 881 | case X64O_Imm: { 882 | if (!x64opunary_has_imm(op)) { 883 | return x64_emit_error("immediate values are not supported with this operation"); 884 | } 885 | if (op.imm8 && D.imm <= 0xFF) { 886 | *it++ = op.imm8; 887 | it = x64e_bytes_(it, (uint8_t*)&D.imm, 1); 888 | } else if (op.imm32 && D.imm <= 0xFFFFffff) { 889 | *it++ = op.imm32; 890 | it = x64e_bytes_(it, (uint8_t*)&D.imm, 4); 891 | } else { 892 | return x64_emit_error("32-bit immediate is maximum"); 893 | } 894 | break; 895 | } 896 | 897 | case X64O_Mem: { 898 | it = x64e_modrm_sib_disp_(it, X64_SDefault, op.rm, op.rm_op + 1, D.mem.base, D.mem.index, D.mem.scale, D.mem.displacement, &error); 899 | if (it == 0) { 900 | return x64_emit_error(error); 901 | } 902 | break; 903 | } 904 | 905 | default: 906 | return x64_emit_error("invalid operand"); 907 | } 908 | 909 | inst.count = it - inst.bytes; 910 | return inst; 911 | } 912 | 913 | // 914 | // 915 | // 916 | 917 | static inline X64Inst x64_mov(X64Size size, X64Operand D, X64Operand S) { return x64_emit_binary(size, X64Op_Mov, D, S); } 918 | static inline X64Inst x64_sub(X64Size size, X64Operand D, X64Operand S) { return x64_emit_binary(size, X64Op_Sub, D, S); } 919 | static inline X64Inst x64_add(X64Size size, X64Operand D, X64Operand S) { return x64_emit_binary(size, X64Op_Add, D, S); } 920 | static inline X64Inst x64_and(X64Size size, X64Operand D, X64Operand S) { return x64_emit_binary(size, X64Op_And, D, S); } 921 | static inline X64Inst x64_or(X64Size size, X64Operand D, X64Operand S) { return x64_emit_binary(size, X64Op_Or, D, S); } 922 | static inline X64Inst x64_xor(X64Size size, X64Operand D, X64Operand S) { return x64_emit_binary(size, X64Op_Xor, D, S); } 923 | 924 | static inline X64Inst x64_pop(X64Operand D) { return x64_emit_unary(X64Op_Pop, D); } 925 | static inline X64Inst x64_push(X64Operand S) { return x64_emit_unary(X64Op_Push, S); } 926 | 927 | // TODO: What about '0xCB RET Far return to calling procedure.'? 928 | static inline X64Inst x64_ret() { return (X64Inst){ .bytes = { 0xC3 }, .count = 1 }; } 929 | 930 | // TODO: `lea` 931 | // TODO: `not` 932 | // TODO: `shl` 933 | // TODO: `shr` 934 | // TODO: `sal` 935 | // TODO: `sar` 936 | // TODO: `int3` 937 | // TODO: `call reg` 938 | // TODO: `call imm64` 939 | // TODO: `imul` (result in rdx and rax) 940 | // - https://youtu.be/ieuUHIWaIqM?list=PL0C5C980A28FEE68D&t=340 941 | // - there's single operand, double operand and tripple operand versions 942 | // TODO: `idiv` --------------------------------------------------------------------------------