├── .mds-list ├── .gitignore ├── en.md ├── sync-en.sh ├── readme.md └── index.md /.mds-list: -------------------------------------------------------------------------------- 1 | ./source/README.md 2 | ./source/index.md 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | node_modules/ 2 | .DS_Store 3 | fork 4 | source 5 | hub-create.sh -------------------------------------------------------------------------------- /en.md: -------------------------------------------------------------------------------- 1 | Easy x86-64 2 | =========== 3 | 4 | A tutorial on programming with the x86-64 in Assembly. 5 | 6 | [http://ianseyler.github.io/easy_x86-64/](http://ianseyler.github.io/easy_x86-64/) -------------------------------------------------------------------------------- /sync-en.sh: -------------------------------------------------------------------------------- 1 | cat './.mds-list' | while read line || [[ -n ${line} ]] 2 | do 3 | testseq="zh.md" 4 | if [[ $line =~ $testseq || "$line" == "" ]]; then 5 | echo "skip $line" 6 | else 7 | lowline=`echo "$line" | awk '{print tolower($0)}'` 8 | # lowwer string 9 | zh=${line//source\//} 10 | dir=$(dirname $zh) 11 | 12 | source_readme="./source/readme.md" 13 | if [[ $lowline == $source_readme ]];then 14 | # source/[readme|REAMDE].md => en.md 15 | filename="en.md" 16 | else 17 | # source/other.md => ./other.md 18 | filename=$(basename $zh) 19 | fi 20 | echo "$line >> $dir/$filename" 21 | mkdir -p $dir && cp $line "$_/$filename" 22 | fi 23 | done -------------------------------------------------------------------------------- /readme.md: -------------------------------------------------------------------------------- 1 | # IanSeyler/easy_x86-64 [![explain]][source] [![translate-svg]][translate-list] 2 | 3 | 4 | 5 | [explain]: http://llever.com/explain.svg 6 | [source]: https://github.com/chinanf-boy/Source-Explain 7 | [translate-svg]: http://llever.com/translate.svg 8 | [translate-list]: https://github.com/chinanf-boy/chinese-translate-list 9 | [size-img]: https://packagephobia.now.sh/badge?p=Name 10 | [size]: https://packagephobia.now.sh/result?p=Name 11 | 12 | 「 汇编之 Intel 语法的表面简介 」 13 | 14 | [中文](./readme.md) | [english](https://github.com/IanSeyler/easy_x86-64) 15 | 16 | --- 17 | 18 | ## 校对 ✅ 19 | 20 | 21 | 22 | 23 | 24 | 25 | | 翻译的原文 | 与日期 | 最新更新 | 更多 | 26 | | ---------- | ------------- | -------- | -------------------------- | 27 | | [commit] | ⏰ 2013-04-17 | ![last] | [中文翻译][translate-list] | 28 | 29 | [last]: https://img.shields.io/github/last-commit/IanSeyler/easy_x86-64.svg 30 | [commit]: https://github.com/IanSeyler/easy_x86-64/tree/9000b7d515e1bd3d03e508fd178c441051a91203 31 | 32 | 33 | 34 | - [x] index.zh.md 35 | 36 | ### 贡献 37 | 38 | 欢迎 👏 勘误/校对/更新贡献 😊 [具体贡献请看](https://github.com/chinanf-boy/chinese-translate-list#贡献) 39 | 40 | ## 生活 41 | 42 | [hIf help, **buy** me coffee —— 营养跟不上了,给我来瓶营养快线吧! 💰](https://github.com/chinanf-boy/live-need-money) 43 | 44 | --- 45 | 46 | “我是小巧计算器的操作员。” - Kraftwerk 47 | 48 | 最近对汇编很感兴趣(无论是现实的[6502](http://skilldrick.github.com/easy6502/),或虚构的 DCPU-16;我甚至在 2007 年创建了自己的虚拟 8 位 CPU,名为 i808)但没有关注当今计算机中最受欢迎的架构。如果您在台式机,笔记本电脑或服务器上阅读此内容,那么您的计算机很可能使用的是 x86-64(或 x86)。x86-64 是 32 位 x86 架构的 64 位超集,AMD 或 Intel 的任何现代 CPU 都支持它。本文档将重点介绍 x86-64 中最常用的部分。 49 | 50 | 汇编语言是计算机中最低级别的抽象,人类仍然可以轻松阅读。汇编语言直接转换为计算机处理器执行的字节。 51 | 52 | 学习汇编是一项有用的练习,可以让您更深入地了解“引擎盖下”发生的事情。虽然绝大多数编程是通过诸如 C,C ++,Java 等高级语言完成的,但如果执行速度是高优先级,那有时编写部分汇编代码段也是很可以的嘛(还能装个 B)。例如,对于 3D 游戏或科学过程进行大量数学计算的代码片段,就显示了汇编语言实现的可加速性。 53 | 54 | 在本文档中,我们将使用“Intel”[语法](http://en.wikipedia.org/wiki/X86_assembly_language#Syntax),而不是'AT&T'。因此,会是以下的,多参数操作码形式: 55 | 56 | ``` 57 | opcode destination, source 58 | ``` 59 | 60 | x86-64 汇编语言中,带前缀“0x”的任何数字(以及本文档中的扩展名)都是十六进制(十六进制)格式。如果您不熟悉十六进制数字,我建议您在开始之前,阅读[维基百科文章](http://en.wikipedia.org/wiki/Hexadecimal)。 61 | 62 | ## 寄存器 63 | 64 | 寄存器可能是 x86-64 架构中最复杂的部分,主要是由于传统的 32 位和 16 位 x86 架构产生的复杂性。x86-64 有 16 个 64 位通用寄存器,名为 R0-R15。这些寄存器可以按位大小分解为单独的部分,也可以通过旧 x86 名称引用。[这里](http://www.sandpile.org/x86/gpr.htm)可以找到有关寄存器名称和故障的更多信息。 65 | 66 | 例如,R0 是 64 位寄存器(也称为四字)。如果您只想使用 32 位,则该部分可以由 R0D(双字)引用,16 位由 R0W(一字)引用,或者由 R0B(一字节)引用 8 位。 67 | 68 | 这些 D,W 和 B 参考是来自[字(word)是 16 位](http://en.wikipedia.org/wiki/Computer_word)那时的延续: 69 | 70 | - 8 位 = 1 字节或'半字' 71 | - 16 位 = 2 字节= 1 个字 72 | - 32 位 = 4 字节= 2 个字= 1 个双字 73 | - 64 位 = 8 字节= 4 个字= 1 个四字 74 | 75 | 根据具体的寄存器,某些操作码会出现进一步的复杂情况。这将在乘法和除法部分中,进行更详细的探讨。 76 | 77 | ## 基本操作 78 | 79 | 最基本的操作是为寄存器分配值,或在两个寄存器之间移动值。在 x86-64 中,这称为移动或**mov**。这个术语具有误导性,因为没有任何移动行为;它只是被复制或存储。 80 | 81 | ``` 82 | mov rax, 15 ; 将值 15 存储在 rax 中 83 | mov rcx, rax ; 将 rax 中的值复制到rcx 84 | mov rbx, 18446744073709551615 ; 将最大可能的64位数 存储在rbx中 85 | ``` 86 | 87 | ## 加和减 88 | 89 | 我们可以把具体寄存器**add**到一起: 90 | 91 | ``` 92 | mov rax, 11 ; 将值 11 存储在 rax 中 93 | mov rcx, 500 ; 将值 500 存储在 rcx 中 94 | add rax, rcx ; 将 rcx 中的值添加到rax 95 | ``` 96 | 97 | 我们也可以把一个值**add**到寄存器: 98 | 99 | ``` 100 | mov rax, 25 ; 将值 25 存储在 rax 中 101 | add rax, 12 ; 添加 12 到 rax; rax 现在包含 37 102 | ``` 103 | 104 | 我们可以用一个寄存器值,**sub**另一个寄存的值: 105 | 106 | ``` 107 | mov r15, 1337 ; 将值 1337 存储在 r15 中 108 | mov r12, 55 ; 将值 55 存储在 r12 中 109 | sub r15, r12 ; 从 r15 中,减去 r12 中的值 110 | ``` 111 | 112 | 我们也可以帮一个寄存器值,**sub**一个值: 113 | 114 | ``` 115 | mov rcx, 123 ; 将值 123 存储在 rax 中 116 | sub rcx, 24 ; 从 rcx 中减去24; rcx 现在包含 99 117 | ``` 118 | 119 | 加和减,可以与任何可用的寄存器一起使用。 120 | 121 | ## 乘法和除法 122 | 123 | 在本节中我们将使用**mul**和**div**操作码。这些操作更复杂,突出了几个寄存器的独特用途。 124 | 125 | ``` 126 | mov rax, 50 ; 将值 50 存储在 rax 中 127 | mov rcx, 12 ; 将值 12 存储在 rcx 中 128 | mul rcx ; 通过 rcx 乘以 rax。在这种情况下,rax 将设置为 600 129 | ``` 130 | 131 | 初始号码必须存储在 rax 中。rax 可以乘以任何其他寄存器中的值。结果将存储在 rdx:rax 中。 132 | 133 | ``` 134 | mov rax, 800 ; 将值 800 存储在 rax 中 135 | mov rdx, 0 ; 将 rdx 清除为0 136 | mov rbx, 100 ; 将值 100 存储在rbx中 137 | div rbx ; 用 rbx 除 rdx:rax。在这种情况下,rdx将设置为 0,rax将设置为 8 138 | ``` 139 | 140 | 寄存器 rdx:rax 必须持有被除数,而任何其他寄存器都可以持有除数。div 操作码执行后,商存储在 rax 中,余数存储在 rdx 中。 141 | 142 | ## 分支 143 | 144 | 分支允许我们根据特定条件重定向程序流。可以使用比较,来检查这些条件。 145 | 146 | 比较(cmp),允许我们比较两个寄存器的内容,系统标志将根据比较结果设置。然后,我们可以根据这些系统标志,更改代码执行。 147 | 148 | 让我们尝试一个简单的 c'for'循环。 149 | 150 | ``` 151 | mov rax, 0 ; 将rax设置为0 152 | increment_loop: 153 | add rax, 1 ; 在 rax 中添加 1 154 | cmp rax, 10 ; 将 rax 中的值与10进行比较 155 | jne increment_loop ; 如果它们不相等,则跳转到increment_loop 156 | ``` 157 | 158 | 上述代码将循环 10 次。jne 指的是“如果不相等则跳转”。这意味着如果 rax 不包含值 10,执行将跳转回“increment_loop”。还有许多其他跳转命令: 159 | 160 | | jmp | 跳转-不查看系统标志的直接跳转 | 161 | | --- | ----------------------------- | 162 | | je | 相等时跳 | 163 | | jne | 如果不等于,则跳转 | 164 | | jl | 小就跳 | 165 | | jle | 小于或等于时跳转 | 166 | | jg | 大就跳 | 167 | | jge | 大于或等于时跳转 | 168 | 169 | 另一种分支是函数调用(call)。函数调用允许我们跳转到代码的特定部分,当函数调用完成时,该部分将返回到函数调用离开的位置。 170 | 171 | ``` 172 | mov rax, 14 ; 将 rax 设置为 14 173 | mov rcx, 23 ; 将 rcx 设置为 23 174 | call add_and_subtract_one ; 调用该函数 175 | cmp rax, 5 176 | je test_function_sucess ; 如果 rax == 5 然后跳转,如果没有则继续下一行 177 | 178 | ... 179 | 180 | add_and_subtract_one: ; 将 rcx 添加到 rax,然后减去 1 的函数 181 | add rax, rcx 182 | sub rax, 1 183 | ret 184 | ``` 185 | 186 | ## 访问内存 187 | 188 | 寄存器可用于读取和写入系统内存。mov 操作码的使用方式与我们之前看到的类似。除了,我们可以使用一个字面值,表示内存地址,其被封装在[方括号][square brackets]内。 189 | 190 | ``` 191 | mov rax, [0x200000] ; 将 64 位值,从内存地址 0x200000 复制到rax 192 | mov [0x402000], rbx ; 将 64 位值,从 rbx复制到内存地址0x402000 193 | ``` 194 | 195 | ## 栈 196 | 197 | 栈是用于存储临时信息的内存区域。栈是后进先出(LIFO)数据结构。`push`(推送) 操作将添加到列表顶部,`pop`(弹出) 操作将从列表顶部删除一个项目。如果您将数字 5、7 和 15 推到栈中,您将首先将它们弹出为 15,然后再弹出为 7,最后弹出为 5。在汇编中,您可以将寄存器推到栈上,稍后再将其弹出——当您想保存寄存器的值,同时将该寄存器用于其他用途时,这种功能非常有用。 198 | 199 | ``` 200 | mov rax, 25 ; 将值 25 存储在 rax 中 201 | push rax ; 将 rax 中的值推入栈 202 | mov rax, 12 ; 将值 12 存储在 rax 中 203 | pop rax ; 将栈中的第一个值,弹出到 rax。在这种情况下,rax 再次设置为 25。 204 | ``` 205 | 206 | 没有要求说,就一定要同一个寄存器来回 push 和 pop。例如,这两个部分的结果相同: 207 | 208 | ``` 209 | mov rcx, rax ; 将 rax 中的值复制到rcx 210 | 211 | push rax ; 将 rax 中的值推入栈 212 | pop rcx ; 将栈中的第一个值弹出到rcx 213 | ``` 214 | 215 | ## 进一步阅读 216 | 217 | 本文档仅蹭到了 x86-64 体系结构中,可用的操作码和功能的表面。 218 | 219 | Intel 软件开发人员手册可以在他们的网站上找到。AMD 手册可以在他们的网站上找到。 220 | 221 | x86-64 汇编的*厚脸皮插头*的使用,可以在 BaretalOS(源代码)中看到,它完全由我自己,用汇编编写的。 222 | 223 | // EOF 224 | -------------------------------------------------------------------------------- /index.md: -------------------------------------------------------------------------------- 1 | “I’m the operator with my pocket calculator.” -Kraftwerk 2 | 3 | There has been much interest in assembly lately (whether the real [6502](http://skilldrick.github.com/easy6502/), or the fictional DCPU–16; I even created my own virtual 8-bit CPU called i808 in 2007), but none of this attention focuses on the architecture that is most popular in today’s computers. If you are reading this on a desktop, laptop, or server then your computer is most likely using x86–64 (or x86). x86–64 is the 64-bit superset of the 32-bit x86 architecture and any modern CPU from AMD or Intel supports it. This document will focus on the most used parts of x86–64. 4 | 5 | Assembly language is the lowest level of abstraction in computers that is still easily readable by humans. Assembly language translates directly to the bytes that are executed by your computer’s processor. 6 | 7 | Learning assembly is a useful exercise and will give you a deeper understanding of what takes place ‘under the hood’. While the vast majority of programming is done via high-level languages such as C, C++, Java, etc., it is sometimes advantageous to write partial segments of code in assembly if execution speed is a high priority. For instance, code segments with heavy math calculations for 3D games or scientific processes stand to benefit significantly from the speedup that can be achieved with assembly. 8 | 9 | In this document we will be using ‘Intel’ [syntax](http://en.wikipedia.org/wiki/X86_assembly_language#Syntax) instead of ‘AT&T’. Therefore, opcodes that use multiple arguments work in the following form: 10 | 11 | opcode destination, source 12 | 13 | Any numbers with the prefix ‘0x’ in x86–64 assembly language (and by extension, in this document) are in hexadecimal (hex) format. If you’re not familiar with hex numbers, I recommend you read the [Wikipedia article](http://en.wikipedia.org/wiki/Hexadecimal) before beginning. 14 | 15 | ## Registers 16 | 17 | Registers are probably the most complicated part of the x86–64 architecture and the complications that arise from them are mainly due to the carry-over from the legacy 32-bit and 16-bit x86 architectures. x86–64 has 16 64-bit general purpose registers named R0 - R15. These registers can be broken down into separate parts by bit size and can also be referenced by their legacy x86 names. More information on register names and breakdowns can be found [here](http://www.sandpile.org/x86/gpr.htm). 18 | 19 | For instance, R0 is a 64-bit register (also known as a quad word). If you only want to use 32 bits, then that section can be referenced by R0D (a double word), 16 bits by R0W (a word), or 8 bits by R0B (a byte). 20 | 21 | These D, W, and B refereces are examples of carry-over from the [16-bit word](http://en.wikipedia.org/wiki/Computer_word) days: 22 | 23 | - 8 bits = 1 byte or ‘halfword’ 24 | - 16 bits = 2 bytes = 1 word 25 | - 32 bits = 4 bytes = 2 words = 1 double word 26 | - 64 bits = 8 bytes = 4 words = 1 quad word 27 | 28 | Further complications present themselves with certain opcodes depending on specific registers. This will be explored in more detail in the Multiplication and Division section. 29 | 30 | ## Basic Operations 31 | 32 | The most basic operations are assigning a value to a register or moving a value between two registers. In x86–64 this is called a move or **mov**. This terminology is misleading, as nothing is moved; it is merely copied or stored. 33 | 34 | mov rax, 15 ; Store the value 15 in rax 35 | mov rcx, rax ; Copy the value in rax to rcx 36 | mov rbx, 18446744073709551615 ; Store the largest possible 64-bit number in rbx 37 | 38 | 39 | ## Addition and Subtraction 40 | 41 | We can **add** specific registers together: 42 | 43 | mov rax, 11 ; Store the value 11 in rax 44 | mov rcx, 500 ; Store the value 500 in rcx 45 | add rax, rcx ; Add the value in rcx to rax 46 | 47 | We can also **add** a value to a register: 48 | 49 | mov rax, 25 ; Store the value 25 in rax 50 | add rax, 12 ; Add 12 to rax; rax now contains 37 51 | 52 | We can **sub**tract the value of one register from another: 53 | 54 | mov r15, 1337 ; Store the value 1337 in r15 55 | mov r12, 55 ; Store the value 55 in r12 56 | sub r15, r12 ; Subtract the value in r12 from r15 57 | 58 | We can also **sub**tract a value from a register: 59 | 60 | mov rcx, 123 ; Store the value 123 in rax 61 | sub rcx, 24 ; Subtract 24 from rcx; rcx now contains 99 62 | 63 | Additions and subtractions can be used with any of the available registers. 64 | 65 | ## Multiplication and Division 66 | 67 | In this section we will be using the **mul** and **div** opcodes. These operations are more complicated and highlight the unique purposes of several registers. 68 | 69 | mov rax, 50 ; Store the value 50 in rax 70 | mov rcx, 12 ; Store the value 12 in rcx 71 | mul rcx ; Multiply rax by rcx. In this case rax will be set to 600 72 | 73 | The initial number must be stored in rax. rax can be multiplied by a value in any of the other registers. The result will be stored in rdx:rax. 74 | 75 | mov rax, 800 ; Store the value 800 in rax 76 | mov rdx, 0 ; Clear rdx to 0 77 | mov rbx, 100 ; Store the value 100 in rbx 78 | div rbx ; Divide rdx:rax by rbx. In this case rdx will be set to 0, and rax will be set to 8 79 | 80 | Registers rdx:rax must hold the dividend, while any other register can hold the divisor. After the div opcode executes, the quotient is stored in rax and the remainder in rdx. 81 | 82 | ## Branching 83 | 84 | Branching allows us to redirect the program flow based on certain conditions. These conditions can be checked using comparisons. 85 | 86 | Comparisons allow us to compare the content of two registers and the system flags will be set depending on the result of the comparison. We can then change the code execution based on these system flags. 87 | 88 | Let’s try something like a simple C ‘for’ loop. 89 | 90 | mov rax, 0 ; Set rax to 0 91 | increment_loop: 92 | add rax, 1 ; Add 1 to rax 93 | cmp rax, 10 ; Compare the value in rax to 10 94 | jne increment_loop ; If they are not equal then jump to increment_loop 95 | 96 | The above code will loop 10 times. jne refers to ‘Jump if Not Equal’. This means the execution will jump back to ‘increment_loop’ if rax does not contain the value 10. There are many other jump commands: 97 | 98 | 99 | - jmp - JuMP - A direct jump without looking at the system flags 100 | - je - Jump if Equal 101 | - jne - Jump if Not Equal 102 | - jl - Jump if Less 103 | - jle - Jump if Less or Equal 104 | - jg - Jump if Greater 105 | - jge - Jump if Greater or Equal 106 | 107 | Another kind of branch is a function call. A function call allows us to jump to a specific section of code that will return us to where we left off when the fuction call is completed. 108 | 109 | mov rax, 14 ; Set rax to 14 110 | mov rcx, 23 ; Set rcx to 23 111 | call add_and_subtract_one ; Call the function 112 | cmp rax, 5 113 | je test_function_sucess ; If rax == 5 then jump, if not then continue to next line 114 | 115 | ... 116 | 117 | add_and_subtract_one: ; Function to add rcx to rax and then subtract 1 118 | add rax, rcx 119 | sub rax, 1 120 | ret 121 | 122 | ## Accessing Memory 123 | 124 | The registers can be used to read from and write to system memory. The mov opcode is used in a similar manner as we have seen earlier. Instead of providing a literal value we can use a memory address that is encapsulated in [square brackets]. 125 | 126 | mov rax, [0x200000] ; Copy a 64-bit value from memory address 0x200000 to rax 127 | mov [0x402000], rbx ; Copy a 64-bit value from rbx to memory address 0x402000 128 | 129 | ## The Stack 130 | 131 | The stack is an area of memory used for storing temporary information. A stack is a last in, first out (LIFO) data structure. The push operation adds to the top of the list and the pop operation removes an item from the top of the list. If you were to push the numbers 5, 7, and 15 onto the stack, you would pop them out as 15 first, then 7, and lastly 5. In assembly, you can push registers onto the stack and pop them out later - this ability is useful when you want to save the value of a register while utilizing that register for another purpose. 132 | 133 | mov rax, 25 ; Store the value 25 in rax 134 | push rax ; Push the value in rax to the stack 135 | mov rax, 12 ; Store the value 12 in rax 136 | pop rax ; Pop the first value in the stack to rax. In this case rax is set to 25 again. 137 | 138 | There is no requirement to push and pop to/from the same register. For instance, both of these segments have the same result: 139 | 140 | mov rcx, rax ; Copy the value in rax to rcx 141 | 142 | push rax ; Push the value in rax to the stack 143 | pop rcx ; Pop the first value in the stack to rcx 144 | 145 | ## Further Reading 146 | 147 | This document only scrapes the surface of the opcodes and functionality that is available with the x86-64 architecture. 148 | 149 | Intel Software Developers Manuals can be found on their website. AMD manuals can be found on their website. 150 | 151 | *Shameless plug* Usage of x86-64 assembly can be seen in BareMetal OS (Source Code), which was written by myself entirely in assembly. 152 | 153 | // EOF 154 | --------------------------------------------------------------------------------