├── .gitignore ├── book ├── !000!-intro.md ├── !000!-toc.md └── 00.md ├── build ├── build.bat ├── code └── day00 │ └── boot.asm ├── html ├── fonts │ ├── Alegreya.ttf │ ├── Cutive.ttf │ └── EBGaramond.ttf ├── media │ └── 00_qemu_result.png ├── reset.css ├── style.css └── template.html ├── readme └── simple.css /.gitignore: -------------------------------------------------------------------------------- 1 | .~* 2 | myos 3 | html/index.html 4 | html/book.html -------------------------------------------------------------------------------- /book/!000!-intro.md: -------------------------------------------------------------------------------- 1 | 2 | ## Preface 3 | 4 | Welcome abroad, young OS developer! You have stepped out into an endless ocean of the one of the most interesting hobbies there are. Trust me, this book barely scratches the surface of what's possible, but it gives enough for you to build something you can show off to your friends and if you want, learn more about OS development, giving you not only the opportunity to build **an** OS, but the opportunity to build you **the** OS you want. 5 | 6 | Each chapter is named according to what you're about to do in the chapter, but besides the instruction, each chapter contains the detailed explanation of what we're doing and why we're doing. If you're blindly copying the code, don't expect it to work! Read the book carefully, make sure you understand everything. 7 | 8 | -------------------------------------------------------------------------------- /book/!000!-toc.md: -------------------------------------------------------------------------------- 1 | 2 | ## Table of contents 3 | 4 | 0. [Day 00 / Hello, world!](#ch00) 5 | 1. [Setting up the stage](#ch00-01) 6 | 2. [How OS's start up](#ch00-02) 7 | 3. [Making a bootable image](#ch00-03) 8 | 4. [Intro to the NASM assembler](#ch00-04) 9 | 10 | 1. [Day 01 / Reading the disk](#ch01) 11 | 1. 12 | 13 | 2. [Day 02 / Starting C programming](#ch02) 14 | 1. 15 | 16 | 3. [Day 03 / GDT/IDT setup](#ch03) 17 | 1. 18 | 19 | 4. [Day 04 / Interrupt handling](#ch04) 20 | 1. 21 | -------------------------------------------------------------------------------- /book/00.md: -------------------------------------------------------------------------------- 1 | 2 | ## Hello, world! {id=ch00} 3 | 4 | Today is the first (actually zeroth) day of our OS development. There will be a lot of theory, but at the end you will build something that is actually tangible. The plan is to build the most primitive OS that simply displays "Hello world" to the screen and does nothing else. 5 | 6 | To actually build and run our OS we would first need a few programs. The following chapter lists all the requirements and the installation instructions. 7 | 8 | ### 1. Setting up the stage {id=ch00-01} 9 | 10 | The OS will be developed using two programming languages - C and Assembly. The reason for such choice is partly my personal preference and partly the necessary choice. 11 | 12 | In principle you can make your OS in any systems-level programming language you want, be it C, C++, Rust or Go. But when you do so, some parts of the programming language can't be utilized, since they require an OS in order to be function. For example the `printf` function, a C function that prints character strings to a screen can not be used. Since we're writing the OS even if we called `printf` we'd just crash, or won't be able to compile our code properly. 13 | 14 | Once you disable such functionality, you can use whatever programming language you want. The reason C is a good programming language here is because it doesn't have that many features that depend on existing OS functionality. Basically I chose C because that's easier for the explanation. 15 | 16 | The assembly is our second language, but only because it is necessary to write some parts of the OS in it. When you develop an OS sometimes you need to play around with specific registers in the processor or have a more fine-grained control over what code gets generated. Since C is not made for OS development, it naturally won't have those things. With assembly you can do pretty much anything, which is why we need it. We could even write the whole OS in assembly, but its incredibly tiring, so I'd rather pass. Besides who would read a book like that, haha. 17 | 18 | For the C compiler in this book we will use the GCC compiler. If you're using Windows I recommend downloading TDM-GCC. It's a fairly minimal binary distribution of GCC compiler which won't require you to install as much as 4gb. 19 | 20 | Note that you won't be able to use Visual Studio's C compiler. Or rather you can, but that will require you to search internet on how to disable certain compiler features that we can't use, because the compiler options for these two compilers are different. In this book I'm only explaining GCC flags. 21 | 22 | Download TDM-GCC into whatever directory you want. You will need to remember that directory for later. 23 | 24 | For the assembler we will use NASM. This is one of the assemblers which also comes as a fairly minimal distribution. Download and install it into whatever directory you want. You will need to remember the name of the directory for later. 25 | 26 | After we have compiled our OS we will need to somehow run it. When you turn on your computer the OS is loaded from the disk. So we will need to create a disk image with the OS on it and then write that image onto the disk (About why its not as simple as copying the OS file onto the disk we'll talk later). The program called Win32DiskImager can write image files onto thumb drives, so you can download it as well. 27 | 28 | When developing our OS you are likely to recompile it lots of times, and going through the cycle of writing an OS to thumb drive and restarting the PC might not be the best use of time. To make the process simpler we will use an emulator. There are two good emulators I know, one of them is bochs and another is QEMU. We will use QEMU, because it's easier to set up. You can download QEMU from the official website. Again you'll need to remember the directory where its installed. 29 | 30 | The last but not the least, it's optional to have some hex editor installed. This is a program that lets you view the bytes of the files directly. This is very useful in many situations when you develop an OS. I suggest downloading and using HxD. 31 | 32 | When developing the OS we will use one program more often than any other -- that's the command line. It should be installed on Windows by default. In case you're not familiar with it, the following is a small tutorial on cmd. First, to start the command line you press `win+r` and in the appeared window you type `cmd` and hit enter. 33 | 34 | The part after `>` is the place where you enter commands to command your OS to perform certain operations. Try typing `dir` and hitting enter, like this: 35 | 36 | ``` 37 | > dir 38 | ``` 39 | 40 | In this book whenever there is a direction to type a command into the command line, I will start the line with `>`. This is representing the command line prompt, so you don't type this symbol. 41 | 42 | The `dir` is a command that prints the contents of the current directory. Currently your current directory is `C:\Windows\System32\`. You can see it before the `>`. You can change the current directory to something else by typing `cd `. Let's, for example change our current directory to `C:\`. 43 | 44 | ``` 45 | > cd C:\ 46 | ``` 47 | 48 | Now we'll create the directory where we will be developing our OS. You can chose whatever path you want. For this example I chose `C:\os`, for a lack of imagination. 49 | 50 | ``` 51 | > mkdir os 52 | ``` 53 | 54 | The `mkdir ` creates a directory on a given directory. Now switch your current directory to the newly created directory. This directory is where I will assume your current directory to be every time we use the command line. If you close the command line and come back the next day, just remember to switch your current directory to the path where your OS is located. 55 | 56 | Now what we want to do is to be able to use nasm and all the other programs that we just downloaded. We can do this by typing the full path to exe, like this: 57 | 58 | ``` 59 | > path/to/nasm.exe 60 | ``` 61 | 62 | But that's lengthy. What I want to do now is to add the path/to/nasm to the PATH environment variable. If we do that, then we can just type `nasm.exe` and Windows will automatically know that we refer to the `path/to/nasm.exe`. 63 | 64 | Here's how to do it. Hit `win+r`, this time type `sysdm.cpl` and hit enter. In the appeared window switch to "Advanced" tab, and press the "Environment Variables..." button. In the "System variables" you need to select "Path" and then press the "Edit..." button. 65 | 66 | If you're using Windows 8 and above you simply need to add the paths to `qemu-system-i386.exe`, `nasm.exe` and `gcc.exe` (remember the installation directories). If you're using Windows 7, you need to paste these paths and make sure they are separated with `;`. Save the changes, and close the Environment variable editor. 67 | 68 | To check that everything has been done correctly close and re-open the command line. Then type 69 | 70 | ``` 71 | > nasm -v 72 | ``` 73 | 74 | This command should print nasm version and return to the prompt. If you see something along the lines of "nasm is not a command or a file", then you did something wrong. 75 | 76 | This concludes our tool setup. Now we'll learn more about how exactly we would need to make our OS. 77 | 78 | ### 2. How OS boots {id=ch00-02} 79 | 80 | When you turn on the PC, the OS loader is loaded from the disk into the memory and then executed. Then the OS loader is supposed to load the rest of the OS. The program called BIOS is responsible for loading the OS loader from the disk. 81 | 82 | You may be familiar with BIOS if you ever entered the BIOS settings menu. The BIOS settings is not the same as BIOS itself (but rather a part of it). Please be careful to not confuse the two. The BIOS is the first program to run on your PC right after you turn on the power button. 83 | 84 | Unlike OS, the BIOS is not stored on disk, but rather it's stored in the motherboard ROM. It comes already installed right after you buy a fresh motherboard. 85 | 86 | After BIOS has started it performs hardware initialization. It checks all the disks in the system (thumb drives, SSD's and hard drives), and if it finds a bootable disk, it loads the OS from that disk. Note that the first found device will be booted from, which is why the order is important. If there is one OS on the thumb drive, and another on the hard drive, then the order in which BIOS checks these disks determines which OS will be loaded. That's why you can change this checking order in the BIOS settings. 87 | 88 | Most disks are separated into 512-byte chunks called *sectors*. When you interact with the disk you can not load less than a sector, and you can not write than a sector. So the sector is the minimum adressable unit of any disk. 89 | 90 | The first sector of every disk is reserved and contains the OS boot code. If the two last bytes of this sector are `55 aa` (hex), then the disk is considered bootable by BIOS and the rest of the boot code is ran. That's why the first sector is also called the *boot sector*. 91 | 92 | The boot sector looks roughly like this: 93 | 94 | 95 | 96 | 97 | 98 |
Offset (hex)SizeDescription
000510The loader code
1fe2The 55 aa boot signature
99 | 100 | We have only 510 bytes for the code! But don't worry, it's not as much as to cause us to suffer deficiency, if we write trivial loader. If anything, it's more than enough for our purposes. 101 | 102 | There's one more sweet thing about BIOS. BIOS also provides its own functions that help loader boot the rest of the OS. There are BIOS functions that print characters to the screen, the functions that load sectors from disks and other functions related to primitive hardware control. 103 | 104 | Now we will write our first boot sector and try figuring out how to actually run it. 105 | 106 | ### 3. Making a bootable image {id=ch00-03} 107 | 108 | Now I want to create an image of the bootable disk with some code. The meaning of the code will be explained later, right now let's focus on how to actually get the bootable image. 109 | 110 | The bootable image is the byte-by-byte representation of the data on the disk. Since disks are only addressed by sectors, the size of that file would be a multiple of 512 bytes. If this data is then written ("burned") to the disk, the disk will have these exact bytes. 111 | 112 | Note that we can actually use images that are much less than the size of the disk. In this case the part that the disk image didn't overwrite will remain untouched. So for now we will use this and use 512-bytes long image that contains only the boot sector. 113 | 114 | First, we'll need to have a 512-bytes long binary file that contains the boot sector. For this open HxD, and type the following bytes: 115 | 116 | ``` 117 | b800 008e d8be 1d7c b40e bb07 008a 0483 118 | c601 3c00 7404 cd10 ebf3 f4eb fd48 656c 119 | 6c6f 2c20 576f 726c 6421 0000 0000 0000 120 | 0000 0000 0000 0000 0000 0000 0000 0000 121 | 0000 0000 0000 0000 0000 0000 0000 0000 122 | 0000 0000 0000 0000 0000 0000 0000 0000 123 | 0000 0000 0000 0000 0000 0000 0000 0000 124 | 0000 0000 0000 0000 0000 0000 0000 0000 125 | 0000 0000 0000 0000 0000 0000 0000 0000 126 | 0000 0000 0000 0000 0000 0000 0000 0000 127 | 0000 0000 0000 0000 0000 0000 0000 0000 128 | 0000 0000 0000 0000 0000 0000 0000 0000 129 | 0000 0000 0000 0000 0000 0000 0000 0000 130 | 0000 0000 0000 0000 0000 0000 0000 0000 131 | 0000 0000 0000 0000 0000 0000 0000 0000 132 | 0000 0000 0000 0000 0000 0000 0000 0000 133 | 0000 0000 0000 0000 0000 0000 0000 0000 134 | 0000 0000 0000 0000 0000 0000 0000 0000 135 | 0000 0000 0000 0000 0000 0000 0000 0000 136 | 0000 0000 0000 0000 0000 0000 0000 0000 137 | 0000 0000 0000 0000 0000 0000 0000 0000 138 | 0000 0000 0000 0000 0000 0000 0000 0000 139 | 0000 0000 0000 0000 0000 0000 0000 0000 140 | 0000 0000 0000 0000 0000 0000 0000 0000 141 | 0000 0000 0000 0000 0000 0000 0000 0000 142 | 0000 0000 0000 0000 0000 0000 0000 0000 143 | 0000 0000 0000 0000 0000 0000 0000 0000 144 | 0000 0000 0000 0000 0000 0000 0000 0000 145 | 0000 0000 0000 0000 0000 0000 0000 0000 146 | 0000 0000 0000 0000 0000 0000 0000 0000 147 | 0000 0000 0000 0000 0000 0000 0000 0000 148 | 0000 0000 0000 0000 0000 0000 0000 55aa 149 | ``` 150 | 151 | There are lots of zeroes, so you can try using the copy and paste feature of the HxD to finish the job more quickly. After you're done save the file in your OS directory and name it `boot.bin` (*a rose by any other name would smell as sweet*). 152 | 153 | `boot.bin` is our disk image. Now we need to run our OS, (if I dare calling the loader an OS). First, let's try running it in an emulator to make sure everything is correct. For this use the command `qemu-system-i386 ` into the command line: 154 | 155 | ``` 156 | > qemu-system-i386 boot.bin 157 | ``` 158 | 159 | Note that when you do that QEMU prints a warning: 160 | 161 | ``` 162 | WARNING: Image format was not specified for 'boot.bin' and probing guessed raw. 163 | Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted. 164 | Specify the 'raw' format explicitly to remove the restrictions. 165 | ``` 166 | 167 | You can ignore this warning for now, I will show you how to specify the format of the disk image explicitly later. If you copied the bytes of the boot sector correctly, you should see the following result: 168 | 169 | ![](media/00_qemu_result.png) 170 | 171 | The particular content of the screen may differ depending on the QEMU version, but if you don't see the "Hello, World!" string, something is wrong. Check that you typed the boot sector correctly and that it's size is exactly 512 bytes. 172 | 173 | If your image file is correct, we can move on to trying to run it on a real hardware. Make sure you got a thumb drive (or another storage medium) that has no important data. Because any data that's stored will become unavailable. Open Win32DiskImager, the program that writes disk images to disks. 174 | 175 | In the "Image File" field, chose the file `boot.bin`. If you can't find it, make sure to disable the `.iso` filter in Win32DiskImager. Then in the "Device" dropdown chose your thumb drive. Make sure you picked the correct device. Then hit the "write" button. After the process finished you got a bootable USB! 176 | 177 | Now you need to know how to enter your BIOS settings. Turn off your PC and set the USB as the first boot option, and the hard drive as the second option (you have to do it only once, just don't forget to eject the USB before rebooting back into Windows). I can't give you more detailed instructions, because the specifics depend from one PC to another. 178 | 179 | Plug in your USB and restart your computer. Since USB is the first device, it checks it's boot signature (`55 aa`). If there is a boot signature, BIOS will run our boot code. If the boot signature is not there, it would boot back into the Windows. 180 | 181 | If after that you boot back into Windows (assuming that you were in the emulator), the most likely problem is that your USB is not the first boot option. The other causes may be that your motherboard doesn't start-up in the BIOS mode. This also can be changed in BIOS settings. 182 | 183 | If you see the "Hello, World!" message, I congradulate you! You can show this off to your friends and say how cool you are. Except there is only one problem -- we still have no idea what are these bytes, and how they correspond to the "Hello, World!" message. We'll start by re-writing our image in the Assembly language. 184 | 185 | ### 4. Intro to the NASM assembler {id=ch00-04} 186 | 187 | The CPU reads the code as bytes, instruction by instruction. The assembly language is relatively straightforward -- you write one instruction per line, and the assembler converts this into the corresponding machine code. 188 | 189 | One of the first instructions that we'll learn is `db` (data byte). It is not a real instruction, in the sense that it's not being processed by the CPU, but rather its a pseudo-instruction, it instructs NASM to do something special. The `db` instruction tells NASM to put the given byte value into the file. 190 | 191 | Try this: create a file, which we will use later, call it `boot.asm`. Then write the following contents into the file: 192 | 193 | ``` 194 | db 0xb8 195 | ``` 196 | 197 | And save it. Then in the command line run 198 | 199 | ``` 200 | nasm boot.asm -o boot.bin 201 | ``` 202 | 203 | This command tells NASM to compile `boot.asm` and put the compiled file into `boot.bin`. This will overwrite the contents of boot.bin, so be careful. `-o` stands for "output". The dash there signifies that it is a command-line option. 204 | 205 | After you have compiled the file, open boot.bin in the hex editor. You should see 0xb8 as the only byte of it. So as you can see, NASM simply put the given value 0xb8 value into the file. We can write multiple bytes like this: 206 | 207 | ``` 208 | db 0xb8 209 | db 0x00 210 | ``` 211 | 212 | Or, for shortness we can just list the bytes after a comma: 213 | 214 | ``` 215 | db 0xb8, 0x00 216 | ``` 217 | 218 | By continuing copy-pasting the bytes from our initial `boot.bin` we can achieve the following assembly: 219 | 220 | ``` 221 | db 0xb8, 0x00, 0x00, 0x8e, 0xd8, 0xbe, 0x1d, 0x7c 222 | db 0xb4, 0x0e, 0xbb, 0x07, 0x00, 0x8a, 0x04, 0x83 223 | db 0xc6, 0x01, 0x3c, 0x00, 0x74, 0x04, 0xcd, 0x10 224 | db 0xeb, 0xf3, 0xf4, 0xeb, 0xfd, 0x48, 0x65, 0x6c 225 | db 0x6c, 0x6f, 0x2c, 0x20, 0x57, 0x6f, 0x72, 0x6c 226 | db 0x64, 0x21, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 227 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 228 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 229 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 230 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 231 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 232 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 233 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 234 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 235 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 236 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 237 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 238 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 239 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 240 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 241 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 242 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 243 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 244 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 245 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 246 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 247 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 248 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 249 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 250 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 251 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 252 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 253 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 254 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 255 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 256 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 257 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 258 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 259 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 260 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 261 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 262 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 263 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 264 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 265 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 266 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 267 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 268 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 269 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 270 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 271 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 272 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 273 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 274 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 275 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 276 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 277 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 278 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 279 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 280 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 281 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 282 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 283 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 284 | db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x55, 0xaa 285 | ``` 286 | 287 | This is way more typing! But you don't have to do it all manually, I simply left it here for demonstration purposes. If you copy-paste this into `boot.asm` and assemble it, you should get exactly the same `boot.bin` as before, except now it is written in assembly and then assembled, instead of being written directly as bytes. 288 | 289 | So this is why NASM is also the all-powerful tool which you can use to type whatever program you want and whatever file you want! Of course, it is just a joke, kind of like saying that you can write any great book with just a pencil and paper. But it just shows the versatility of the language. 290 | 291 | Now I will introduct another useful instruction. It is also a pseudo-instruction, this time I'd even say a "meta-instruction". This instruction is `times`. By doing `times N ` you can achieve the same result as if the instruction was actually typed out N times. We can use this to remove all the zeroes in `boot.asm` above. 292 | 293 | ``` 294 | db 0xb8, 0x00, 0x00, 0x8e, 0xd8, 0xbe, 0x1d, 0x7c 295 | db 0xb4, 0x0e, 0xbb, 0x07, 0x00, 0x8a, 0x04, 0x83 296 | db 0xc6, 0x01, 0x3c, 0x00, 0x74, 0x04, 0xcd, 0x10 297 | db 0xeb, 0xf3, 0xf4, 0xeb, 0xfd, 0x48, 0x65, 0x6c 298 | db 0x6c, 0x6f, 0x2c, 0x20, 0x57, 0x6f, 0x72, 0x6c 299 | db 0x64, 0x21, 0x00 300 | 301 | times 467 db 0x00 302 | 303 | db 0x55 304 | db 0xaa 305 | ``` 306 | 307 | There were 467 zeroes so I removed them and replaced with `times 467 db 0x00`. 308 | 309 | By the way, it's a fun observation on my side, but for some reason you can't use `times` as the argument to `times`. Somehow NASM breaks, what a pity. In any case you can use arithmetics as `N`, so you should be able to do stuff like `times 400+67 db 0x00`. 310 | 311 | Now prepare for the great reveal. I will uncover the rest of the assembly code and we will look at each of the parts in a greater detail, as well as learn a few more facts about x86, BIOS and the assembly language. 312 | 313 | ``` 314 | org 0x7c00 315 | 316 | start: 317 | mov ax, 0 318 | mov ds, ax 319 | print: 320 | mov si, string 321 | mov ah, 0x0e 322 | mov bx, 0x0007 323 | putchar: 324 | mov al, [si] 325 | add si, 1 326 | cmp al, 0 327 | je end 328 | int 0x10 329 | jmp putchar 330 | end: 331 | hlt 332 | jmp end 333 | 334 | string: 335 | db 'Hello, World!', 0 336 | 337 | times 510 - ($-$$) db 0 338 | db 0x55 339 | db 0xaa 340 | ``` 341 | 342 | Finally some letters! Let's start digging the instructions. I will skip the first line until later. Then we see the following part: 343 | 344 | ``` 345 | start: 346 | mov ax, 0 347 | mov ds, ax 348 | ``` 349 | 350 | The `start:` is a label. When you put a label before an instruction, in this case before `mov ax, 0`, that label points to that instruction. We can later use that label to refer to that instruction, for example figure out it's memory address. 351 | 352 | The `mov` instruction moves the value of the second operand into the first operand. Actually, `mov` is a kind of misnomer. When you move a thing, it is understood that the thing is not on its old place anymore. With `mov`, rather than moving it's more like copying. So the guys at Intel should have thought better and named this instruction "copy" or "set". In any case `mov ax,0` means "ax = 0". It sets the register ax to zero. Likewise, `mov ds,ax` sets the register ds to the value of ax. 353 | 354 | For now think of registers as kinds of variables, and when you program assembly there is only a limited amount of those "variables". The registers in x86 have their own names, here I will name them all: 355 | 356 | - **ax** - accumulator 357 | - **cx** - counter 358 | - **dx** - data 359 | - **bx** - base 360 | - **si** - source index 361 | - **di** - destination index 362 | - **sp** - stack pointer 363 | - **bp** - base pointer 364 | - **cs** - code segment 365 | - **ss** - stack segment 366 | - **ds** - (default) data segment 367 | - **es** - extra data segment 368 | 369 | The first 8 are general-purpose registers. This means they can be used interchangeably for most general-purpose instructions, such as arithmetic, comparison, jumps, memory accesses. The reason for their names is that some instructions are biased towards some registers over the other, and if you use the registers according to their name you can write relatively fast and short code. E.g. if you use accumulator register as the cumulative value during the computation a complex expression, you can avoid extra `mov` instructions. 370 | 371 | The last four registers are segment registers. They are used for memory addressing. You can note that above we were setting the data segment to zero. I won't explain why this is important for now, just know that doing so will help avoid problems with addressing for now. 372 | 373 | With this we have figured out that the first 3 lines of code do roughly what I can describe as "register initialization". The next block is the following: 374 | 375 | ``` 376 | print: 377 | mov si, string 378 | mov ah, 0x0e 379 | mov bx, 0x0007 380 | 381 | ; ... Some code ... 382 | 383 | string: 384 | db 'Hello, World!', 0 385 | ``` 386 | 387 | And a quick remark, lines starting with `;` mark comments. The comment spans until the end of line and is ignored by the assembler. 388 | 389 | After the `print:` label, the first thing we're doing is we're setting `si` to be `string`. But if si is a 16-bit register, what does it exactly mean to set it to a string? Well, it is finally the time to tell you about the labels in NASM. 390 | 391 | The labels in NASM are memory addresses of the instruction that comes next. The string "Hello, World!" is located at the offset `0x1d` from the start of the file. And since the boot sector is loaded starting at the address `0x7c00`, the address of the string in memory would be `0x7c00+0x1d=0x7c1d`. This will be the address of the label `string`. 392 | 393 | So in this case, `mov si,string` means the same thing as `mov si,0x7c1d`. The reason we're using labels is because as soon as we remove or add new instructions, we would have to recalculate the new address of the string within the file. This is dirty work that we don't want and don't have to do. What a convenient world! 394 | 395 | Also note how the load address of the program is participates in this calculation. This is exactly why we needed that `org` instruction at the beginning. If we for some reason forgot it or specified a wrong address, the address calculation would have yielded a wrong value (0x1b would be added to a wrong offset) pointing to some random location in memory and we wouldn't have printed anything coherent. 396 | 397 | Next, here is a seemingly new register, `ah`. In fact, the first four general-purpose registers (`ax`, `cx`, `dx`, `bx`) can be split up into the low 8 bits and the high 8 bits. So the low 8 bits of `ax` are called `al`, and the high 8 bits of `ax` are called `ah`. Similarly `cl`, `dl`, `bl` are the low 8-bits of `cx`, `dx`, `bx` and `ch`, `dh`, `bh` are their high 8 bits. 398 | 399 | When you mov to the high 8 bits of some these registers, the low 8 bits remain untouched. The other general purpose registers and segment regsiters can't be split like that. Don't ask why, I didn't design x86, haha. 400 | 401 | Then we set the bx register to be `0x0007`. This is self-explanatory. 402 | 403 | And lastly, there is a new way of using the `db` command. Rather than specifying numbers you can specify character strings. For that you can surround the string with singular quotes (') or double quotes ("). When you do that NASM stores that entire string, character by character into the output file. 404 | 405 | Moving to the next part: 406 | 407 | ``` 408 | putchar: 409 | mov al, [si] 410 | add si, 1 411 | cmp al, 0 412 | je end 413 | int 0x10 414 | jmp putchar 415 | ``` 416 | 417 | Following the label we see another `mov` instruction. But this time it's something new. Time to talk about memory. 418 | 419 | The memory can be imagined as a flat array of bytes. The indices to that array are what's called the "address in memory". Whenever you load something from memory or put something into the memory, you always say something like "Hey can you put the value XX into the box number YYYY" to the CPU. 420 | 421 | The square brackets around an operand turn the simple assignment into memory load. So rather than setting the value of `al` to the value of `si`, we're setting the value of `al` to the value of *memory cell* at address `si`. 422 | 423 | In this case only a single byte is read from the memory, because the size of destination (al) is 8. If we wrote instead `mov ax,[si]` that would have loaded 2 bytes from the memory into `ax`. So the number of bytes loaded depends on the operands. But sometimes it's not possible to infer the size of the load just from the operands. Consider `mov [si],2`. If you wrote that, NASM would kindly say the following: 424 | 425 | ``` 426 | error: operation size not specified 427 | ``` 428 | 429 | The way you can specify the operand size is by prepending either `byte`, `word` or `dword` to the `[]`. In the case above that would be `mov byte[si],2` or `mov word[si],2`. It's also true that `mov al,[si]` is the same thing as `mov al,byte[si]`. 430 | 431 | Also an important note. When you load multibyte value from the memory, for example `mov ax,word[si]`, the low 8 bits of `ax` will be loaded from the address `si`, and the high 8 bits will be loaded from the address `si+1`. 432 | 433 | Returning to the next instruction on the line, we see `add si,1`. This is self-explanatory, we increment `si` by 1, so the next time it will point to the next character in the string. 434 | 435 | `cmp al,0` will compare `al` with 0. It won't do anything besides set the FLAGS register, one of the registers we haven't talked about yet. The FLAGS register contains the status of the previous instruction, like whether the previous instruction overflown, as well as the results of comparison, like whether the result of comparison was "greater", "less" or "equal". It only sets the flags, doesn't do much more. This is important for the next instruction. 436 | 437 | Which is `je end`. `je` stands for "Jump if Equal". Here `end` is a label, meaning it will have the value of the memory address of the instruction to which we're going to jump. And the "if equal" part is taken from the results of the previous comparison. Reading the pair of instructions, the meaning is "we're going to jump to `end` if `al` is 0". 438 | 439 | Next is `int 0x10`. INT stands for interrupt. Explaining interrupts here fully would suck out all the energy from you so I will just ask you to think about interrupt here as "function call". In particular this is BIOS call that prints the character. You can find many different interrupt tables online, including on Wikipedia. 440 | 441 | The number after `int` specifies a general kind of function you're calling. The number `0x10` (16) means you're calling a display function, if it was 0x13, you'd be calling disk services. There are also numbers for keyboard, mouse, timers. 442 | 443 | The function is defined not only by the interrupt number, but also by the value of `ah` at the time we're calling the interrupt. If you remember, we set `ah` to be 0x0e. If you look the wikipedia article for int 0x10 with ah=0x0e, you can find a function like this: 444 | 445 | ``` 446 | INT 0x10 / AH = 0x0e 447 | Print character function 448 | Parameters: 449 | al = character 450 | bh = page number 451 | bl = color 452 | Returns: 453 | none 454 | ``` 455 | 456 | So the parameters are stored in the registers. If you remember, `al` is the register containing the characters of the strings that we load from `si`, and `bx` is always `0x0007`. This means that `bh` is 0, and `bl` is always `0x07`. I won't bother you with the meaning of these two parameters, but just note that the color 0x07 is light grey, and it's fine to just leave the page number at 0. 457 | 458 | So this `int` instruction prints the character at `al` to the screen. 459 | 460 | Finally, the last instruction of this block is `jmp`. JMP stands for jump. `jmp` doesn't check any flags in the FLAGS register, in this case it just jumps back to the point when we read the next character from the string. 461 | 462 | So overall this block of code reads the characters from the string, one by one, until it encounters `0x00`, and print them to the screen. 463 | 464 | And the final block of code is this: 465 | 466 | ``` 467 | end: 468 | hlt 469 | jmp end 470 | ``` 471 | 472 | You can see that this is an infinite loop that runs a single instruction - `hlt`. HLT stands for Halt, it's used to put the processor to sleep until the next interrupt, which is used to consume the power. I like thinking that I'm saving the planet from the apocalypse by inserting a HLT instruction into infinite loops, so this is why I didn't just wrote 473 | 474 | ``` 475 | end: 476 | jmp end 477 | ``` 478 | 479 | The last part is more-less familiar to us, just written differently. 480 | 481 | ``` 482 | times 510 - ($-$$) db 0 483 | db 0x55 484 | db 0xaa 485 | ``` 486 | 487 | Here, if you read `$-$$` as "the number of bytes in the file up until this point", you can clearly see how `510-($-$$)` is simply the number of bytes that is needed until you reach 510. In our case the number of bytes before that huge block of zeroes is 43, so 510-43 is 467. We will insert zero 467 times. And the 43 code bytes + 467 zeroes should leave us at offset 510. 488 | 489 | Then we simply insert the remaining bytes, 55 and aa, our boot signature. This concludes the explanation of the assembly. I encourage you to read through the full code again, making sure you understand it and what it does. 490 | -------------------------------------------------------------------------------- /build: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | extensions="-markdown_in_html_blocks-smart+tex_math_dollars+implicit_figures+link_attributes+tex_math_single_backslash+header_attributes" 4 | 5 | cat book/*.md > html/book.md 6 | pandoc html/book.md -fmarkdown$extensions -o html/book.html --ascii 7 | cat html/template.html html/book.html > html/index.html 8 | rm html/book.html 9 | rm html/book.md 10 | -------------------------------------------------------------------------------- /build.bat: -------------------------------------------------------------------------------- 1 | @echo off 2 | copy book\*.md html\book.md 3 | 4 | set extensions="-smart+tex_math_dollars+implicit_figures+link_attributes+tex_math_single_backslash+header_attributes" 5 | 6 | pandoc html\book.md -fmarkdown%extensions% -o html/book.html --ascii 7 | copy web\template.html web\book.html > web\index.html 8 | rm html\book.html 9 | rm html\book.md 10 | -------------------------------------------------------------------------------- /code/day00/boot.asm: -------------------------------------------------------------------------------- 1 | org 0x7c00 2 | 3 | start: 4 | mov ax, 0 5 | mov ds, ax 6 | print: 7 | mov si, string 8 | mov ah, 0x0e 9 | mov bx, 0x0007 10 | putchar: 11 | mov al, [si] 12 | add si, 1 13 | cmp al, 0 14 | je end 15 | int 0x10 16 | jmp putchar 17 | end: 18 | hlt 19 | jmp end 20 | 21 | string: 22 | db 'Hello, World!', 0 23 | 24 | times 510 - ($-$$) db 0 25 | db 0x55 26 | db 0xaa 27 | -------------------------------------------------------------------------------- /html/fonts/Alegreya.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/handmade-osdev/os-in-30-days/2849e468968a402f4f8bdd63581b0852dc4b348d/html/fonts/Alegreya.ttf -------------------------------------------------------------------------------- /html/fonts/Cutive.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/handmade-osdev/os-in-30-days/2849e468968a402f4f8bdd63581b0852dc4b348d/html/fonts/Cutive.ttf -------------------------------------------------------------------------------- /html/fonts/EBGaramond.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/handmade-osdev/os-in-30-days/2849e468968a402f4f8bdd63581b0852dc4b348d/html/fonts/EBGaramond.ttf -------------------------------------------------------------------------------- /html/media/00_qemu_result.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/handmade-osdev/os-in-30-days/2849e468968a402f4f8bdd63581b0852dc4b348d/html/media/00_qemu_result.png -------------------------------------------------------------------------------- /html/reset.css: -------------------------------------------------------------------------------- 1 | html { 2 | font-family: serif; 3 | font-size: 16px; 4 | margin: 0; 5 | padding: 0; 6 | border: 0; 7 | } 8 | 9 | body, div, span, applet, object, iframe, 10 | h1, h2, h3, h4, h5, h6, p, blockquote, pre, 11 | a, abbr, acronym, address, big, cite, code, 12 | del, dfn, em, img, ins, kbd, q, s, samp, 13 | small, strike, strong, sub, sup, tt, var, 14 | b, u, i, center, 15 | dl, dt, dd, ol, ul, li, 16 | fieldset, form, label, legend, 17 | table, caption, tbody, tfoot, thead, tr, th, td, 18 | article, aside, canvas, details, embed, 19 | figure, figcaption, footer, header, hgroup, 20 | menu, nav, output, ruby, section, summary, 21 | time, mark, audio, video { 22 | margin: 0; 23 | padding: 0; 24 | border: 0; 25 | font: inherit; 26 | vertical-align: baseline; 27 | } 28 | /* HTML5 display-role reset for older browsers */ 29 | article, aside, details, figcaption, figure, 30 | footer, header, hgroup, menu, nav, section { 31 | display: block; 32 | } 33 | body { 34 | line-height: 1; 35 | } 36 | ol, ul { 37 | list-style: none; 38 | } 39 | blockquote, q { 40 | quotes: none; 41 | } 42 | blockquote:before, blockquote:after, 43 | q:before, q:after { 44 | content: ''; 45 | content: none; 46 | } 47 | table { 48 | border-collapse: collapse; 49 | border-spacing: 0; 50 | } 51 | -------------------------------------------------------------------------------- /html/style.css: -------------------------------------------------------------------------------- 1 | @font-face { 2 | src: url(fonts/Alegreya.ttf); 3 | font-family: Alegreya; 4 | } 5 | 6 | @font-face { 7 | src: url(fonts/EBGaramond.ttf); 8 | font-family: EBGaramond; 9 | } 10 | 11 | @font-face { 12 | src: url(fonts/Cutive.ttf); 13 | font-family: Cutive; 14 | } 15 | 16 | body { 17 | font-family: 'Alegreya'; 18 | font-feature-settings: frac; 19 | font-variant: common-ligatures tabular-nums oldstyle-nums; 20 | } 21 | 22 | .num { 23 | font-variant: diagonal-fractions; 24 | } 25 | 26 | article { 27 | max-width: 80ch; 28 | margin: 0 auto; 29 | } 30 | 31 | h1,h2,h3,h4,h5,h6 { 32 | font-family: 'EBGaramond'; 33 | font-variant: lining-nums tabular-nums; 34 | } 35 | 36 | h1 { 37 | font-size: 2rem; 38 | font-variant: small-caps; 39 | margin: 15px 0; 40 | text-align: center; 41 | } 42 | 43 | h2 { 44 | font-size: 1.5rem; 45 | margin: 15px 0; 46 | } 47 | 48 | h3 { 49 | font-size: 1.3rem; 50 | margin: 15px 0; 51 | } 52 | 53 | p { 54 | font-size: 1rem; 55 | line-height: 1.07; 56 | margin: 15px 0; 57 | } 58 | 59 | small { 60 | font-size: 0.6em; 61 | } 62 | 63 | i {font-style: italic;} 64 | b {font-weight: bold;} 65 | strong {font-weight: bold;} 66 | em {font-style: italic;} 67 | 68 | ul { 69 | list-style: circle inside none; 70 | } 71 | 72 | ol { 73 | list-style: decimal inside none; 74 | } 75 | 76 | ol ol { 77 | margin-left: 4ch; 78 | } 79 | 80 | img { 81 | display: block; 82 | margin: auto; 83 | max-width: 100%; 84 | } 85 | 86 | td { 87 | padding-right: 10px; 88 | } 89 | 90 | th { 91 | padding-right: 10px; 92 | font-weight: 600; 93 | text-align: left; 94 | border-bottom: 1px solid black; 95 | } 96 | 97 | code { 98 | font-family: 'Cutive'; 99 | letter-spacing: -1px; 100 | } 101 | 102 | pre code { 103 | display: block; 104 | margin: 0 4ch 5px 2ch; 105 | overflow: auto; 106 | } 107 | -------------------------------------------------------------------------------- /html/template.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Handmade OS in 30 days 9 | 10 | 11 |
12 |

Handmade OS in 30 days

-------------------------------------------------------------------------------- /readme: -------------------------------------------------------------------------------- 1 | 2 | This is a book which will teach you how to create your own OS in just 30 days, 3 | assuming only a small bit of prior experience in programming and nothing else. 4 | This book is inspired by a japanese book with a similar title: 5 | 6 | 「30日でできる! OS自作入門」 7 | ("Make it in 30 days! Introduction to self-made OS") 8 | 9 | This book assumes that you are using a Windows operating system. Linux users 10 | should search alternatives for the software provided in this book themselves. 11 | Well, I try using the tools that are portable to begin with, but with some 12 | software it becomes impossible to find such tools. In the future I'll probably 13 | have to write these cross-platform tools myself, but for the time being please 14 | cope. 15 | 16 | Contributing 17 | 18 | You can join the discord server to receive help if you have any trouble or 19 | questions. We generally try to be helpful to beginners so don't be afraid to 20 | ask if you don't understand something: 21 | 22 | https://discord.gg/TkxhtTGVAu 23 | 24 | If you encounter a factual error/mistranslation/bad explanation/typographic error 25 | you can either file an issue at: 26 | 27 | https://github.com/handmade-osdev/os-in-30-days/issues/new 28 | 29 | Or create a pull request. How to do this is described below. 30 | 31 | All contributions are welcome. In order to submit an error, fork this 32 | repository to your hard drive. Then navigate to book/ subdirectory and find 33 | an .md file corresponding to the chapter number. Note that filenames starting 34 | with "!" do not correspond to chaters of the book, but rather to other 35 | sections of the book. Edit the markdown file to fix the error, then you will 36 | have to rebuild the book.html file. When editing .md files make sure to 37 | enable some kind of wrapping in your text editor. Otherwise you may 38 | experience difficulty editing the contents of the book. Avoid putting manual 39 | line breaks. 40 | 41 | For this make sure you've got pandoc installed. If you haven't got pandoc 42 | installed on your machine follow this URL: 43 | 44 | https://pandoc.org/installing.html 45 | 46 | Now you can build the book into an HTML document. Open your shell and type 47 | the command described next. Depending on your shell the command will look 48 | a bit different. 49 | 50 | Windows (cmd.exe): 51 | build 52 | 53 | POSIX shell or Powershell: 54 | ./build 55 | 56 | Now book.html should be updated. You can open this file in your browser and 57 | verify that the changes are correct and that there are no errors. After that 58 | you may commit your changes and create a pull request. 59 | 60 | Please check the following section to better understand the organization of 61 | the project. 62 | 63 | Notes: 64 | 65 | 1/ I have tried organizing the project in a way that would work for multiple 66 | OS's but it requires some assumptions, mainly that your OS sorts the folders 67 | alphabetically. It is important that the sort order in book/ folder is 68 | alphabetical. To check run `ls` (linux) or `dir` (windows) command and check 69 | whether the order is correct. It should look something like that: 70 | 71 | !000!-start.md 72 | !001!-preface.md 73 | !002!-toc.md 74 | 00.md 75 | 01.md 76 | 02.md 77 | ... 78 | 31.md 79 | 80 | If it does not, after you build the project the resulting HTML will have 81 | content in the wrong order. Note that different programs have different 82 | opinions on what constitutes "alphabetical sorting". The exclamation mark 83 | is a character whose codepoint is less than any other character or digit and 84 | its positions within filenames makes it so that different sorting algorithm 85 | no matter what sorts them correctly. If this causes any further issue I 86 | believe I'll have to write a utility that will concatenate files in the 87 | correct order. 88 | 89 | 2/ When dealing with footnotes note that the footnote number is common for 90 | every single file in the book/ directory. The structure here is the following. 91 | 92 | file whose name starts with ! don't contain footnotes. 93 | chapter 0 contains footnote numbers 0..9 94 | chapter 1 contains footnote numbers 10..19 95 | chapter 2 contains footnote numbers 20..29 96 | ... 97 | chapter 31 contains footnote numbers 310..319 98 | 99 | This means that every chapter is limited to 10 footnotes. This simplifies the 100 | insertion and removal of footnotes. 101 | 102 | 3/ Make sure that every file starts and ends with one or two empty lines. 103 | This is to make sure that after concatenation different files occupy at least 104 | different paragraphs. 105 | 106 | -------------------------------------------------------------------------------- /simple.css: -------------------------------------------------------------------------------- 1 | body { 2 | font-family: 'Times New Roman'; 3 | width: 84ch; 4 | max-width: 100%; 5 | } 6 | 7 | pre code { 8 | margin-left: 4ch; 9 | width: 80ch; 10 | display: block; 11 | overflow-x: auto; 12 | white-space: pre; 13 | } 14 | 15 | img { 16 | max-width: 100%; 17 | display: block; 18 | margin-left: auto; 19 | margin-right: auto; 20 | } --------------------------------------------------------------------------------