├── .gitignore ├── LICENSE ├── README.md ├── build.sh ├── dis.sh ├── examples ├── hello.c ├── sinwave.c └── twinkle.c ├── img └── sinwave.png ├── lint └── lint.c ├── rt ├── _start.c └── lib.c ├── run.sh ├── run_raw.sh └── sectorc.s /.gitignore: -------------------------------------------------------------------------------- 1 | build/ 2 | *~ 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | CC0 1.0 Universal 2 | 3 | Statement of Purpose 4 | 5 | The laws of most jurisdictions throughout the world automatically confer 6 | exclusive Copyright and Related Rights (defined below) upon the creator and 7 | subsequent owner(s) (each and all, an "owner") of an original work of 8 | authorship and/or a database (each, a "Work"). 9 | 10 | Certain owners wish to permanently relinquish those rights to a Work for the 11 | purpose of contributing to a commons of creative, cultural and scientific 12 | works ("Commons") that the public can reliably and without fear of later 13 | claims of infringement build upon, modify, incorporate in other works, reuse 14 | and redistribute as freely as possible in any form whatsoever and for any 15 | purposes, including without limitation commercial purposes. These owners may 16 | contribute to the Commons to promote the ideal of a free culture and the 17 | further production of creative, cultural and scientific works, or to gain 18 | reputation or greater distribution for their Work in part through the use and 19 | efforts of others. 20 | 21 | For these and/or other purposes and motivations, and without any expectation 22 | of additional consideration or compensation, the person associating CC0 with a 23 | Work (the "Affirmer"), to the extent that he or she is an owner of Copyright 24 | and Related Rights in the Work, voluntarily elects to apply CC0 to the Work 25 | and publicly distribute the Work under its terms, with knowledge of his or her 26 | Copyright and Related Rights in the Work and the meaning and intended legal 27 | effect of CC0 on those rights. 28 | 29 | 1. Copyright and Related Rights. A Work made available under CC0 may be 30 | protected by copyright and related or neighboring rights ("Copyright and 31 | Related Rights"). Copyright and Related Rights include, but are not limited 32 | to, the following: 33 | 34 | i. the right to reproduce, adapt, distribute, perform, display, communicate, 35 | and translate a Work; 36 | 37 | ii. moral rights retained by the original author(s) and/or performer(s); 38 | 39 | iii. publicity and privacy rights pertaining to a person's image or likeness 40 | depicted in a Work; 41 | 42 | iv. rights protecting against unfair competition in regards to a Work, 43 | subject to the limitations in paragraph 4(a), below; 44 | 45 | v. rights protecting the extraction, dissemination, use and reuse of data in 46 | a Work; 47 | 48 | vi. database rights (such as those arising under Directive 96/9/EC of the 49 | European Parliament and of the Council of 11 March 1996 on the legal 50 | protection of databases, and under any national implementation thereof, 51 | including any amended or successor version of such directive); and 52 | 53 | vii. other similar, equivalent or corresponding rights throughout the world 54 | based on applicable law or treaty, and any national implementations thereof. 55 | 56 | 2. Waiver. To the greatest extent permitted by, but not in contravention of, 57 | applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and 58 | unconditionally waives, abandons, and surrenders all of Affirmer's Copyright 59 | and Related Rights and associated claims and causes of action, whether now 60 | known or unknown (including existing as well as future claims and causes of 61 | action), in the Work (i) in all territories worldwide, (ii) for the maximum 62 | duration provided by applicable law or treaty (including future time 63 | extensions), (iii) in any current or future medium and for any number of 64 | copies, and (iv) for any purpose whatsoever, including without limitation 65 | commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes 66 | the Waiver for the benefit of each member of the public at large and to the 67 | detriment of Affirmer's heirs and successors, fully intending that such Waiver 68 | shall not be subject to revocation, rescission, cancellation, termination, or 69 | any other legal or equitable action to disrupt the quiet enjoyment of the Work 70 | by the public as contemplated by Affirmer's express Statement of Purpose. 71 | 72 | 3. Public License Fallback. Should any part of the Waiver for any reason be 73 | judged legally invalid or ineffective under applicable law, then the Waiver 74 | shall be preserved to the maximum extent permitted taking into account 75 | Affirmer's express Statement of Purpose. In addition, to the extent the Waiver 76 | is so judged Affirmer hereby grants to each affected person a royalty-free, 77 | non transferable, non sublicensable, non exclusive, irrevocable and 78 | unconditional license to exercise Affirmer's Copyright and Related Rights in 79 | the Work (i) in all territories worldwide, (ii) for the maximum duration 80 | provided by applicable law or treaty (including future time extensions), (iii) 81 | in any current or future medium and for any number of copies, and (iv) for any 82 | purpose whatsoever, including without limitation commercial, advertising or 83 | promotional purposes (the "License"). The License shall be deemed effective as 84 | of the date CC0 was applied by Affirmer to the Work. Should any part of the 85 | License for any reason be judged legally invalid or ineffective under 86 | applicable law, such partial invalidity or ineffectiveness shall not 87 | invalidate the remainder of the License, and in such case Affirmer hereby 88 | affirms that he or she will not (i) exercise any of his or her remaining 89 | Copyright and Related Rights in the Work or (ii) assert any associated claims 90 | and causes of action with respect to the Work, in either case contrary to 91 | Affirmer's express Statement of Purpose. 92 | 93 | 4. Limitations and Disclaimers. 94 | 95 | a. No trademark or patent rights held by Affirmer are waived, abandoned, 96 | surrendered, licensed or otherwise affected by this document. 97 | 98 | b. Affirmer offers the Work as-is and makes no representations or warranties 99 | of any kind concerning the Work, express, implied, statutory or otherwise, 100 | including without limitation warranties of title, merchantability, fitness 101 | for a particular purpose, non infringement, or the absence of latent or 102 | other defects, accuracy, or the present or absence of errors, whether or not 103 | discoverable, all to the greatest extent permissible under applicable law. 104 | 105 | c. Affirmer disclaims responsibility for clearing rights of other persons 106 | that may apply to the Work or any use thereof, including without limitation 107 | any person's Copyright and Related Rights in the Work. Further, Affirmer 108 | disclaims responsibility for obtaining any necessary consents, permissions 109 | or other rights required for any use of the Work. 110 | 111 | d. Affirmer understands and acknowledges that Creative Commons is not a 112 | party to this document and has no duty or obligation with respect to this 113 | CC0 or use of the Work. 114 | 115 | For more information, please see 116 | 117 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # SectorC 2 | SectorC is a C compiler written in x86-16 assembly that fits within the 512 byte boot sector of an x86 machine. It supports a 3 | subset of C that is large enough to write real and interesting programs. It is quite likely the smallest C compiler ever written. 4 | 5 | In a base64 encoding, it looks like this: 6 | 7 | ``` 8 | 6gUAwAdoADAfaAAgBzH/6DABPfQYdQXoJQHr8+gjAVOJP+gSALDDqluB+9lQdeAG/zdoAEAfy+gI 9 | AegFAYnYg/hNdFuE9nQNsOiqiwcp+IPoAqvr4j3/FXUG6OUAquvXPVgYdQXoJgDrGj0C2nUGV+gb 10 | AOsF6CgA68Ow6apYKfiD6AKrifgp8CaJRP7rrOg4ALiFwKu4D4Srq1fonP9ewz2N/HUV6JoA6BkA 11 | ieu4iQRQuIs26IAAWKvD6AcAieu4iQbrc4nd6HkA6HYA6DgAHg4fvq8Bra052HQGhcB19h/DrVCw 12 | UKroWQDoGwC4WZGrW4D/wHUMuDnIq7i4AKu4AA+ridirH8M9jfx1COgzALiLBOucg/j4dQXorf/r 13 | JIP49nUI6BwAuI0G6wyE0nQFsLiq6wa4iwarAduJ2KvrA+gAAOhLADwgfvkx2zHJPDkPnsI8IH4S 14 | weEIiMFr2wqD6DABw+gqAOvqicg9Ly90Dj0qL3QSPSkoD5TGidjD6BAAPAp1+eu86Ln/g/jDdfjr 15 | slIx9osEMQQ8O3QUuAACMdLNFIDkgHX0PDt1BIkEMcBaw/v/A8H9/yvB+v/34fb/I8FMAAvBLgAz 16 | wYQA0+CaANP4jwCUwHf/lcAMAJzADgCfwIUAnsCZAJ3AAAAAAAAAAAAAAAAAAAAAAAAAAAAAVao= 17 | ``` 18 | 19 | ## Supported language 20 | 21 | A fairly large subset is supported: global variables, functions, if statements, while statements, lots of operators, pointer dereference, inline machine-code, comments, etc. 22 | All of these features make it quite capable. 23 | 24 | For example, the following program animates a moving sine-wave: 25 | 26 | ``` 27 | int y; 28 | int x; 29 | int x_0; 30 | void sin_positive_approx() 31 | { 32 | y = ( x_0 * ( 157 - x_0 ) ) >> 7; 33 | } 34 | void sin() 35 | { 36 | x_0 = x; 37 | while( x_0 > 314 ){ 38 | x_0 = x_0 - 314; 39 | } 40 | if( x_0 <= 157 ){ 41 | sin_positive_approx(); 42 | } 43 | if( x_0 > 157 ){ 44 | x_0 = x_0 - 157; 45 | sin_positive_approx(); 46 | y = 0 - y; 47 | } 48 | y = 100 + y; 49 | } 50 | 51 | 52 | int offset; 53 | int x_end; 54 | void draw_sine_wave() 55 | { 56 | x = offset; 57 | x_end = x + 314; 58 | while( x <= x_end ){ 59 | sin(); 60 | pixel_x = x - offset; 61 | pixel_y = y; 62 | vga_set_pixel(); 63 | x = x + 1; 64 | } 65 | } 66 | 67 | int v_1; 68 | int v_2; 69 | void delay() 70 | { 71 | v_1 = 0; 72 | while( v_1 < 50 ){ 73 | v_2 = 0; 74 | while( v_2 < 10000 ){ 75 | v_2 = v_2 + 1; 76 | } 77 | v_1 = v_1 + 1; 78 | } 79 | } 80 | 81 | void main() 82 | { 83 | vga_init(); 84 | 85 | offset = 0; 86 | while( 1 ){ 87 | vga_clear(); 88 | draw_sine_wave(); 89 | 90 | delay(); 91 | offset = offset + 1; 92 | if( offset >= 314 ){ // mod the value to avoid 2^16 integer overflow 93 | offset = offset - 314; 94 | } 95 | } 96 | } 97 | ``` 98 | 99 | ### Screenshot 100 | 101 | ![Moving Sinwave](img/sinwave.png) 102 | 103 | ## Provided Example Code 104 | 105 | A few examples are provided that leverage the unique hardware aspects of the x86-16 IBM PC: 106 | - `examples/hello.c:` Print a text greeting on the screen writing to memory at 0xB8000 107 | - `examples/sinwave.c:` Draw a moving sine wave animation with VGA Mode 0x13 using an appropriately bad approximation of sin(x) 108 | - `examples/twinkle.c:` Play “Twinkle Twinkle Little Star” through the PC Speaker (Warning: LOUD) 109 | 110 | ## Grammar 111 | 112 | The following grammar is accepted and compiled by sectorc: 113 | 114 | ``` 115 | program = (var_decl | func_decl)+ 116 | var_decl = "int" identifier ";" 117 | func_decl = "void" func_name "{" statement* "}" 118 | func_name = 119 | statement = "if(" expr "){" statement* "}" 120 | | "while(" expr "){" statement* "}" 121 | | "asm" integer ";" 122 | | func_name ";" 123 | | assign_expr ";" 124 | assign_expr = deref? identifier "=" expr 125 | deref = "*(int*)" 126 | expr = unary (op unary)? 127 | unary = deref identifier 128 | | "&" identifier 129 | | "(" expr ")" 130 | | identifier 131 | | integer 132 | op = "+" | "-" | "&" | "|" | "^" | "<<" | ">>" 133 | | "==" | "!=" | "<" | ">" | "<=" | ">=" 134 | ``` 135 | 136 | In addition, both `// comment` and `/* multi-line comment */` styles are supported. 137 | 138 | (NOTE: This grammar is 704 bytes in ascii, 38% larger than its implementation!) 139 | 140 | ## How? 141 | 142 | See blog post: [SectorC: A C Compiler in 512 bytes](https://xorvoid.com/sectorc.html) 143 | 144 | ## Why? 145 | 146 | In 2020, cesarblum wrote a Forth that fits in a bootsector: ([sectorforth](https://github.com/cesarblum/sectorforth)) 147 | 148 | In 2021, jart et. al. wrote a Lisp that fits in the bootsector: ([sectorlisp](https://github.com/jart/sectorlisp)) 149 | 150 | Naturally, C always needs to come and crash (literally) every low-level systems party regardless of whether it was even invited. 151 | 152 | ## Running 153 | 154 | Dependencies: 155 | - `nasm` for assembling (I used v2.16.01) 156 | - `qemu-system-i386` for emulating x86-16 (I used v8.0.0) 157 | 158 | Build: `./build.sh` 159 | 160 | Run: `./run.sh your_source.c` 161 | 162 | NOTE: Tested only on a MacBook M1 163 | 164 | ## What is this useful for? 165 | 166 | Probably Nothing. 167 | 168 | Or at least that's what I thought when starting out. But, I didn't think I'd get such a feature set. Now, I'd say that it **might** be 169 | useful for someone that wants to explore x86-16 bios functions and machine model w/o having to learn lots of x86 assembly first. But, then again, you 170 | should just use a proper C compiler and write a tiny bootloader to execute it. 171 | -------------------------------------------------------------------------------- /build.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | set -e 3 | THISDIR=$(dirname $(realpath $0)) 4 | cd $THISDIR 5 | 6 | SRC=sectorc.s 7 | BIN=build/sectorc.bin 8 | 9 | ## output dir for build artifacts 10 | mkdir -p build 11 | 12 | ## assemble sectorc 13 | nasm -f bin -o $BIN $SRC 14 | 15 | ## build a helpful linter 16 | gcc -std=c11 -Wall -Werror -O2 -g -o build/lint lint/lint.c 17 | -------------------------------------------------------------------------------- /dis.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | set -e 3 | objdump -D -b binary -m i386 -Maddr16,data16 -M intel "$1" 4 | -------------------------------------------------------------------------------- /examples/hello.c: -------------------------------------------------------------------------------- 1 | int buf; 2 | int ptr; 3 | int len; 4 | void vga_write() 5 | { 6 | /* Text vga is located at b800:0000 */ 7 | store_far_seg = 47104; // segment: 0xb800 8 | store_far_off = idx << 1; 9 | store_far_val = ( 15 << 8 ) | ( ch & 255 ); // white fg and black bg 10 | store_far(); 11 | } 12 | 13 | int x_off; 14 | int y_off; 15 | void vga_write_ch() 16 | { 17 | if( ch != 10 ){ 18 | idx = y_off + x_off; 19 | vga_write(); 20 | x_off = x_off + 1; 21 | } 22 | if( ( ch == 10 ) | ( x_off == 80 ) ){ 23 | y_off = y_off + 80; 24 | x_off = 0; 25 | } 26 | } 27 | 28 | int idx; 29 | void vga_clear() 30 | { 31 | idx = 0; 32 | while( idx < 2000 ){ // 80x25 33 | ch = 32; // char: ' ' 34 | vga_write(); 35 | idx = idx + 1; 36 | } 37 | pos = 0; 38 | } 39 | 40 | void main() 41 | { 42 | // dump_code_segment_and_shutdown(); 43 | 44 | vga_clear(); 45 | 46 | ch = 72; vga_write_ch(); 47 | ch = 101; vga_write_ch(); 48 | ch = 108; vga_write_ch(); 49 | ch = 108; vga_write_ch(); 50 | ch = 111; vga_write_ch(); 51 | ch = 10; vga_write_ch(); 52 | ch = 32; vga_write_ch(); 53 | ch = 102; vga_write_ch(); 54 | ch = 114; vga_write_ch(); 55 | ch = 111; vga_write_ch(); 56 | ch = 109; vga_write_ch(); 57 | ch = 10; vga_write_ch(); 58 | ch = 32; vga_write_ch(); 59 | ch = 32; vga_write_ch(); 60 | ch = 83; vga_write_ch(); 61 | ch = 101; vga_write_ch(); 62 | ch = 99; vga_write_ch(); 63 | ch = 116; vga_write_ch(); 64 | ch = 111; vga_write_ch(); 65 | ch = 114; vga_write_ch(); 66 | ch = 67; vga_write_ch(); 67 | ch = 10; vga_write_ch(); 68 | ch = 32; vga_write_ch(); 69 | ch = 32; vga_write_ch(); 70 | ch = 32; vga_write_ch(); 71 | 72 | i = 0; 73 | while( i < 10 ){ 74 | ch = 33; vga_write_ch(); 75 | i = i + 1; 76 | } 77 | 78 | while( 1 ){ } 79 | } 80 | -------------------------------------------------------------------------------- /examples/sinwave.c: -------------------------------------------------------------------------------- 1 | /* A Sine-wave Animation 2 | 3 | Math time: 4 | --------------------------- 5 | Along the range [0, pi] we can approximate sin(x) very crudely with a 2nd order quadratic 6 | That is: y = a * x^2 + b * x + c 7 | 8 | Three unknowns need three constraints, so picking the easy ones: 9 | x = 0, y = 0 10 | x = pi/2, y = 1 11 | x = pi, y = 0 12 | 13 | Solving the linear system: 14 | 15 | | 0 0 1 | | a | | 0 | 16 | | pi^2/4 pi/2 1 | * | b | = | 1 | 17 | | pi^2 pi 1 | | c | | 0 | 18 | 19 | We get: 20 | 21 | a = -4 / pi^2 22 | b = 4 / pi 23 | c = 0 24 | 25 | And: 26 | 27 | y = 4x(pi - x)/(pi^2) 28 | 29 | Engineering time: 30 | --------------------------- 31 | We are working with a 320x200 vga. We also don't have floating-point math. So, the 32 | goal here is to do all the math in integer screen coordinates and accept some pixel 33 | approximation error. 34 | 35 | First, we want to center the wave in the middle, y = 100 36 | We'll let y vary +-50 pixels to remain on the screen, so [50, 150] 37 | We want to show an entire cycle (2pi) on the x-axis, so *50 gives us [0, ~314] 38 | This implies that the "x-origin" is at x = 157 39 | 40 | Substituting in everything, we get: 41 | 42 | y ~= 100 + x*(157 - x)/125 43 | 44 | The division by 125 is problematic as we don't have division. But luckily 128 is close enough. 45 | 46 | Thus, we get: 47 | 48 | y ~= 100 + (x*(157 - x)) >> 7 49 | 50 | The rest is just adjusting for the [0, pi] range reduction by negating the approximation 51 | along [pi, 2pi] 52 | 53 | NOTE: the screen coordinate system is upside-down and I don't bother to correct for that. 54 | it simply means that the animation starts at a +pi phase offset 55 | */ 56 | 57 | int y; 58 | int x; 59 | int x_0; 60 | void sin_positive_approx() 61 | { 62 | y = ( x_0 * ( 157 - x_0 ) ) >> 7; 63 | } 64 | void sin() 65 | { 66 | x_0 = x; 67 | while( x_0 > 314 ){ 68 | x_0 = x_0 - 314; 69 | } 70 | if( x_0 <= 157 ){ 71 | sin_positive_approx(); 72 | } 73 | if( x_0 > 157 ){ 74 | x_0 = x_0 - 157; 75 | sin_positive_approx(); 76 | y = 0 - y; 77 | } 78 | y = 100 + y; 79 | } 80 | 81 | 82 | int offset; 83 | int x_end; 84 | void draw_sine_wave() 85 | { 86 | x = offset; 87 | x_end = x + 314; 88 | while( x <= x_end ){ 89 | sin(); 90 | pixel_x = x - offset; 91 | pixel_y = y; 92 | vga_set_pixel(); 93 | x = x + 1; 94 | } 95 | } 96 | 97 | int v_1; 98 | int v_2; 99 | void delay() 100 | { 101 | v_1 = 0; 102 | while( v_1 < 50 ){ 103 | v_2 = 0; 104 | while( v_2 < 10000 ){ 105 | v_2 = v_2 + 1; 106 | } 107 | v_1 = v_1 + 1; 108 | } 109 | } 110 | 111 | void main() 112 | { 113 | vga_init(); 114 | 115 | offset = 0; 116 | while( 1 ){ 117 | vga_clear(); 118 | draw_sine_wave(); 119 | 120 | delay(); 121 | offset = offset + 1; 122 | if( offset >= 314 ){ // mod the value to avoid 2^16 integer overflow 123 | offset = offset - 314; 124 | } 125 | } 126 | } 127 | -------------------------------------------------------------------------------- /examples/twinkle.c: -------------------------------------------------------------------------------- 1 | /* References: 2 | http://muruganad.com/8086/8086-assembly-language-program-to-play-sound-using-pc-speaker.html 3 | https://en.wikipedia.org/wiki/Twinkle,_Twinkle,_Little_Star 4 | */ 5 | 6 | void delay_1() 7 | { 8 | v_1 = 0; 9 | while( v_1 < 4000 ){ 10 | v_2 = 0; 11 | while( v_2 < 10000 ){ 12 | v_2 = v_2 + 1; 13 | } 14 | v_1 = v_1 + 1; 15 | } 16 | } 17 | 18 | void delay_2() 19 | { 20 | v_1 = 0; 21 | while( v_1 < 300 ){ 22 | v_2 = 0; 23 | while( v_2 < 10000 ){ 24 | v_2 = v_2 + 1; 25 | } 26 | v_1 = v_1 + 1; 27 | } 28 | } 29 | 30 | void audio_init() 31 | { 32 | // Configure PIC2 mode 33 | port_num = 67; 34 | port_val = 182; 35 | port_outb(); 36 | } 37 | 38 | void audio_enable() 39 | { 40 | // Set bits 0 and 1 to enable 41 | port_num = 97; 42 | port_inb(); 43 | port_val = port_val | 3; 44 | port_outb(); 45 | } 46 | 47 | void audio_disable() 48 | { 49 | // Clear bits 0 and 1 to enable 50 | port_num = 97; 51 | port_inb(); 52 | port_val = port_val & 65532; 53 | port_outb(); 54 | } 55 | 56 | int audio_freq; 57 | void audio_freq_set() 58 | { 59 | // Set frequency 60 | port_num = 66; 61 | port_val = audio_freq & 255; 62 | port_outb(); 63 | port_val = ( audio_freq >> 8 ) & 255; 64 | port_outb(); 65 | } 66 | 67 | int note; 68 | void play_quarter_note() 69 | { 70 | audio_freq = note; 71 | audio_freq_set(); 72 | audio_enable(); 73 | delay_1(); 74 | audio_disable(); 75 | delay_2(); 76 | } 77 | void play_half_note() 78 | { 79 | audio_freq = note; 80 | audio_freq_set(); 81 | audio_enable(); 82 | delay_1(); 83 | delay_1(); 84 | audio_disable(); 85 | delay_2(); 86 | } 87 | 88 | void play_section_1() 89 | { 90 | note = C; play_quarter_note(); 91 | note = C; play_quarter_note(); 92 | note = G; play_quarter_note(); 93 | note = G; play_quarter_note(); 94 | note = A; play_quarter_note(); 95 | note = A; play_quarter_note(); 96 | note = G; play_half_note(); 97 | 98 | note = F; play_quarter_note(); 99 | note = F; play_quarter_note(); 100 | note = E; play_quarter_note(); 101 | note = E; play_quarter_note(); 102 | note = D; play_quarter_note(); 103 | note = D; play_quarter_note(); 104 | note = C; play_half_note(); 105 | } 106 | 107 | void play_section_2() 108 | { 109 | note = G; play_quarter_note(); 110 | note = G; play_quarter_note(); 111 | note = F; play_quarter_note(); 112 | note = F; play_quarter_note(); 113 | note = E; play_quarter_note(); 114 | note = E; play_quarter_note(); 115 | note = D; play_half_note(); 116 | 117 | note = G; play_quarter_note(); 118 | note = G; play_quarter_note(); 119 | note = F; play_quarter_note(); 120 | note = F; play_quarter_note(); 121 | note = E; play_quarter_note(); 122 | note = E; play_quarter_note(); 123 | note = D; play_half_note(); 124 | } 125 | 126 | void main() 127 | { 128 | audio_init(); 129 | audio_enable(); 130 | 131 | C = 4560; 132 | D = 4063; 133 | E = 3619; 134 | F = 3416; 135 | G = 3043; 136 | A = 2711; 137 | 138 | play_section_1(); 139 | play_section_2(); 140 | play_section_1(); 141 | 142 | audio_disable(); 143 | } 144 | -------------------------------------------------------------------------------- /img/sinwave.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xorvoid/sectorc/bc6218b750415f69dceb139b9f8f02d57926d7dd/img/sinwave.png -------------------------------------------------------------------------------- /lint/lint.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | 7 | typedef uint8_t u8; 8 | typedef uint16_t u16; 9 | typedef uint32_t u32; 10 | typedef uint64_t u64; 11 | typedef int8_t i8; 12 | typedef int16_t i16; 13 | typedef int32_t i32; 14 | typedef int64_t i64; 15 | 16 | #define error(msg, ...) do { \ 17 | fprintf(stderr, "Fatal Error: "); \ 18 | fprintf(stderr, msg, ##__VA_ARGS__); \ 19 | fprintf(stderr, ": (%zu, %zu) at %s:%d\n", tok->line_num, tok->line_off, __FUNCTION__, __LINE__); \ 20 | fprintf(stderr, "\n"); \ 21 | tok_report_line_error();\ 22 | exit(1); \ 23 | } while(0) 24 | 25 | #define TOK_TYPE_EOF 0 26 | #define TOK_TYPE_NUM 1 27 | #define TOK_TYPE_SYM 2 28 | #define TOK_TYPE_FUNC 3 29 | 30 | #define TOKENS(_)\ 31 | /* enum-symbol token-literal token-value */\ 32 | _( TOK_INT, "int", 6388 )\ 33 | _( TOK_VOID, "void", 11386 )\ 34 | _( TOK_ASM, "asm", 5631 )\ 35 | _( TOK_START, "_start()", 33977 )\ 36 | _( TOK_SEMI, ";", 11 )\ 37 | _( TOK_DEREF, "*(int*)", 64653 )\ 38 | _( TOK_WHILE_BEGIN, "while(", 55810 )\ 39 | _( TOK_IF_BEGIN, "if(", 6232 )\ 40 | _( TOK_BODY_BEGIN, "){", 5 )\ 41 | _( TOK_LPAREN, "(", 65528 )\ 42 | _( TOK_RPAREN, ")", 65529 )\ 43 | _( TOK_BLK_BEGIN, "{", 75 )\ 44 | _( TOK_BLK_END, "}", 77 )\ 45 | _( TOK_ASSIGN, "=", 13 )\ 46 | _( TOK_ADDR, "&", 65526 )\ 47 | _( TOK_SUB, "-", 65533 )\ 48 | _( TOK_ADD, "+", 65531 )\ 49 | _( TOK_MUL, "*", 65530 )\ 50 | _( TOK_OR, "|", 76 )\ 51 | _( TOK_XOR, "^", 46 )\ 52 | _( TOK_SHL, "<<", 132 )\ 53 | _( TOK_SHR, ">>", 154 )\ 54 | _( TOK_EQ, "==", 143 )\ 55 | _( TOK_NE, "!=", 65399 )\ 56 | _( TOK_LT, "<", 12 )\ 57 | _( TOK_GT, ">", 14 )\ 58 | _( TOK_LE, "<=", 133 )\ 59 | _( TOK_GE, ">=", 153 )\ 60 | 61 | enum { 62 | #define ELT(enum_sym, _2, token_val) enum_sym = ((u16)token_val), 63 | TOKENS(ELT) 64 | #undef ELT 65 | }; 66 | 67 | static const char *token_str(int val) 68 | { 69 | switch (val) { 70 | #define ELT(enum_sym, str, _3) case enum_sym: return str; 71 | TOKENS(ELT) 72 | #undef ELT 73 | } 74 | return NULL; 75 | } 76 | 77 | static bool token_is_kw(int val) 78 | { 79 | switch (val) { 80 | #define ELT(enum_sym, str, _3) case enum_sym: return true; 81 | TOKENS(ELT) 82 | #undef ELT 83 | } 84 | return false; 85 | } 86 | 87 | typedef struct token token_t; 88 | struct token 89 | { 90 | int type; 91 | u16 val; 92 | 93 | const char * text; 94 | size_t len; 95 | 96 | const char * line_start; 97 | size_t line_num; 98 | size_t line_off; 99 | }; 100 | 101 | #define TOK_FMT "'%.*s'" 102 | #define TOK_ARG(t) (int)((t)->len), (t)->text 103 | 104 | static char * input_buf = NULL; 105 | static char * input_ptr = NULL; 106 | static size_t input_len = 0; 107 | static token_t tok[1]; 108 | 109 | static void input_append_source_file(const char *path) 110 | { 111 | FILE *fp = fopen(path, "r"); 112 | if (!fp) { 113 | fprintf(stderr, "Failed to open file: %s\n", path); 114 | exit(1); 115 | } 116 | 117 | fseek(fp, 0, SEEK_END); 118 | size_t file_sz = ftell(fp); 119 | fseek(fp, 0, SEEK_SET); 120 | 121 | size_t old_sz = input_len; 122 | size_t new_sz = old_sz + file_sz; 123 | 124 | input_buf = realloc(input_buf, new_sz); 125 | if (!input_buf) { 126 | fprintf(stderr, "Failed to allocate buffer for source text\n"); 127 | exit(2); 128 | } 129 | 130 | size_t n = fread(input_buf + input_len, 1, file_sz, fp); 131 | if (n != file_sz) { 132 | fprintf(stderr, "Failed to read source file completely\n"); 133 | exit(3); 134 | } 135 | 136 | fclose(fp); 137 | 138 | input_ptr = input_buf; 139 | input_len = new_sz; 140 | tok->line_start = input_buf; 141 | } 142 | 143 | static void tok_report_line_error(void) 144 | { 145 | size_t line_len; 146 | char *s = strchr(tok->line_start, '\n'); 147 | if (s) { 148 | line_len = s - tok->line_start; 149 | } else { 150 | line_len = strlen(tok->line_start); 151 | } 152 | 153 | fprintf(stderr, "%.*s\n", (int)line_len, tok->line_start); 154 | fprintf(stderr, "%*s^\n", (int)(tok->line_off - tok->len), ""); 155 | } 156 | 157 | __attribute__((unused)) void tok_print() 158 | { 159 | fprintf(stderr, "tok (%zu:%zu) | type %d | text '%.*s'\n", tok->line_num, tok->line_off, tok->type, (int)tok->len, tok->text); 160 | } 161 | 162 | int next_is_semi = 0; 163 | 164 | char tok_char_get() 165 | { 166 | if (next_is_semi) return ' '; 167 | else return *input_ptr; 168 | } 169 | 170 | void tok_char_next() 171 | { 172 | if (next_is_semi) { 173 | next_is_semi = 0; 174 | return; 175 | } 176 | 177 | char last = *input_ptr; 178 | if (last == '\n') { 179 | tok->line_start = input_ptr + 1; 180 | tok->line_num++; 181 | tok->line_off = 0; 182 | } else { 183 | tok->line_off++; 184 | } 185 | input_ptr++; 186 | 187 | if (*input_ptr == ';') { 188 | next_is_semi = 1; 189 | } 190 | } 191 | 192 | static bool tok_lint_number(const char *text, size_t len) 193 | { 194 | for (size_t i = 0; i < len; i++) { 195 | char c = text[i]; 196 | if (!('0' <= c && c <= '9')) { 197 | return false; 198 | } 199 | } 200 | return true; 201 | } 202 | 203 | static bool tok_lint_ident(const char *text, size_t len) 204 | { 205 | for (size_t i = 0; i < len; i++) { 206 | char c = text[i]; 207 | if (!(c == '_' || ('a' <= c && c <= 'z') || ('A' <= c && c <= 'Z') || (i != 0 && '0' <= c && c <= '9'))) { 208 | return false; 209 | } 210 | } 211 | return true; 212 | } 213 | 214 | static bool tok_lint_func_name(const char *text, size_t len) 215 | { 216 | if (len <= 2 || 0 != memcmp(&text[len-2], "()", 2)) { 217 | return false; 218 | } 219 | return tok_lint_ident(text, len-2); 220 | } 221 | 222 | static void tok_lint_matches(const char *str) 223 | { 224 | size_t len = strlen(str); 225 | if (tok->len != len || 0 != memcmp(tok->text, str, len)) { 226 | error("token (%u) collision between " TOK_FMT " and '%s'", tok->val, TOK_ARG(tok), str); 227 | } 228 | } 229 | 230 | static void tok_lint() 231 | { 232 | switch (tok->type) { 233 | case TOK_TYPE_EOF: break; 234 | case TOK_TYPE_NUM: { 235 | if (!tok_lint_number(tok->text, tok->len)) { 236 | error("token (%u) parsed as int, but it is not: " TOK_FMT, tok->val, TOK_ARG(tok)); 237 | } 238 | } break; 239 | case TOK_TYPE_SYM: { 240 | switch (tok->val) { 241 | #define ELT(enum_sym, str, _3) case enum_sym: tok_lint_matches(str); return; 242 | TOKENS(ELT) 243 | #undef ELT 244 | } 245 | // If we've reached this point, it's not a special token so it must be an ordinary variable identifier 246 | if (!tok_lint_ident(tok->text, tok->len)) { 247 | error("token (%u) parsed as variable identifier, but it is not: " TOK_FMT, tok->val, TOK_ARG(tok)); 248 | } 249 | } break; 250 | case TOK_TYPE_FUNC: { 251 | if (!tok_lint_func_name(tok->text, tok->len)) { 252 | error("token (%u) parsed as func name, but it is not: " TOK_FMT, tok->val, TOK_ARG(tok)); 253 | } 254 | } break; 255 | } 256 | } 257 | 258 | static void tok_next(void) 259 | { 260 | while (1) { 261 | char c = tok_char_get(); 262 | 263 | // Handle EOF 264 | if (c == 0) { 265 | tok->type = TOK_TYPE_EOF; 266 | tok->len = 0; 267 | return; 268 | } 269 | 270 | // Skip whitespace 271 | if (c <= ' ') { 272 | tok_char_next(); 273 | continue; 274 | } 275 | 276 | tok->type = ('0' <= c && c <= '9') ? TOK_TYPE_NUM : TOK_TYPE_SYM; 277 | tok->text = input_ptr; 278 | tok->len = 1; 279 | tok->val = c - '0'; 280 | 281 | bool slash = c == '/'; 282 | tok_char_next(); 283 | c = tok_char_get(); 284 | 285 | // single-line comment? 286 | if (slash && c == '/') { 287 | tok_char_next(); 288 | c = tok_char_get(); 289 | if (c > ' ') error("expected space after // comment"); 290 | while (1) { 291 | if (c == '\n') break; 292 | tok_char_next(); 293 | c = tok_char_get(); 294 | } 295 | continue; // try tok_next() again 296 | } 297 | 298 | // multi-line comment? 299 | if (slash && c == '*') { 300 | tok_char_next(); 301 | c = tok_char_get(); 302 | if (c > ' ') error("expected space after /* comment"); 303 | 304 | // ending "*/" must be proceeded by a space 305 | int last_space_off = 0; 306 | bool last_is_asterisk = false; 307 | while (1) { 308 | tok_char_next(); 309 | c = tok_char_get(); 310 | if (last_is_asterisk && c == '/') { 311 | if (last_space_off != 1) error("expected space before */ comment terminator"); 312 | /* Found, done! */ 313 | break; 314 | } 315 | last_space_off++; 316 | last_is_asterisk = c == '*'; 317 | if (c <= ' ') { 318 | last_space_off = 0; 319 | } 320 | } 321 | tok_char_next(); 322 | continue; 323 | } 324 | 325 | // normal token 326 | while (c > ' ') { 327 | tok->len++; 328 | tok->val = 10 * tok->val + c - '0'; 329 | tok_char_next(); 330 | c = tok_char_get(); 331 | } 332 | 333 | // special logic to detect function names 334 | if (tok->len > 2 && 0 == memcmp(&tok->text[tok->len-2], "()", 2)) { 335 | tok->type = TOK_TYPE_FUNC; 336 | } 337 | 338 | // lint the token to detect errors 339 | tok_lint(); 340 | return; 341 | } 342 | } 343 | 344 | static bool tok_num_is(void) 345 | { 346 | return tok->type == TOK_TYPE_NUM; 347 | } 348 | 349 | static void tok_num_expect(void) 350 | { 351 | if (!tok_num_is()) { 352 | error("expected num token"); 353 | } 354 | tok_next(); 355 | } 356 | 357 | static bool tok_kw_is(u16 val) 358 | { 359 | return 360 | tok->type == TOK_TYPE_SYM && 361 | tok->val == val && 362 | token_is_kw(val); 363 | } 364 | 365 | static void tok_kw_expect(u16 val) 366 | { 367 | if (!tok_kw_is(val)) { 368 | error("expected keyword token '%s' (%u)", token_str(val), val); 369 | } 370 | tok_next(); 371 | } 372 | 373 | static bool tok_ident_is(void) 374 | { 375 | return 376 | tok->type == TOK_TYPE_SYM && 377 | !token_is_kw(tok->val) && 378 | tok_lint_ident(tok->text, tok->len); 379 | } 380 | 381 | static void tok_ident_expect(void) 382 | { 383 | if (!tok_ident_is()) { 384 | error("expected identifier token"); 385 | } 386 | tok_next(); 387 | } 388 | 389 | static bool tok_func_is(void) 390 | { 391 | return tok->type == TOK_TYPE_FUNC; 392 | } 393 | 394 | static void tok_func_expect(void) 395 | { 396 | if (!tok_func_is()) { 397 | error("expected func token"); 398 | } 399 | tok_next(); 400 | } 401 | 402 | static bool tok_oper_is(void) 403 | { 404 | return 405 | tok_kw_is(TOK_ADD) || 406 | tok_kw_is(TOK_SUB) || 407 | tok_kw_is(TOK_MUL) || 408 | tok_kw_is(TOK_ADDR) || // "AND" in this context 409 | tok_kw_is(TOK_OR) || 410 | tok_kw_is(TOK_XOR) || 411 | tok_kw_is(TOK_SHL) || 412 | tok_kw_is(TOK_SHR) || 413 | tok_kw_is(TOK_EQ) || 414 | tok_kw_is(TOK_NE) || 415 | tok_kw_is(TOK_LT) || 416 | tok_kw_is(TOK_GT) || 417 | tok_kw_is(TOK_LE) || 418 | tok_kw_is(TOK_GE); 419 | } 420 | 421 | // fwd decl needed for paren grouping recursion 422 | static void parse_expr(void); 423 | 424 | // unary = deref identifier 425 | // | "&" identifier 426 | // | "(" expr ")" 427 | // | identifier 428 | // | integer 429 | static void parse_unary(void) 430 | { 431 | if (tok_kw_is(TOK_DEREF)) { 432 | tok_next(); // consume TOK_DEREF 433 | tok_ident_expect(); 434 | } 435 | 436 | else if (tok_kw_is(TOK_ADDR)) { 437 | tok_next(); // consume TOK_ADDR 438 | tok_ident_expect(); 439 | } 440 | 441 | else if (tok_kw_is(TOK_LPAREN)) { 442 | tok_next(); // consume TOK_LPAREN 443 | parse_expr(); 444 | tok_kw_expect(TOK_RPAREN); 445 | } 446 | 447 | else if (tok_ident_is()) { 448 | tok_next(); // consume identifier 449 | } 450 | 451 | else if (tok_num_is()) { 452 | tok_next(); // consume number 453 | } 454 | 455 | else { 456 | error("expected unary expression"); 457 | } 458 | } 459 | 460 | // expr = unary (op unary)? 461 | static void parse_expr(void) 462 | { 463 | parse_unary(); 464 | if (tok_oper_is()) { 465 | tok_next(); // consume the operator 466 | parse_unary(); 467 | } 468 | } 469 | 470 | // assign_expr = deref? identifier "=" expr 471 | static void parse_assign_expr(void) 472 | { 473 | // optionally we have a deref token 474 | if (tok_kw_is(TOK_DEREF)) { 475 | tok_next(); 476 | } 477 | 478 | tok_ident_expect(); 479 | tok_kw_expect(TOK_ASSIGN); 480 | parse_expr(); 481 | } 482 | 483 | // statement = "if(" expr "){" statement* "}" 484 | // | "while(" expr "){" statement* "}" 485 | // | "asm" integer ";" 486 | // | func_name ";" 487 | // | assign_expr ";" 488 | static void parse_statement(void) 489 | { 490 | if (tok_kw_is(TOK_IF_BEGIN)) { 491 | tok_next(); // consume TOK_IF_BEGIN 492 | parse_expr(); 493 | tok_kw_expect(TOK_BODY_BEGIN); 494 | while (!tok_kw_is(TOK_BLK_END)) { 495 | parse_statement(); 496 | } 497 | tok_next(); // consume TOK_BODY_BEGIN 498 | } 499 | 500 | else if (tok_kw_is(TOK_WHILE_BEGIN)) { 501 | tok_next(); // consume TOK_WHILE_BEGIN 502 | parse_expr(); 503 | tok_kw_expect(TOK_BODY_BEGIN); 504 | while (!tok_kw_is(TOK_BLK_END)) { 505 | parse_statement(); 506 | } 507 | tok_next(); // consume TOK_BODY_BEGIN 508 | } 509 | 510 | else if (tok_kw_is(TOK_ASM)) { 511 | tok_next(); // consume TOK_ASM 512 | tok_num_expect(); 513 | tok_kw_expect(TOK_SEMI); 514 | } 515 | 516 | else if (tok_func_is()) { 517 | tok_next(); // consume func name 518 | // TODO: validate against a symbol table 519 | tok_kw_expect(TOK_SEMI); 520 | } 521 | 522 | else { // default case: an assignment expression 523 | parse_assign_expr(); 524 | tok_kw_expect(TOK_SEMI); 525 | } 526 | } 527 | 528 | // var_decl = "int" identifier ";" 529 | static void parse_var_decl(void) 530 | { 531 | tok_kw_expect(TOK_INT); 532 | tok_ident_expect(); 533 | // TODO: Build and check a symbol table of globals 534 | tok_kw_expect(TOK_SEMI); 535 | } 536 | 537 | // func_decl = "void" func_name "{" statement* "}" 538 | static void parse_func_decl(void) 539 | { 540 | tok_kw_expect(TOK_VOID); 541 | tok_func_expect(); 542 | // TODO: Build and check a symbol table of functions 543 | tok_kw_expect(TOK_BLK_BEGIN); 544 | while (!tok_kw_is(TOK_BLK_END)) { 545 | parse_statement(); 546 | } 547 | tok_kw_expect(TOK_BLK_END); 548 | } 549 | 550 | // program = (var_decl | func_decl)+ 551 | static void parse_program(void) 552 | { 553 | while (tok->type != TOK_TYPE_EOF) { 554 | if (tok_kw_is(TOK_INT)) { 555 | parse_var_decl(); 556 | } 557 | else if (tok_kw_is(TOK_VOID)) { 558 | parse_func_decl(); 559 | } 560 | else { 561 | error("expected var decl or func decl"); 562 | } 563 | } 564 | } 565 | 566 | int main(int argc, char *argv[]) 567 | { 568 | if (argc < 2) { 569 | fprintf(stderr, "usage: %s ...\n", argv[0]); 570 | return 1; 571 | } 572 | 573 | for (size_t i = 1; i < (size_t)argc; i++) { 574 | input_append_source_file(argv[i]); 575 | } 576 | 577 | tok_next(); 578 | parse_program(); 579 | 580 | free(input_buf); 581 | return 0; 582 | } 583 | -------------------------------------------------------------------------------- /rt/_start.c: -------------------------------------------------------------------------------- 1 | 2 | void _start() 3 | { 4 | main(); 5 | shutdown(); 6 | } 7 | -------------------------------------------------------------------------------- /rt/lib.c: -------------------------------------------------------------------------------- 1 | int tmp1; 2 | int tmp2; 3 | 4 | void shutdown() 5 | { 6 | /* Shutdown via APM: coded in asm machine code directly */ 7 | 8 | // Check for APM 9 | // | mov ah,0x53; mov al,0x00; xor bx,bx; int 0x15; jc error 10 | asm 180; asm 83; asm 176; asm 0; asm 49; asm 219; 11 | asm 205; asm 21; asm 114; asm 55; 12 | 13 | // Disconnect from any APM interface 14 | // | mov ah,0x53; mov al,0x04; xor bx,bx; int 0x15 15 | // | jc maybe_error; jmp no_error 16 | asm 180; asm 83; asm 176; asm 4; asm 49; asm 219; 17 | asm 205; asm 21; asm 114; asm 2; asm 235; asm 5; 18 | 19 | // Label: maybe_error 20 | // | cmp ah,0x03; jne error 21 | asm 128; asm 252; asm 3; asm 117; asm 38; 22 | // Label: no_error 23 | 24 | // Connect to APM interface 25 | // | mov ah,0x53; mov al,0x01; xor bx,bx; int 0x15; jc error 26 | asm 180; asm 83; asm 176; asm 1; asm 49; asm 219; 27 | asm 205; asm 21; asm 114; asm 28; 28 | 29 | // Enable power management for all devices 30 | // | mov ah,0x53; mov al,0x08; mov bx,0x0001; mov cx,0x0001 31 | // | int 0x15; jc error 32 | asm 180; asm 83; asm 176; asm 8; 33 | asm 187; asm 1; asm 0; asm 185; asm 1; asm 0; 34 | asm 205; asm 21; asm 114; asm 14; 35 | 36 | // Set the power state for all devices 37 | // | mov ah,0x53; mov al,0x7; mov bx,0x0001; mov cx,0x0003 38 | // | int 0x15; jc error 39 | asm 180; asm 83; asm 176; asm 7; 40 | asm 187; asm 1; asm 0; asm 185; asm 3; asm 0; 41 | asm 205; asm 21; asm 114; asm 0; 42 | 43 | // Label: error 44 | // | hlt; jmp error 45 | asm 244; asm 235; asm 253; 46 | } 47 | 48 | int store_far_seg; 49 | int store_far_off; 50 | int store_far_val; 51 | void store_far() 52 | { 53 | // mov es, store_far_seg 54 | store_far_seg = store_far_seg; 55 | asm 142; asm 192; 56 | 57 | // mov si, store_far_off 58 | store_far_off = store_far_off; 59 | asm 137; asm 198; 60 | 61 | // mov es:[si], store_far_val 62 | store_far_val = store_far_val; 63 | asm 38; asm 137; asm 4; 64 | } 65 | 66 | int div10_unsigned_n; 67 | int div10_unsigned_q; 68 | int div10_unsigned_r; 69 | void div10_unsigned() 70 | { 71 | /* Taken from "Hacker's Delight", modified to "fit your screen" */ 72 | 73 | tmp1 = ( div10_unsigned_n >> 1 ) & 32767; // unsigned 74 | tmp2 = ( div10_unsigned_n >> 2 ) & 16383; // unsigned 75 | div10_unsigned_q = tmp1 + tmp2; 76 | 77 | tmp1 = ( div10_unsigned_q >> 4 ) & 4095; // unsigned 78 | div10_unsigned_q = div10_unsigned_q + tmp1; 79 | 80 | tmp1 = ( div10_unsigned_q >> 8 ) & 255; // unsigned 81 | div10_unsigned_q = div10_unsigned_q + tmp1; 82 | 83 | div10_unsigned_q = ( div10_unsigned_q >> 3 ) & 8191; // unsigned 84 | 85 | div10_unsigned_r = div10_unsigned_n 86 | - ( ( div10_unsigned_q << 3 ) + ( div10_unsigned_q << 1 ) ); 87 | 88 | if( div10_unsigned_r > 9 ){ 89 | div10_unsigned_q = div10_unsigned_q + 1; 90 | div10_unsigned_r = div10_unsigned_r - 10; 91 | } 92 | } 93 | 94 | int print_ch; 95 | void print_char() 96 | { 97 | /* Implement print char via serial port bios function accessed via int 0x14 */ 98 | 99 | print_ch = print_ch; // mov ax,[&print_ch] 100 | asm 180; asm 1; // mov ah,1 101 | asm 186; asm 0; asm 0 ; // mov dx,0 102 | asm 205; asm 20; // int 0x14 103 | } 104 | 105 | // uses 'print_ch' 106 | void print_newline() 107 | { 108 | print_ch = 10; 109 | print_char(); 110 | } 111 | 112 | int print_num; // input 113 | int print_u16_bufptr; 114 | int print_u16_cur; 115 | void print_u16() 116 | { 117 | print_u16_bufptr = 30000; // buffer for ascii digits 118 | 119 | if( print_num == 0 ){ 120 | print_ch = 48; 121 | print_char(); 122 | } 123 | 124 | print_u16_cur = print_num; 125 | while( print_u16_cur != 0 ){ 126 | div10_unsigned_n = print_u16_cur; 127 | div10_unsigned(); 128 | 129 | *(int*) print_u16_bufptr = div10_unsigned_r; 130 | print_u16_bufptr = print_u16_bufptr + 1; 131 | 132 | print_u16_cur = div10_unsigned_q; 133 | } 134 | 135 | while( print_u16_bufptr != 30000 ){ // emit them in reverse over 136 | print_u16_bufptr = print_u16_bufptr - 1; 137 | print_ch = ( *(int*) print_u16_bufptr & 255 ) + 48; 138 | print_char(); 139 | } 140 | } 141 | 142 | // uses 'print_num' and 'print_ch' 143 | void print_i16() 144 | { 145 | if( print_num < 0 ){ 146 | print_ch = 45; print_char(); // '-' 147 | print_num = 0 - print_num; 148 | } 149 | print_u16(); 150 | } 151 | 152 | void vga_init() 153 | { 154 | // mov ah,0; mov al,0x13; int 0x10 155 | asm 180; asm 0; asm 176; asm 19; asm 205; asm 16; 156 | } 157 | 158 | void vga_clear() 159 | { 160 | // push di; xor di,di; mov bx,0xa000; mov es,bx; 161 | // mov cx,0x7d00; xor ax,ax; rep stos; pop di 162 | asm 87 ; asm 49 ; asm 255; asm 187; asm 0; asm 160; 163 | asm 142; asm 195; asm 185; asm 0; asm 125; asm 49; 164 | asm 192; asm 243; asm 171; asm 95; 165 | } 166 | 167 | int pixel_x; 168 | int pixel_y; 169 | void vga_set_pixel() 170 | { 171 | // need to multiply pixel_y by 320 = 256 + 64 172 | // use 'tmp1' for pixel index 173 | tmp1 = ( ( pixel_y << 8 ) + ( pixel_y << 6 ) ) + pixel_x; 174 | 175 | // store to 0xa000:pixel_idx 176 | // mov bx,0xa000; mov es,bx; mov bx,ax; mov BYTE PTR es:[bx],0xf 177 | tmp1 = tmp1; 178 | asm 187; asm 0; asm 160; asm 142; asm 195; 179 | asm 137; asm 195; asm 38; asm 198; asm 7; asm 15; 180 | } 181 | 182 | int port_num; 183 | int port_val; 184 | void port_inb() 185 | { 186 | dx = port_num; 187 | // mov dx,WORD PTR [0x464]; in al,dx 188 | asm 139; asm 22; asm 160; asm 4; asm 236; 189 | 190 | // mov WORD PTR [0x464],ax 191 | asm 137; asm 6; asm 100; asm 4; 192 | port_val = ax; 193 | } 194 | void port_inw() 195 | { 196 | // mov dx,WORD PTR [0x464]; in ax,dx 197 | dx = port_num; 198 | asm 139; asm 22; asm 160; asm 4; asm 237; 199 | 200 | // mov WORD PTR [0x464],ax 201 | asm 137; asm 6; asm 100; asm 4; 202 | port_val = ax; 203 | } 204 | void port_outb() 205 | { 206 | dx = port_num; 207 | ax = port_val; 208 | 209 | // mov dx,WORD PTR [0x464] 210 | asm 139; asm 22; asm 160; asm 4; 211 | 212 | // mov ax,WORD PTR [0x464] 213 | asm 139; asm 6; asm 100; asm 4; 214 | 215 | // outb dx,al 216 | asm 238; 217 | } 218 | void port_outw() 219 | { 220 | dx = port_num; 221 | ax = port_val; 222 | 223 | // mov dx,WORD PTR [0x464] 224 | asm 139; asm 22; asm 160; asm 4; 225 | 226 | // mov ax,WORD PTR [0x464] 227 | asm 139; asm 6; asm 100; asm 4; 228 | 229 | // outb dx,al 230 | asm 239; 231 | } 232 | 233 | void dump_code_segment_and_shutdown() 234 | { 235 | /* NOTE: This code is in a different segment from data, and our compiled pointer accesses 236 | do not leave the data segment, so we need a little machine code to grab data from the 237 | code segment and stash it in a variable for C */ 238 | 239 | i = 0; 240 | while( i < 8192 ){ /* Just assuming 8K is enough.. might not be true */ 241 | 242 | // (put "i" in ax); mov si,ax; mov ax,cs:[si]; mov [&a],ax 243 | i = i; asm 137; asm 198; asm 46; asm 139; asm 4; asm 137; asm 133; asm 98; asm 0; 244 | 245 | print_ch = a; 246 | print_char(); 247 | i = i + 1; 248 | } 249 | shutdown(); 250 | } 251 | -------------------------------------------------------------------------------- /run.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | set -e 3 | THISDIR=$(dirname $(realpath $0)) 4 | cd $THISDIR 5 | 6 | if [ "$#" != 1 ]; then 7 | echo "usage: $0 " 8 | exit 1 9 | fi 10 | 11 | input="rt/lib.c $1 rt/_start.c" 12 | 13 | ./build/lint $input 14 | ./run_raw.sh $input 15 | -------------------------------------------------------------------------------- /run_raw.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | set -e 3 | THISDIR=$(dirname $(realpath $0)) 4 | cd $THISDIR 5 | 6 | if [ "$#" -lt 1 ]; then 7 | echo "usage: $0 [ ...]" 8 | exit 1 9 | fi 10 | 11 | cat $@ | qemu-system-i386 -hda build/sectorc.bin -serial stdio -audiodev coreaudio,id=audio0 -machine pcspk-audiodev=audio0 12 | -------------------------------------------------------------------------------- /sectorc.s: -------------------------------------------------------------------------------- 1 | bits 16 2 | cpu 386 3 | 4 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 5 | ;;; Token values as computed by the tokenizer's 6 | ;;; atoi() calculation 7 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 8 | %define TOK_INT 6388 9 | %define TOK_VOID 11386 10 | %define TOK_ASM 5631 11 | %define TOK_COMM 65532 12 | %define TOK_SEMI 11 13 | %define TOK_LPAREN 65528 14 | %define TOK_RPAREN 65529 15 | %define TOK_START 20697 16 | %define TOK_DEREF 64653 17 | %define TOK_WHILE_BEGIN 55810 18 | %define TOK_IF_BEGIN 6232 19 | %define TOK_BODY_BEGIN 5 20 | %define TOK_BLK_BEGIN 75 21 | %define TOK_BLK_END 77 22 | %define TOK_ASSIGN 13 23 | %define TOK_ADDR 65526 24 | %define TOK_SUB 65533 25 | %define TOK_ADD 65531 26 | %define TOK_MUL 65530 27 | %define TOK_AND 65526 28 | %define TOK_OR 76 29 | %define TOK_XOR 46 30 | %define TOK_SHL 132 31 | %define TOK_SHR 154 32 | %define TOK_EQ 143 33 | %define TOK_NE 65399 34 | %define TOK_LT 12 35 | %define TOK_GT 14 36 | %define TOK_LE 133 37 | %define TOK_GE 153 38 | 39 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 40 | ;;; Common register uses 41 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 42 | ;;; ax: current token / scratch register / emit val for stosw 43 | ;;; bx: current token 44 | ;;; cx: used by tok_next for trailing 2 bytes 45 | ;;; dl: flag for "tok_is_num" 46 | ;;; dh: flags for "tok_is_call", trailing "()" 47 | ;;; bp: saved token for assigned variable 48 | ;;; sp: stack pointer, we don't mess with this 49 | ;;; si: used with lodsw for table scans 50 | ;;; ds: fn symbol table segment (occasionally set to "cs" to access binary_oper_tbl) 51 | ;;; di: codegen destination offset 52 | ;;; es: codegen destination segment 53 | ;;; cs: always 0x07c0 54 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 55 | 56 | jmp 0x07c0:entry 57 | entry: 58 | push 0x3000 ; segment 0x3000 is used for fn symbol table 59 | pop ds 60 | push 0x2000 ; segment 0x2000 is used for codegen output buffer 61 | pop es 62 | xor di,di ; codegen index, zero'd 63 | ;; [fall-through] 64 | 65 | ;; main loop for parsing all decls 66 | compile: 67 | ;; advance to either "int" or "void" 68 | call tok_next 69 | 70 | ;; if "int" then skip a variable 71 | cmp ax,TOK_INT 72 | jne compile_function 73 | call tok_next2 ; consume "int" and 74 | jmp compile 75 | 76 | compile_function: ; parse and compile a function decl 77 | call tok_next ; consume "void" 78 | push bx ; save function name token 79 | mov [bx],di ; record function address in symtbl 80 | call compile_stmts_tok_next2 ; compile function body 81 | 82 | mov al,0xc3 ; emit "ret" instruction 83 | stosb 84 | 85 | pop bx ; if the function is _start(), we're done 86 | cmp bx,TOK_START 87 | jne compile ; otherwise, loop and compile another declaration 88 | ;; [fall-through] 89 | 90 | ;; done compiling, execute the binary 91 | execute: 92 | push es ; push the codegen segment 93 | push word [bx] ; push the offset to "_start()" 94 | push 0x4000 ; load new segment for variable data 95 | pop ds 96 | retf ; jump into it via "retf" 97 | 98 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 99 | ;;; compile statements (optionally advancing tokens beforehand) 100 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 101 | compile_stmts_tok_next2: 102 | call tok_next 103 | compile_stmts_tok_next: 104 | call tok_next 105 | compile_stmts: 106 | mov ax,bx 107 | cmp ax,TOK_BLK_END ; if we reach '}' then return 108 | je return 109 | 110 | test dh,dh ; if dh is 0, it's not a call 111 | je _not_call 112 | mov al,0xe8 ; emit "call" instruction 113 | stosb 114 | 115 | mov ax,[bx] ; load function offset from symbol-table 116 | sub ax,di ; compute relative to this location: "dest - cur - 2" 117 | sub ax,2 118 | stosw ; emit target 119 | 120 | jmp compile_stmts_tok_next2 ; loop to compile next statement 121 | 122 | _not_call: 123 | cmp ax,TOK_ASM ; check for "asm" 124 | jne _not_asm 125 | call tok_next ; tok_next to get literal byte 126 | stosb ; emit the literal 127 | jmp compile_stmts_tok_next2 ; loop to compile next statement 128 | 129 | _not_asm: 130 | cmp ax,TOK_IF_BEGIN ; check for "if" 131 | jne _not_if 132 | call _control_flow_block ; compile control-flow block 133 | jmp _patch_fwd ; patch up forward jump of if-stmt 134 | 135 | _not_if: 136 | cmp ax,TOK_WHILE_BEGIN ; check for "while" 137 | jne _not_while 138 | push di ; save loop start location 139 | call _control_flow_block ; compile control-flow block 140 | jmp _patch_back ; patch up backward and forward jumps of while-stmt 141 | 142 | _not_while: 143 | call compile_assign ; handle an assignment statement 144 | jmp compile_stmts ; loop to compile next statement 145 | 146 | _patch_back: 147 | mov al,0xe9 ; emit "jmp" instruction (backwards) 148 | stosb 149 | pop ax ; restore loop start location 150 | sub ax,di ; compute relative to this location: "dest - cur - 2" 151 | sub ax,2 152 | stosw ; emit target 153 | ;; [fall-through] 154 | _patch_fwd: 155 | mov ax,di ; compute relative fwd jump to this location: "dest - src" 156 | sub ax,si 157 | mov es:[si-2],ax ; patch "src - 2" 158 | jmp compile_stmts_tok_next ; loop to compile next statement 159 | 160 | _control_flow_block: 161 | call compile_expr_tok_next ; compile loop or if condition expr 162 | 163 | ;; emit forward jump 164 | mov ax,0xc085 ; emit "test ax,ax" 165 | stosw 166 | mov ax,0x840f ; emit "je" instruction 167 | stosw 168 | stosw ; emit placeholder for target 169 | 170 | push di ; save forward patch location 171 | call compile_stmts_tok_next ; compile a block of statements 172 | pop si ; restore forward patch location 173 | 174 | return: ; this label gives us a way to do conditional returns 175 | ret ; (e.g. "jne return") 176 | 177 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 178 | ;;; compile assignment statement 179 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 180 | compile_assign: 181 | cmp ax,TOK_DEREF ; check for "*(int*)" 182 | jne _not_deref_store 183 | call tok_next ; consume "*(int*)" 184 | call save_var_and_compile_expr ; compile rhs first 185 | ;; [fall-through] 186 | 187 | compile_store_deref: 188 | mov bx,bp ; restore dest var token 189 | mov ax,0x0489 ; code for "mov [si],ax" 190 | ;; [fall-through] 191 | 192 | emit_common_ptr_op: 193 | push ax 194 | mov ax,0x368b ; emit "mov si,[imm]" 195 | call emit_var 196 | pop ax 197 | stosw ; emit 198 | ret 199 | 200 | _not_deref_store: 201 | call save_var_and_compile_expr ; compile rhs first 202 | ;; [fall-through] 203 | 204 | compile_store: 205 | mov bx,bp ; restore dest var token 206 | mov ax,0x0689 ; code for "mov [imm],ax" 207 | jmp emit_var ; [tail-call] 208 | 209 | save_var_and_compile_expr: 210 | mov bp,bx ; save dest to bp 211 | call tok_next ; consume dest 212 | ;; [fall-through] ; fall-through will consume "=" before compiling expr 213 | 214 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 215 | ;;; compile expression 216 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 217 | compile_expr_tok_next: 218 | call tok_next 219 | compile_expr: 220 | call compile_unary ; compile left-hand side 221 | 222 | push ds ; need to swap out 'ds' to scan the table with lodsw 223 | push cs 224 | pop ds 225 | 226 | mov si,binary_oper_tbl - 2 ; load ptr to operator table (biased backwards) 227 | _check_next: 228 | lodsw ; discard 16-bit of machine-code 229 | lodsw ; load 16-bit token value 230 | cmp ax,bx ; matches token? 231 | je _found 232 | test ax,ax ; end of table? 233 | jne _check_next 234 | 235 | pop ds 236 | ret ; all-done, not found 237 | 238 | _found: 239 | lodsw ; load 16-bit of machine-code 240 | push ax ; save it to the stack 241 | mov al,0x50 ; code for "push ax" 242 | stosb ; emit 243 | call tok_next ; consume operator token 244 | call compile_unary ; compile right-hand side 245 | mov ax,0x9159 ; code for "pop cx; xchg ax,cx" 246 | stosw ; emit 247 | 248 | pop bx ; restore 16-bit of machine-code 249 | cmp bh,0xc0 ; detect the special case for comparison ops 250 | jne emit_op 251 | emit_cmp_op: 252 | mov ax,0xc839 ; code for "cmp ax,cx" 253 | stosw ; emit 254 | mov ax,0x00b8 ; code for "mov ax,0x00" 255 | stosw ; emit 256 | mov ax,0x0f00 ; code for the rest of imm and prefix for "setX" instrs 257 | stosw ; emit 258 | ;; [fall-through] 259 | 260 | emit_op: 261 | mov ax,bx 262 | stosw ; emit machine code for op 263 | pop ds 264 | ret 265 | 266 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 267 | ;;; compile unary 268 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 269 | compile_unary: 270 | cmp ax,TOK_DEREF ; check for "*(int*)" 271 | jne _not_deref 272 | ;; compile deref (load) 273 | call tok_next ; consume "*(int*)" 274 | mov ax,0x048b ; code for "mov ax,[si]" 275 | jmp emit_common_ptr_op ; [tail-call] 276 | 277 | _not_deref: 278 | cmp ax,TOK_LPAREN ; check for "*(int*)" 279 | jne _not_paren 280 | call compile_expr_tok_next ; consume "(" and compile expr 281 | jmp tok_next ; [tail-call] to consume ")" 282 | 283 | _not_paren: 284 | cmp ax,TOK_ADDR ; check for "&" 285 | jne _not_addr 286 | call tok_next ; consume "&" 287 | mov ax,0x068d ; code for "lea ax,[imm]" 288 | jmp emit_var ; [tail-call] to emit code 289 | 290 | _not_addr: 291 | test dl,dl ; check for tok_is_num 292 | je _not_int 293 | mov al,0xb8 ; code for "mov ax,imm" 294 | stosb ; emit 295 | jmp emit_tok ; [tail-call] to emit imm 296 | 297 | _not_int: 298 | ;; compile var 299 | mov ax,0x068b ; code for "mov ax,[imm]" 300 | ;; [fall-through] 301 | 302 | emit_var: 303 | stosw ; emit 304 | add bx,bx ; bx = 2*bx (scale up for 16-bit) 305 | ;; [fall-through] 306 | 307 | emit_tok: 308 | mov ax,bx 309 | stosw ; emit token value 310 | jmp tok_next ; [tail-call] 311 | 312 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 313 | ;;; get next token, setting the following: 314 | ;;; ax: token 315 | ;;; bx: token 316 | ;;; dl: tok_is_num 317 | ;;; dh: tok_is_call 318 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 319 | tok_next2: 320 | call tok_next 321 | ;; [fall-through] 322 | tok_next: 323 | call getch 324 | cmp al,32 ; skip spaces (anything <= ' ' is considered space) 325 | jle tok_next 326 | 327 | xor bx,bx ; zero token reg 328 | xor cx,cx ; zero last-two chars reg 329 | 330 | cmp al,57 331 | setle dl ; tok_is_num = (al <= '9') 332 | 333 | _nextch: 334 | cmp al,32 335 | jle _done ; if char is space then break 336 | 337 | shl cx,8 338 | mov cl,al ; shift this char into cx 339 | 340 | imul bx,10 341 | sub ax,48 342 | add bx,ax ; atoi computation: bx = 10 * bx + (ax - '0') 343 | 344 | call getch 345 | jmp _nextch ; [loop] 346 | 347 | _done: 348 | mov ax,cx 349 | cmp ax,0x2f2f ; check for single-line comment "//" 350 | je _comment_double_slash 351 | cmp ax,0x2f2a ; check for multi-line comment "/*" 352 | je _comment_multi_line 353 | cmp ax,0x2829 ; check for call parens "()" 354 | sete dh 355 | 356 | mov ax,bx ; return token in ax also 357 | ret 358 | 359 | _comment_double_slash: 360 | call getch ; get next char 361 | cmp al,10 ; check for newline '\n' 362 | jne _comment_double_slash ; [loop] 363 | jmp tok_next ; [tail-call] 364 | 365 | _comment_multi_line: 366 | call tok_next ; get next token 367 | cmp ax,65475 ; check for token "*/" 368 | jne _comment_multi_line ; [loop] 369 | jmp tok_next ; [tail-call] 370 | 371 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 372 | ;;; get next char: returned in ax (ah == 0, al == ch) 373 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 374 | getch: 375 | push dx ; need to save dx because tok_next uses it for flags 376 | xor si,si ; use ds:0 as a semi-colon buffer, encodes smaller via si 377 | mov ax,[si] ; load the semi-colon buffer 378 | xor [si],ax ; zero the buffer 379 | cmp al,59 ; check for ';' 380 | je getch_done ; if ';' return it 381 | 382 | getch_tryagain: 383 | mov ax,0x0200 384 | xor dx,dx 385 | int 0x14 ; get a char from serial (bios function) 386 | 387 | and ah,0x80 ; check for failure and clear ah as a side-effect 388 | jne getch_tryagain ; failed, try again later 389 | 390 | cmp al,59 ; check for ';' 391 | jne getch_done ; if not ';' return it 392 | mov [si],ax ; save the ';' 393 | xor ax,ax ; return 0 instead, treated as whitespcae 394 | 395 | getch_done: 396 | pop dx 397 | ret 398 | 399 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 400 | ;;; binary operator table 401 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 402 | binary_oper_tbl: 403 | dw TOK_ADD,0xc103 ; add ax,cx 404 | dw TOK_SUB,0xc12b ; sub ax,cx 405 | dw TOK_MUL,0xe1f7 ; mul ax,cx 406 | dw TOK_AND,0xc123 ; and ax,cx 407 | dw TOK_OR,0xc10b ; or ax,cx 408 | dw TOK_XOR,0xc133 ; xor ax,cx 409 | dw TOK_SHL,0xe0d3 ; shl ax,cx 410 | dw TOK_SHR,0xf8d3 ; shr ax,cx 411 | dw TOK_EQ,0xc094 ; sete al 412 | dw TOK_NE,0xc095 ; setne al 413 | dw TOK_LT,0xc09c ; setl al 414 | dw TOK_GT,0xc09f ; setg al 415 | dw TOK_LE,0xc09e ; setle al 416 | dw TOK_GE,0xc09d ; setge al 417 | dw 0 ; [sentinel] 418 | 419 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 420 | ;;; boot signature 421 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 422 | times 510-($-$$) db 0 423 | db 0x55, 0xaa 424 | --------------------------------------------------------------------------------