├── .gitignore
├── LICENSE
├── README.md
├── build.sh
├── dis.sh
├── examples
├── hello.c
├── sinwave.c
└── twinkle.c
├── img
└── sinwave.png
├── lint
└── lint.c
├── rt
├── _start.c
└── lib.c
├── run.sh
├── run_raw.sh
└── sectorc.s
/.gitignore:
--------------------------------------------------------------------------------
1 | build/
2 | *~
3 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | CC0 1.0 Universal
2 |
3 | Statement of Purpose
4 |
5 | The laws of most jurisdictions throughout the world automatically confer
6 | exclusive Copyright and Related Rights (defined below) upon the creator and
7 | subsequent owner(s) (each and all, an "owner") of an original work of
8 | authorship and/or a database (each, a "Work").
9 |
10 | Certain owners wish to permanently relinquish those rights to a Work for the
11 | purpose of contributing to a commons of creative, cultural and scientific
12 | works ("Commons") that the public can reliably and without fear of later
13 | claims of infringement build upon, modify, incorporate in other works, reuse
14 | and redistribute as freely as possible in any form whatsoever and for any
15 | purposes, including without limitation commercial purposes. These owners may
16 | contribute to the Commons to promote the ideal of a free culture and the
17 | further production of creative, cultural and scientific works, or to gain
18 | reputation or greater distribution for their Work in part through the use and
19 | efforts of others.
20 |
21 | For these and/or other purposes and motivations, and without any expectation
22 | of additional consideration or compensation, the person associating CC0 with a
23 | Work (the "Affirmer"), to the extent that he or she is an owner of Copyright
24 | and Related Rights in the Work, voluntarily elects to apply CC0 to the Work
25 | and publicly distribute the Work under its terms, with knowledge of his or her
26 | Copyright and Related Rights in the Work and the meaning and intended legal
27 | effect of CC0 on those rights.
28 |
29 | 1. Copyright and Related Rights. A Work made available under CC0 may be
30 | protected by copyright and related or neighboring rights ("Copyright and
31 | Related Rights"). Copyright and Related Rights include, but are not limited
32 | to, the following:
33 |
34 | i. the right to reproduce, adapt, distribute, perform, display, communicate,
35 | and translate a Work;
36 |
37 | ii. moral rights retained by the original author(s) and/or performer(s);
38 |
39 | iii. publicity and privacy rights pertaining to a person's image or likeness
40 | depicted in a Work;
41 |
42 | iv. rights protecting against unfair competition in regards to a Work,
43 | subject to the limitations in paragraph 4(a), below;
44 |
45 | v. rights protecting the extraction, dissemination, use and reuse of data in
46 | a Work;
47 |
48 | vi. database rights (such as those arising under Directive 96/9/EC of the
49 | European Parliament and of the Council of 11 March 1996 on the legal
50 | protection of databases, and under any national implementation thereof,
51 | including any amended or successor version of such directive); and
52 |
53 | vii. other similar, equivalent or corresponding rights throughout the world
54 | based on applicable law or treaty, and any national implementations thereof.
55 |
56 | 2. Waiver. To the greatest extent permitted by, but not in contravention of,
57 | applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and
58 | unconditionally waives, abandons, and surrenders all of Affirmer's Copyright
59 | and Related Rights and associated claims and causes of action, whether now
60 | known or unknown (including existing as well as future claims and causes of
61 | action), in the Work (i) in all territories worldwide, (ii) for the maximum
62 | duration provided by applicable law or treaty (including future time
63 | extensions), (iii) in any current or future medium and for any number of
64 | copies, and (iv) for any purpose whatsoever, including without limitation
65 | commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes
66 | the Waiver for the benefit of each member of the public at large and to the
67 | detriment of Affirmer's heirs and successors, fully intending that such Waiver
68 | shall not be subject to revocation, rescission, cancellation, termination, or
69 | any other legal or equitable action to disrupt the quiet enjoyment of the Work
70 | by the public as contemplated by Affirmer's express Statement of Purpose.
71 |
72 | 3. Public License Fallback. Should any part of the Waiver for any reason be
73 | judged legally invalid or ineffective under applicable law, then the Waiver
74 | shall be preserved to the maximum extent permitted taking into account
75 | Affirmer's express Statement of Purpose. In addition, to the extent the Waiver
76 | is so judged Affirmer hereby grants to each affected person a royalty-free,
77 | non transferable, non sublicensable, non exclusive, irrevocable and
78 | unconditional license to exercise Affirmer's Copyright and Related Rights in
79 | the Work (i) in all territories worldwide, (ii) for the maximum duration
80 | provided by applicable law or treaty (including future time extensions), (iii)
81 | in any current or future medium and for any number of copies, and (iv) for any
82 | purpose whatsoever, including without limitation commercial, advertising or
83 | promotional purposes (the "License"). The License shall be deemed effective as
84 | of the date CC0 was applied by Affirmer to the Work. Should any part of the
85 | License for any reason be judged legally invalid or ineffective under
86 | applicable law, such partial invalidity or ineffectiveness shall not
87 | invalidate the remainder of the License, and in such case Affirmer hereby
88 | affirms that he or she will not (i) exercise any of his or her remaining
89 | Copyright and Related Rights in the Work or (ii) assert any associated claims
90 | and causes of action with respect to the Work, in either case contrary to
91 | Affirmer's express Statement of Purpose.
92 |
93 | 4. Limitations and Disclaimers.
94 |
95 | a. No trademark or patent rights held by Affirmer are waived, abandoned,
96 | surrendered, licensed or otherwise affected by this document.
97 |
98 | b. Affirmer offers the Work as-is and makes no representations or warranties
99 | of any kind concerning the Work, express, implied, statutory or otherwise,
100 | including without limitation warranties of title, merchantability, fitness
101 | for a particular purpose, non infringement, or the absence of latent or
102 | other defects, accuracy, or the present or absence of errors, whether or not
103 | discoverable, all to the greatest extent permissible under applicable law.
104 |
105 | c. Affirmer disclaims responsibility for clearing rights of other persons
106 | that may apply to the Work or any use thereof, including without limitation
107 | any person's Copyright and Related Rights in the Work. Further, Affirmer
108 | disclaims responsibility for obtaining any necessary consents, permissions
109 | or other rights required for any use of the Work.
110 |
111 | d. Affirmer understands and acknowledges that Creative Commons is not a
112 | party to this document and has no duty or obligation with respect to this
113 | CC0 or use of the Work.
114 |
115 | For more information, please see
116 |
117 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # SectorC
2 | SectorC is a C compiler written in x86-16 assembly that fits within the 512 byte boot sector of an x86 machine. It supports a
3 | subset of C that is large enough to write real and interesting programs. It is quite likely the smallest C compiler ever written.
4 |
5 | In a base64 encoding, it looks like this:
6 |
7 | ```
8 | 6gUAwAdoADAfaAAgBzH/6DABPfQYdQXoJQHr8+gjAVOJP+gSALDDqluB+9lQdeAG/zdoAEAfy+gI
9 | AegFAYnYg/hNdFuE9nQNsOiqiwcp+IPoAqvr4j3/FXUG6OUAquvXPVgYdQXoJgDrGj0C2nUGV+gb
10 | AOsF6CgA68Ow6apYKfiD6AKrifgp8CaJRP7rrOg4ALiFwKu4D4Srq1fonP9ewz2N/HUV6JoA6BkA
11 | ieu4iQRQuIs26IAAWKvD6AcAieu4iQbrc4nd6HkA6HYA6DgAHg4fvq8Bra052HQGhcB19h/DrVCw
12 | UKroWQDoGwC4WZGrW4D/wHUMuDnIq7i4AKu4AA+ridirH8M9jfx1COgzALiLBOucg/j4dQXorf/r
13 | JIP49nUI6BwAuI0G6wyE0nQFsLiq6wa4iwarAduJ2KvrA+gAAOhLADwgfvkx2zHJPDkPnsI8IH4S
14 | weEIiMFr2wqD6DABw+gqAOvqicg9Ly90Dj0qL3QSPSkoD5TGidjD6BAAPAp1+eu86Ln/g/jDdfjr
15 | slIx9osEMQQ8O3QUuAACMdLNFIDkgHX0PDt1BIkEMcBaw/v/A8H9/yvB+v/34fb/I8FMAAvBLgAz
16 | wYQA0+CaANP4jwCUwHf/lcAMAJzADgCfwIUAnsCZAJ3AAAAAAAAAAAAAAAAAAAAAAAAAAAAAVao=
17 | ```
18 |
19 | ## Supported language
20 |
21 | A fairly large subset is supported: global variables, functions, if statements, while statements, lots of operators, pointer dereference, inline machine-code, comments, etc.
22 | All of these features make it quite capable.
23 |
24 | For example, the following program animates a moving sine-wave:
25 |
26 | ```
27 | int y;
28 | int x;
29 | int x_0;
30 | void sin_positive_approx()
31 | {
32 | y = ( x_0 * ( 157 - x_0 ) ) >> 7;
33 | }
34 | void sin()
35 | {
36 | x_0 = x;
37 | while( x_0 > 314 ){
38 | x_0 = x_0 - 314;
39 | }
40 | if( x_0 <= 157 ){
41 | sin_positive_approx();
42 | }
43 | if( x_0 > 157 ){
44 | x_0 = x_0 - 157;
45 | sin_positive_approx();
46 | y = 0 - y;
47 | }
48 | y = 100 + y;
49 | }
50 |
51 |
52 | int offset;
53 | int x_end;
54 | void draw_sine_wave()
55 | {
56 | x = offset;
57 | x_end = x + 314;
58 | while( x <= x_end ){
59 | sin();
60 | pixel_x = x - offset;
61 | pixel_y = y;
62 | vga_set_pixel();
63 | x = x + 1;
64 | }
65 | }
66 |
67 | int v_1;
68 | int v_2;
69 | void delay()
70 | {
71 | v_1 = 0;
72 | while( v_1 < 50 ){
73 | v_2 = 0;
74 | while( v_2 < 10000 ){
75 | v_2 = v_2 + 1;
76 | }
77 | v_1 = v_1 + 1;
78 | }
79 | }
80 |
81 | void main()
82 | {
83 | vga_init();
84 |
85 | offset = 0;
86 | while( 1 ){
87 | vga_clear();
88 | draw_sine_wave();
89 |
90 | delay();
91 | offset = offset + 1;
92 | if( offset >= 314 ){ // mod the value to avoid 2^16 integer overflow
93 | offset = offset - 314;
94 | }
95 | }
96 | }
97 | ```
98 |
99 | ### Screenshot
100 |
101 | 
102 |
103 | ## Provided Example Code
104 |
105 | A few examples are provided that leverage the unique hardware aspects of the x86-16 IBM PC:
106 | - `examples/hello.c:` Print a text greeting on the screen writing to memory at 0xB8000
107 | - `examples/sinwave.c:` Draw a moving sine wave animation with VGA Mode 0x13 using an appropriately bad approximation of sin(x)
108 | - `examples/twinkle.c:` Play “Twinkle Twinkle Little Star” through the PC Speaker (Warning: LOUD)
109 |
110 | ## Grammar
111 |
112 | The following grammar is accepted and compiled by sectorc:
113 |
114 | ```
115 | program = (var_decl | func_decl)+
116 | var_decl = "int" identifier ";"
117 | func_decl = "void" func_name "{" statement* "}"
118 | func_name =
119 | statement = "if(" expr "){" statement* "}"
120 | | "while(" expr "){" statement* "}"
121 | | "asm" integer ";"
122 | | func_name ";"
123 | | assign_expr ";"
124 | assign_expr = deref? identifier "=" expr
125 | deref = "*(int*)"
126 | expr = unary (op unary)?
127 | unary = deref identifier
128 | | "&" identifier
129 | | "(" expr ")"
130 | | identifier
131 | | integer
132 | op = "+" | "-" | "&" | "|" | "^" | "<<" | ">>"
133 | | "==" | "!=" | "<" | ">" | "<=" | ">="
134 | ```
135 |
136 | In addition, both `// comment` and `/* multi-line comment */` styles are supported.
137 |
138 | (NOTE: This grammar is 704 bytes in ascii, 38% larger than its implementation!)
139 |
140 | ## How?
141 |
142 | See blog post: [SectorC: A C Compiler in 512 bytes](https://xorvoid.com/sectorc.html)
143 |
144 | ## Why?
145 |
146 | In 2020, cesarblum wrote a Forth that fits in a bootsector: ([sectorforth](https://github.com/cesarblum/sectorforth))
147 |
148 | In 2021, jart et. al. wrote a Lisp that fits in the bootsector: ([sectorlisp](https://github.com/jart/sectorlisp))
149 |
150 | Naturally, C always needs to come and crash (literally) every low-level systems party regardless of whether it was even invited.
151 |
152 | ## Running
153 |
154 | Dependencies:
155 | - `nasm` for assembling (I used v2.16.01)
156 | - `qemu-system-i386` for emulating x86-16 (I used v8.0.0)
157 |
158 | Build: `./build.sh`
159 |
160 | Run: `./run.sh your_source.c`
161 |
162 | NOTE: Tested only on a MacBook M1
163 |
164 | ## What is this useful for?
165 |
166 | Probably Nothing.
167 |
168 | Or at least that's what I thought when starting out. But, I didn't think I'd get such a feature set. Now, I'd say that it **might** be
169 | useful for someone that wants to explore x86-16 bios functions and machine model w/o having to learn lots of x86 assembly first. But, then again, you
170 | should just use a proper C compiler and write a tiny bootloader to execute it.
171 |
--------------------------------------------------------------------------------
/build.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | set -e
3 | THISDIR=$(dirname $(realpath $0))
4 | cd $THISDIR
5 |
6 | SRC=sectorc.s
7 | BIN=build/sectorc.bin
8 |
9 | ## output dir for build artifacts
10 | mkdir -p build
11 |
12 | ## assemble sectorc
13 | nasm -f bin -o $BIN $SRC
14 |
15 | ## build a helpful linter
16 | gcc -std=c11 -Wall -Werror -O2 -g -o build/lint lint/lint.c
17 |
--------------------------------------------------------------------------------
/dis.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | set -e
3 | objdump -D -b binary -m i386 -Maddr16,data16 -M intel "$1"
4 |
--------------------------------------------------------------------------------
/examples/hello.c:
--------------------------------------------------------------------------------
1 | int buf;
2 | int ptr;
3 | int len;
4 | void vga_write()
5 | {
6 | /* Text vga is located at b800:0000 */
7 | store_far_seg = 47104; // segment: 0xb800
8 | store_far_off = idx << 1;
9 | store_far_val = ( 15 << 8 ) | ( ch & 255 ); // white fg and black bg
10 | store_far();
11 | }
12 |
13 | int x_off;
14 | int y_off;
15 | void vga_write_ch()
16 | {
17 | if( ch != 10 ){
18 | idx = y_off + x_off;
19 | vga_write();
20 | x_off = x_off + 1;
21 | }
22 | if( ( ch == 10 ) | ( x_off == 80 ) ){
23 | y_off = y_off + 80;
24 | x_off = 0;
25 | }
26 | }
27 |
28 | int idx;
29 | void vga_clear()
30 | {
31 | idx = 0;
32 | while( idx < 2000 ){ // 80x25
33 | ch = 32; // char: ' '
34 | vga_write();
35 | idx = idx + 1;
36 | }
37 | pos = 0;
38 | }
39 |
40 | void main()
41 | {
42 | // dump_code_segment_and_shutdown();
43 |
44 | vga_clear();
45 |
46 | ch = 72; vga_write_ch();
47 | ch = 101; vga_write_ch();
48 | ch = 108; vga_write_ch();
49 | ch = 108; vga_write_ch();
50 | ch = 111; vga_write_ch();
51 | ch = 10; vga_write_ch();
52 | ch = 32; vga_write_ch();
53 | ch = 102; vga_write_ch();
54 | ch = 114; vga_write_ch();
55 | ch = 111; vga_write_ch();
56 | ch = 109; vga_write_ch();
57 | ch = 10; vga_write_ch();
58 | ch = 32; vga_write_ch();
59 | ch = 32; vga_write_ch();
60 | ch = 83; vga_write_ch();
61 | ch = 101; vga_write_ch();
62 | ch = 99; vga_write_ch();
63 | ch = 116; vga_write_ch();
64 | ch = 111; vga_write_ch();
65 | ch = 114; vga_write_ch();
66 | ch = 67; vga_write_ch();
67 | ch = 10; vga_write_ch();
68 | ch = 32; vga_write_ch();
69 | ch = 32; vga_write_ch();
70 | ch = 32; vga_write_ch();
71 |
72 | i = 0;
73 | while( i < 10 ){
74 | ch = 33; vga_write_ch();
75 | i = i + 1;
76 | }
77 |
78 | while( 1 ){ }
79 | }
80 |
--------------------------------------------------------------------------------
/examples/sinwave.c:
--------------------------------------------------------------------------------
1 | /* A Sine-wave Animation
2 |
3 | Math time:
4 | ---------------------------
5 | Along the range [0, pi] we can approximate sin(x) very crudely with a 2nd order quadratic
6 | That is: y = a * x^2 + b * x + c
7 |
8 | Three unknowns need three constraints, so picking the easy ones:
9 | x = 0, y = 0
10 | x = pi/2, y = 1
11 | x = pi, y = 0
12 |
13 | Solving the linear system:
14 |
15 | | 0 0 1 | | a | | 0 |
16 | | pi^2/4 pi/2 1 | * | b | = | 1 |
17 | | pi^2 pi 1 | | c | | 0 |
18 |
19 | We get:
20 |
21 | a = -4 / pi^2
22 | b = 4 / pi
23 | c = 0
24 |
25 | And:
26 |
27 | y = 4x(pi - x)/(pi^2)
28 |
29 | Engineering time:
30 | ---------------------------
31 | We are working with a 320x200 vga. We also don't have floating-point math. So, the
32 | goal here is to do all the math in integer screen coordinates and accept some pixel
33 | approximation error.
34 |
35 | First, we want to center the wave in the middle, y = 100
36 | We'll let y vary +-50 pixels to remain on the screen, so [50, 150]
37 | We want to show an entire cycle (2pi) on the x-axis, so *50 gives us [0, ~314]
38 | This implies that the "x-origin" is at x = 157
39 |
40 | Substituting in everything, we get:
41 |
42 | y ~= 100 + x*(157 - x)/125
43 |
44 | The division by 125 is problematic as we don't have division. But luckily 128 is close enough.
45 |
46 | Thus, we get:
47 |
48 | y ~= 100 + (x*(157 - x)) >> 7
49 |
50 | The rest is just adjusting for the [0, pi] range reduction by negating the approximation
51 | along [pi, 2pi]
52 |
53 | NOTE: the screen coordinate system is upside-down and I don't bother to correct for that.
54 | it simply means that the animation starts at a +pi phase offset
55 | */
56 |
57 | int y;
58 | int x;
59 | int x_0;
60 | void sin_positive_approx()
61 | {
62 | y = ( x_0 * ( 157 - x_0 ) ) >> 7;
63 | }
64 | void sin()
65 | {
66 | x_0 = x;
67 | while( x_0 > 314 ){
68 | x_0 = x_0 - 314;
69 | }
70 | if( x_0 <= 157 ){
71 | sin_positive_approx();
72 | }
73 | if( x_0 > 157 ){
74 | x_0 = x_0 - 157;
75 | sin_positive_approx();
76 | y = 0 - y;
77 | }
78 | y = 100 + y;
79 | }
80 |
81 |
82 | int offset;
83 | int x_end;
84 | void draw_sine_wave()
85 | {
86 | x = offset;
87 | x_end = x + 314;
88 | while( x <= x_end ){
89 | sin();
90 | pixel_x = x - offset;
91 | pixel_y = y;
92 | vga_set_pixel();
93 | x = x + 1;
94 | }
95 | }
96 |
97 | int v_1;
98 | int v_2;
99 | void delay()
100 | {
101 | v_1 = 0;
102 | while( v_1 < 50 ){
103 | v_2 = 0;
104 | while( v_2 < 10000 ){
105 | v_2 = v_2 + 1;
106 | }
107 | v_1 = v_1 + 1;
108 | }
109 | }
110 |
111 | void main()
112 | {
113 | vga_init();
114 |
115 | offset = 0;
116 | while( 1 ){
117 | vga_clear();
118 | draw_sine_wave();
119 |
120 | delay();
121 | offset = offset + 1;
122 | if( offset >= 314 ){ // mod the value to avoid 2^16 integer overflow
123 | offset = offset - 314;
124 | }
125 | }
126 | }
127 |
--------------------------------------------------------------------------------
/examples/twinkle.c:
--------------------------------------------------------------------------------
1 | /* References:
2 | http://muruganad.com/8086/8086-assembly-language-program-to-play-sound-using-pc-speaker.html
3 | https://en.wikipedia.org/wiki/Twinkle,_Twinkle,_Little_Star
4 | */
5 |
6 | void delay_1()
7 | {
8 | v_1 = 0;
9 | while( v_1 < 4000 ){
10 | v_2 = 0;
11 | while( v_2 < 10000 ){
12 | v_2 = v_2 + 1;
13 | }
14 | v_1 = v_1 + 1;
15 | }
16 | }
17 |
18 | void delay_2()
19 | {
20 | v_1 = 0;
21 | while( v_1 < 300 ){
22 | v_2 = 0;
23 | while( v_2 < 10000 ){
24 | v_2 = v_2 + 1;
25 | }
26 | v_1 = v_1 + 1;
27 | }
28 | }
29 |
30 | void audio_init()
31 | {
32 | // Configure PIC2 mode
33 | port_num = 67;
34 | port_val = 182;
35 | port_outb();
36 | }
37 |
38 | void audio_enable()
39 | {
40 | // Set bits 0 and 1 to enable
41 | port_num = 97;
42 | port_inb();
43 | port_val = port_val | 3;
44 | port_outb();
45 | }
46 |
47 | void audio_disable()
48 | {
49 | // Clear bits 0 and 1 to enable
50 | port_num = 97;
51 | port_inb();
52 | port_val = port_val & 65532;
53 | port_outb();
54 | }
55 |
56 | int audio_freq;
57 | void audio_freq_set()
58 | {
59 | // Set frequency
60 | port_num = 66;
61 | port_val = audio_freq & 255;
62 | port_outb();
63 | port_val = ( audio_freq >> 8 ) & 255;
64 | port_outb();
65 | }
66 |
67 | int note;
68 | void play_quarter_note()
69 | {
70 | audio_freq = note;
71 | audio_freq_set();
72 | audio_enable();
73 | delay_1();
74 | audio_disable();
75 | delay_2();
76 | }
77 | void play_half_note()
78 | {
79 | audio_freq = note;
80 | audio_freq_set();
81 | audio_enable();
82 | delay_1();
83 | delay_1();
84 | audio_disable();
85 | delay_2();
86 | }
87 |
88 | void play_section_1()
89 | {
90 | note = C; play_quarter_note();
91 | note = C; play_quarter_note();
92 | note = G; play_quarter_note();
93 | note = G; play_quarter_note();
94 | note = A; play_quarter_note();
95 | note = A; play_quarter_note();
96 | note = G; play_half_note();
97 |
98 | note = F; play_quarter_note();
99 | note = F; play_quarter_note();
100 | note = E; play_quarter_note();
101 | note = E; play_quarter_note();
102 | note = D; play_quarter_note();
103 | note = D; play_quarter_note();
104 | note = C; play_half_note();
105 | }
106 |
107 | void play_section_2()
108 | {
109 | note = G; play_quarter_note();
110 | note = G; play_quarter_note();
111 | note = F; play_quarter_note();
112 | note = F; play_quarter_note();
113 | note = E; play_quarter_note();
114 | note = E; play_quarter_note();
115 | note = D; play_half_note();
116 |
117 | note = G; play_quarter_note();
118 | note = G; play_quarter_note();
119 | note = F; play_quarter_note();
120 | note = F; play_quarter_note();
121 | note = E; play_quarter_note();
122 | note = E; play_quarter_note();
123 | note = D; play_half_note();
124 | }
125 |
126 | void main()
127 | {
128 | audio_init();
129 | audio_enable();
130 |
131 | C = 4560;
132 | D = 4063;
133 | E = 3619;
134 | F = 3416;
135 | G = 3043;
136 | A = 2711;
137 |
138 | play_section_1();
139 | play_section_2();
140 | play_section_1();
141 |
142 | audio_disable();
143 | }
144 |
--------------------------------------------------------------------------------
/img/sinwave.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xorvoid/sectorc/bc6218b750415f69dceb139b9f8f02d57926d7dd/img/sinwave.png
--------------------------------------------------------------------------------
/lint/lint.c:
--------------------------------------------------------------------------------
1 | #include
2 | #include
3 | #include
4 | #include
5 | #include
6 |
7 | typedef uint8_t u8;
8 | typedef uint16_t u16;
9 | typedef uint32_t u32;
10 | typedef uint64_t u64;
11 | typedef int8_t i8;
12 | typedef int16_t i16;
13 | typedef int32_t i32;
14 | typedef int64_t i64;
15 |
16 | #define error(msg, ...) do { \
17 | fprintf(stderr, "Fatal Error: "); \
18 | fprintf(stderr, msg, ##__VA_ARGS__); \
19 | fprintf(stderr, ": (%zu, %zu) at %s:%d\n", tok->line_num, tok->line_off, __FUNCTION__, __LINE__); \
20 | fprintf(stderr, "\n"); \
21 | tok_report_line_error();\
22 | exit(1); \
23 | } while(0)
24 |
25 | #define TOK_TYPE_EOF 0
26 | #define TOK_TYPE_NUM 1
27 | #define TOK_TYPE_SYM 2
28 | #define TOK_TYPE_FUNC 3
29 |
30 | #define TOKENS(_)\
31 | /* enum-symbol token-literal token-value */\
32 | _( TOK_INT, "int", 6388 )\
33 | _( TOK_VOID, "void", 11386 )\
34 | _( TOK_ASM, "asm", 5631 )\
35 | _( TOK_START, "_start()", 33977 )\
36 | _( TOK_SEMI, ";", 11 )\
37 | _( TOK_DEREF, "*(int*)", 64653 )\
38 | _( TOK_WHILE_BEGIN, "while(", 55810 )\
39 | _( TOK_IF_BEGIN, "if(", 6232 )\
40 | _( TOK_BODY_BEGIN, "){", 5 )\
41 | _( TOK_LPAREN, "(", 65528 )\
42 | _( TOK_RPAREN, ")", 65529 )\
43 | _( TOK_BLK_BEGIN, "{", 75 )\
44 | _( TOK_BLK_END, "}", 77 )\
45 | _( TOK_ASSIGN, "=", 13 )\
46 | _( TOK_ADDR, "&", 65526 )\
47 | _( TOK_SUB, "-", 65533 )\
48 | _( TOK_ADD, "+", 65531 )\
49 | _( TOK_MUL, "*", 65530 )\
50 | _( TOK_OR, "|", 76 )\
51 | _( TOK_XOR, "^", 46 )\
52 | _( TOK_SHL, "<<", 132 )\
53 | _( TOK_SHR, ">>", 154 )\
54 | _( TOK_EQ, "==", 143 )\
55 | _( TOK_NE, "!=", 65399 )\
56 | _( TOK_LT, "<", 12 )\
57 | _( TOK_GT, ">", 14 )\
58 | _( TOK_LE, "<=", 133 )\
59 | _( TOK_GE, ">=", 153 )\
60 |
61 | enum {
62 | #define ELT(enum_sym, _2, token_val) enum_sym = ((u16)token_val),
63 | TOKENS(ELT)
64 | #undef ELT
65 | };
66 |
67 | static const char *token_str(int val)
68 | {
69 | switch (val) {
70 | #define ELT(enum_sym, str, _3) case enum_sym: return str;
71 | TOKENS(ELT)
72 | #undef ELT
73 | }
74 | return NULL;
75 | }
76 |
77 | static bool token_is_kw(int val)
78 | {
79 | switch (val) {
80 | #define ELT(enum_sym, str, _3) case enum_sym: return true;
81 | TOKENS(ELT)
82 | #undef ELT
83 | }
84 | return false;
85 | }
86 |
87 | typedef struct token token_t;
88 | struct token
89 | {
90 | int type;
91 | u16 val;
92 |
93 | const char * text;
94 | size_t len;
95 |
96 | const char * line_start;
97 | size_t line_num;
98 | size_t line_off;
99 | };
100 |
101 | #define TOK_FMT "'%.*s'"
102 | #define TOK_ARG(t) (int)((t)->len), (t)->text
103 |
104 | static char * input_buf = NULL;
105 | static char * input_ptr = NULL;
106 | static size_t input_len = 0;
107 | static token_t tok[1];
108 |
109 | static void input_append_source_file(const char *path)
110 | {
111 | FILE *fp = fopen(path, "r");
112 | if (!fp) {
113 | fprintf(stderr, "Failed to open file: %s\n", path);
114 | exit(1);
115 | }
116 |
117 | fseek(fp, 0, SEEK_END);
118 | size_t file_sz = ftell(fp);
119 | fseek(fp, 0, SEEK_SET);
120 |
121 | size_t old_sz = input_len;
122 | size_t new_sz = old_sz + file_sz;
123 |
124 | input_buf = realloc(input_buf, new_sz);
125 | if (!input_buf) {
126 | fprintf(stderr, "Failed to allocate buffer for source text\n");
127 | exit(2);
128 | }
129 |
130 | size_t n = fread(input_buf + input_len, 1, file_sz, fp);
131 | if (n != file_sz) {
132 | fprintf(stderr, "Failed to read source file completely\n");
133 | exit(3);
134 | }
135 |
136 | fclose(fp);
137 |
138 | input_ptr = input_buf;
139 | input_len = new_sz;
140 | tok->line_start = input_buf;
141 | }
142 |
143 | static void tok_report_line_error(void)
144 | {
145 | size_t line_len;
146 | char *s = strchr(tok->line_start, '\n');
147 | if (s) {
148 | line_len = s - tok->line_start;
149 | } else {
150 | line_len = strlen(tok->line_start);
151 | }
152 |
153 | fprintf(stderr, "%.*s\n", (int)line_len, tok->line_start);
154 | fprintf(stderr, "%*s^\n", (int)(tok->line_off - tok->len), "");
155 | }
156 |
157 | __attribute__((unused)) void tok_print()
158 | {
159 | fprintf(stderr, "tok (%zu:%zu) | type %d | text '%.*s'\n", tok->line_num, tok->line_off, tok->type, (int)tok->len, tok->text);
160 | }
161 |
162 | int next_is_semi = 0;
163 |
164 | char tok_char_get()
165 | {
166 | if (next_is_semi) return ' ';
167 | else return *input_ptr;
168 | }
169 |
170 | void tok_char_next()
171 | {
172 | if (next_is_semi) {
173 | next_is_semi = 0;
174 | return;
175 | }
176 |
177 | char last = *input_ptr;
178 | if (last == '\n') {
179 | tok->line_start = input_ptr + 1;
180 | tok->line_num++;
181 | tok->line_off = 0;
182 | } else {
183 | tok->line_off++;
184 | }
185 | input_ptr++;
186 |
187 | if (*input_ptr == ';') {
188 | next_is_semi = 1;
189 | }
190 | }
191 |
192 | static bool tok_lint_number(const char *text, size_t len)
193 | {
194 | for (size_t i = 0; i < len; i++) {
195 | char c = text[i];
196 | if (!('0' <= c && c <= '9')) {
197 | return false;
198 | }
199 | }
200 | return true;
201 | }
202 |
203 | static bool tok_lint_ident(const char *text, size_t len)
204 | {
205 | for (size_t i = 0; i < len; i++) {
206 | char c = text[i];
207 | if (!(c == '_' || ('a' <= c && c <= 'z') || ('A' <= c && c <= 'Z') || (i != 0 && '0' <= c && c <= '9'))) {
208 | return false;
209 | }
210 | }
211 | return true;
212 | }
213 |
214 | static bool tok_lint_func_name(const char *text, size_t len)
215 | {
216 | if (len <= 2 || 0 != memcmp(&text[len-2], "()", 2)) {
217 | return false;
218 | }
219 | return tok_lint_ident(text, len-2);
220 | }
221 |
222 | static void tok_lint_matches(const char *str)
223 | {
224 | size_t len = strlen(str);
225 | if (tok->len != len || 0 != memcmp(tok->text, str, len)) {
226 | error("token (%u) collision between " TOK_FMT " and '%s'", tok->val, TOK_ARG(tok), str);
227 | }
228 | }
229 |
230 | static void tok_lint()
231 | {
232 | switch (tok->type) {
233 | case TOK_TYPE_EOF: break;
234 | case TOK_TYPE_NUM: {
235 | if (!tok_lint_number(tok->text, tok->len)) {
236 | error("token (%u) parsed as int, but it is not: " TOK_FMT, tok->val, TOK_ARG(tok));
237 | }
238 | } break;
239 | case TOK_TYPE_SYM: {
240 | switch (tok->val) {
241 | #define ELT(enum_sym, str, _3) case enum_sym: tok_lint_matches(str); return;
242 | TOKENS(ELT)
243 | #undef ELT
244 | }
245 | // If we've reached this point, it's not a special token so it must be an ordinary variable identifier
246 | if (!tok_lint_ident(tok->text, tok->len)) {
247 | error("token (%u) parsed as variable identifier, but it is not: " TOK_FMT, tok->val, TOK_ARG(tok));
248 | }
249 | } break;
250 | case TOK_TYPE_FUNC: {
251 | if (!tok_lint_func_name(tok->text, tok->len)) {
252 | error("token (%u) parsed as func name, but it is not: " TOK_FMT, tok->val, TOK_ARG(tok));
253 | }
254 | } break;
255 | }
256 | }
257 |
258 | static void tok_next(void)
259 | {
260 | while (1) {
261 | char c = tok_char_get();
262 |
263 | // Handle EOF
264 | if (c == 0) {
265 | tok->type = TOK_TYPE_EOF;
266 | tok->len = 0;
267 | return;
268 | }
269 |
270 | // Skip whitespace
271 | if (c <= ' ') {
272 | tok_char_next();
273 | continue;
274 | }
275 |
276 | tok->type = ('0' <= c && c <= '9') ? TOK_TYPE_NUM : TOK_TYPE_SYM;
277 | tok->text = input_ptr;
278 | tok->len = 1;
279 | tok->val = c - '0';
280 |
281 | bool slash = c == '/';
282 | tok_char_next();
283 | c = tok_char_get();
284 |
285 | // single-line comment?
286 | if (slash && c == '/') {
287 | tok_char_next();
288 | c = tok_char_get();
289 | if (c > ' ') error("expected space after // comment");
290 | while (1) {
291 | if (c == '\n') break;
292 | tok_char_next();
293 | c = tok_char_get();
294 | }
295 | continue; // try tok_next() again
296 | }
297 |
298 | // multi-line comment?
299 | if (slash && c == '*') {
300 | tok_char_next();
301 | c = tok_char_get();
302 | if (c > ' ') error("expected space after /* comment");
303 |
304 | // ending "*/" must be proceeded by a space
305 | int last_space_off = 0;
306 | bool last_is_asterisk = false;
307 | while (1) {
308 | tok_char_next();
309 | c = tok_char_get();
310 | if (last_is_asterisk && c == '/') {
311 | if (last_space_off != 1) error("expected space before */ comment terminator");
312 | /* Found, done! */
313 | break;
314 | }
315 | last_space_off++;
316 | last_is_asterisk = c == '*';
317 | if (c <= ' ') {
318 | last_space_off = 0;
319 | }
320 | }
321 | tok_char_next();
322 | continue;
323 | }
324 |
325 | // normal token
326 | while (c > ' ') {
327 | tok->len++;
328 | tok->val = 10 * tok->val + c - '0';
329 | tok_char_next();
330 | c = tok_char_get();
331 | }
332 |
333 | // special logic to detect function names
334 | if (tok->len > 2 && 0 == memcmp(&tok->text[tok->len-2], "()", 2)) {
335 | tok->type = TOK_TYPE_FUNC;
336 | }
337 |
338 | // lint the token to detect errors
339 | tok_lint();
340 | return;
341 | }
342 | }
343 |
344 | static bool tok_num_is(void)
345 | {
346 | return tok->type == TOK_TYPE_NUM;
347 | }
348 |
349 | static void tok_num_expect(void)
350 | {
351 | if (!tok_num_is()) {
352 | error("expected num token");
353 | }
354 | tok_next();
355 | }
356 |
357 | static bool tok_kw_is(u16 val)
358 | {
359 | return
360 | tok->type == TOK_TYPE_SYM &&
361 | tok->val == val &&
362 | token_is_kw(val);
363 | }
364 |
365 | static void tok_kw_expect(u16 val)
366 | {
367 | if (!tok_kw_is(val)) {
368 | error("expected keyword token '%s' (%u)", token_str(val), val);
369 | }
370 | tok_next();
371 | }
372 |
373 | static bool tok_ident_is(void)
374 | {
375 | return
376 | tok->type == TOK_TYPE_SYM &&
377 | !token_is_kw(tok->val) &&
378 | tok_lint_ident(tok->text, tok->len);
379 | }
380 |
381 | static void tok_ident_expect(void)
382 | {
383 | if (!tok_ident_is()) {
384 | error("expected identifier token");
385 | }
386 | tok_next();
387 | }
388 |
389 | static bool tok_func_is(void)
390 | {
391 | return tok->type == TOK_TYPE_FUNC;
392 | }
393 |
394 | static void tok_func_expect(void)
395 | {
396 | if (!tok_func_is()) {
397 | error("expected func token");
398 | }
399 | tok_next();
400 | }
401 |
402 | static bool tok_oper_is(void)
403 | {
404 | return
405 | tok_kw_is(TOK_ADD) ||
406 | tok_kw_is(TOK_SUB) ||
407 | tok_kw_is(TOK_MUL) ||
408 | tok_kw_is(TOK_ADDR) || // "AND" in this context
409 | tok_kw_is(TOK_OR) ||
410 | tok_kw_is(TOK_XOR) ||
411 | tok_kw_is(TOK_SHL) ||
412 | tok_kw_is(TOK_SHR) ||
413 | tok_kw_is(TOK_EQ) ||
414 | tok_kw_is(TOK_NE) ||
415 | tok_kw_is(TOK_LT) ||
416 | tok_kw_is(TOK_GT) ||
417 | tok_kw_is(TOK_LE) ||
418 | tok_kw_is(TOK_GE);
419 | }
420 |
421 | // fwd decl needed for paren grouping recursion
422 | static void parse_expr(void);
423 |
424 | // unary = deref identifier
425 | // | "&" identifier
426 | // | "(" expr ")"
427 | // | identifier
428 | // | integer
429 | static void parse_unary(void)
430 | {
431 | if (tok_kw_is(TOK_DEREF)) {
432 | tok_next(); // consume TOK_DEREF
433 | tok_ident_expect();
434 | }
435 |
436 | else if (tok_kw_is(TOK_ADDR)) {
437 | tok_next(); // consume TOK_ADDR
438 | tok_ident_expect();
439 | }
440 |
441 | else if (tok_kw_is(TOK_LPAREN)) {
442 | tok_next(); // consume TOK_LPAREN
443 | parse_expr();
444 | tok_kw_expect(TOK_RPAREN);
445 | }
446 |
447 | else if (tok_ident_is()) {
448 | tok_next(); // consume identifier
449 | }
450 |
451 | else if (tok_num_is()) {
452 | tok_next(); // consume number
453 | }
454 |
455 | else {
456 | error("expected unary expression");
457 | }
458 | }
459 |
460 | // expr = unary (op unary)?
461 | static void parse_expr(void)
462 | {
463 | parse_unary();
464 | if (tok_oper_is()) {
465 | tok_next(); // consume the operator
466 | parse_unary();
467 | }
468 | }
469 |
470 | // assign_expr = deref? identifier "=" expr
471 | static void parse_assign_expr(void)
472 | {
473 | // optionally we have a deref token
474 | if (tok_kw_is(TOK_DEREF)) {
475 | tok_next();
476 | }
477 |
478 | tok_ident_expect();
479 | tok_kw_expect(TOK_ASSIGN);
480 | parse_expr();
481 | }
482 |
483 | // statement = "if(" expr "){" statement* "}"
484 | // | "while(" expr "){" statement* "}"
485 | // | "asm" integer ";"
486 | // | func_name ";"
487 | // | assign_expr ";"
488 | static void parse_statement(void)
489 | {
490 | if (tok_kw_is(TOK_IF_BEGIN)) {
491 | tok_next(); // consume TOK_IF_BEGIN
492 | parse_expr();
493 | tok_kw_expect(TOK_BODY_BEGIN);
494 | while (!tok_kw_is(TOK_BLK_END)) {
495 | parse_statement();
496 | }
497 | tok_next(); // consume TOK_BODY_BEGIN
498 | }
499 |
500 | else if (tok_kw_is(TOK_WHILE_BEGIN)) {
501 | tok_next(); // consume TOK_WHILE_BEGIN
502 | parse_expr();
503 | tok_kw_expect(TOK_BODY_BEGIN);
504 | while (!tok_kw_is(TOK_BLK_END)) {
505 | parse_statement();
506 | }
507 | tok_next(); // consume TOK_BODY_BEGIN
508 | }
509 |
510 | else if (tok_kw_is(TOK_ASM)) {
511 | tok_next(); // consume TOK_ASM
512 | tok_num_expect();
513 | tok_kw_expect(TOK_SEMI);
514 | }
515 |
516 | else if (tok_func_is()) {
517 | tok_next(); // consume func name
518 | // TODO: validate against a symbol table
519 | tok_kw_expect(TOK_SEMI);
520 | }
521 |
522 | else { // default case: an assignment expression
523 | parse_assign_expr();
524 | tok_kw_expect(TOK_SEMI);
525 | }
526 | }
527 |
528 | // var_decl = "int" identifier ";"
529 | static void parse_var_decl(void)
530 | {
531 | tok_kw_expect(TOK_INT);
532 | tok_ident_expect();
533 | // TODO: Build and check a symbol table of globals
534 | tok_kw_expect(TOK_SEMI);
535 | }
536 |
537 | // func_decl = "void" func_name "{" statement* "}"
538 | static void parse_func_decl(void)
539 | {
540 | tok_kw_expect(TOK_VOID);
541 | tok_func_expect();
542 | // TODO: Build and check a symbol table of functions
543 | tok_kw_expect(TOK_BLK_BEGIN);
544 | while (!tok_kw_is(TOK_BLK_END)) {
545 | parse_statement();
546 | }
547 | tok_kw_expect(TOK_BLK_END);
548 | }
549 |
550 | // program = (var_decl | func_decl)+
551 | static void parse_program(void)
552 | {
553 | while (tok->type != TOK_TYPE_EOF) {
554 | if (tok_kw_is(TOK_INT)) {
555 | parse_var_decl();
556 | }
557 | else if (tok_kw_is(TOK_VOID)) {
558 | parse_func_decl();
559 | }
560 | else {
561 | error("expected var decl or func decl");
562 | }
563 | }
564 | }
565 |
566 | int main(int argc, char *argv[])
567 | {
568 | if (argc < 2) {
569 | fprintf(stderr, "usage: %s ...\n", argv[0]);
570 | return 1;
571 | }
572 |
573 | for (size_t i = 1; i < (size_t)argc; i++) {
574 | input_append_source_file(argv[i]);
575 | }
576 |
577 | tok_next();
578 | parse_program();
579 |
580 | free(input_buf);
581 | return 0;
582 | }
583 |
--------------------------------------------------------------------------------
/rt/_start.c:
--------------------------------------------------------------------------------
1 |
2 | void _start()
3 | {
4 | main();
5 | shutdown();
6 | }
7 |
--------------------------------------------------------------------------------
/rt/lib.c:
--------------------------------------------------------------------------------
1 | int tmp1;
2 | int tmp2;
3 |
4 | void shutdown()
5 | {
6 | /* Shutdown via APM: coded in asm machine code directly */
7 |
8 | // Check for APM
9 | // | mov ah,0x53; mov al,0x00; xor bx,bx; int 0x15; jc error
10 | asm 180; asm 83; asm 176; asm 0; asm 49; asm 219;
11 | asm 205; asm 21; asm 114; asm 55;
12 |
13 | // Disconnect from any APM interface
14 | // | mov ah,0x53; mov al,0x04; xor bx,bx; int 0x15
15 | // | jc maybe_error; jmp no_error
16 | asm 180; asm 83; asm 176; asm 4; asm 49; asm 219;
17 | asm 205; asm 21; asm 114; asm 2; asm 235; asm 5;
18 |
19 | // Label: maybe_error
20 | // | cmp ah,0x03; jne error
21 | asm 128; asm 252; asm 3; asm 117; asm 38;
22 | // Label: no_error
23 |
24 | // Connect to APM interface
25 | // | mov ah,0x53; mov al,0x01; xor bx,bx; int 0x15; jc error
26 | asm 180; asm 83; asm 176; asm 1; asm 49; asm 219;
27 | asm 205; asm 21; asm 114; asm 28;
28 |
29 | // Enable power management for all devices
30 | // | mov ah,0x53; mov al,0x08; mov bx,0x0001; mov cx,0x0001
31 | // | int 0x15; jc error
32 | asm 180; asm 83; asm 176; asm 8;
33 | asm 187; asm 1; asm 0; asm 185; asm 1; asm 0;
34 | asm 205; asm 21; asm 114; asm 14;
35 |
36 | // Set the power state for all devices
37 | // | mov ah,0x53; mov al,0x7; mov bx,0x0001; mov cx,0x0003
38 | // | int 0x15; jc error
39 | asm 180; asm 83; asm 176; asm 7;
40 | asm 187; asm 1; asm 0; asm 185; asm 3; asm 0;
41 | asm 205; asm 21; asm 114; asm 0;
42 |
43 | // Label: error
44 | // | hlt; jmp error
45 | asm 244; asm 235; asm 253;
46 | }
47 |
48 | int store_far_seg;
49 | int store_far_off;
50 | int store_far_val;
51 | void store_far()
52 | {
53 | // mov es, store_far_seg
54 | store_far_seg = store_far_seg;
55 | asm 142; asm 192;
56 |
57 | // mov si, store_far_off
58 | store_far_off = store_far_off;
59 | asm 137; asm 198;
60 |
61 | // mov es:[si], store_far_val
62 | store_far_val = store_far_val;
63 | asm 38; asm 137; asm 4;
64 | }
65 |
66 | int div10_unsigned_n;
67 | int div10_unsigned_q;
68 | int div10_unsigned_r;
69 | void div10_unsigned()
70 | {
71 | /* Taken from "Hacker's Delight", modified to "fit your screen" */
72 |
73 | tmp1 = ( div10_unsigned_n >> 1 ) & 32767; // unsigned
74 | tmp2 = ( div10_unsigned_n >> 2 ) & 16383; // unsigned
75 | div10_unsigned_q = tmp1 + tmp2;
76 |
77 | tmp1 = ( div10_unsigned_q >> 4 ) & 4095; // unsigned
78 | div10_unsigned_q = div10_unsigned_q + tmp1;
79 |
80 | tmp1 = ( div10_unsigned_q >> 8 ) & 255; // unsigned
81 | div10_unsigned_q = div10_unsigned_q + tmp1;
82 |
83 | div10_unsigned_q = ( div10_unsigned_q >> 3 ) & 8191; // unsigned
84 |
85 | div10_unsigned_r = div10_unsigned_n
86 | - ( ( div10_unsigned_q << 3 ) + ( div10_unsigned_q << 1 ) );
87 |
88 | if( div10_unsigned_r > 9 ){
89 | div10_unsigned_q = div10_unsigned_q + 1;
90 | div10_unsigned_r = div10_unsigned_r - 10;
91 | }
92 | }
93 |
94 | int print_ch;
95 | void print_char()
96 | {
97 | /* Implement print char via serial port bios function accessed via int 0x14 */
98 |
99 | print_ch = print_ch; // mov ax,[&print_ch]
100 | asm 180; asm 1; // mov ah,1
101 | asm 186; asm 0; asm 0 ; // mov dx,0
102 | asm 205; asm 20; // int 0x14
103 | }
104 |
105 | // uses 'print_ch'
106 | void print_newline()
107 | {
108 | print_ch = 10;
109 | print_char();
110 | }
111 |
112 | int print_num; // input
113 | int print_u16_bufptr;
114 | int print_u16_cur;
115 | void print_u16()
116 | {
117 | print_u16_bufptr = 30000; // buffer for ascii digits
118 |
119 | if( print_num == 0 ){
120 | print_ch = 48;
121 | print_char();
122 | }
123 |
124 | print_u16_cur = print_num;
125 | while( print_u16_cur != 0 ){
126 | div10_unsigned_n = print_u16_cur;
127 | div10_unsigned();
128 |
129 | *(int*) print_u16_bufptr = div10_unsigned_r;
130 | print_u16_bufptr = print_u16_bufptr + 1;
131 |
132 | print_u16_cur = div10_unsigned_q;
133 | }
134 |
135 | while( print_u16_bufptr != 30000 ){ // emit them in reverse over
136 | print_u16_bufptr = print_u16_bufptr - 1;
137 | print_ch = ( *(int*) print_u16_bufptr & 255 ) + 48;
138 | print_char();
139 | }
140 | }
141 |
142 | // uses 'print_num' and 'print_ch'
143 | void print_i16()
144 | {
145 | if( print_num < 0 ){
146 | print_ch = 45; print_char(); // '-'
147 | print_num = 0 - print_num;
148 | }
149 | print_u16();
150 | }
151 |
152 | void vga_init()
153 | {
154 | // mov ah,0; mov al,0x13; int 0x10
155 | asm 180; asm 0; asm 176; asm 19; asm 205; asm 16;
156 | }
157 |
158 | void vga_clear()
159 | {
160 | // push di; xor di,di; mov bx,0xa000; mov es,bx;
161 | // mov cx,0x7d00; xor ax,ax; rep stos; pop di
162 | asm 87 ; asm 49 ; asm 255; asm 187; asm 0; asm 160;
163 | asm 142; asm 195; asm 185; asm 0; asm 125; asm 49;
164 | asm 192; asm 243; asm 171; asm 95;
165 | }
166 |
167 | int pixel_x;
168 | int pixel_y;
169 | void vga_set_pixel()
170 | {
171 | // need to multiply pixel_y by 320 = 256 + 64
172 | // use 'tmp1' for pixel index
173 | tmp1 = ( ( pixel_y << 8 ) + ( pixel_y << 6 ) ) + pixel_x;
174 |
175 | // store to 0xa000:pixel_idx
176 | // mov bx,0xa000; mov es,bx; mov bx,ax; mov BYTE PTR es:[bx],0xf
177 | tmp1 = tmp1;
178 | asm 187; asm 0; asm 160; asm 142; asm 195;
179 | asm 137; asm 195; asm 38; asm 198; asm 7; asm 15;
180 | }
181 |
182 | int port_num;
183 | int port_val;
184 | void port_inb()
185 | {
186 | dx = port_num;
187 | // mov dx,WORD PTR [0x464]; in al,dx
188 | asm 139; asm 22; asm 160; asm 4; asm 236;
189 |
190 | // mov WORD PTR [0x464],ax
191 | asm 137; asm 6; asm 100; asm 4;
192 | port_val = ax;
193 | }
194 | void port_inw()
195 | {
196 | // mov dx,WORD PTR [0x464]; in ax,dx
197 | dx = port_num;
198 | asm 139; asm 22; asm 160; asm 4; asm 237;
199 |
200 | // mov WORD PTR [0x464],ax
201 | asm 137; asm 6; asm 100; asm 4;
202 | port_val = ax;
203 | }
204 | void port_outb()
205 | {
206 | dx = port_num;
207 | ax = port_val;
208 |
209 | // mov dx,WORD PTR [0x464]
210 | asm 139; asm 22; asm 160; asm 4;
211 |
212 | // mov ax,WORD PTR [0x464]
213 | asm 139; asm 6; asm 100; asm 4;
214 |
215 | // outb dx,al
216 | asm 238;
217 | }
218 | void port_outw()
219 | {
220 | dx = port_num;
221 | ax = port_val;
222 |
223 | // mov dx,WORD PTR [0x464]
224 | asm 139; asm 22; asm 160; asm 4;
225 |
226 | // mov ax,WORD PTR [0x464]
227 | asm 139; asm 6; asm 100; asm 4;
228 |
229 | // outb dx,al
230 | asm 239;
231 | }
232 |
233 | void dump_code_segment_and_shutdown()
234 | {
235 | /* NOTE: This code is in a different segment from data, and our compiled pointer accesses
236 | do not leave the data segment, so we need a little machine code to grab data from the
237 | code segment and stash it in a variable for C */
238 |
239 | i = 0;
240 | while( i < 8192 ){ /* Just assuming 8K is enough.. might not be true */
241 |
242 | // (put "i" in ax); mov si,ax; mov ax,cs:[si]; mov [&a],ax
243 | i = i; asm 137; asm 198; asm 46; asm 139; asm 4; asm 137; asm 133; asm 98; asm 0;
244 |
245 | print_ch = a;
246 | print_char();
247 | i = i + 1;
248 | }
249 | shutdown();
250 | }
251 |
--------------------------------------------------------------------------------
/run.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | set -e
3 | THISDIR=$(dirname $(realpath $0))
4 | cd $THISDIR
5 |
6 | if [ "$#" != 1 ]; then
7 | echo "usage: $0 "
8 | exit 1
9 | fi
10 |
11 | input="rt/lib.c $1 rt/_start.c"
12 |
13 | ./build/lint $input
14 | ./run_raw.sh $input
15 |
--------------------------------------------------------------------------------
/run_raw.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | set -e
3 | THISDIR=$(dirname $(realpath $0))
4 | cd $THISDIR
5 |
6 | if [ "$#" -lt 1 ]; then
7 | echo "usage: $0 [ ...]"
8 | exit 1
9 | fi
10 |
11 | cat $@ | qemu-system-i386 -hda build/sectorc.bin -serial stdio -audiodev coreaudio,id=audio0 -machine pcspk-audiodev=audio0
12 |
--------------------------------------------------------------------------------
/sectorc.s:
--------------------------------------------------------------------------------
1 | bits 16
2 | cpu 386
3 |
4 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
5 | ;;; Token values as computed by the tokenizer's
6 | ;;; atoi() calculation
7 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
8 | %define TOK_INT 6388
9 | %define TOK_VOID 11386
10 | %define TOK_ASM 5631
11 | %define TOK_COMM 65532
12 | %define TOK_SEMI 11
13 | %define TOK_LPAREN 65528
14 | %define TOK_RPAREN 65529
15 | %define TOK_START 20697
16 | %define TOK_DEREF 64653
17 | %define TOK_WHILE_BEGIN 55810
18 | %define TOK_IF_BEGIN 6232
19 | %define TOK_BODY_BEGIN 5
20 | %define TOK_BLK_BEGIN 75
21 | %define TOK_BLK_END 77
22 | %define TOK_ASSIGN 13
23 | %define TOK_ADDR 65526
24 | %define TOK_SUB 65533
25 | %define TOK_ADD 65531
26 | %define TOK_MUL 65530
27 | %define TOK_AND 65526
28 | %define TOK_OR 76
29 | %define TOK_XOR 46
30 | %define TOK_SHL 132
31 | %define TOK_SHR 154
32 | %define TOK_EQ 143
33 | %define TOK_NE 65399
34 | %define TOK_LT 12
35 | %define TOK_GT 14
36 | %define TOK_LE 133
37 | %define TOK_GE 153
38 |
39 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
40 | ;;; Common register uses
41 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
42 | ;;; ax: current token / scratch register / emit val for stosw
43 | ;;; bx: current token
44 | ;;; cx: used by tok_next for trailing 2 bytes
45 | ;;; dl: flag for "tok_is_num"
46 | ;;; dh: flags for "tok_is_call", trailing "()"
47 | ;;; bp: saved token for assigned variable
48 | ;;; sp: stack pointer, we don't mess with this
49 | ;;; si: used with lodsw for table scans
50 | ;;; ds: fn symbol table segment (occasionally set to "cs" to access binary_oper_tbl)
51 | ;;; di: codegen destination offset
52 | ;;; es: codegen destination segment
53 | ;;; cs: always 0x07c0
54 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
55 |
56 | jmp 0x07c0:entry
57 | entry:
58 | push 0x3000 ; segment 0x3000 is used for fn symbol table
59 | pop ds
60 | push 0x2000 ; segment 0x2000 is used for codegen output buffer
61 | pop es
62 | xor di,di ; codegen index, zero'd
63 | ;; [fall-through]
64 |
65 | ;; main loop for parsing all decls
66 | compile:
67 | ;; advance to either "int" or "void"
68 | call tok_next
69 |
70 | ;; if "int" then skip a variable
71 | cmp ax,TOK_INT
72 | jne compile_function
73 | call tok_next2 ; consume "int" and
74 | jmp compile
75 |
76 | compile_function: ; parse and compile a function decl
77 | call tok_next ; consume "void"
78 | push bx ; save function name token
79 | mov [bx],di ; record function address in symtbl
80 | call compile_stmts_tok_next2 ; compile function body
81 |
82 | mov al,0xc3 ; emit "ret" instruction
83 | stosb
84 |
85 | pop bx ; if the function is _start(), we're done
86 | cmp bx,TOK_START
87 | jne compile ; otherwise, loop and compile another declaration
88 | ;; [fall-through]
89 |
90 | ;; done compiling, execute the binary
91 | execute:
92 | push es ; push the codegen segment
93 | push word [bx] ; push the offset to "_start()"
94 | push 0x4000 ; load new segment for variable data
95 | pop ds
96 | retf ; jump into it via "retf"
97 |
98 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
99 | ;;; compile statements (optionally advancing tokens beforehand)
100 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
101 | compile_stmts_tok_next2:
102 | call tok_next
103 | compile_stmts_tok_next:
104 | call tok_next
105 | compile_stmts:
106 | mov ax,bx
107 | cmp ax,TOK_BLK_END ; if we reach '}' then return
108 | je return
109 |
110 | test dh,dh ; if dh is 0, it's not a call
111 | je _not_call
112 | mov al,0xe8 ; emit "call" instruction
113 | stosb
114 |
115 | mov ax,[bx] ; load function offset from symbol-table
116 | sub ax,di ; compute relative to this location: "dest - cur - 2"
117 | sub ax,2
118 | stosw ; emit target
119 |
120 | jmp compile_stmts_tok_next2 ; loop to compile next statement
121 |
122 | _not_call:
123 | cmp ax,TOK_ASM ; check for "asm"
124 | jne _not_asm
125 | call tok_next ; tok_next to get literal byte
126 | stosb ; emit the literal
127 | jmp compile_stmts_tok_next2 ; loop to compile next statement
128 |
129 | _not_asm:
130 | cmp ax,TOK_IF_BEGIN ; check for "if"
131 | jne _not_if
132 | call _control_flow_block ; compile control-flow block
133 | jmp _patch_fwd ; patch up forward jump of if-stmt
134 |
135 | _not_if:
136 | cmp ax,TOK_WHILE_BEGIN ; check for "while"
137 | jne _not_while
138 | push di ; save loop start location
139 | call _control_flow_block ; compile control-flow block
140 | jmp _patch_back ; patch up backward and forward jumps of while-stmt
141 |
142 | _not_while:
143 | call compile_assign ; handle an assignment statement
144 | jmp compile_stmts ; loop to compile next statement
145 |
146 | _patch_back:
147 | mov al,0xe9 ; emit "jmp" instruction (backwards)
148 | stosb
149 | pop ax ; restore loop start location
150 | sub ax,di ; compute relative to this location: "dest - cur - 2"
151 | sub ax,2
152 | stosw ; emit target
153 | ;; [fall-through]
154 | _patch_fwd:
155 | mov ax,di ; compute relative fwd jump to this location: "dest - src"
156 | sub ax,si
157 | mov es:[si-2],ax ; patch "src - 2"
158 | jmp compile_stmts_tok_next ; loop to compile next statement
159 |
160 | _control_flow_block:
161 | call compile_expr_tok_next ; compile loop or if condition expr
162 |
163 | ;; emit forward jump
164 | mov ax,0xc085 ; emit "test ax,ax"
165 | stosw
166 | mov ax,0x840f ; emit "je" instruction
167 | stosw
168 | stosw ; emit placeholder for target
169 |
170 | push di ; save forward patch location
171 | call compile_stmts_tok_next ; compile a block of statements
172 | pop si ; restore forward patch location
173 |
174 | return: ; this label gives us a way to do conditional returns
175 | ret ; (e.g. "jne return")
176 |
177 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
178 | ;;; compile assignment statement
179 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
180 | compile_assign:
181 | cmp ax,TOK_DEREF ; check for "*(int*)"
182 | jne _not_deref_store
183 | call tok_next ; consume "*(int*)"
184 | call save_var_and_compile_expr ; compile rhs first
185 | ;; [fall-through]
186 |
187 | compile_store_deref:
188 | mov bx,bp ; restore dest var token
189 | mov ax,0x0489 ; code for "mov [si],ax"
190 | ;; [fall-through]
191 |
192 | emit_common_ptr_op:
193 | push ax
194 | mov ax,0x368b ; emit "mov si,[imm]"
195 | call emit_var
196 | pop ax
197 | stosw ; emit
198 | ret
199 |
200 | _not_deref_store:
201 | call save_var_and_compile_expr ; compile rhs first
202 | ;; [fall-through]
203 |
204 | compile_store:
205 | mov bx,bp ; restore dest var token
206 | mov ax,0x0689 ; code for "mov [imm],ax"
207 | jmp emit_var ; [tail-call]
208 |
209 | save_var_and_compile_expr:
210 | mov bp,bx ; save dest to bp
211 | call tok_next ; consume dest
212 | ;; [fall-through] ; fall-through will consume "=" before compiling expr
213 |
214 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
215 | ;;; compile expression
216 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
217 | compile_expr_tok_next:
218 | call tok_next
219 | compile_expr:
220 | call compile_unary ; compile left-hand side
221 |
222 | push ds ; need to swap out 'ds' to scan the table with lodsw
223 | push cs
224 | pop ds
225 |
226 | mov si,binary_oper_tbl - 2 ; load ptr to operator table (biased backwards)
227 | _check_next:
228 | lodsw ; discard 16-bit of machine-code
229 | lodsw ; load 16-bit token value
230 | cmp ax,bx ; matches token?
231 | je _found
232 | test ax,ax ; end of table?
233 | jne _check_next
234 |
235 | pop ds
236 | ret ; all-done, not found
237 |
238 | _found:
239 | lodsw ; load 16-bit of machine-code
240 | push ax ; save it to the stack
241 | mov al,0x50 ; code for "push ax"
242 | stosb ; emit
243 | call tok_next ; consume operator token
244 | call compile_unary ; compile right-hand side
245 | mov ax,0x9159 ; code for "pop cx; xchg ax,cx"
246 | stosw ; emit
247 |
248 | pop bx ; restore 16-bit of machine-code
249 | cmp bh,0xc0 ; detect the special case for comparison ops
250 | jne emit_op
251 | emit_cmp_op:
252 | mov ax,0xc839 ; code for "cmp ax,cx"
253 | stosw ; emit
254 | mov ax,0x00b8 ; code for "mov ax,0x00"
255 | stosw ; emit
256 | mov ax,0x0f00 ; code for the rest of imm and prefix for "setX" instrs
257 | stosw ; emit
258 | ;; [fall-through]
259 |
260 | emit_op:
261 | mov ax,bx
262 | stosw ; emit machine code for op
263 | pop ds
264 | ret
265 |
266 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
267 | ;;; compile unary
268 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
269 | compile_unary:
270 | cmp ax,TOK_DEREF ; check for "*(int*)"
271 | jne _not_deref
272 | ;; compile deref (load)
273 | call tok_next ; consume "*(int*)"
274 | mov ax,0x048b ; code for "mov ax,[si]"
275 | jmp emit_common_ptr_op ; [tail-call]
276 |
277 | _not_deref:
278 | cmp ax,TOK_LPAREN ; check for "*(int*)"
279 | jne _not_paren
280 | call compile_expr_tok_next ; consume "(" and compile expr
281 | jmp tok_next ; [tail-call] to consume ")"
282 |
283 | _not_paren:
284 | cmp ax,TOK_ADDR ; check for "&"
285 | jne _not_addr
286 | call tok_next ; consume "&"
287 | mov ax,0x068d ; code for "lea ax,[imm]"
288 | jmp emit_var ; [tail-call] to emit code
289 |
290 | _not_addr:
291 | test dl,dl ; check for tok_is_num
292 | je _not_int
293 | mov al,0xb8 ; code for "mov ax,imm"
294 | stosb ; emit
295 | jmp emit_tok ; [tail-call] to emit imm
296 |
297 | _not_int:
298 | ;; compile var
299 | mov ax,0x068b ; code for "mov ax,[imm]"
300 | ;; [fall-through]
301 |
302 | emit_var:
303 | stosw ; emit
304 | add bx,bx ; bx = 2*bx (scale up for 16-bit)
305 | ;; [fall-through]
306 |
307 | emit_tok:
308 | mov ax,bx
309 | stosw ; emit token value
310 | jmp tok_next ; [tail-call]
311 |
312 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
313 | ;;; get next token, setting the following:
314 | ;;; ax: token
315 | ;;; bx: token
316 | ;;; dl: tok_is_num
317 | ;;; dh: tok_is_call
318 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
319 | tok_next2:
320 | call tok_next
321 | ;; [fall-through]
322 | tok_next:
323 | call getch
324 | cmp al,32 ; skip spaces (anything <= ' ' is considered space)
325 | jle tok_next
326 |
327 | xor bx,bx ; zero token reg
328 | xor cx,cx ; zero last-two chars reg
329 |
330 | cmp al,57
331 | setle dl ; tok_is_num = (al <= '9')
332 |
333 | _nextch:
334 | cmp al,32
335 | jle _done ; if char is space then break
336 |
337 | shl cx,8
338 | mov cl,al ; shift this char into cx
339 |
340 | imul bx,10
341 | sub ax,48
342 | add bx,ax ; atoi computation: bx = 10 * bx + (ax - '0')
343 |
344 | call getch
345 | jmp _nextch ; [loop]
346 |
347 | _done:
348 | mov ax,cx
349 | cmp ax,0x2f2f ; check for single-line comment "//"
350 | je _comment_double_slash
351 | cmp ax,0x2f2a ; check for multi-line comment "/*"
352 | je _comment_multi_line
353 | cmp ax,0x2829 ; check for call parens "()"
354 | sete dh
355 |
356 | mov ax,bx ; return token in ax also
357 | ret
358 |
359 | _comment_double_slash:
360 | call getch ; get next char
361 | cmp al,10 ; check for newline '\n'
362 | jne _comment_double_slash ; [loop]
363 | jmp tok_next ; [tail-call]
364 |
365 | _comment_multi_line:
366 | call tok_next ; get next token
367 | cmp ax,65475 ; check for token "*/"
368 | jne _comment_multi_line ; [loop]
369 | jmp tok_next ; [tail-call]
370 |
371 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
372 | ;;; get next char: returned in ax (ah == 0, al == ch)
373 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
374 | getch:
375 | push dx ; need to save dx because tok_next uses it for flags
376 | xor si,si ; use ds:0 as a semi-colon buffer, encodes smaller via si
377 | mov ax,[si] ; load the semi-colon buffer
378 | xor [si],ax ; zero the buffer
379 | cmp al,59 ; check for ';'
380 | je getch_done ; if ';' return it
381 |
382 | getch_tryagain:
383 | mov ax,0x0200
384 | xor dx,dx
385 | int 0x14 ; get a char from serial (bios function)
386 |
387 | and ah,0x80 ; check for failure and clear ah as a side-effect
388 | jne getch_tryagain ; failed, try again later
389 |
390 | cmp al,59 ; check for ';'
391 | jne getch_done ; if not ';' return it
392 | mov [si],ax ; save the ';'
393 | xor ax,ax ; return 0 instead, treated as whitespcae
394 |
395 | getch_done:
396 | pop dx
397 | ret
398 |
399 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
400 | ;;; binary operator table
401 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
402 | binary_oper_tbl:
403 | dw TOK_ADD,0xc103 ; add ax,cx
404 | dw TOK_SUB,0xc12b ; sub ax,cx
405 | dw TOK_MUL,0xe1f7 ; mul ax,cx
406 | dw TOK_AND,0xc123 ; and ax,cx
407 | dw TOK_OR,0xc10b ; or ax,cx
408 | dw TOK_XOR,0xc133 ; xor ax,cx
409 | dw TOK_SHL,0xe0d3 ; shl ax,cx
410 | dw TOK_SHR,0xf8d3 ; shr ax,cx
411 | dw TOK_EQ,0xc094 ; sete al
412 | dw TOK_NE,0xc095 ; setne al
413 | dw TOK_LT,0xc09c ; setl al
414 | dw TOK_GT,0xc09f ; setg al
415 | dw TOK_LE,0xc09e ; setle al
416 | dw TOK_GE,0xc09d ; setge al
417 | dw 0 ; [sentinel]
418 |
419 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
420 | ;;; boot signature
421 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
422 | times 510-($-$$) db 0
423 | db 0x55, 0xaa
424 |
--------------------------------------------------------------------------------