├── ReadMe.md ├── ReleaseNotes.md ├── ToDo.md └── src ├── Makefile ├── compiler.c ├── compiler.h ├── defines.h ├── functiontable.c ├── functiontable.h ├── identifiertypes.c ├── identifiertypes.h ├── list.c ├── list.h ├── make.bat ├── optimizer.c ├── optimizer.h ├── parsetree.c ├── parsetree.h ├── prunefunctions.c ├── prunefunctions.h ├── stringqueue.c ├── stringqueue.h ├── stringtable.c ├── stringtable.h ├── strutil.c ├── strutil.h ├── symbolstack.c ├── symbolstack.h ├── symboltable.c ├── symboltable.h ├── u.l ├── u.y ├── upre.l └── upre.y /ReadMe.md: -------------------------------------------------------------------------------- 1 | U Programming Language 2 | ================================================================================ 3 | 4 | Introduction 5 | -------------------------------------------------------------------------------- 6 | 7 | Around 10 years ago, I became obsessed with computer operating systems. I'm not 8 | entirely sure why, but I think the idea of controlling the entire computer, the 9 | individual bits and bytes, appealed to me. At the time, I had been programming 10 | for a couple years and, being young and naïve, I decided I would take on the 11 | challenge of writing my own hobby system. 12 | 13 | I had heard that one needed assembly language (at least at some level) to write 14 | such programs. From here, I searched the Internet (not as simple back then with 15 | my finicky dialup connection) for free x86 assemblers and documentation relating 16 | to them. I found a couple good ones, and proceeded to write my system. 17 | 18 | ...but assembly was HARD! It was a lot of fun to write small optimized blocks 19 | of code (and I still get a good deal of enjoyment from this) but a few years 20 | later and with only meager results (I managed to develop a control program with 21 | a basic command prompt and the beginnings of a file system) I put the project on 22 | hold indefinitely. I attempted to rewrite the system in C, but constantly got 23 | fed up with linkers and some of the housekeeping inherent to the language. The 24 | OS I wanted to write was just for fun, and using a language like this would be 25 | sort of like pheasant hunting with a howitzer, to borrow a phrase. 26 | 27 | Recently, during my senior year as an undergrad, I took a programming languages 28 | and compilers course. In it, I was exposed to the tools Flex and Bison and 29 | they instantly became my new favorite software toys. Prior to this, I had been 30 | thinking of things that I would like in my ideal language, particularly a 31 | simple, albeit low-level one. After graduating, I was able to implement a 32 | number of these ideas in the compiler for this language that I dubbed, “U”. 33 | Granted, not all of the features that would make it an ideal language are, or 34 | may ever be, added to it (I'm quite lazy). But, in the words of Bjarne 35 | Stroustrup, such endeavors constitute a “sterile quest for perfection”. 36 | 37 | Some things I wanted U to be: 38 | ----------------------------- 39 | * a hobby language - writing a compiler for the sake of writing a compiler 40 | * tightly integrated with x86 assembly -> easy to read / write low-level 41 | code for real mode 42 | * easy to learn / read overall -> no curly braces, small list of reserved 43 | words, simple grammar (granted, some of these things are due to my 44 | inherent laziness in coding it up) 45 | * education -> maybe it can be helpful for learning low-level programming, 46 | assembly, and how compilers work in general 47 | 48 | Some things that it's not meant to be / isn't: 49 | ---------------------------------------------- 50 | * well tested and suited for production code 51 | * portable (it's not C) 52 | * completely elegant and efficient (take a look at my source files - you'll 53 | understand) 54 | 55 | At any rate, feel free to play around with the language. I can't promise it's 56 | bug free (in fact, I know it's not), but hopefully it will get improved further 57 | over time. Like Larry Wall, I reserve the right to be its "benevolent 58 | dictator", but feel free to make suggestions and use the source code for your 59 | own projects. It should finally be noted that small portions of the code (mostly 60 | in the parsetree and symboltable files) were written by Dr. Brian Turnquist at 61 | Bethel University and I have him to thank for the excellent intro course he 62 | taught on compiler construction. 63 | 64 | With that, here's some sparse documentation. Happy coding! 65 | 66 | -------------------------------------------------------------------------------- 67 | 68 | License 69 | -------------------------------------------------------------------------------- 70 | The MIT License (MIT) 71 | 72 | Copyright (c) 2012 Rob Upcraft 73 | 74 | Permission is hereby granted, free of charge, to any person obtaining a copy 75 | of this software and associated documentation files (the "Software"), to deal 76 | in the Software without restriction, including without limitation the rights 77 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 78 | copies of the Software, and to permit persons to whom the Software is 79 | furnished to do so, subject to the following conditions: 80 | 81 | The above copyright notice and this permission notice shall be included in 82 | all copies or substantial portions of the Software. 83 | 84 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 85 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 86 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 87 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 88 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 89 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 90 | THE SOFTWARE. 91 | 92 | 93 | -------------------------------------------------------------------------------- 94 | Building 95 | -------------------------------------------------------------------------------- 96 | To make a working executable, you will need the following installed on your 97 | development machine: 98 | * flex (http://flex.sourceforge.net) 99 | * bison v3 (http://gnu.org/software/bison) 100 | * a C compiler (I used GCC on an Ubuntu box) 101 | 102 | In theory, other compilers and platforms should work but I haven't done 103 | extensive testing. 104 | 105 | To build the compiler, change to the directory containing the source code and 106 | type "make". This will build an executable using the included Makefile. 107 | 108 | -------------------------------------------------------------------------------- 109 | Compiler 110 | -------------------------------------------------------------------------------- 111 | The U compiler compiles source code to a single output file containing Intel x86 112 | opcodes. These can then be assembled to executable machine code with assemblers 113 | such as FASM or NASM. 114 | 115 | Invoking the compiler: 116 | u [input file] [optional output file] [flags] 117 | 118 | Invoking the compiler without any input arguments will present the user with 119 | a help screen outlining the options offered. 120 | 121 | 122 | -------------------------------------------------------------------------------- 123 | 124 | Language Features & Examples 125 | -------------------------------------------------------------------------------- 126 | 127 | ### General Program Structure ### 128 | 129 | Every program in U, like C, contains a main() function that returns 'void'. 130 | Note that instead of opening and closing braces, each function (and other 131 | code blocks in the language) are terminated with the 'end' keyword: 132 | 133 | void main() 134 | // code here 135 | end 136 | 137 | Comments, as in C, are expressed with either single line (//) or multi-line (/* 138 | and */) syntax. Every program is composed of a collection of functions like the 139 | following example. Note that the compiler takes multiple passes through the 140 | input files, so header files, prototypes, and function ordering are neither 141 | necessary nor relevant. Also, 'import' statements can be used outside of 142 | function blocks to import external U source files into the compiler: 143 | 144 | // Import some other source files 145 | import "somefile.u" 146 | import "anotherfile.u" 147 | 148 | /* Main function */ 149 | void main() 150 | print("Hello, world") 151 | putc('!') 152 | end 153 | 154 | /* Print function */ 155 | void print(byte[] str) 156 | // some code for printing the input string . . . 157 | end 158 | 159 | /* Character printing function */ 160 | void putc(byte c) 161 | // code for printing a single character . . . 162 | end 163 | 164 | ### Variables and Data Types ### 165 | 166 | U includes 5 data types (in addition to the 'void' type) that can be used to 167 | declare variables within a program: 168 | 169 | byte - An 8 bit byte. 170 | word - A 16 bit word. 171 | bool - Data type dedicated to storing boolean (true/false) values. 172 | byte[] - A pointer to an array of bytes. The pointer itself requires 32 bits 173 | of memory. 174 | word[] - A pointer to an array of words. The pointer itself requires 32 bits 175 | of memory. 176 | 177 | Variables are declared and initialized in a manner very similar to C: 178 | 179 | void main() 180 | byte a = 'A' // store the character 'A' in the variable, a 181 | byte c // declare the variable 'C' 182 | c = a // set c to a's value 183 | 184 | // Here we declare a variable called, x, and initialize it to point to 185 | // the location 10:5 in memory. In U, the value 10:5 represents the 186 | // 10th segment in memory with an offset of 5 bytes (seg:off). This 187 | // ':' operator can be used this way in other non-constant expressions 188 | // as well 189 | byte[] x = 10:5 190 | byte[] y // declare the byte pointer, y 191 | y = x // set y's pointer value to x's pointer value 192 | 193 | // Point str to the string "hello, world!" in memory. Note that this 194 | // string is stored in a single location, so modifying it will modify 195 | // in other locations that point to the identical string. 196 | byte[] str = "hello, world!" 197 | end 198 | 199 | As in C, basic mathematical operators (+, -, *, and /) as well as % (modulus) 200 | are supported. Increment (++) and decrement (--), as well as other shortcuts 201 | like the '+=' operator do not currently exist in the language. 202 | 203 | 204 | ### Conditional Statements ### 205 | 206 | At present, the only conditional control structure supported in U is the if-else 207 | block. Note that the words 'true' and 'false' are reserved and have their 208 | conventional boolean values: 209 | 210 | void main() 211 | byte a = 'A' 212 | byte b = 'B' 213 | 214 | // simple if-else 215 | if (a == b) 216 | // this code shouldn't be executed 217 | else 218 | // this code should be executed 219 | end 220 | 221 | // if (without else) 222 | if (false) 223 | // never execute this block 224 | end 225 | 226 | // if with else-if blocks 227 | if (a == 'C') 228 | // don't execute this 229 | else if (a == 'B') 230 | // don't execute this either 231 | else if (a == 'A') 232 | // this looks right! 233 | else 234 | // an extra condition, just in case 235 | end 236 | end 237 | 238 | 239 | ### Iteration ### 240 | 241 | Currently, the only loop structure supported in U is the 'while' block: 242 | 243 | include "someio.u" 244 | void main() 245 | word i = 0 246 | 247 | // print 'X' 10 times 248 | while (i < 10) 249 | putc('X') 250 | i = i + 1 251 | end 252 | end 253 | 254 | 255 | ### Inline Assembly ### 256 | 257 | One of the relatively unique features of U is that is is designed to support 258 | inline Intel x86 assembly. At present, only the 'mov' and 'int' calls are 259 | supported (to take advantage of BIOS interrupt calls) but more are planned to be 260 | added as the language is further developed: 261 | 262 | /* Main function */ 263 | void main() 264 | putc('A') // print character 265 | putc(getc()) // print a character that the user types 266 | end 267 | 268 | /* Function that actually prints a character using a BIOS call */ 269 | void putc(byte c) 270 | asm // start of assembly block 271 | mov ah, 0Eh 272 | mov al, [c] 273 | int 10h 274 | end // end of assembly block 275 | end 276 | 277 | /* Function that gets a character from the keyboard using a BIOS call */ 278 | byte getc() 279 | byte c 280 | asm 281 | mov ah, 0 282 | int 16h 283 | mov [c], al 284 | end 285 | return c 286 | end 287 | -------------------------------------------------------------------------------- /ReleaseNotes.md: -------------------------------------------------------------------------------- 1 | # RELEASE NOTES 2 | 3 | ## 10/10/2013 - v0.0.6 4 | 5 | * added segment() and offset() built-in functions for extracting 6 | the segment and offset from a pointer expression 7 | 8 | ## 9/21/2013 - v0.0.4 9 | 10 | * added assembly call instruction with integer-type arguments 11 | 12 | ## 5/9/2013 - v0.0.3 13 | 14 | * added license 15 | * cleaned up lexer and parser files (u.l, upre.l, u.y, upre.y) 16 | 17 | ## 2/23/2013 - v0.0.2 18 | 19 | * added integer constant folding 20 | 21 | ## 12/7/2012 - v0.0.1 22 | 23 | * initial release 24 | -------------------------------------------------------------------------------- /ToDo.md: -------------------------------------------------------------------------------- 1 | U Compiler Wish List 2 | ==================== 3 | This is a non-comprehensive list of features I'm planning to add to the compiler 4 | if I ever have time and become less lazy: 5 | 6 | * a "define" preprocessor feature (this would work much like #define in C) 7 | * global variables - I have a bit of a personal vendetta against these from my 8 | training and experience in software engineering, but for a language like 9 | this, they seem almost (gasp) inevitable 10 | * more assembly code in asm blocks - Right now only "mov" and "int" are supported, 11 | and I'd like to support a greater subset of the syntax 12 | -------------------------------------------------------------------------------- /src/Makefile: -------------------------------------------------------------------------------- 1 | LEX = flex 2 | YACC = bison 3 | 4 | CC = gcc 5 | 6 | u: ulex.c uprelex.c u.tab.c upre.tab.c symboltable.c symbolstack.c functiontable.c parsetree.c compiler.c strutil.c stringtable.c 7 | $(CC) -o u ulex.c uprelex.c u.tab.c upre.tab.c symboltable.c symbolstack.c functiontable.c parsetree.c compiler.c strutil.c stringtable.c list.c identifiertypes.c stringqueue.c prunefunctions.c optimizer.c -ll 8 | rm ulex.c uprelex.c upre.tab.c upre.tab.h u.tab.c u.tab.h 9 | 10 | u.tab.c u.tab.h: u.y 11 | rm -f u.tab.h u.tab.c upre.tab.h upre.tab.c 12 | $(YACC) -d u.y 13 | $(YACC) -d upre.y 14 | 15 | ulex.c uprelex.c: u.l u.tab.h upre.l upre.tab.h 16 | rm -f ulex.c uprelex.c 17 | $(LEX) u.l 18 | $(LEX) upre.l 19 | 20 | clean: 21 | rm -f u u.tab.h u.tab.c upre.tab.c upre.tab.h ulex.c uprelex.c 22 | -------------------------------------------------------------------------------- /src/compiler.c: -------------------------------------------------------------------------------- 1 | // Includes 2 | #include 3 | #include 4 | #include "defines.h" 5 | #include "parsetree.h" 6 | #include "compiler.h" 7 | #include "strutil.h" 8 | #include "functiontable.h" 9 | #include "stringtable.h" 10 | #include "optimizer.h" 11 | 12 | // Global variables 13 | FILE* fp; 14 | extern function_table* fTable; 15 | extern string_table* strTable; 16 | 17 | // Defines 18 | #define CALLOFFSET 4 19 | 20 | // Append string literals 21 | void AppendStringLiterals() 22 | { 23 | int i; 24 | for (i = 0; i < strTable->size; i++) 25 | { 26 | //fprintf(fp, "strlit_%d db '%s', 0\n", i, strTable->strings[i]); 27 | int prevChar = FALSE; 28 | int j; 29 | char* str = strTable->strings[i]; 30 | int len = strlen(str); 31 | fprintf(fp, "strlit_%d db ", i); 32 | for (j = 0; j < len; j++) 33 | { 34 | if (str[j] == 10) 35 | { 36 | if (prevChar) 37 | fprintf(fp, "', "); 38 | 39 | fprintf(fp, "10"); 40 | if (j != len - 1) 41 | fprintf(fp, ", "); 42 | prevChar = FALSE; 43 | } else { 44 | if (!prevChar) 45 | fprintf(fp, "'"); 46 | fprintf(fp, "%c", str[j]); 47 | if (j == len - 1) 48 | fprintf(fp, "'"); 49 | prevChar = TRUE; 50 | } 51 | } 52 | fprintf(fp, ", 0\n"); 53 | } 54 | } 55 | 56 | // Recursive helper function for NASM Emit() 57 | void EmitHelper(struct tree_node* node) 58 | { 59 | if (node->type == TN_FUNCTION) 60 | { 61 | // Function root 62 | EmitHelper(node->operands[0]); 63 | if (node->numOperands > 1) 64 | { 65 | EmitHelper(node->operands[1]); 66 | } 67 | fprintf(fp, "pop bp\n"); 68 | fprintf(fp, "ret\n"); 69 | } else if (node->type == TN_FDEF) { 70 | // Function definition 71 | fprintf(fp, "\n_%s:\n", node->sval); 72 | fprintf(fp, "push bp\nmov bp, sp\n"); 73 | } else if (node->type == TN_FUNCTIONCALL) { 74 | // Function call 75 | // Reserve memory on stack 76 | function* f = LookupFunction(fTable, node->sval); // function struct 77 | 78 | // Reserve stack space for local variables 79 | if (f->frameSize - f->paramSize > 0) 80 | fprintf(fp, "sub sp, %d\n", f->frameSize - f->paramSize); 81 | 82 | // "Push" arguments to stack 83 | if (node->operands[0] != NULL) 84 | EmitHelper(node->operands[0]); 85 | 86 | // Call function 87 | fprintf(fp, "call _%s\n", node->sval); 88 | if (f->frameSize > 0) 89 | fprintf(fp, "add sp, %d\n", f->frameSize); // restore stack memory 90 | 91 | // Push return value 92 | if (f->type != IT_VOID) 93 | { 94 | if (f->type == IT_BYTEP || f->type == IT_WORDP) 95 | { 96 | fprintf(fp, "push bx\npush ax\n"); 97 | } else { 98 | fprintf(fp, "push ax\n"); 99 | } 100 | } 101 | } else if (node->type == TN_ARGLIST) { 102 | // Parameter list 103 | int i; 104 | 105 | // Push arguments onto stack in reverse order 106 | int argSize = 0; 107 | for (i = node->numOperands - 1; i >= 0; i--) 108 | { 109 | EmitHelper(node->operands[i]); 110 | 111 | node_type t = node->operands[i]->type; 112 | if (t == TN_PTR_IDENT || t == TN_REF || t == TN_STRING_LITERAL) 113 | { 114 | // Reverse segment/offset order for pointers 115 | fprintf(fp, "pop ax\npop bx\npush ax\npush bx\n"); 116 | } 117 | } 118 | } else if (node->type == TN_RET_INT) { 119 | // Return int value 120 | EmitHelper(node->operands[0]); 121 | fprintf(fp, "pop ax\npop bp\nret\n"); 122 | } else if (node->type == TN_RET_BOOL) { 123 | // Return bool value 124 | EmitHelper(node->operands[0]); 125 | fprintf(fp, "pop ax\npop bp\nret\n"); 126 | } else if (node->type == TN_RET_PTR) { 127 | // Return pointer value 128 | EmitHelper(node->operands[0]); 129 | fprintf(fp, "pop ax\npop bx\npop bp\nret\n"); 130 | } else if (node->type == TN_BYTE_ASSIGN) { 131 | // Byte variable assignment 132 | EmitHelper(node->operands[1]); 133 | fprintf(fp, "pop ax\nmov [bp+%d], al\t; %s\n", node->operands[0]->ival + CALLOFFSET, node->operands[0]->sval); 134 | } else if (node->type == TN_WORD_ASSIGN) { 135 | // Word variable assignment 136 | EmitHelper(node->operands[1]); 137 | fprintf(fp, "pop ax\nmov [bp+%d], ax\t; %s\n", node->operands[0]->ival + CALLOFFSET, node->operands[0]->sval); 138 | } else if (node->type == TN_BOOL_ASSIGN) { 139 | // Boolean variable assignment 140 | EmitHelper(node->operands[1]); 141 | fprintf(fp, "pop ax\nmov [bp+%d], al\t; %s\n", node->operands[0]->ival + CALLOFFSET, node->operands[0]->sval); 142 | } else if (node->type == TN_PTR_ASSIGN) { 143 | // Pointer variable assignment 144 | EmitHelper(node->operands[1]); 145 | fprintf(fp, "pop ax\nmov [bp+%d], ax\t; %s (offset)\n", node->operands[0]->ival + CALLOFFSET + 2, node->operands[0]->sval); 146 | fprintf(fp, "pop ax\nmov [bp+%d], ax\t; %s (segment)\n", node->operands[0]->ival + CALLOFFSET, node->operands[0]->sval); 147 | } else if (node->type == TN_PTR_BYTE_ASSIGN) { 148 | // Byte pointer element assignment 149 | EmitHelper(node->operands[1]); // Value 150 | EmitHelper(node->operands[0]); // Index 151 | fprintf(fp, "mov es, [bp+%d]\n", node->ival + CALLOFFSET); 152 | fprintf(fp, "mov si, [bp+%d]\n", node->ival + CALLOFFSET + 2); 153 | fprintf(fp, "pop ax\nadd si, ax\n"); 154 | fprintf(fp, "pop ax\n"); 155 | fprintf(fp, "mov [es:si], al\n"); 156 | } else if (node->type == TN_PTR_WORD_ASSIGN) { 157 | // Word pointer element assignment 158 | EmitHelper(node->operands[1]); // Value 159 | EmitHelper(node->operands[0]); // Index 160 | fprintf(fp, "mov es, [bp+%d]\n", node->ival + CALLOFFSET); 161 | fprintf(fp, "mov si, [bp+%d]\n", node->ival + CALLOFFSET + 2); 162 | fprintf(fp, "pop ax\nadd si, ax\n"); 163 | fprintf(fp, "pop ax\n"); 164 | fprintf(fp, "mov [es:si], ax\n"); 165 | } else if (node->type == TN_BYTE_IDENT) { 166 | // Byte identifier 167 | fprintf(fp, "mov ax, [bp+%d]\nxor ah, ah\npush ax\t; %s\n", node->ival + CALLOFFSET, node->sval); 168 | } else if (node->type == TN_WORD_IDENT) { 169 | // Word identifier 170 | fprintf(fp, "mov ax, [bp+%d]\npush ax\t; %s\n", node->ival + CALLOFFSET, node->sval); 171 | } else if (node->type == TN_BOOL_IDENT) { 172 | // Boolean identifier 173 | fprintf(fp, "mov al, [bp+%d]\npush ax\t; %s\n", node->ival + CALLOFFSET, node->sval); 174 | } else if (node->type == TN_PTR_IDENT) { 175 | // Pointer identifier 176 | fprintf(fp, "mov ax, [bp+%d]\npush ax\t; %s (segment)\nmov ax, [bp+%d]\npush ax\t; %s (offset)\n", node->ival + CALLOFFSET, node->sval, node->ival + CALLOFFSET + 2, node->sval); 177 | } else if (node->type == TN_INTEGER) { 178 | // Integer constant 179 | fprintf(fp, "push %d\n", node->ival); 180 | } else if (node->type == TN_CHAR) { 181 | // Character constant 182 | fprintf(fp, "push '%c'\n", node->sval[0]); 183 | } else if (node->type == TN_PTR_BYTE) { 184 | // Get byte value in an array 185 | EmitHelper(node->operands[0]); 186 | fprintf(fp, "mov es, [bp+%d]\n", node->ival + CALLOFFSET); 187 | fprintf(fp, "mov si, [bp+%d]\n", node->ival + CALLOFFSET + 2); 188 | fprintf(fp, "pop ax\nadd si, ax\n"); 189 | fprintf(fp, "xor bh, bh\n"); 190 | fprintf(fp, "mov bl, [es:si]\n"); 191 | fprintf(fp, "push bx\n"); 192 | } else if (node->type == TN_PTR_WORD) { 193 | // Get word value in an array 194 | EmitHelper(node->operands[0]); 195 | fprintf(fp, "mov es, [bp+%d]\n", node->ival + CALLOFFSET); 196 | fprintf(fp, "mov si, [bp+%d]\n", node->ival + CALLOFFSET + 2); 197 | fprintf(fp, "pop ax\nadd si, ax\n"); 198 | fprintf(fp, "mov bx, [es:si]\n"); 199 | fprintf(fp, "push bx\n"); 200 | } else if (node->type == TN_TRUE) { 201 | // Logical true 202 | fprintf(fp, "push 1\n"); 203 | } else if (node->type == TN_FALSE) { 204 | fprintf(fp, "push 0\n"); 205 | } else if (node->type == TN_WHILE) { 206 | // While statement 207 | fprintf(fp, "while_%d:\n", node->id); 208 | EmitHelper(node->operands[0]); 209 | fprintf(fp, "pop ax\n"); 210 | fprintf(fp, "cmp al, 0\n"); 211 | fprintf(fp, "jne begin_while_%d\n", node->id); 212 | fprintf(fp, "jmp end_while_%d\n", node->id); 213 | fprintf(fp, "begin_while_%d:\n", node->id); 214 | EmitHelper(node->operands[1]); 215 | fprintf(fp, "jmp while_%d\n", node->id); 216 | fprintf(fp, "end_while_%d:\n", node->id); 217 | } else if (node->type == TN_IF) { 218 | // If statement 219 | EmitHelper(node->operands[0]); 220 | fprintf(fp, "pop ax\n"); 221 | fprintf(fp, "cmp al, 1\n"); 222 | fprintf(fp, "je begin_if_%d\n", node->id); 223 | fprintf(fp, "jmp else_%d\n", node->id); 224 | fprintf(fp, "begin_if_%d:\n", node->id); 225 | EmitHelper(node->operands[1]); 226 | fprintf(fp, "jmp endif_%d\nelse_%d:\n", node->id, node->id); 227 | if (node->operands[2] != NULL) 228 | EmitHelper(node->operands[2]); 229 | fprintf(fp, "endif_%d:\n", node->id); 230 | } else if (node->type == TN_SEGCALL) { 231 | // segment() built-in function 232 | EmitHelper(node->operands[0]); 233 | fprintf(fp, "pop ax\npop bx\n"); 234 | fprintf(fp, "push bx\n"); 235 | } else if (node->type == TN_OFFCALL) { 236 | // offset() built-in function 237 | EmitHelper(node->operands[0]); 238 | fprintf(fp, "pop ax\npop bx\n"); 239 | fprintf(fp, "push ax\n"); 240 | } else if (node->type == TN_ASM) { 241 | // Assembly code 242 | EmitHelper(node->operands[0]); 243 | } else if (node->type == TN_AMOV) { 244 | // mov instruction 245 | struct tree_node* op1 = node->operands[0]; 246 | struct tree_node* op2 = node->operands[1]; 247 | 248 | if (op1->type == TN_ASMREG) 249 | fprintf(fp, "mov %s, ", regStr(op1->ival)); 250 | else if (op1->type == TN_ASMLOC) 251 | fprintf(fp, "mov [bp+%d], ", op1->operands[0]->ival + CALLOFFSET); 252 | 253 | if (op2->type == TN_INTEGER) 254 | fprintf(fp, "%d\n", op2->ival); 255 | else if (op2->type == TN_ASMREG) 256 | fprintf(fp, "%s\n", regStr(op2->ival)); 257 | else if (op2->type == TN_ASMLOC) 258 | fprintf(fp, "[bp+%d]\n", op2->operands[0]->ival + CALLOFFSET); 259 | else if (op2->type == TN_CHAR) 260 | fprintf(fp, "'%s'\n", op2->sval); 261 | } else if (node->type == TN_AINT) { 262 | // int instruction 263 | fprintf(fp, "int %d\n", node->operands[0]->ival); 264 | } else if (node->type == TN_ACALL) { 265 | // call instruction 266 | fprintf(fp, "call %d\n", node->operands[0]->ival); 267 | } else if (node->type == TN_IADD) { 268 | // Integer addition 269 | EmitHelper(node->operands[0]); 270 | EmitHelper(node->operands[1]); 271 | fprintf(fp, "pop bx\npop ax\nadd ax, bx\npush ax\n"); 272 | } else if (node->type == TN_ISUB) { 273 | // Integer subtraction 274 | EmitHelper(node->operands[0]); 275 | EmitHelper(node->operands[1]); 276 | fprintf(fp, "pop bx\npop ax\nsub ax, bx\npush ax\n"); 277 | } else if (node->type == TN_UMINUS) { 278 | // Unary minus 279 | EmitHelper(node->operands[0]); 280 | fprintf(fp, "pop ax\nmov bx, 0\nsub bx, ax\npush bx\n"); 281 | } else if (node->type == TN_IMUL) { 282 | // Integer multiplication 283 | EmitHelper(node->operands[0]); 284 | EmitHelper(node->operands[1]); 285 | fprintf(fp, "pop bx\npop ax\nimul ax, bx\npush ax\n"); 286 | } else if (node->type == TN_IDIV) { 287 | // Integer division 288 | EmitHelper(node->operands[0]); 289 | EmitHelper(node->operands[1]); 290 | fprintf(fp, "pop bx\npop ax\nxor dx, dx\nidiv bx\npush ax\n"); 291 | } else if (node->type == TN_IMOD) { 292 | // Integer modulus 293 | EmitHelper(node->operands[0]); 294 | EmitHelper(node->operands[1]); 295 | fprintf(fp, "pop bx\npop ax\nxor dx, dx\nidiv bx\npush dx\n"); 296 | } else if (node->type == TN_IEQ || node->type == TN_BEQ) { 297 | // Integer equality 298 | EmitHelper(node->operands[0]); 299 | EmitHelper(node->operands[1]); 300 | fprintf(fp, "pop bx\npop ax\nmov dx, 1\ncmp ax, bx\nje eq_%d\nmov dx, 0\neq_%d:\npush dx\n", node->id, node->id); 301 | } else if (node->type == TN_INEQ || node->type == TN_BNEQ) { 302 | // Integer inequality 303 | EmitHelper(node->operands[0]); 304 | EmitHelper(node->operands[1]); 305 | fprintf(fp, "pop bx\npop ax\nmov dx, 1\ncmp ax, bx\njne eq_%d\nmov dx, 0\neq_%d:\npush dx\n", node->id, node->id); 306 | } else if (node->type == TN_IGT) { 307 | // Integer greater than 308 | EmitHelper(node->operands[0]); 309 | EmitHelper(node->operands[1]); 310 | fprintf(fp, "pop bx\npop ax\nmov dx, 1\ncmp ax, bx\njg eq_%d\nmov dx, 0\neq_%d:\npush dx\n", node->id, node->id); 311 | } else if (node->type == TN_ILT) { 312 | // Integer less than 313 | EmitHelper(node->operands[0]); 314 | EmitHelper(node->operands[1]); 315 | fprintf(fp, "pop bx\npop ax\nmov dx, 1\ncmp ax, bx\njl eq_%d\nmov dx, 0\neq_%d:\npush dx\n", node->id, node->id); 316 | } else if (node->type == TN_IGTE) { 317 | // Integer greater than or equal to 318 | EmitHelper(node->operands[0]); 319 | EmitHelper(node->operands[1]); 320 | fprintf(fp, "pop bx\npop ax\nmov dx, 1\ncmp ax, bx\njge eq_%d\nmov dx, 0\neq_%d:\npush dx\n", node->id, node->id); 321 | } else if (node->type == TN_ILTE) { 322 | // Integer less than or equal to 323 | EmitHelper(node->operands[0]); 324 | EmitHelper(node->operands[1]); 325 | fprintf(fp, "pop bx\npop ax\nmov dx, 1\ncmp ax, bx\njle eq_%d\nmov dx, 0\neq_%d:\npush dx\n", node->id, node->id); 326 | } else if (node->type == TN_PEQ) { 327 | // Pointer equality 328 | EmitHelper(node->operands[0]); 329 | EmitHelper(node->operands[1]); 330 | fprintf(fp, "pop dx\npop cx\npop bx\npop ax\ncmp ax, cx\njne eq_%d\ncmp bx, dx\njne eq_%d\npush 1\njmp eeq_%d\neq_%d:\npush 0\neeq_%d:\n", 331 | node->id, node->id, node->id, node->id, node->id); 332 | } else if (node->type == TN_PNEQ) { 333 | // Pointer inequality 334 | EmitHelper(node->operands[0]); 335 | EmitHelper(node->operands[1]); 336 | fprintf(fp, "pop dx\npop cx\npop bx\npop ax\ncmp ax, cx\njne eq_%d\ncmp bx, dx\njne eq_%d\npush 1\njmp eeq_%d\neq_%d:\npush 0\neeq_%d:\n", 337 | node->id, node->id, node->id, node->id, node->id); 338 | } else if (node->type == TN_UBNEQ) { 339 | // Unary boolean inequality 340 | EmitHelper(node->operands[0]); 341 | fprintf(fp, "mov dx, 1\npop ax\ncmp ax, 0\nje eq_%d\nmov dx, 0\neq_%d:\npush dx\n", node->id, node->id); 342 | } else if (node->type == TN_STRING_LITERAL) { 343 | // Get pointer to string 344 | fprintf(fp, "push cs\nlea ax, [strlit_%d]\npush ax\n", node->ival); 345 | } else if (node->type == TN_NULL) { 346 | // Push null reference onto stack 347 | fprintf(fp, "push -1\npush 0\n"); 348 | } else if (node->type == TN_REF) { 349 | // Memory reference (':' operator) 350 | EmitHelper(node->operands[0]); 351 | EmitHelper(node->operands[1]); 352 | } else { 353 | // Unknown node 354 | if (node->sval != NULL) 355 | fprintf(fp, "UNKNOWN NODE: %s\n", node->sval); 356 | else 357 | fprintf(fp, "UNKNOWN NODE\n"); 358 | } 359 | 360 | // Call next statement if it exists 361 | if (node->pNextStatement != NULL) 362 | EmitHelper(node->pNextStatement); 363 | } 364 | 365 | // Emit assembly code compatible with the FASM assembler 366 | void EmitFasm(char* filename, struct tree_node* root, char* org) 367 | { 368 | // Open file 369 | fp = fopen(filename, "w+"); 370 | 371 | // Add org statement if memory placement specified 372 | if (!streq(org, "")) 373 | { 374 | fprintf(fp, "org %s\n", org); 375 | } 376 | 377 | // Call main 378 | function* f = LookupFunction(fTable, "main"); // function struct 379 | 380 | // Call function 381 | if (f->frameSize - f->paramSize > 0) 382 | fprintf(fp, "sub sp, %d\n", f->frameSize - f->paramSize); // reserve local variable stack space 383 | fprintf(fp, "call _main\n"); 384 | if (f->frameSize > 0) 385 | fprintf(fp, "add sp, %d\n", f->frameSize); // restore stack memory 386 | fprintf(fp, "ret\n"); 387 | 388 | // Write the rest of the program code 389 | EmitHelper(root); 390 | fprintf(fp, "\n"); 391 | 392 | // Add string literal code 393 | AppendStringLiterals(); 394 | 395 | // Close file 396 | fclose(fp); 397 | } 398 | 399 | // Emit assembly code compatible with the NASM assembler 400 | void EmitNasm(char* filename, struct tree_node* root) 401 | { 402 | // Open file 403 | fp = fopen(filename, "w+"); 404 | 405 | // Recurse and emit code 406 | fprintf(fp, "[BITS 16]\norg 100h\ncall main\nret\n"); // REMOVE ORG LATER 407 | EmitHelper(root); 408 | fprintf(fp, "\n"); 409 | 410 | // Close file 411 | fclose(fp); 412 | } 413 | 414 | 415 | -------------------------------------------------------------------------------- /src/compiler.h: -------------------------------------------------------------------------------- 1 | // Emit code compatible with the NASM assembler 2 | void EmitNasm(char* filename, struct tree_node* root); 3 | 4 | // Emit code compatible with the FASM assembler 5 | void EmitFasm(char* filename, struct tree_node* root, char* org); 6 | 7 | 8 | -------------------------------------------------------------------------------- /src/defines.h: -------------------------------------------------------------------------------- 1 | 2 | 3 | #pragma once 4 | 5 | // Standard defines 6 | #define TRUE 1 7 | #define FALSE 0 8 | 9 | // Register constants 10 | #define AX 1 11 | #define BX 2 12 | #define CX 3 13 | #define DX 4 14 | #define AH 5 15 | #define AL 6 16 | #define BH 7 17 | #define BL 8 18 | #define CH 9 19 | #define CL 10 20 | #define DH 11 21 | #define DL 12 22 | #define ES 13 23 | #define SI 14 24 | #define DI 15 25 | 26 | // Parser states 27 | #define OUTF 1 28 | #define INF 2 29 | 30 | -------------------------------------------------------------------------------- /src/functiontable.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Function table code. 3 | */ 4 | 5 | #include "defines.h" 6 | #include 7 | #include 8 | #include "functiontable.h" 9 | #include "identifiertypes.h" 10 | #include "strutil.h" 11 | #include "assert.h" 12 | #include 13 | 14 | // Free a function 15 | void FreeFunction(function* f) 16 | { 17 | int i; 18 | free(f->lexeme); 19 | free(f->params); 20 | free(f); 21 | } 22 | 23 | // Create a new symbol table 24 | function_table* CreateFunctionTable() 25 | { 26 | function_table* t = (function_table*) malloc(sizeof(function_table)); 27 | t->size = 0; 28 | t->cap = 10; 29 | t->funcs = (function**) malloc(sizeof(function*) * t->cap); 30 | return t; 31 | } 32 | 33 | // Free the symbol table 34 | void FreeFunctionTable(function_table* table) 35 | { 36 | int i; 37 | for (i = 0; i < table->size; i++) 38 | FreeFunction(table->funcs[i]); 39 | free(table->funcs); 40 | free(table); 41 | } 42 | 43 | // Add a variable to a function's parameter list 44 | void AddParameter(function_table* table, char* fname, identifier_type type) 45 | { 46 | int i; 47 | function* f = LookupFunction(table, fname); 48 | if (f == NULL) 49 | return; 50 | 51 | if (f->numParams >= f->maxParams) 52 | { 53 | // Expand array 54 | f->maxParams *= 2; 55 | identifier_type* tmp = (identifier_type*) malloc(sizeof(identifier_type) * f->maxParams); 56 | for (i = 0; i < f->numParams; i++) 57 | tmp[i] = f->params[i]; 58 | free(f->params); 59 | f->params = tmp; 60 | } 61 | 62 | // Update parameter stack size 63 | switch (type) 64 | { 65 | case IT_BYTE: 66 | case IT_BOOL: 67 | case IT_WORD: 68 | f->paramSize += 2; 69 | break; 70 | case IT_BYTEP: 71 | case IT_WORDP: 72 | f->paramSize += 4; 73 | break; 74 | default: 75 | printf("error - invalid parameter\n"); 76 | } 77 | 78 | f->params[f->numParams++] = type; 79 | } 80 | 81 | // Add a function 82 | void AddFunction(function_table* table, identifier_type type, char* lexeme) 83 | { 84 | int i; 85 | function* f = (function*) malloc(sizeof(function)); 86 | f->type = type; 87 | f->lexeme = strdup(lexeme); 88 | f->numParams = 0; 89 | f->maxParams = 10; 90 | f->params = (identifier_type*) malloc(sizeof(identifier_type) * f->maxParams); 91 | f->frameSize = 0; 92 | f->paramSize = 0; 93 | f->called = FALSE; 94 | 95 | if (table->size >= table->cap) 96 | { 97 | // Expand array 98 | table->cap *= 2; 99 | function** tmp = (function**) malloc(sizeof(function*) * table->cap); 100 | for (i = 0; i < table->size; i++) 101 | tmp[i] = table->funcs[i]; 102 | free(table->funcs); 103 | table->funcs = tmp; 104 | } 105 | 106 | table->funcs[table->size++] = f; 107 | } 108 | 109 | // Lookup function 110 | function* LookupFunction(function_table* table, char* lexeme) 111 | { 112 | int i; 113 | for (i = 0; i < table->size; i++) 114 | if (streq(lexeme, table->funcs[i]->lexeme)) 115 | return table->funcs[i]; 116 | return NULL; 117 | } 118 | 119 | // Print function table for debugging 120 | void PrintFunctionTable(function_table* table) 121 | { 122 | int i, j; 123 | printf("FUNCTION TABLE:\n"); 124 | for (i = 0; i < table->size; i++) 125 | { 126 | function* f = table->funcs[i]; 127 | printf("%s (%s): FRAMESIZE: %d PARAMSIZE: %d\n", f->lexeme, getTypeString(f->type), f->frameSize, f->paramSize); 128 | 129 | for (j = 0; j < f->numParams; j++) 130 | { 131 | printf("\t%s\n", getTypeString(f->params[j])); 132 | } 133 | } 134 | } 135 | 136 | // TESTS 137 | /* 138 | int main() 139 | { 140 | function_table* t = CreateFunctionTable(); 141 | 142 | AddFunction(t, IT_BYTE, "func1", FALSE); 143 | AddFunction(t, IT_WORD, "func2", TRUE); 144 | 145 | function* f = LookupFunction(t, "fulkj"); 146 | assert(f == NULL); 147 | f = LookupFunction(t, "func1"); 148 | assert(strcmp("func1", f->lexeme) == 0); 149 | assert(f->type == IT_BYTE); 150 | 151 | AddParameter(t, "func2", IT_BYTE); 152 | AddParameter(t, "func2", IT_WORD); 153 | AddParameter(t, "func2", IT_BYTEP); 154 | assert(LookupFunction(t, "func2")->numParams == 3); 155 | assert(LookupFunction(t, "func2")->params[0] == IT_BYTE); 156 | assert(LookupFunction(t, "func2")->params[2] == IT_BYTEP); 157 | 158 | FreeFunctionTable(t); 159 | return 0; 160 | } 161 | */ 162 | 163 | 164 | -------------------------------------------------------------------------------- /src/functiontable.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Function table code. 3 | */ 4 | 5 | #pragma once 6 | 7 | #include "identifiertypes.h" 8 | 9 | // Function structure 10 | typedef struct 11 | { 12 | identifier_type type; 13 | char* lexeme; 14 | int numParams; 15 | int maxParams; 16 | identifier_type* params; 17 | int frameSize; 18 | int paramSize; 19 | int called; 20 | } function; 21 | 22 | // Function table structure 23 | typedef struct 24 | { 25 | int cap; 26 | int size; 27 | function** funcs; 28 | } function_table; 29 | 30 | // Create a new symbol table 31 | function_table* CreateFunctionTable(); 32 | 33 | // Free the symbol table 34 | void FreeFunctionTable(function_table* table); 35 | 36 | // Add a variable to a function's parameter list 37 | void AddParameter(function_table* table, char* fname, identifier_type type); 38 | 39 | // Add a function 40 | void AddFunction(function_table* table, identifier_type type, char* lexeme); 41 | 42 | // Determine if a function has been defined 43 | int FunctionDefined(function_table* table, char* lexeme); 44 | 45 | // Set a function as defined 46 | void DefineFunction(function_table* table, char* lexeme); 47 | 48 | // Lookup function 49 | function* LookupFunction(function_table* table, char* lexeme); 50 | 51 | // Print function table for debugging 52 | void PrintFunctionTable(function_table* table); 53 | 54 | 55 | -------------------------------------------------------------------------------- /src/identifiertypes.c: -------------------------------------------------------------------------------- 1 | #include "identifiertypes.h" 2 | 3 | // Get string for an identifier type 4 | char* getTypeString(identifier_type t) 5 | { 6 | switch (t) 7 | { 8 | case IT_VOID: 9 | return "IT_VOID"; 10 | case IT_BYTE: 11 | return "IT_BYTE"; 12 | case IT_WORD: 13 | return "IT_WORD"; 14 | case IT_BOOL: 15 | return "IT_BOOL"; 16 | case IT_BYTEP: 17 | return "IT_BYTEP"; 18 | case IT_WORDP: 19 | return "IT_WORDP"; 20 | default: 21 | return "UKNOWN TYPE"; 22 | } 23 | } 24 | 25 | -------------------------------------------------------------------------------- /src/identifiertypes.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Identifier types. 3 | */ 4 | 5 | #pragma once 6 | 7 | typedef enum identifier_type_tag 8 | { 9 | IT_VOID, 10 | IT_BYTE, 11 | IT_WORD, 12 | IT_BOOL, 13 | IT_BYTEP, 14 | IT_WORDP 15 | } identifier_type; 16 | 17 | char* getTypeString(identifier_type t); 18 | 19 | -------------------------------------------------------------------------------- /src/list.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Generic list functions. 3 | */ 4 | 5 | // Includes 6 | #include 7 | #include "list.h" 8 | #ifndef NULL 9 | #define NULL 0 10 | #endif 11 | 12 | /* Create a new list. */ 13 | List* newList() 14 | { 15 | List* l = (List*) malloc(sizeof(List)); 16 | l->cap = 10; 17 | l->size = 0; 18 | l->arr = (void**) malloc(sizeof(void*) * 10); 19 | return l; 20 | } 21 | 22 | /* Free list. */ 23 | void freeList(List* l) 24 | { 25 | free(l->arr); 26 | free(l); 27 | } 28 | 29 | /* Add an element to the list. */ 30 | void listAdd(List* l, void* e) 31 | { 32 | if (l->size == l->cap) 33 | { 34 | // Expand stack 35 | l->cap *= 2; 36 | void** newArr = (void**) malloc(sizeof(void*) * l->cap); 37 | int i; 38 | for (i = 0; i < l->size; i++) 39 | newArr[i] = l->arr[i]; 40 | free(l->arr); 41 | l->arr = newArr; 42 | } 43 | 44 | l->arr[l->size++] = e; 45 | } 46 | 47 | /* Get an element from the list. */ 48 | void* listGet(List* l, int index) 49 | { 50 | if (index < 0 || index >= l->size) 51 | return NULL; 52 | return l->arr[index]; 53 | } 54 | 55 | /* Remove an element from the list. */ 56 | void listRemove(List* l, int index) 57 | { 58 | int i; 59 | l->size--; 60 | for (i = index; i < l->size; i++) 61 | l->arr[i] = l->arr[i + 1]; 62 | } 63 | 64 | 65 | -------------------------------------------------------------------------------- /src/list.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Generic linked list functions. 3 | */ 4 | 5 | /* List structure. */ 6 | typedef struct 7 | { 8 | int cap; 9 | int size; 10 | void** arr; 11 | } List; 12 | 13 | /* Create a new list. */ 14 | List* newList(); 15 | 16 | /* Free a linked list. */ 17 | void freeList(List* l); 18 | 19 | /* Add an element to the list. */ 20 | void listAdd(List* l, void* e); 21 | 22 | /* Get an element from the list. */ 23 | void* listGet(List* l, int index); 24 | 25 | /* Remove an element from the list. */ 26 | void listRemove(List* l, int index); 27 | 28 | -------------------------------------------------------------------------------- /src/make.bat: -------------------------------------------------------------------------------- 1 | flex u.l 2 | flex upre.l 3 | bison -d u.y 4 | bison -d upre.y 5 | gcc -o u ulex.c uprelex.c u.tab.c upre.tab.c symboltable.c symbolstack.c functiontable.c parsetree.c compiler.c strutil.c stringtable.c list.c identifiertypes.c stringqueue.c prunefunctions.c -------------------------------------------------------------------------------- /src/optimizer.c: -------------------------------------------------------------------------------- 1 | /* 2 | * optimizer.c 3 | * Functions for optimizing the parse-tree. 4 | */ 5 | 6 | #include 7 | #include "parsetree.h" 8 | #include 9 | 10 | struct tree_node* FoldConstants(struct tree_node* node, struct tree_node* parent, int parent_op) 11 | { 12 | int i; 13 | for (i = 0; i < node->numOperands; i++) 14 | FoldConstants(node->operands[i], node, i); 15 | 16 | if (node->type == TN_IADD 17 | || node->type == TN_ISUB 18 | || node->type == TN_IMUL 19 | || node->type == TN_IDIV 20 | || node->type == TN_IMOD) 21 | { 22 | // This is a math operation node 23 | if (node->operands[0]->type == TN_INTEGER 24 | && node->operands[1]->type == TN_INTEGER) 25 | { 26 | // This node has two constant children, fold 27 | struct tree_node* tmp = node; 28 | node = newTreeNode(); 29 | parent->operands[parent_op] = node; 30 | node->type = TN_INTEGER; 31 | node->id = tmp->id; 32 | 33 | // Fold 34 | if (tmp->type == TN_IADD) 35 | node->ival = tmp->operands[0]->ival + tmp->operands[1]->ival; 36 | else if (tmp->type == TN_ISUB) 37 | node->ival = tmp->operands[0]->ival - tmp->operands[1]->ival; 38 | else if (tmp->type == TN_IMUL) 39 | node->ival = tmp->operands[0]->ival * tmp->operands[1]->ival; 40 | else if (tmp->type == TN_IDIV) 41 | node->ival = tmp->operands[0]->ival / tmp->operands[1]->ival; 42 | else if (tmp->type == TN_IMOD) 43 | node->ival = tmp->operands[0]->ival % tmp->operands[1]->ival; 44 | 45 | // Free old subtree 46 | FreeTree(tmp); 47 | } 48 | } 49 | return node; 50 | } 51 | -------------------------------------------------------------------------------- /src/optimizer.h: -------------------------------------------------------------------------------- 1 | /* 2 | * optimizer.h 3 | * Functions for optimizing the parse-tree. 4 | */ 5 | 6 | #include "parsetree.h" 7 | 8 | struct tree_node* FoldConstants(struct tree_node* root, struct tree_node* parent, int parent_op); 9 | -------------------------------------------------------------------------------- /src/parsetree.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include "parsetree.h" 4 | #include 5 | #include "defines.h" 6 | #include "identifiertypes.h" 7 | #include "strutil.h" 8 | 9 | // Node id counter 10 | int nodeIdCounter = 1; 11 | 12 | // Free entire parse tree, using this node as the root 13 | void FreeTree(struct tree_node* pNode) 14 | { 15 | // Recursively free any non-NULL operands 16 | if (pNode != NULL) 17 | { 18 | int i; 19 | FreeTree(pNode->pNextStatement); 20 | 21 | for (i = 0; i < pNode->numOperands; i++) 22 | FreeTree(pNode->operands[i]); 23 | free(pNode->operands); 24 | 25 | if (pNode->sval != NULL) 26 | free(pNode->sval); 27 | 28 | free(pNode); 29 | } 30 | } 31 | 32 | // Free this node and any below it, but not any successive nodes 33 | void FreeNode(struct tree_node* pNode) 34 | { 35 | if (pNode != NULL) 36 | { 37 | int i; 38 | for (i = 0; i < pNode->numOperands; i++) 39 | FreeTree(pNode->operands[i]); 40 | free(pNode->operands); 41 | 42 | if (pNode->sval != NULL) 43 | free(pNode->sval); 44 | free(pNode); 45 | } 46 | } 47 | 48 | struct tree_node* newTreeNode() 49 | { 50 | struct tree_node* n = (struct tree_node*) malloc(sizeof(struct tree_node)); 51 | n->type = TN_NOTYPE; 52 | n->id = nodeIdCounter++; 53 | n->numOperands = 0; 54 | n->ival = 0; 55 | n->sval = NULL; 56 | n->pNextStatement = NULL; 57 | n->operands = (struct tree_node**) malloc(sizeof(struct tree_node*) * MAX_OPERANDS); 58 | int i; 59 | for (i = 0; i < MAX_OPERANDS; i++) 60 | n->operands[i] = NULL; 61 | return n; 62 | } 63 | 64 | // Get type string 65 | const char* identStr(int type) 66 | { 67 | if (type == IT_VOID) 68 | return "VOID"; 69 | else if (type == IT_BYTE) 70 | return "BYTE"; 71 | else if (type == IT_WORD) 72 | return "WORD"; 73 | else if (type == IT_BYTEP) 74 | return "BYTE[]"; 75 | else if (type == IT_WORDP) 76 | return "WORD[]"; 77 | else 78 | return "NOTYPE"; 79 | } 80 | 81 | // Print the tree for debugging 82 | void PrintParseTree(struct tree_node* pNode, int depth) 83 | { 84 | int i; 85 | for (i = 0; i < depth; i++) 86 | printf(" "); 87 | 88 | printf("NODE (%d): ", pNode->id); 89 | if (pNode->type == TN_INTEGER) 90 | printf("INTEGER (%d)\n", pNode->ival); 91 | else if (pNode->type == TN_QSTRING) 92 | printf("QSTRING\n"); 93 | else if (pNode->type == TN_CHAR) 94 | printf("CHAR ('%s')\n", pNode->sval); 95 | else if (pNode->type == TN_BYTE_IDENT) 96 | printf("BYTE IDENTIFIER (\"%s\" offset: %d)\n", pNode->sval, pNode->ival); 97 | else if (pNode->type == TN_WORD_IDENT) 98 | printf("WORD IDENTIFIER (\"%s\" offset: %d)\n", pNode->sval, pNode->ival); 99 | else if (pNode->type == TN_BOOL_IDENT) 100 | printf("BOOLEAN IDENTIFIER (\"%s\" offset: %d)\n", pNode->sval, pNode->ival); 101 | else if (pNode->type == TN_PTR_IDENT) 102 | printf("PTR IDENTIFIER (\"%s\" offset: %d)\n", pNode->sval, pNode->ival); 103 | else if (pNode->type == TN_PTR_BYTE_ASSIGN) 104 | printf("BYTE POINTER ELEMENT ASSIGNMENT (\"%s\")\n", pNode->sval); 105 | else if (pNode->type == TN_PTR_WORD_ASSIGN) 106 | printf("WORD POINTER ELEMENT ASSIGNMENT (\"%s\")\n", pNode->sval); 107 | else if (pNode->type == TN_PTR_BYTE) 108 | printf("BYTE PTR DEREFERENCE (\"%s\")\n", pNode->sval); 109 | else if (pNode->type == TN_PTR_WORD) 110 | printf("WORD PTR DEREFERENCE (\"%s\")\n", pNode->sval); 111 | else if (pNode->type == TN_STRING_LITERAL) 112 | printf("STRING LITERAL (\"%s\")\n", pNode->sval); 113 | else if (pNode->type == TN_BYTE_BLOCK) 114 | printf("BYTE BLOCK\n"); 115 | else if (pNode->type == TN_WORD_BLOCK) 116 | printf("WORD BLOCK\n"); 117 | else if (pNode->type == TN_FUNCTION) 118 | printf("FUNCTION\n"); 119 | else if (pNode->type == TN_FUNCTIONCALL) 120 | printf("FUNCTION CALL (\"%s\")\n", pNode->sval); 121 | else if (pNode->type == TN_SEGCALL) 122 | printf("SEGMENT() CALL\n"); 123 | else if (pNode->type == TN_OFFCALL) 124 | printf("OFFSET() CALL\n"); 125 | else if (pNode->type == TN_BYTE_ASSIGN) 126 | printf("BYTE ASSIGN\n"); 127 | else if (pNode->type == TN_WORD_ASSIGN) 128 | printf("WORD ASSIGN\n"); 129 | else if (pNode->type == TN_BOOL_ASSIGN) 130 | printf("BOOL ASSIGN\n"); 131 | else if (pNode->type == TN_PTR_ASSIGN) 132 | printf("PTR ASSIGN\n"); 133 | else if (pNode->type == TN_RET_INT) 134 | printf("RETURN INT\n"); 135 | else if (pNode->type == TN_RET_BOOL) 136 | printf("RETURN BOOL\n"); 137 | else if (pNode->type == TN_RET_PTR) 138 | printf("RETURN PTR\n"); 139 | else if (pNode->type == TN_WHILE) 140 | printf("WHILE\n"); 141 | else if (pNode->type == TN_TRUE) 142 | printf("TRUE\n"); 143 | else if (pNode->type == TN_FALSE) 144 | printf("FALSE\n"); 145 | else if (pNode->type == TN_NULL) 146 | printf("NULL\n"); 147 | else if (pNode->type == TN_IF) 148 | printf("IF\n"); 149 | else if (pNode->type == TN_ARGLIST) 150 | printf("ARGUMENT LIST\n"); 151 | else if (pNode->type == TN_PARAM) 152 | printf("PARAM (%s %s)\n", identStr(pNode->ival), pNode->sval); 153 | else if (pNode->type == TN_PARAMLIST) 154 | printf("PARAMLIST\n"); 155 | else if (pNode->type == TN_FDEF) 156 | printf("FUNCTION DEFINITION (%s)\n", pNode->sval); 157 | else if (pNode->type == TN_ASM) 158 | printf("ASM\n"); 159 | else if (pNode->type == TN_ASMLOC) 160 | printf("MEMLOC\n"); 161 | else if (pNode->type == TN_ASMREG) 162 | printf("REGISTER (%s)\n", regStr(pNode->ival)); 163 | else if (pNode->type == TN_AMOV) 164 | printf("MOV\n"); 165 | else if (pNode->type == TN_ACALL) 166 | printf("CALL\n"); 167 | else if (pNode->type == TN_AINT) 168 | printf("INT\n"); 169 | else if (pNode->type == TN_IADD) 170 | printf("ADD\n"); 171 | else if (pNode->type == TN_ISUB) 172 | printf("SUB\n"); 173 | else if (pNode->type == TN_UMINUS) 174 | printf("UNARY MINUS\n"); 175 | else if (pNode->type == TN_IMUL) 176 | printf("MUL\n"); 177 | else if (pNode->type == TN_IDIV) 178 | printf("DIV\n"); 179 | else if (pNode->type == TN_IMOD) 180 | printf("MOD\n"); 181 | else if (pNode->type == TN_IEQ || pNode->type == TN_BEQ || pNode->type == TN_PEQ) 182 | printf("EQ\n"); 183 | else if (pNode->type == TN_INEQ || pNode->type == TN_BNEQ || pNode->type == TN_PNEQ || pNode->type == TN_UBNEQ) 184 | printf("NEQ\n"); 185 | else if (pNode->type == TN_ILT) 186 | printf("LT\n"); 187 | else if (pNode->type == TN_IGT) 188 | printf("GT\n"); 189 | else if (pNode->type == TN_ILTE) 190 | printf("LTE\n"); 191 | else if (pNode->type == TN_IGTE) 192 | printf("GTE\n"); 193 | else if (pNode->type == TN_REF) 194 | printf("MEMORY ADDRESS\n"); 195 | else 196 | printf("NO TYPE\n"); 197 | 198 | for (i = 0; i < pNode->numOperands; i++) 199 | if (pNode->operands[i] != NULL) 200 | PrintParseTree(pNode->operands[i], depth + 1); 201 | else 202 | printf("WARNING: UNEXPECTED NULL OPERAND!\n"); 203 | 204 | if (pNode->pNextStatement != NULL) 205 | PrintParseTree(pNode->pNextStatement, depth); 206 | } 207 | 208 | // Tests 209 | /* 210 | int main() 211 | { 212 | struct tree_node* a = newTreeNode(); 213 | struct tree_node* b = newTreeNode(); 214 | struct tree_node* c = newTreeNode(); 215 | struct tree_node* d = newTreeNode(); 216 | a->numOperands = 2; 217 | b->numOperands = 1; 218 | a->operands[0] = b; 219 | a->operands[1] = c; 220 | b->operands[0] = d; 221 | FreeTree(a); 222 | return 0; 223 | }*/ 224 | 225 | 226 | -------------------------------------------------------------------------------- /src/parsetree.h: -------------------------------------------------------------------------------- 1 | ////////////////////////////////////////////////////////////////////////// 2 | // These are starter data types for parse tree nodes. You will need 3 | // to add to them and modify their structure to suit your purposes. 4 | ////////////////////////////////////////////////////////////////////////// 5 | 6 | #pragma once 7 | 8 | #include "identifiertypes.h" 9 | 10 | // Maximum number of operands 11 | #define MAX_OPERANDS 250 12 | 13 | // These value in this enumerated type will rougly correspond to the tokens 14 | // and rules in your grammar 15 | typedef enum node_type_tag 16 | { 17 | TN_FUNCTION, 18 | TN_FUNCTIONCALL, 19 | TN_SEGCALL, 20 | TN_OFFCALL, 21 | TN_PARAM, 22 | TN_PARAMLIST, 23 | TN_FDEF, 24 | TN_ARGLIST, 25 | TN_BYTE_ASSIGN, 26 | TN_WORD_ASSIGN, 27 | TN_BOOL_ASSIGN, 28 | TN_PTR_ASSIGN, 29 | TN_PTR_BYTE_ASSIGN, 30 | TN_PTR_WORD_ASSIGN, 31 | TN_WHILE, 32 | TN_IF, 33 | TN_TRUE, 34 | TN_FALSE, 35 | TN_NULL, 36 | TN_BYTE_IDENT, 37 | TN_WORD_IDENT, 38 | TN_BOOL_IDENT, 39 | TN_PTR_IDENT, 40 | TN_STRING_LITERAL, 41 | TN_BYTE_BLOCK, 42 | TN_WORD_BLOCK, 43 | TN_PTR_BYTE, 44 | TN_PTR_WORD, 45 | TN_RET_INT, 46 | TN_RET_BOOL, 47 | TN_RET_PTR, 48 | TN_ARRELEM, 49 | TN_CHAR, 50 | TN_ASM, 51 | TN_ASMLOC, 52 | TN_ASMREG, 53 | TN_AMOV, 54 | TN_AINT, 55 | TN_ACALL, 56 | TN_IADD, 57 | TN_ISUB, 58 | TN_UMINUS, 59 | TN_IMUL, 60 | TN_IDIV, 61 | TN_IMOD, 62 | TN_IEQ, 63 | TN_INEQ, 64 | TN_ILT, 65 | TN_IGT, 66 | TN_ILTE, 67 | TN_IGTE, 68 | TN_BEQ, 69 | TN_BNEQ, 70 | TN_PEQ, 71 | TN_PNEQ, 72 | TN_UBNEQ, 73 | TN_REF, 74 | TN_QSTRING, 75 | TN_INTEGER, 76 | TN_NOTYPE 77 | } node_type; 78 | 79 | struct tree_node 80 | { 81 | // The parser will assign this variable one of the enumerated type values 82 | // above according to the kind of node it was created to be. 83 | node_type type; 84 | 85 | // This connects to the next statement in the program 86 | struct tree_node* pNextStatement; 87 | 88 | // Node id 89 | int id; 90 | 91 | // Number of operands used in this node 92 | int numOperands; 93 | 94 | // Integer value of this tree node, if needed 95 | int ival; 96 | 97 | // String value o fthis tree node, if needed 98 | char* sval; 99 | 100 | // Operand pointers 101 | struct tree_node** operands; 102 | }; 103 | 104 | void FreeTree(struct tree_node* pNode); 105 | 106 | void FreeNode(struct tree_node* pNode); 107 | 108 | struct tree_node* newTreeNode(); 109 | 110 | // Print the tree for debugging 111 | void PrintParseTree(struct tree_node* pNode, int depth); 112 | 113 | 114 | -------------------------------------------------------------------------------- /src/prunefunctions.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Code for pruning unused functions from the parse tree. 3 | */ 4 | 5 | #include "prunefunctions.h" 6 | #include "functiontable.h" 7 | #include "parsetree.h" 8 | #include "defines.h" 9 | #include 10 | #include 11 | 12 | struct tree_node* pruneUnusedFunctions(function_table* fTable, struct tree_node* root) 13 | { 14 | struct tree_node* current = root; 15 | struct tree_node* prev = NULL; 16 | while (current != NULL) 17 | { 18 | if (current->type == TN_FUNCTION) 19 | { 20 | function* f = LookupFunction(fTable, current->operands[0]->sval); 21 | if (f->called == FALSE) 22 | { 23 | // function was not called in program, prune it from the parse tree 24 | if (prev == NULL) 25 | { 26 | // remove this node, next one the root 27 | root = current->pNextStatement; 28 | FreeNode(current); 29 | current = root; 30 | } else { 31 | // remove this node and skip over it 32 | prev->pNextStatement = current->pNextStatement; 33 | FreeNode(current); 34 | current = prev->pNextStatement; 35 | } 36 | } else { 37 | prev = current; 38 | current = current->pNextStatement; 39 | } 40 | } else { 41 | current = current->pNextStatement; 42 | } 43 | } 44 | 45 | return root; 46 | } 47 | 48 | 49 | -------------------------------------------------------------------------------- /src/prunefunctions.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Code for pruning unused functions from the parse tree. 3 | */ 4 | 5 | #include "functiontable.h" 6 | #include "parsetree.h" 7 | 8 | struct tree_node* pruneUnusedFunctions(function_table* fTable, struct tree_node* root); 9 | 10 | -------------------------------------------------------------------------------- /src/stringqueue.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Functions for adding/removing strings from a queue. 3 | */ 4 | #include "stringqueue.h" 5 | #include 6 | #include 7 | #include 8 | 9 | // Enqueue a string 10 | void EnqueueString(string_queue* q, char* str) 11 | { 12 | // Exit if string already exists in queue 13 | int i; 14 | for (i = 0; i < q->size; i++) 15 | if (strcmp(q->strings[i], str) == 0) 16 | return; 17 | 18 | // Add to queue 19 | if (q->size == q->cap) 20 | { 21 | // Expand array 22 | q->cap *= 2; 23 | char** tmp = (char**) malloc(sizeof(char*) * q->cap); 24 | for (i = 0; i < q->size; i++) 25 | tmp[i] = q->strings[i]; 26 | free(q->strings); 27 | q->strings = tmp; 28 | } 29 | q->strings[q->size++] = strdup(str); 30 | } 31 | 32 | // Dequeue a string 33 | char* DequeueString(string_queue* q) 34 | { 35 | if (q->size > 0) 36 | { 37 | char* str = q->strings[0]; 38 | int i; 39 | for (i = 0; i < q->size - 1; i++) 40 | q->strings[i] = q->strings[i + 1]; 41 | q->size--; 42 | return str; 43 | } else { 44 | return 0; 45 | } 46 | } 47 | 48 | // Create a new string queue 49 | string_queue* CreateStringQueue() 50 | { 51 | string_queue* q = (string_queue*) malloc(sizeof(string_queue)); 52 | q->size = 0; 53 | q->cap = 10; 54 | q->strings = (char**) malloc(sizeof(char*) * q->cap); 55 | return q; 56 | } 57 | 58 | // Free string queue memory 59 | void FreeStringQueue(string_queue* q) 60 | { 61 | int i; 62 | for (i = 0; i < q->size; i++) 63 | if (q->strings[i] != NULL) 64 | free(q->strings[i]); 65 | free(q->strings); 66 | free(q); 67 | } 68 | 69 | // Tests 70 | /*int main() 71 | { 72 | string_queue* q = CreateStringQueue(); 73 | EnqueueString(q, "goodbye"); 74 | EnqueueString(q, "cruel"); 75 | EnqueueString(q, "world"); 76 | assert(q->size == 3); 77 | char* str = DequeueString(q); 78 | assert(strcmp("goodbye", str) == 0); 79 | free(str); 80 | str = DequeueString(q); 81 | assert(strcmp("cruel", str) == 0); 82 | free(str); 83 | str = DequeueString(q); 84 | assert(strcmp("world", str) == 0); 85 | free(str); 86 | assert(q->size == 0); 87 | FreeStringQueue(q); 88 | return 0; 89 | }*/ 90 | 91 | 92 | -------------------------------------------------------------------------------- /src/stringqueue.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Functions for adding/removing strings from a queue. 3 | */ 4 | 5 | typedef struct 6 | { 7 | int cap; 8 | int size; 9 | char** strings; 10 | } string_queue; 11 | 12 | void EnqueueString(string_queue* q, char* str); 13 | 14 | char* DequeueString(string_queue* q); 15 | 16 | string_queue* CreateStringQueue(); 17 | 18 | void FreeStringQueue(string_queue* q); 19 | 20 | -------------------------------------------------------------------------------- /src/stringtable.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Table for storing string literals. 3 | */ 4 | 5 | #include 6 | #include 7 | #include 8 | #include "stringtable.h" 9 | #include 10 | 11 | // Create a string table 12 | string_table* CreateStringTable() 13 | { 14 | string_table* t = (string_table*) malloc(sizeof(string_table)); 15 | t->cap = 10; 16 | t->size = 0; 17 | t->strings = (char**) malloc(sizeof(char*) * t->cap); 18 | return t; 19 | } 20 | 21 | // Free string table 22 | void FreeStringTable(string_table* table) 23 | { 24 | int i; 25 | for (i = 0; i < table->size; i++) 26 | free(table->strings[i]); 27 | free(table->strings); 28 | free(table); 29 | } 30 | 31 | // Add a string 32 | void AddString(string_table* table, char* str) 33 | { 34 | // replace escape characters with corresponding ASCII values 35 | // BACKSLASH ('\\') -> \ 36 | // NL ('\n') -> 10 37 | int i, j; 38 | int len = strlen(str); 39 | for (i = 0; i < len - 1; i++) 40 | { 41 | if (str[i] == '\\' && str[i + 1] == 'n') 42 | { 43 | str[i] = 10; 44 | for (j = i + 1; j < len; j++) 45 | str[j] = str[j + 1]; 46 | } else if (str[i] == '\\' && str[i + 1] == '\\') { 47 | str[i] = '\\'; 48 | for (j = i + 1; j < len; j++) 49 | str[j] = str[j + 1]; 50 | } 51 | } 52 | 53 | // lookup string to see if it exsists 54 | if (LookupString(table, str) != -1) 55 | return; 56 | 57 | // add to table 58 | if (table->size >= table->cap) 59 | { 60 | // Expand array 61 | table->cap *= 2; 62 | char** tmp = (char**) malloc(sizeof(char*) * table->cap); 63 | for (i = 0; i < table->size; i++) 64 | tmp[i] = table->strings[i]; 65 | free(table->strings); 66 | table->strings = tmp; 67 | } 68 | 69 | table->strings[table->size++] = strdup(str); 70 | } 71 | 72 | // Lookup the index of a string, returns -1 if not found 73 | int LookupString(string_table* table, char* str) 74 | { 75 | int i; 76 | for (i = 0; i < table->size; i++) 77 | if (strcmp(str, table->strings[i]) == 0) 78 | return i; 79 | return -1; 80 | } 81 | 82 | // TESTS 83 | /* 84 | int main() 85 | { 86 | string_table* t = CreateStringTable(); 87 | 88 | AddString(t, "hello"); 89 | AddString(t, "world"); 90 | AddString(t, "something else"); 91 | assert(LookupString(t, "hello") == 0); 92 | assert(LookupString(t, "kljf") == -1); 93 | assert(LookupString(t, "something else") == 2); 94 | 95 | FreeStringTable(t); 96 | return 0; 97 | }*/ 98 | 99 | 100 | -------------------------------------------------------------------------------- /src/stringtable.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Table for storing string literals. 3 | */ 4 | 5 | typedef struct 6 | { 7 | int cap; 8 | int size; 9 | char** strings; 10 | } string_table; 11 | 12 | // Create a string table 13 | string_table* CreateStringTable(); 14 | 15 | // Free string table 16 | void FreeStringTable(string_table* table); 17 | 18 | // Add a string 19 | void AddString(string_table* table, char* str); 20 | 21 | // Lookup the index of a string, returns -1 if not found 22 | int LookupString(string_table* table, char* str); 23 | 24 | 25 | -------------------------------------------------------------------------------- /src/strutil.c: -------------------------------------------------------------------------------- 1 | /* 2 | * String utility functions. 3 | */ 4 | #include 5 | #include 6 | #include 7 | #include "strutil.h" 8 | #include "defines.h" 9 | 10 | int streq(char* a, char* b) 11 | { 12 | return strcmp(a, b) == 0; 13 | } 14 | 15 | 16 | // Get register string 17 | const char* regStr(int reg) 18 | { 19 | switch (reg) 20 | { 21 | case AX: 22 | return "ax"; 23 | case BX: 24 | return "bx"; 25 | case CX: 26 | return "cx"; 27 | case DX: 28 | return "dx"; 29 | case AH: 30 | return "ah"; 31 | case AL: 32 | return "al"; 33 | case BH: 34 | return "bh"; 35 | case BL: 36 | return "bl"; 37 | case CH: 38 | return "ch"; 39 | case CL: 40 | return "cl"; 41 | case DH: 42 | return "dh"; 43 | case DL: 44 | return "dl"; 45 | case SI: 46 | return "si"; 47 | case DI: 48 | return "di"; 49 | default: 50 | return ""; 51 | } 52 | } 53 | 54 | // Convert a string to lower case 55 | void strToLower(char* str) 56 | { 57 | int i; 58 | for (i = 0; str[i] != '\0'; i++) 59 | str[i] = tolower(str[i]); 60 | } 61 | 62 | /* Get directory path from input filepath (with trailing '/' character) */ 63 | char* directoryPath(char* filePath) 64 | { 65 | // Find ending '/' 66 | int len = strlen(filePath); 67 | int i; 68 | int last; 69 | for (i = len - 1; i >= 0; i--) 70 | { 71 | if (filePath[i] == '/') 72 | { 73 | last = i; 74 | break; 75 | } 76 | } 77 | 78 | if (i < 0) 79 | return strdup("./"); 80 | 81 | char* str = (char*) malloc(last + 2); 82 | str[last + 1] = '\0'; 83 | for (i = 0; i <= last; i++) 84 | str[i] = filePath[i]; 85 | return str; 86 | } 87 | 88 | /* Get file basename. */ 89 | char* getBasename(char* filePath) 90 | { 91 | int len = strlen(filePath); 92 | int i, j; 93 | int start = 0; 94 | for (i = len - 1; i >= 0; i--) 95 | { 96 | if (filePath[i] == '/') 97 | { 98 | start = i + 1; 99 | break; 100 | } 101 | } 102 | 103 | char* str = (char*) malloc(len - start + 1); 104 | str[len - start] = '\0'; 105 | for (i = start, j = 0; i < len; i++, j++) 106 | str[j] = filePath[i]; 107 | return str; 108 | } 109 | 110 | /* Trims the extension from a filename. */ 111 | char* outputPath(char* filePath) 112 | { 113 | int len = strlen(filePath); 114 | int i, end; 115 | for (end = len - 1; end >= 0; end--) 116 | if (filePath[end] == '.') 117 | break; 118 | char* str = (char*) malloc(end + 5); 119 | for (i = 0; i < end; i++) 120 | str[i] = filePath[i]; 121 | str[i++] = '.'; 122 | str[i++] = 'a'; 123 | str[i++] = 's'; 124 | str[i++] = 'm'; 125 | str[i] = '\0'; 126 | return str; 127 | } 128 | 129 | /* Converts a string in [01]+b or [01]+ format to an int value. */ 130 | int bintoint(char* str) 131 | { 132 | int len = strlen(str); 133 | int i; 134 | int value = 0; 135 | int power = 1; 136 | for (i = len - 2; i >= 0; i--) 137 | { 138 | if (str[i] == '1') 139 | value += power; 140 | power *= 2; 141 | } 142 | return value; 143 | } 144 | 145 | 146 | -------------------------------------------------------------------------------- /src/strutil.h: -------------------------------------------------------------------------------- 1 | /* 2 | * String utility functions. 3 | */ 4 | 5 | #pragma once 6 | 7 | int streq(char* a, char* b); 8 | 9 | const char* regStr(int reg); 10 | 11 | void strToLower(char* str); 12 | 13 | char* getBasename(char* filePath); 14 | 15 | char* directoryPath(char* filePath); 16 | 17 | char* outputPath(char* filePath); 18 | 19 | int bintoint(char* binStr); 20 | 21 | 22 | -------------------------------------------------------------------------------- /src/symbolstack.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Stack of symbol tables. 3 | */ 4 | 5 | #include 6 | #include "symboltable.h" 7 | #include "identifiertypes.h" 8 | #include 9 | #include "symbolstack.h" 10 | #include 11 | 12 | // Reset offset counter 13 | void ResetOffsetCounter(symbol_stack* stack) 14 | { 15 | stack->offsetCounter = 0; 16 | } 17 | 18 | // Add a symbol to the top level 19 | void AddSymbol(symbol_stack* stack, char* lexeme, identifier_type type) 20 | { 21 | int offset; 22 | switch (type) 23 | { 24 | case IT_BYTE: 25 | case IT_BOOL: 26 | case IT_WORD: 27 | offset = stack->offsetCounter; 28 | stack->offsetCounter += 2; 29 | break; 30 | case IT_BYTEP: 31 | case IT_WORDP: 32 | offset = stack->offsetCounter; 33 | stack->offsetCounter += 4; 34 | break; 35 | default: 36 | offset = 0; 37 | } 38 | 39 | if (stack->size > 0) 40 | Add(stack->tlist[stack->size - 1], lexeme, type, offset); 41 | } 42 | 43 | // Lookup a symbol starting with the top level and progressing downward 44 | symtab_entry* LookupSymbol(symbol_stack* stack, char* lexeme) 45 | { 46 | int i; 47 | symtab_entry* entry; 48 | for (i = stack->size - 1; i >= 0; i--) 49 | { 50 | entry = Lookup(stack->tlist[i], lexeme); 51 | if (entry != NULL) 52 | return entry; 53 | } 54 | 55 | return NULL; 56 | } 57 | 58 | // Push a new symbol table onto the stack 59 | void PushTable(symbol_stack* stack) 60 | { 61 | int i; 62 | 63 | if (stack->size >= stack->cap) 64 | { 65 | // Expand stack 66 | stack->cap *= 2; 67 | symbol_table** newList = (symbol_table**) malloc(sizeof(symbol_table*) * stack->cap); 68 | 69 | for (i = 0; i < stack->size; i++) 70 | newList[i] = stack->tlist[i]; 71 | 72 | free(stack->tlist); 73 | stack->tlist = newList; 74 | } 75 | 76 | stack->tlist[stack->size++] = CreateSymbolTable(); 77 | } 78 | 79 | // Pop a symbol table from the stack 80 | void PopTable(symbol_stack* stack) 81 | { 82 | if (stack->size > 0) 83 | { 84 | stack->size--; 85 | FreeSymbolTable(stack->tlist[stack->size]); 86 | } 87 | } 88 | 89 | // Create a symbol table stack 90 | symbol_stack* CreateSymbolStack() 91 | { 92 | symbol_stack* s = (symbol_stack*) malloc(sizeof(symbol_stack)); 93 | s->size = 0; 94 | s->cap = 10; 95 | s->tlist = (symbol_table**) malloc(sizeof(symbol_table*) * s->cap); 96 | s->offsetCounter = 0; 97 | return s; 98 | } 99 | 100 | // Free a symbol table stack 101 | void FreeSymbolStack(symbol_stack* stack) 102 | { 103 | int i; 104 | for (i = 0; i < stack->size; i++) 105 | FreeSymbolTable(stack->tlist[i]); 106 | free(stack->tlist); 107 | free(stack); 108 | } 109 | 110 | 111 | // Tests 112 | /* 113 | #include 114 | int main() 115 | { 116 | symbol_stack* stack = CreateSymbolStack(); 117 | PushTable(stack); 118 | 119 | AddSymbol(stack, "myVar", IT_BYTE); 120 | AddSymbol(stack, "myVar2", IT_BYTE); 121 | 122 | PushTable(stack); 123 | 124 | AddSymbol(stack, "myVar", IT_WORD); 125 | 126 | symtab_entry* entry; 127 | entry = LookupSymbol(stack, "myVar"); 128 | 129 | assert(entry->type == IT_WORD); 130 | 131 | entry = LookupSymbol(stack, "myVar2"); 132 | assert(entry->type == IT_BYTE); 133 | PopTable(stack); 134 | 135 | entry = LookupSymbol(stack, "myVar"); 136 | assert(entry->type == IT_BYTE); 137 | 138 | PopTable(stack); 139 | 140 | entry = LookupSymbol(stack, "myVar"); 141 | assert(entry == NULL); 142 | 143 | FreeSymbolStack(stack); 144 | }*/ 145 | 146 | 147 | 148 | -------------------------------------------------------------------------------- /src/symbolstack.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Stack of symbol tables. 3 | */ 4 | #pragma once 5 | 6 | #include "identifiertypes.h" 7 | 8 | // Stack structure 9 | typedef struct 10 | { 11 | int size; 12 | int cap; 13 | symbol_table** tlist; 14 | int offsetCounter; 15 | } symbol_stack; 16 | 17 | // Reset offset counter 18 | void ResetOffsetCounter(symbol_stack* stack); 19 | 20 | // Add a symbol to the top level 21 | void AddSymbol(symbol_stack* stack, char* lexeme, identifier_type type); 22 | 23 | // Lookup a symbol starting with the top level and progressing downward 24 | symtab_entry* LookupSymbol(symbol_stack* stack, char* lexeme); 25 | 26 | // Push a new symbol table onto the stack 27 | void PushTable(symbol_stack* stack); 28 | 29 | // Pop a symbol table from the stack 30 | void PopTable(symbol_stack* stack); 31 | 32 | // Create a symbol table stack 33 | symbol_stack* CreateSymbolStack(); 34 | 35 | // Free a symbol table stack 36 | void FreeSymbolStack(symbol_stack* stack); 37 | 38 | 39 | -------------------------------------------------------------------------------- /src/symboltable.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include "symboltable.h" 5 | #include "identifiertypes.h" 6 | 7 | symbol_table* CreateSymbolTable() 8 | { 9 | symbol_table* pST = malloc(sizeof(symbol_table)); 10 | pST->pSymbol = NULL; // only NULL for the root of the tree 11 | // and until the first symbol is Add()ed 12 | pST->pNextNode = NULL; // it's a linked list 13 | 14 | return pST; 15 | } 16 | 17 | void FreeSymbolTable(symbol_table *pSymbolTable) 18 | { 19 | if ( pSymbolTable == NULL ) 20 | return; 21 | 22 | FreeSymbolTable(pSymbolTable->pNextNode); 23 | // all the nodes beyond this one are gone so 24 | // now deallocate this one 25 | if ( pSymbolTable->pSymbol ) 26 | { 27 | free(pSymbolTable->pSymbol->lexeme); 28 | free(pSymbolTable->pSymbol); 29 | } 30 | free(pSymbolTable); // we really are just freeing this current 31 | // symbol table node...but when the 32 | // recursion is over, they're all gone. 33 | } 34 | 35 | symtab_entry* Lookup(symbol_table* pST, char* Lexeme) 36 | { 37 | // Search the symbol table for an entry matching Lexeme 38 | while ( pST ) 39 | { 40 | if ( pST->pSymbol ) 41 | { 42 | if ( strcmp(pST->pSymbol->lexeme, Lexeme) == 0 ) 43 | return pST->pSymbol; // found it 44 | } 45 | pST = pST->pNextNode; 46 | } 47 | return NULL; // didn't find it 48 | } 49 | 50 | symtab_entry* Add(symbol_table *pST, char* Lexeme, identifier_type Type, int Offset) 51 | { 52 | // The symbol should not already be in this table 53 | assert(Lookup(pST, Lexeme) == NULL); 54 | assert(strlen(Lexeme) > 0); 55 | 56 | // not already in the table so create a new one 57 | symtab_entry* pSymbol = malloc(sizeof(symtab_entry)); 58 | pSymbol->lexeme = strdup(Lexeme); 59 | pSymbol->type = Type; 60 | pSymbol->offset = Offset; 61 | 62 | // Now place this symbol into a symbol table node in the linked list 63 | if ( pST->pSymbol == NULL ) 64 | { // there are no symbols in the table yet, just the default 65 | // root node 66 | assert(pST->pNextNode == NULL); 67 | pST->pSymbol = pSymbol; 68 | } 69 | else 70 | { 71 | struct symtab_list_node *pNewNode = 72 | malloc(sizeof(struct symtab_list_node)); 73 | pNewNode->pSymbol = pSymbol; 74 | pNewNode->pNextNode = NULL; // this one will go at the end 75 | 76 | // track to end of the table and add it there 77 | while ( pST->pNextNode ) 78 | pST = pST->pNextNode; 79 | pST->pNextNode = pNewNode; 80 | } 81 | return pSymbol; // return the symbol 82 | } 83 | 84 | 85 | ////////////////////////////////////////////////////////////////// 86 | // Unit test code 87 | ////////////////////////////////////////////////////////////////// 88 | /* 89 | #include 90 | int main() 91 | { 92 | symbol_table* pST = CreateSymbolTable(); 93 | 94 | printf("Checking for symbols in the empty symbol table...\n"); 95 | assert(Lookup(pST, "Hello") == NULL); 96 | 97 | printf("Adding a set to the symbol table...\n"); 98 | Add(pST, "SetVar", IT_BYTE); 99 | 100 | printf("Adding a string to the symbol table...\n"); 101 | Add(pST, "StrVar", IT_WORD); 102 | 103 | printf("Looking up the two symbols just added...\n"); 104 | symtab_entry* pS1 = Lookup(pST, "SetVar"); 105 | assert(pS1); 106 | assert(pS1->type == IT_BYTE); 107 | 108 | symtab_entry* pS2 = Lookup(pST, "StrVar"); 109 | assert(pS2); 110 | assert(pS2->type == IT_WORD); 111 | 112 | printf("All tests passed.\n"); 113 | 114 | FreeSymbolTable(pST); 115 | 116 | return 0; 117 | } 118 | */ 119 | 120 | 121 | -------------------------------------------------------------------------------- /src/symboltable.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "identifiertypes.h" 4 | 5 | ///////////////////////////////////////////////// 6 | // Elements in the symbol table are of this type 7 | ///////////////////////////////////////////////// 8 | typedef struct symtab_entry_tag 9 | { 10 | char* lexeme; // The lexeme for the identifier 11 | identifier_type type; // Identifier type yet 12 | int offset; 13 | } symtab_entry; 14 | 15 | struct symtab_list_node 16 | { 17 | symtab_entry* pSymbol; 18 | struct symtab_list_node *pNextNode; 19 | }; 20 | 21 | typedef struct symtab_list_node symbol_table; 22 | 23 | /////////////////////////////////////////////////////// 24 | // There should be one symbol table for each "scope" of 25 | // variables. You call this method to create the symbol 26 | // table the first time. Thereafter, you pass it to 27 | // LookUp( ) and Add( ) to fetch and add symbols to the 28 | // symbol table. 29 | // 30 | // When you are done with the symbol table don't forget 31 | // to call FreeSymbolTable( ) on the pointer you obtained 32 | // from CreateSymbolTable( ). 33 | /////////////////////////////////////////////////////// 34 | symbol_table* CreateSymbolTable(); 35 | void FreeSymbolTable(symbol_table *pSymbolTable); 36 | 37 | /////////////////////////////////////////////////////////////// 38 | // This looks in pSymbolTable for the identifier whose name is 39 | // is Lexeme. If it finds it, the symtab_entry for that 40 | // identifier is returned. Otherwise, NULL is returned. 41 | // NOTE: When an identifier that you Lookup( ) is found 42 | // in the table, you must still check its Type to be sure it 43 | // matches its current usage. 44 | /////////////////////////////////////////////////////////////// 45 | symtab_entry* Lookup(symbol_table *pSymbolTable, char* Lexeme); 46 | 47 | ///////////////////////////////////////////////////////////////// 48 | // This attempts to add a new identifier to pSymbolTable. 49 | // The new symbol will be associated with the identifier name 50 | // Lexeme and be of the type specified by Type. It will receive 51 | // a default value of 52 | // "" for string types 53 | // { } for set types. 54 | // 55 | // Before Adding a symbol to the parse tree you must first call 56 | // Lookup( ) to verify that 57 | // the symbol is not already in the symbol table (if it is, just 58 | // use the return value of Lookup( )...no need to Add( ) it) 59 | ///////////////////////////////////////////////////////////////// 60 | symtab_entry* Add(symbol_table *pSymbolTable, char* Lexeme, identifier_type Type, int Offset); 61 | 62 | 63 | 64 | 65 | -------------------------------------------------------------------------------- /src/u.l: -------------------------------------------------------------------------------- 1 | /* 2 | * u.l 3 | * (f)lex lexer token definition file for the U programming language. 4 | */ 5 | 6 | %{ 7 | #include "defines.h" 8 | #include "u.tab.h" 9 | #include "upre.tab.h" 10 | #include "symboltable.h" 11 | #include "symbolstack.h" 12 | #include "strutil.h" 13 | #include "functiontable.h" 14 | #include "parsetree.h" 15 | #include 16 | #include 17 | #include "stringtable.h" 18 | #include "stringqueue.h" 19 | #include 20 | 21 | symbol_stack* symStack; 22 | function_table* fTable; 23 | string_table* strTable; 24 | string_table* fileTable; 25 | string_queue* fileQueue; // file queue for the main compiler 26 | string_queue* pFileQueue; // file queue for the pre-processor 27 | int currentLine = 1; 28 | char* currentFile; 29 | int errCount = 0; 30 | struct tree_node* treeRoot; 31 | 32 | // Wrap function 33 | int yywrap(void) 34 | { 35 | if (fileQueue->size == 0) 36 | { 37 | // No more files to preprocess 38 | fclose(uin); 39 | free(currentFile); 40 | return 1; 41 | } else { 42 | // Close previous file 43 | fclose(uin); 44 | 45 | // Open next file 46 | free(currentFile); 47 | currentFile = DequeueString(fileQueue); 48 | uin = fopen(currentFile, "r"); 49 | if (uin == NULL) 50 | { 51 | char* bn = getBasename(currentFile); 52 | printf("imported file '%s' does not exist\n", bn); 53 | free(bn); 54 | free(currentFile); 55 | exit(1); 56 | } 57 | 58 | // Return code to continue 59 | return 0; 60 | } 61 | } 62 | %} 63 | 64 | %option prefix="u" 65 | %option outfile="ulex.c" 66 | 67 | %% 68 | 69 | [/][\*] { 70 | int i = currentLine; 71 | while (TRUE) 72 | { 73 | int v = yylex(); 74 | if (v == ENDMULTICOMMENT) 75 | break; 76 | 77 | if (v == 0) 78 | { 79 | printf("unexpected end of file, unmatched '/*' on line %d\n", i); 80 | exit(1); 81 | } else if (strcmp(yytext, "\n") == 0) { 82 | currentLine++; 83 | } 84 | } 85 | } 86 | 87 | [\*][/] { 88 | return ENDMULTICOMMENT; 89 | } 90 | 91 | import { 92 | return IMPORT; 93 | } 94 | 95 | while { 96 | return WHILE; 97 | } 98 | 99 | else[ \t]+if { 100 | return ELSEIF; 101 | } 102 | 103 | if { 104 | return IF; 105 | } 106 | 107 | else { 108 | return ELSE; 109 | } 110 | 111 | true { 112 | return LTRUE; 113 | } 114 | 115 | null { 116 | return LNULL; 117 | } 118 | 119 | false { 120 | return LFALSE; 121 | } 122 | 123 | asm { 124 | return ASM; 125 | } 126 | 127 | return { 128 | return RETURN; 129 | } 130 | 131 | void { 132 | return VOID; 133 | } 134 | 135 | byte { 136 | return BYTE; 137 | } 138 | 139 | word { 140 | return WORD; 141 | } 142 | 143 | bool { 144 | return BOOL; 145 | } 146 | 147 | byte\[\] { 148 | return BYTEP; 149 | } 150 | 151 | word\[\] { 152 | return WORDP; 153 | } 154 | 155 | end { 156 | return END; 157 | } 158 | 159 | segment { 160 | return SEGMENT; 161 | } 162 | 163 | offset { 164 | return OFFSET; 165 | } 166 | 167 | [\"][^\n]*[\"] { 168 | char* litStr = yytext + 1; 169 | litStr = strdup(litStr); 170 | litStr[strlen(litStr) - 1] = '\0'; 171 | ulval.sval = litStr; 172 | 173 | return STRING_LITERAL; 174 | } 175 | 176 | '.' { 177 | char* litChar = yytext + 1; 178 | litChar = strdup(litChar); 179 | litChar[1] = '\0'; 180 | ulval.sval = litChar; 181 | return CHAR; 182 | } 183 | 184 | [abcd][hlx] { 185 | strToLower(yytext); 186 | ulval.ival = 0; 187 | if (strcmp(yytext, "ax") == 0) 188 | ulval.ival = AX; 189 | else if (strcmp(yytext, "bx") == 0) 190 | ulval.ival = BX; 191 | else if (strcmp(yytext, "cx") == 0) 192 | ulval.ival = CX; 193 | else if (strcmp(yytext, "dx") == 0) 194 | ulval.ival = DX; 195 | else if (strcmp(yytext, "ah") == 0) 196 | ulval.ival = AH; 197 | else if (strcmp(yytext, "al") == 0) 198 | ulval.ival = AL; 199 | else if (strcmp(yytext, "bh") == 0) 200 | ulval.ival = BH; 201 | else if (strcmp(yytext, "bl") == 0) 202 | ulval.ival = BL; 203 | else if (strcmp(yytext, "ch") == 0) 204 | ulval.ival = CH; 205 | else if (strcmp(yytext, "cl") == 0) 206 | ulval.ival = CL; 207 | else if (strcmp(yytext, "dh") == 0) 208 | ulval.ival = DH; 209 | else if (strcmp(yytext, "dl") == 0) 210 | ulval.ival = DL; 211 | return G_REG; 212 | } 213 | 214 | [sd][i] { 215 | strToLower(yytext); 216 | ulval.ival = 0; 217 | if (strcmp(yytext, "si") == 0) 218 | ulval.ival = SI; 219 | else if (strcmp(yytext, "di") == 0) 220 | ulval.ival = DI; 221 | return G_REG; 222 | } 223 | 224 | [0-9][0-9A-Fa-f]*[h] { 225 | ulval.ival = strtol(yytext, NULL, 16); 226 | return HEX; 227 | } 228 | 229 | [01]+[bB] { 230 | ulval.ival = bintoint(yytext); 231 | return BIN; 232 | } 233 | 234 | int { 235 | return INT; 236 | } 237 | 238 | mov { 239 | return MOV; 240 | } 241 | 242 | call { 243 | return CALL; 244 | } 245 | 246 | [=][=] { 247 | return EQ; 248 | } 249 | 250 | [!][=] { 251 | return NEQ; 252 | } 253 | 254 | [>][=] { 255 | return GTE; 256 | } 257 | 258 | [<][=] { 259 | return LTE; 260 | } 261 | 262 | [>] { 263 | return GT; 264 | } 265 | 266 | [<] { 267 | return LT; 268 | } 269 | 270 | [a-zA-Z][a-z_0-9A-Z]* { 271 | ulval.sval = strdup(yytext); 272 | symtab_entry* entry = LookupSymbol(symStack, yytext); 273 | if (entry == NULL) 274 | { 275 | // Entry doesn't exist in symbol table, check function table 276 | function* f = LookupFunction(fTable, yytext); 277 | if (f == NULL) 278 | { 279 | // Identifier not defined 280 | return IDENT_UNDEC; 281 | } else if (f->type == IT_VOID) { 282 | // Void identifier 283 | return FIDENT_VOID; 284 | } else if (f->type == IT_BYTE) { 285 | // Byte identifier 286 | return FIDENT_BYTE; 287 | } else if (f->type == IT_WORD) { 288 | // Word identifier 289 | return FIDENT_WORD; 290 | } else if (f->type == IT_BOOL) { 291 | // Bool identifier 292 | return FIDENT_BOOL; 293 | } else if (f->type == IT_BYTEP) { 294 | // Byte[] identifier 295 | return FIDENT_BYTEP; 296 | } else { 297 | // Word[] identifier 298 | return FIDENT_WORDP; 299 | } 300 | } else if (entry->type == IT_BYTE) { 301 | // Byte identifier 302 | return IDENT_BYTE; 303 | } else if (entry->type == IT_WORD) { 304 | // Word identifier 305 | return IDENT_WORD; 306 | } else if (entry->type == IT_BOOL) { 307 | // Bool identifier 308 | return IDENT_BOOL; 309 | } else if (entry->type == IT_BYTEP) { 310 | // Byte[] identifier 311 | return IDENT_BYTEP; 312 | } else { 313 | // Word[] identifier 314 | return IDENT_WORDP; 315 | } 316 | } 317 | 318 | [0-9]+ { 319 | ulval.ival = atoi(yytext); 320 | return INTEGER; 321 | } 322 | 323 | [/][/][^\n]*[\n] { 324 | currentLine++; 325 | } 326 | 327 | [ \t]+ ; 328 | 329 | [\r?\n] { 330 | currentLine++; 331 | } 332 | 333 | . { 334 | return yytext[0]; 335 | } 336 | 337 | %% 338 | -------------------------------------------------------------------------------- /src/u.y: -------------------------------------------------------------------------------- 1 | /* 2 | * u.y 3 | * Main Bison grammar file for U programming language. 4 | */ 5 | 6 | %{ 7 | // General defines 8 | #define YYDEBUG 1 9 | #define VERSION "0.0.7" 10 | 11 | // Includes 12 | #include 13 | #include 14 | #include "defines.h" 15 | #include "identifiertypes.h" 16 | #include "symboltable.h" 17 | #include "symbolstack.h" 18 | #include "functiontable.h" 19 | #include "parsetree.h" 20 | #include "strutil.h" 21 | #include "stringtable.h" 22 | #include "stringqueue.h" 23 | #include "prunefunctions.h" 24 | #include 25 | #include "optimizer.h" 26 | #include "compiler.h" 27 | 28 | extern char* currentFile; 29 | extern symbol_stack* symStack; 30 | extern function_table* fTable; 31 | extern struct tree_node* treeRoot; 32 | extern string_table* strTable; 33 | extern string_table* fileTable; 34 | extern string_queue* fileQueue; 35 | extern string_queue* pFileQueue; 36 | 37 | /* Added lone prototype so prototypes in u.tab.h and upre.tab.h won't conflict */ 38 | int upreparse(void); 39 | %} 40 | 41 | %name-prefix "u" 42 | 43 | %union 44 | { 45 | int ival; 46 | char* sval; 47 | int typeval; 48 | struct tree_node* tnode; 49 | } 50 | 51 | %error-verbose 52 | 53 | /* Tokens from scanner. */ 54 | %token INTEGER 55 | %token IDENT_UNDEC 56 | %token IDENT_BYTE 57 | %token IDENT_WORD 58 | %token IDENT_BOOL 59 | %token IDENT_BYTEP 60 | %token IDENT_WORDP 61 | %token FIDENT_VOID 62 | %token FIDENT_BYTE 63 | %token FIDENT_WORD 64 | %token FIDENT_BOOL 65 | %token FIDENT_BYTEP 66 | %token FIDENT_WORDP 67 | %token CHAR 68 | %token IMPORT 69 | %token WHILE 70 | %token IF 71 | %token ELSE 72 | %token ELSEIF 73 | %token LTRUE 74 | %token LFALSE 75 | %token LNULL 76 | %token ASM 77 | %token RETURN 78 | %token VOID 79 | %token BYTE 80 | %token WORD 81 | %token BOOL 82 | %token BYTEP 83 | %token WORDP 84 | %token END 85 | %token SEGMENT 86 | %token OFFSET 87 | %token EQ 88 | %token NEQ 89 | %token GTE 90 | %token LTE 91 | %token GT 92 | %token LT 93 | %token ENDMULTICOMMENT 94 | %token STRING_LITERAL 95 | 96 | %token MOV 97 | %token G_REG 98 | %token HEX 99 | %token BIN 100 | %token INT 101 | %token CALL 102 | 103 | /* Associativity and precedence. */ 104 | %left ',' 105 | %left EQ NEQ 106 | %left GT LT GTE LTE 107 | %left '+' '-' 108 | %left '*' '/' '%' 109 | %left '!' 110 | %left ':' 111 | 112 | /* Types. */ 113 | %type vtype 114 | %type ftype 115 | %type program 116 | %type component_list 117 | %type function 118 | %type statement_list 119 | %type statement 120 | %type int_exp 121 | %type bool_exp 122 | %type ptr_exp 123 | %type expression 124 | %type decl_assign_statement 125 | %type declared_identifier 126 | %type declared_func_identifier 127 | %type assignment_statement 128 | %type arg_list 129 | %type int_function_call 130 | %type bool_function_call 131 | %type ptr_function_call 132 | %type void_function_call 133 | %type segment_function_call 134 | %type offset_function_call 135 | %type undec_function_call 136 | %type while_statement 137 | %type while_header 138 | %type if_header 139 | %type elseif_header 140 | %type else_clause 141 | %type if_statement 142 | %type elseif_block 143 | %type return_statement 144 | %type asm_line 145 | %type asm_list 146 | %type asm_op 147 | %type int_op 148 | %type bool_op 149 | %type ptr_op 150 | %type asm_statement 151 | %type asm_memloc 152 | %type parameter 153 | %type parameter_list 154 | %type function_header 155 | %type function_preamble 156 | 157 | %% 158 | program: 159 | component_list { 160 | treeRoot = $1; 161 | } 162 | ; 163 | 164 | function_preamble: 165 | ftype declared_func_identifier '(' { 166 | $$ = newTreeNode(); 167 | $$->type = TN_FDEF; 168 | $$->sval = strdup($2); 169 | PushTable(symStack); 170 | free($2); 171 | } 172 | ; 173 | 174 | function_header: 175 | function_preamble parameter_list ')' { 176 | $$ = $1; 177 | $$->numOperands = 1; 178 | $$->operands[0] = $2; 179 | } 180 | | function_preamble ')' { 181 | $$ = $1; 182 | } 183 | ; 184 | 185 | function: 186 | function_header statement_list END { 187 | $$ = newTreeNode(); 188 | $$->type = TN_FUNCTION; 189 | $$->numOperands = 2; 190 | $$->operands[0] = $1; 191 | $$->operands[1] = $2; 192 | function* f = LookupFunction(fTable, $1->sval); 193 | f->frameSize = symStack->offsetCounter; 194 | ResetOffsetCounter(symStack); 195 | PopTable(symStack); 196 | } 197 | | function_header END { 198 | $$ = newTreeNode(); 199 | $$->type = TN_FUNCTION; 200 | $$->numOperands = 1; 201 | $$->operands[0] = $1; 202 | function* f = LookupFunction(fTable, $1->sval); 203 | f->frameSize = symStack->offsetCounter; 204 | ResetOffsetCounter(symStack); 205 | PopTable(symStack); 206 | } 207 | ; 208 | 209 | import_statement: 210 | IMPORT STRING_LITERAL { 211 | free($2); 212 | } 213 | ; 214 | 215 | component_list: 216 | function 217 | | function component_list { 218 | $1->pNextStatement = $2; 219 | $$ = $1; 220 | } 221 | | import_statement component_list { 222 | $$ = $2; 223 | } 224 | ; 225 | 226 | statement: 227 | decl_assign_statement 228 | | assignment_statement 229 | | while_statement 230 | | if_statement 231 | | int_function_call 232 | | bool_function_call 233 | | void_function_call 234 | | asm_statement 235 | | return_statement 236 | | undec_function_call 237 | ; 238 | 239 | statement_list: 240 | statement 241 | | decl_statement statement_list { 242 | $$ = $2; 243 | } 244 | | statement statement_list { 245 | $1->pNextStatement = $2; 246 | $$ = $1; 247 | } 248 | ; 249 | 250 | decl_assign_statement: 251 | BYTE IDENT_UNDEC '=' int_exp { 252 | AddSymbol(symStack, $2, IT_BYTE); 253 | struct tree_node* ident = newTreeNode(); 254 | ident->type = TN_BYTE_IDENT; 255 | ident->sval = strdup($2); 256 | ident->ival = LookupSymbol(symStack, $2)->offset; 257 | $$ = newTreeNode(); 258 | $$->type = TN_BYTE_ASSIGN; 259 | $$->numOperands = 2; 260 | $$->operands[0] = ident; 261 | $$->operands[1] = $4; 262 | free($2); 263 | } 264 | | WORD IDENT_UNDEC '=' int_exp { 265 | AddSymbol(symStack, $2, IT_WORD); 266 | struct tree_node* ident = newTreeNode(); 267 | ident->type = TN_WORD_IDENT; 268 | ident->sval = strdup($2); 269 | ident->ival = LookupSymbol(symStack, $2)->offset; 270 | $$ = newTreeNode(); 271 | $$->type = TN_WORD_ASSIGN; 272 | $$->numOperands = 2; 273 | $$->operands[0] = ident; 274 | $$->operands[1] = $4; 275 | free($2); 276 | } 277 | | BOOL IDENT_UNDEC '=' bool_exp { 278 | AddSymbol(symStack, $2, IT_BOOL); 279 | struct tree_node* ident = newTreeNode(); 280 | ident->type = TN_BOOL_IDENT; 281 | ident->sval = strdup($2); 282 | ident->ival = LookupSymbol(symStack, $2)->offset; 283 | $$ = newTreeNode(); 284 | $$->type = TN_BOOL_ASSIGN; 285 | $$->numOperands = 2; 286 | $$->operands[0] = ident; 287 | $$->operands[1] = $4; 288 | free($2); 289 | } 290 | | BYTEP IDENT_UNDEC '=' ptr_exp { 291 | AddSymbol(symStack, $2, IT_BYTEP); 292 | struct tree_node* ident = newTreeNode(); 293 | ident->type = TN_PTR_IDENT; 294 | ident->sval = strdup($2); 295 | ident->ival = LookupSymbol(symStack, $2)->offset; 296 | $$ = newTreeNode(); 297 | $$->type = TN_PTR_ASSIGN; 298 | $$->numOperands = 2; 299 | $$->operands[0] = ident; 300 | $$->operands[1] = $4; 301 | free($2); 302 | } 303 | | WORDP IDENT_UNDEC '=' ptr_exp { 304 | AddSymbol(symStack, $2, IT_WORDP); 305 | struct tree_node* ident = newTreeNode(); 306 | ident->type = TN_PTR_IDENT; 307 | ident->sval = strdup($2); 308 | ident->ival = LookupSymbol(symStack, $2)->offset; 309 | $$ = newTreeNode(); 310 | $$->type = TN_PTR_ASSIGN; 311 | $$->numOperands = 2; 312 | $$->operands[0] = ident; 313 | $$->operands[1] = $4; 314 | free($2); 315 | } 316 | | vtype declared_identifier '=' expression { 317 | yyerror("symbol already declared"); 318 | free($2); 319 | $$ = newTreeNode(); 320 | } 321 | ; 322 | 323 | decl_statement: 324 | BYTE IDENT_UNDEC { 325 | AddSymbol(symStack, $2, IT_BYTE); 326 | free($2); 327 | } 328 | | WORD IDENT_UNDEC { 329 | AddSymbol(symStack, $2, IT_WORD); 330 | free($2); 331 | } 332 | | BOOL IDENT_UNDEC { 333 | AddSymbol(symStack, $2, IT_BOOL); 334 | free($2); 335 | } 336 | | BYTEP IDENT_UNDEC { 337 | AddSymbol(symStack, $2, IT_BYTEP); 338 | free($2); 339 | } 340 | | WORDP IDENT_UNDEC { 341 | AddSymbol(symStack, $2, IT_WORDP); 342 | free($2); 343 | } 344 | | vtype declared_identifier { 345 | yyerror("symbol already declared"); 346 | free($2); 347 | } 348 | ; 349 | 350 | assignment_statement: 351 | IDENT_BYTE '=' int_exp { 352 | $$ = newTreeNode(); 353 | $$->type = TN_BYTE_ASSIGN; 354 | struct tree_node* ident = newTreeNode(); 355 | ident->type = TN_BYTE_IDENT; 356 | ident->sval = strdup($1); 357 | ident->ival = LookupSymbol(symStack, $1)->offset; 358 | $$->numOperands = 2; 359 | $$->operands[0] = ident; 360 | $$->operands[1] = $3; 361 | free($1); 362 | } 363 | | IDENT_WORD '=' int_exp { 364 | $$ = newTreeNode(); 365 | $$->type = TN_WORD_ASSIGN; 366 | struct tree_node* ident = newTreeNode(); 367 | ident->type = TN_WORD_IDENT; 368 | ident->sval = strdup($1); 369 | ident->ival = LookupSymbol(symStack, $1)->offset; 370 | $$->numOperands = 2; 371 | $$->operands[0] = ident; 372 | $$->operands[1] = $3; 373 | free($1); 374 | } 375 | | IDENT_BYTEP '=' ptr_exp { 376 | $$ = newTreeNode(); 377 | $$->type = TN_PTR_ASSIGN; 378 | $$->numOperands = 2; 379 | struct tree_node* ident = newTreeNode(); 380 | ident->type = TN_PTR_IDENT; 381 | ident->sval = strdup($1); 382 | ident->ival = LookupSymbol(symStack, $1)->offset; 383 | $$->operands[0] = ident; 384 | $$->operands[1] = $3; 385 | free($1); 386 | } 387 | | IDENT_WORDP '=' ptr_exp { 388 | $$ = newTreeNode(); 389 | $$->type = TN_PTR_ASSIGN; 390 | $$->numOperands = 2; 391 | struct tree_node* ident = newTreeNode(); 392 | ident->type = TN_PTR_IDENT; 393 | ident->sval = strdup($1); 394 | ident->ival = LookupSymbol(symStack, $1)->offset; 395 | $$->operands[0] = ident; 396 | $$->operands[1] = $3; 397 | free($1); 398 | } 399 | | IDENT_BYTEP '[' int_exp ']' '=' int_exp { 400 | $$ = newTreeNode(); 401 | $$->type = TN_PTR_BYTE_ASSIGN; 402 | $$->numOperands = 2; 403 | $$->sval = strdup($1); 404 | $$->ival = LookupSymbol(symStack, $1)->offset; 405 | $$->operands[0] = $3; 406 | $$->operands[1] = $6; 407 | free($1); 408 | } 409 | | IDENT_WORDP '[' int_exp ']' '=' int_exp { 410 | $$ = newTreeNode(); 411 | $$->type = TN_PTR_WORD_ASSIGN; 412 | $$->numOperands = 2; 413 | $$->sval = strdup($1); 414 | $$->ival = LookupSymbol(symStack, $1)->offset; 415 | $$->operands[0] = $3; 416 | $$->operands[1] = $6; 417 | free($1); 418 | } 419 | | IDENT_BOOL '=' bool_exp { 420 | $$ = newTreeNode(); 421 | $$->type = TN_BOOL_ASSIGN; 422 | $$->numOperands = 2; 423 | struct tree_node* ident = newTreeNode(); 424 | ident->type = TN_BOOL_IDENT; 425 | ident->sval = strdup($1); 426 | ident->ival = LookupSymbol(symStack, $1)->offset; 427 | $$->operands[0] = ident; 428 | $$->operands[1] = $3; 429 | free($1); 430 | } 431 | | IDENT_UNDEC '=' expression { 432 | $$ = newTreeNode(); 433 | char err[500]; 434 | sprintf(err, "'%s' undeclared", $1); 435 | yyerror(err); 436 | free($1); 437 | } 438 | ; 439 | 440 | declared_identifier: 441 | IDENT_BYTE 442 | | IDENT_WORD 443 | | IDENT_BOOL 444 | | IDENT_BYTEP 445 | | IDENT_WORDP 446 | ; 447 | 448 | declared_func_identifier: 449 | FIDENT_VOID 450 | | FIDENT_BYTE 451 | | FIDENT_WORD 452 | | FIDENT_BOOL 453 | | FIDENT_BYTEP 454 | | FIDENT_WORDP 455 | ; 456 | 457 | while_header: 458 | WHILE '(' bool_exp ')' { 459 | PushTable(symStack); 460 | $$ = $3; 461 | } 462 | ; 463 | 464 | while_statement: 465 | while_header statement_list END { 466 | PopTable(symStack); 467 | $$ = newTreeNode(); 468 | $$->type = TN_WHILE; 469 | $$->numOperands = 2; 470 | $$->operands[0] = $1; 471 | $$->operands[1] = $2; 472 | } 473 | ; 474 | 475 | if_header: 476 | IF '(' expression ')' { 477 | PushTable(symStack); 478 | $$ = $3; 479 | } 480 | ; 481 | 482 | else_clause: 483 | ELSE statement_list END { 484 | $$ = $2; 485 | } 486 | ; 487 | 488 | elseif_header: 489 | ELSEIF '(' expression ')' { 490 | PushTable(symStack); 491 | $$ = $3; 492 | } 493 | ; 494 | 495 | elseif_block: 496 | elseif_header statement_list END { 497 | PopTable(symStack); 498 | $$ = newTreeNode(); 499 | $$->type = TN_IF; 500 | $$->numOperands = 2; 501 | $$->operands[0] = $1; 502 | $$->operands[1] = $2; 503 | } 504 | | elseif_header statement_list else_clause { 505 | PopTable(symStack); 506 | $$ = newTreeNode(); 507 | $$->type = TN_IF; 508 | $$->numOperands = 3; 509 | $$->operands[0] = $1; 510 | $$->operands[1] = $2; 511 | $$->operands[2] = $3; 512 | } 513 | | elseif_header statement_list elseif_block { 514 | PopTable(symStack); 515 | $$ = newTreeNode(); 516 | $$->type = TN_IF; 517 | $$->numOperands = 3; 518 | $$->operands[0] = $1; 519 | $$->operands[1] = $2; 520 | $$->operands[2] = $3; 521 | } 522 | ; 523 | 524 | if_statement: 525 | if_header statement_list END { 526 | PopTable(symStack); 527 | $$ = newTreeNode(); 528 | $$->type = TN_IF; 529 | $$->numOperands = 2; 530 | $$->operands[0] = $1; 531 | $$->operands[1] = $2; 532 | } 533 | | if_header statement_list else_clause { 534 | PopTable(symStack); 535 | $$ = newTreeNode(); 536 | $$->type = TN_IF; 537 | $$->numOperands = 3; 538 | $$->operands[0] = $1; 539 | $$->operands[1] = $2; 540 | $$->operands[2] = $3; 541 | } 542 | | if_header statement_list elseif_block { 543 | PopTable(symStack); 544 | $$ = newTreeNode(); 545 | $$->type = TN_IF; 546 | $$->numOperands = 3; 547 | $$->operands[0] = $1; 548 | $$->operands[1] = $2; 549 | $$->operands[2] = $3; 550 | } 551 | ; 552 | 553 | asm_statement: 554 | ASM asm_list END { 555 | $$ = newTreeNode(); 556 | $$->type = TN_ASM; 557 | $$->numOperands = 1; 558 | $$->operands[0] = $2; 559 | } 560 | ; 561 | 562 | asm_list: 563 | asm_line 564 | | asm_line asm_list { 565 | $1->pNextStatement = $2; 566 | $$ = $1; 567 | } 568 | ; 569 | 570 | asm_line: 571 | MOV G_REG ',' asm_op { 572 | $$ = newTreeNode(); 573 | $$->type = TN_AMOV; 574 | $$->numOperands = 2; 575 | $$->operands[0] = newTreeNode(); 576 | $$->operands[0]->type = TN_ASMREG; 577 | $$->operands[0]->ival = $2; 578 | $$->operands[1] = $4; 579 | 580 | } 581 | | MOV asm_memloc ',' G_REG { 582 | $$ = newTreeNode(); 583 | $$->type = TN_AMOV; 584 | $$->numOperands = 2; 585 | $$->operands[0] = $2; 586 | $$->operands[1] = newTreeNode(); 587 | $$->operands[1]->ival = $4; 588 | $$->operands[1]->type = TN_ASMREG; 589 | } 590 | | INT HEX { 591 | $$ = newTreeNode(); 592 | $$->type = TN_AINT; 593 | $$->numOperands = 1; 594 | $$->operands[0] = newTreeNode(); 595 | $$->operands[0]->type = TN_INTEGER; 596 | $$->operands[0]->ival = $2; 597 | } 598 | | INT INTEGER { 599 | $$ = newTreeNode(); 600 | $$->type = TN_AINT; 601 | $$->numOperands = 1; 602 | $$->operands[0] = newTreeNode(); 603 | $$->operands[0]->type = TN_INTEGER; 604 | $$->operands[0]->ival = $2; 605 | } 606 | | INT BIN { 607 | $$ = newTreeNode(); 608 | $$->type = TN_AINT; 609 | $$->numOperands = 1; 610 | $$->operands[0] = newTreeNode(); 611 | $$->operands[0]->type = TN_INTEGER; 612 | $$->operands[0]->ival = $2; 613 | } 614 | | CALL HEX { 615 | $$ = newTreeNode(); 616 | $$->type = TN_ACALL; 617 | $$->numOperands = 1; 618 | $$->operands[0] = newTreeNode(); 619 | $$->operands[0]->type = TN_INTEGER; 620 | $$->operands[0]->ival = $2; 621 | } 622 | | CALL INTEGER { 623 | $$ = newTreeNode(); 624 | $$->type = TN_ACALL; 625 | $$->numOperands = 1; 626 | $$->operands[0] = newTreeNode(); 627 | $$->operands[0]->type = TN_INTEGER; 628 | $$->operands[0]->ival = $2; 629 | } 630 | | CALL BIN { 631 | $$ = newTreeNode(); 632 | $$->type = TN_ACALL; 633 | $$->numOperands = 1; 634 | $$->operands[0] = newTreeNode(); 635 | $$->operands[0]->type = TN_INTEGER; 636 | $$->operands[0]->ival = $2; 637 | } 638 | ; 639 | 640 | asm_op: 641 | INTEGER { 642 | $$ = newTreeNode(); 643 | $$->type = TN_INTEGER; 644 | $$->ival = $1; 645 | } 646 | | HEX { 647 | $$ = newTreeNode(); 648 | $$->type = TN_INTEGER; 649 | $$->ival = $1; 650 | } 651 | | BIN { 652 | $$ = newTreeNode(); 653 | $$->type = TN_INTEGER; 654 | $$->ival = $1; 655 | } 656 | | G_REG { 657 | $$ = newTreeNode(); 658 | $$->type = TN_ASMREG; 659 | $$->ival = $1; 660 | } 661 | | asm_memloc 662 | | CHAR { 663 | $$ = newTreeNode(); 664 | $$->type = TN_CHAR; 665 | $$->sval = strdup($1); 666 | free($1); 667 | } 668 | ; 669 | 670 | asm_memloc: 671 | '[' declared_identifier ']' { 672 | $$ = newTreeNode(); 673 | $$->type = TN_ASMLOC; 674 | $$->numOperands = 1; 675 | $$->operands[0] = newTreeNode(); 676 | symtab_entry* sym = LookupSymbol(symStack, $2); 677 | identifier_type t = sym->type; 678 | if (t == IT_BYTE) 679 | $$->operands[0]->type = TN_BYTE_IDENT; 680 | else if (t == IT_WORD) 681 | $$->operands[0]->type = TN_WORD_IDENT; 682 | else if (t == IT_BOOL) 683 | $$->operands[0]->type = TN_BOOL_IDENT; 684 | else 685 | $$->operands[0]->type = TN_PTR_IDENT; 686 | $$->operands[0]->ival = sym->offset; 687 | $$->operands[0]->sval = strdup($2); 688 | free($2); 689 | } 690 | | '[' INTEGER ']' { 691 | $$ = newTreeNode(); 692 | $$->type = TN_ASMLOC; 693 | $$->numOperands = 1; 694 | $$->operands[0] = newTreeNode(); 695 | $$->operands[0]->type = TN_INTEGER; 696 | $$->operands[0]->ival = $2; 697 | } 698 | | '[' HEX ']' { 699 | $$ = newTreeNode(); 700 | $$->type = TN_ASMLOC; 701 | $$->numOperands = 1; 702 | $$->operands[0] = newTreeNode(); 703 | $$->operands[0]->type = TN_INTEGER; 704 | $$->operands[0]->ival = $2; 705 | } 706 | | '[' IDENT_UNDEC ']' { 707 | $$ = newTreeNode(); 708 | yyerror("undeclared identifier"); 709 | free($2); 710 | } 711 | ; 712 | 713 | int_op: 714 | INTEGER { 715 | $$ = newTreeNode(); 716 | $$->type = TN_INTEGER; 717 | $$->ival = $1; 718 | } 719 | | HEX { 720 | $$ = newTreeNode(); 721 | $$->type = TN_INTEGER; 722 | $$->ival = $1; 723 | } 724 | | BIN { 725 | $$ = newTreeNode(); 726 | $$->type = TN_INTEGER; 727 | $$->ival = $1; 728 | } 729 | | int_function_call 730 | | IDENT_BYTE { 731 | $$ = newTreeNode(); 732 | $$->type = TN_BYTE_IDENT; 733 | $$->sval = strdup($1); 734 | $$->ival = LookupSymbol(symStack, $1)->offset; 735 | free($1); 736 | } 737 | | IDENT_BYTEP '[' int_exp ']' { 738 | $$ = newTreeNode(); 739 | $$->type = TN_PTR_BYTE; 740 | $$->sval = strdup($1); 741 | $$->ival = LookupSymbol(symStack, $1)->offset; 742 | $$->operands[0] = $3; 743 | $$->numOperands = 1; 744 | free($1); 745 | } 746 | | CHAR { 747 | $$ = newTreeNode(); 748 | $$->type = TN_CHAR; 749 | $$->sval = strdup($1); 750 | free($1); 751 | } 752 | | IDENT_WORD { 753 | $$ = newTreeNode(); 754 | $$->type = TN_WORD_IDENT; 755 | $$->sval = strdup($1); 756 | $$->ival = LookupSymbol(symStack, $1)->offset; 757 | free($1); 758 | } 759 | | IDENT_WORDP '[' int_exp ']' { 760 | $$ = newTreeNode(); 761 | $$->type = TN_PTR_WORD; 762 | $$->sval = strdup($1); 763 | $$->ival = LookupSymbol(symStack, $1)->offset; 764 | $$->operands[0] = $3; 765 | $$->numOperands = 1; 766 | free($1); 767 | } 768 | ; 769 | 770 | bool_op: 771 | IDENT_BOOL { 772 | $$ = newTreeNode(); 773 | $$->type = TN_BOOL_IDENT; 774 | $$->sval = strdup($1); 775 | $$->ival = LookupSymbol(symStack, $1)->offset; 776 | free($1); 777 | } 778 | | LTRUE { 779 | $$ = newTreeNode(); 780 | $$->type = TN_TRUE; 781 | } 782 | | bool_function_call 783 | | LFALSE { 784 | $$ = newTreeNode(); 785 | $$->type = TN_FALSE; 786 | } 787 | ; 788 | 789 | ptr_op: 790 | STRING_LITERAL { 791 | $$ = newTreeNode(); 792 | $$->type = TN_STRING_LITERAL; 793 | $$->sval = strdup($1); 794 | AddString(strTable, $1); 795 | $$->ival = LookupString(strTable, $1); 796 | free($1); 797 | } 798 | | IDENT_BYTEP { 799 | $$ = newTreeNode(); 800 | $$->type = TN_PTR_IDENT; 801 | $$->sval = strdup($1); 802 | $$->ival = LookupSymbol(symStack, $1)->offset; 803 | free($1); 804 | } 805 | | IDENT_WORDP { 806 | $$ = newTreeNode(); 807 | $$->type = TN_PTR_IDENT; 808 | $$->sval = strdup($1); 809 | $$->ival = LookupSymbol(symStack, $1)->offset; 810 | free($1); 811 | } 812 | | ptr_function_call 813 | | LNULL { 814 | $$ = newTreeNode(); 815 | $$->type = TN_NULL; 816 | } 817 | ; 818 | 819 | int_function_call: 820 | FIDENT_BYTE '(' ')' { 821 | $$ = newTreeNode(); 822 | $$->type = TN_FUNCTIONCALL; 823 | $$->sval = strdup($1); 824 | 825 | function* f = LookupFunction(fTable, $1); 826 | f->called = TRUE; 827 | 828 | free($1); 829 | } 830 | | FIDENT_BYTE '(' arg_list ')' { 831 | $$ = newTreeNode(); 832 | $$->type = TN_FUNCTIONCALL; 833 | $$->sval = strdup($1); 834 | $$->numOperands = 1; 835 | $$->operands[0] = $3; 836 | 837 | function* f = LookupFunction(fTable, $1); 838 | f->called = TRUE; 839 | 840 | free($1); 841 | } 842 | | FIDENT_WORD '(' ')' { 843 | $$ = newTreeNode(); 844 | $$->type = TN_FUNCTIONCALL; 845 | $$->sval = strdup($1); 846 | 847 | function* f = LookupFunction(fTable, $1); 848 | f->called = TRUE; 849 | 850 | free($1); 851 | } 852 | | FIDENT_WORD '(' arg_list ')' { 853 | $$ = newTreeNode(); 854 | $$->type = TN_FUNCTIONCALL; 855 | $$->sval = strdup($1); 856 | $$->numOperands = 1; 857 | $$->operands[0] = $3; 858 | 859 | function* f = LookupFunction(fTable, $1); 860 | f->called = TRUE; 861 | 862 | free($1); 863 | } 864 | ; 865 | 866 | 867 | bool_function_call: 868 | FIDENT_BOOL '(' ')' { 869 | $$ = newTreeNode(); 870 | $$->type = TN_FUNCTIONCALL; 871 | $$->sval = strdup($1); 872 | 873 | function* f = LookupFunction(fTable, $1); 874 | f->called = TRUE; 875 | 876 | free($1); 877 | } 878 | | FIDENT_BOOL '(' arg_list ')' { 879 | $$ = newTreeNode(); 880 | $$->type = TN_FUNCTIONCALL; 881 | $$->sval = strdup($1); 882 | $$->numOperands = 1; 883 | $$->operands[0] = $3; 884 | 885 | function* f = LookupFunction(fTable, $1); 886 | f->called = TRUE; 887 | 888 | free($1); 889 | } 890 | 891 | ; 892 | ptr_function_call: 893 | FIDENT_BYTEP '(' ')' { 894 | $$ = newTreeNode(); 895 | $$->type = TN_FUNCTIONCALL; 896 | $$->sval = strdup($1); 897 | 898 | function* f = LookupFunction(fTable, $1); 899 | f->called = TRUE; 900 | 901 | free($1); 902 | } 903 | | FIDENT_BYTEP '(' arg_list ')' { 904 | $$ = newTreeNode(); 905 | $$->type = TN_FUNCTIONCALL; 906 | $$->sval = strdup($1); 907 | $$->numOperands = 1; 908 | $$->operands[0] = $3; 909 | 910 | function* f = LookupFunction(fTable, $1); 911 | f->called = TRUE; 912 | 913 | free($1); 914 | } 915 | | FIDENT_WORDP '(' ')' { 916 | $$ = newTreeNode(); 917 | $$->type = TN_FUNCTIONCALL; 918 | $$->sval = strdup($1); 919 | 920 | function* f = LookupFunction(fTable, $1); 921 | f->called = TRUE; 922 | 923 | free($1); 924 | } 925 | | FIDENT_WORDP '(' arg_list ')' { 926 | $$ = newTreeNode(); 927 | $$->type = TN_FUNCTIONCALL; 928 | $$->sval = strdup($1); 929 | $$->numOperands = 1; 930 | $$->operands[0] = $3; 931 | 932 | function* f = LookupFunction(fTable, $1); 933 | f->called = TRUE; 934 | 935 | free($1); 936 | } 937 | ; 938 | 939 | void_function_call: 940 | FIDENT_VOID '(' ')' { 941 | $$ = newTreeNode(); 942 | $$->type = TN_FUNCTIONCALL; 943 | $$->sval = strdup($1); 944 | 945 | function* f = LookupFunction(fTable, $1); 946 | f->called = TRUE; 947 | 948 | free($1); 949 | } 950 | | FIDENT_VOID '(' arg_list ')' { 951 | $$ = newTreeNode(); 952 | $$->type = TN_FUNCTIONCALL; 953 | $$->sval = strdup($1); 954 | $$->numOperands = 1; 955 | $$->operands[0] = $3; 956 | 957 | function* f = LookupFunction(fTable, $1); 958 | f->called = TRUE; 959 | 960 | free($1); 961 | } 962 | ; 963 | 964 | segment_function_call: 965 | SEGMENT '(' ptr_exp ')' { 966 | $$ = newTreeNode(); 967 | $$->type = TN_SEGCALL; 968 | $$->numOperands = 1; 969 | $$->operands[0] = $3; 970 | } 971 | ; 972 | 973 | offset_function_call: 974 | OFFSET '(' ptr_exp ')' { 975 | $$ = newTreeNode(); 976 | $$->type = TN_OFFCALL; 977 | $$->numOperands = 1; 978 | $$->operands[0] = $3; 979 | } 980 | ; 981 | 982 | undec_function_call: 983 | IDENT_UNDEC '(' ')' { 984 | $$ = newTreeNode(); 985 | char err[500]; 986 | sprintf(err, "function '%s' undeclared", $1); 987 | yyerror(err); 988 | free($1); 989 | } 990 | | IDENT_UNDEC '(' arg_list ')' { 991 | $$ = newTreeNode(); 992 | char err[500]; 993 | sprintf(err, "function '%s' undeclared", $1); 994 | yyerror(err); 995 | free($1); 996 | } 997 | ; 998 | 999 | 1000 | arg_list: 1001 | int_exp { 1002 | $$ = newTreeNode(); 1003 | $$->type = TN_ARGLIST; 1004 | $$->numOperands = 1; 1005 | $$->operands[0] = $1; 1006 | } 1007 | | bool_exp { 1008 | $$ = newTreeNode(); 1009 | $$->type = TN_ARGLIST; 1010 | $$->numOperands = 1; 1011 | $$->operands[0] = $1; 1012 | } 1013 | | ptr_exp { 1014 | $$ = newTreeNode(); 1015 | $$->type = TN_ARGLIST; 1016 | $$->numOperands = 1; 1017 | $$->operands[0] = $1; 1018 | } 1019 | | arg_list ',' arg_list { 1020 | $$ = $1; 1021 | struct tree_node* op = $3->operands[0]; 1022 | $$->operands[$$->numOperands++] = op; 1023 | free($3->operands); 1024 | free($3); 1025 | } 1026 | ; 1027 | 1028 | return_statement: 1029 | RETURN int_exp { 1030 | $$ = newTreeNode(); 1031 | $$->type = TN_RET_INT; 1032 | $$->numOperands = 1; 1033 | $$->operands[0] = $2; 1034 | } 1035 | | RETURN bool_exp { 1036 | $$ = newTreeNode(); 1037 | $$->type = TN_RET_BOOL; 1038 | $$->numOperands = 1; 1039 | $$->operands[0] = $2; 1040 | } 1041 | | RETURN ptr_exp { 1042 | $$ = newTreeNode(); 1043 | $$->type = TN_RET_PTR; 1044 | $$->numOperands = 1; 1045 | $$->operands[0] = $2; 1046 | } 1047 | ; 1048 | 1049 | expression: 1050 | int_exp 1051 | | bool_exp 1052 | | ptr_exp 1053 | ; 1054 | 1055 | int_exp: 1056 | int_op 1057 | | '-' int_exp { 1058 | $$ = newTreeNode(); 1059 | $$->type = TN_UMINUS; 1060 | $$->numOperands = 1; 1061 | $$->operands[0] = $2; 1062 | } 1063 | | int_exp '+' int_exp { 1064 | $$ = newTreeNode(); 1065 | $$->type = TN_IADD; 1066 | $$->numOperands = 2; 1067 | $$->operands[0] = $1; 1068 | $$->operands[1] = $3; 1069 | } 1070 | | int_exp '-' int_exp { 1071 | $$ = newTreeNode(); 1072 | $$->type = TN_ISUB; 1073 | $$->numOperands = 2; 1074 | $$->operands[0] = $1; 1075 | $$->operands[1] = $3; 1076 | } 1077 | | int_exp '*' int_exp { 1078 | $$ = newTreeNode(); 1079 | $$->type = TN_IMUL; 1080 | $$->numOperands = 2; 1081 | $$->operands[0] = $1; 1082 | $$->operands[1] = $3; 1083 | } 1084 | | int_exp '/' int_exp { 1085 | $$ = newTreeNode(); 1086 | $$->type = TN_IDIV; 1087 | $$->numOperands = 2; 1088 | $$->operands[0] = $1; 1089 | $$->operands[1] = $3; 1090 | 1091 | // Check for divide by zero 1092 | if ($3->type == TN_INTEGER && $3->ival == 0) 1093 | yyerror("division by zero"); 1094 | } 1095 | | int_exp '%' int_exp { 1096 | $$ = newTreeNode(); 1097 | $$->type = TN_IMOD; 1098 | $$->numOperands = 2; 1099 | $$->operands[0] = $1; 1100 | $$->operands[1] = $3; 1101 | 1102 | // Check for division by zero 1103 | if ($3->type == TN_INTEGER && $3->ival == 0) 1104 | yyerror("division by zero"); 1105 | } 1106 | | segment_function_call 1107 | | offset_function_call 1108 | | '(' int_exp ')' { $$ = $2; } 1109 | ; 1110 | 1111 | bool_exp: 1112 | bool_op 1113 | | '!' bool_exp { 1114 | $$ = newTreeNode(); 1115 | $$->type = TN_UBNEQ; 1116 | $$->numOperands = 1; 1117 | $$->operands[0] = $2; 1118 | } 1119 | | int_exp EQ int_exp { 1120 | $$ = newTreeNode(); 1121 | $$->type = TN_IEQ; 1122 | $$->numOperands = 2; 1123 | $$->operands[0] = $1; 1124 | $$->operands[1] = $3; 1125 | } 1126 | | int_exp NEQ int_exp { 1127 | $$ = newTreeNode(); 1128 | $$->type = TN_INEQ; 1129 | $$->numOperands = 2; 1130 | $$->operands[0] = $1; 1131 | $$->operands[1] = $3; 1132 | } 1133 | | int_exp GT int_exp { 1134 | $$ = newTreeNode(); 1135 | $$->type = TN_IGT; 1136 | $$->numOperands = 2; 1137 | $$->operands[0] = $1; 1138 | $$->operands[1] = $3; 1139 | } 1140 | | int_exp LT int_exp { 1141 | $$ = newTreeNode(); 1142 | $$->type = TN_ILT; 1143 | $$->numOperands = 2; 1144 | $$->operands[0] = $1; 1145 | $$->operands[1] = $3; 1146 | } 1147 | | int_exp GTE int_exp { 1148 | $$ = newTreeNode(); 1149 | $$->type = TN_IGTE; 1150 | $$->numOperands = 2; 1151 | $$->operands[0] = $1; 1152 | $$->operands[1] = $3; 1153 | } 1154 | | int_exp LTE int_exp { 1155 | $$ = newTreeNode(); 1156 | $$->type = TN_ILTE; 1157 | $$->numOperands = 2; 1158 | $$->operands[0] = $1; 1159 | $$->operands[1] = $3; 1160 | } 1161 | | bool_exp EQ bool_exp { 1162 | $$ = newTreeNode(); 1163 | $$->type = TN_BEQ; 1164 | $$->numOperands = 2; 1165 | $$->operands[0] = $1; 1166 | $$->operands[1] = $3; 1167 | } 1168 | | bool_exp NEQ bool_exp { 1169 | $$ = newTreeNode(); 1170 | $$->type = TN_BNEQ; 1171 | $$->numOperands = 2; 1172 | $$->operands[0] = $1; 1173 | $$->operands[1] = $3; 1174 | } 1175 | | ptr_exp EQ ptr_exp { 1176 | $$ = newTreeNode(); 1177 | $$->type = TN_PEQ; 1178 | $$->numOperands = 2; 1179 | $$->operands[0] = $1; 1180 | $$->operands[1] = $3; 1181 | } 1182 | | ptr_exp NEQ ptr_exp { 1183 | $$ = newTreeNode(); 1184 | $$->type = TN_PNEQ; 1185 | $$->numOperands = 2; 1186 | $$->operands[0] = $1; 1187 | $$->operands[1] = $3; 1188 | } 1189 | | '(' bool_exp ')' { 1190 | $$ = $2; 1191 | } 1192 | ; 1193 | 1194 | ptr_exp: 1195 | ptr_op 1196 | | int_exp ':' int_exp { 1197 | $$ = newTreeNode(); 1198 | $$->type = TN_REF; 1199 | $$->numOperands = 2; 1200 | $$->operands[0] = $1; 1201 | $$->operands[1] = $3; 1202 | } 1203 | ; 1204 | 1205 | parameter: 1206 | vtype IDENT_UNDEC { 1207 | $$ = newTreeNode(); 1208 | $$->type = TN_PARAM; 1209 | $$->sval = strdup($2); 1210 | $$->ival = $1; 1211 | free($2); 1212 | } 1213 | ; 1214 | 1215 | parameter_list: 1216 | parameter { 1217 | // Create parameter list node 1218 | $$ = newTreeNode(); 1219 | $$->type = TN_PARAMLIST; 1220 | $$->numOperands = 1; 1221 | $$->operands[0] = $1; 1222 | 1223 | // Add parameter to symbol table 1224 | AddSymbol(symStack, $1->sval, $1->ival); 1225 | 1226 | // Update argument stack size 1227 | if ($1->ival == IT_BYTEP || $1->ival == IT_WORDP) 1228 | $$->ival = 4; 1229 | else 1230 | $$->ival = 2; 1231 | } 1232 | | parameter_list ',' parameter { 1233 | // Add parameter to list 1234 | $$ = $1; 1235 | $$->operands[$$->numOperands++] = $3; 1236 | 1237 | // Add parameter to symbol table 1238 | AddSymbol(symStack, $3->sval, $3->ival); 1239 | 1240 | // Update argument stack size 1241 | if ($3->ival == IT_BYTEP || $3->ival == IT_WORDP) 1242 | $$->ival += 4; 1243 | else 1244 | $$->ival += 2; 1245 | } 1246 | ; 1247 | 1248 | ftype: 1249 | VOID { 1250 | $$ = IT_VOID; 1251 | } 1252 | | vtype 1253 | ; 1254 | 1255 | vtype: 1256 | BYTE { 1257 | $$ = IT_BYTE; 1258 | } 1259 | | WORD { 1260 | $$ = IT_WORD; 1261 | } 1262 | | BYTEP { 1263 | $$ = IT_BYTEP; 1264 | } 1265 | | WORDP { 1266 | $$ = IT_WORDP; 1267 | } 1268 | | BOOL { 1269 | $$ = IT_BOOL; 1270 | } 1271 | ; 1272 | 1273 | %% 1274 | 1275 | extern int errCount; 1276 | extern int currentLine; 1277 | extern char* currentFile; 1278 | extern FILE *uin; 1279 | extern FILE *uprein; 1280 | int main(int argc, char** argv) 1281 | { 1282 | char* outPath; 1283 | 1284 | if (argc > 1) 1285 | { 1286 | currentFile = strdup(argv[1]); 1287 | outPath = outputPath(currentFile); 1288 | uprein = fopen(currentFile, "r"); 1289 | 1290 | if (uprein == NULL) 1291 | { 1292 | printf("failed to open input file: %s\n", currentFile); 1293 | free(currentFile); 1294 | return 1; 1295 | } 1296 | } else { 1297 | printf("U compiler v%s\n", VERSION); 1298 | printf("usage: u [input file] [flags]\n"); 1299 | printf("flags:\n"); 1300 | printf(" -p Print the parse tree.\n"); 1301 | printf(" -f Print the function table.\n"); 1302 | printf(" -org [value] Prepends 'org' statement to assembly output.\n"); 1303 | return 0; 1304 | } 1305 | 1306 | // Set flags 1307 | int i; 1308 | int printParseTreeF = FALSE; 1309 | int printFunctionTableF = FALSE; 1310 | char* org = ""; 1311 | for (i = 2; i < argc; i++) 1312 | { 1313 | if (streq(argv[i], "-p")) 1314 | { 1315 | printParseTreeF = TRUE; 1316 | } else if (streq(argv[i], "-f")) { 1317 | printFunctionTableF = TRUE; 1318 | } else if (streq(argv[i], "-org")) { 1319 | if (++i >= argc) 1320 | { 1321 | printf("memory placement value expected after '-org' flag\n"); 1322 | return 1; 1323 | } else { 1324 | int valid = TRUE; 1325 | int j; 1326 | int len = strlen(argv[i]); 1327 | char* tmp = argv[i]; 1328 | for (j = 0; j < len - 1; j++) 1329 | { 1330 | if (tmp[j] < '0' || tmp[j] > '9') 1331 | { 1332 | valid = FALSE; 1333 | break; 1334 | } 1335 | } 1336 | if ((tmp[j] < '0' || tmp[j] > '9') && !(tmp[j] == 'h' || tmp[j] == 'H')) 1337 | valid = FALSE; 1338 | if (!valid) 1339 | { 1340 | printf("invalid memory placement value\n"); 1341 | return 1; 1342 | } 1343 | org = tmp; 1344 | } 1345 | } 1346 | } 1347 | 1348 | // Create core data structures 1349 | symStack = CreateSymbolStack(); 1350 | fTable = CreateFunctionTable(); 1351 | strTable = CreateStringTable(); 1352 | fileQueue = CreateStringQueue(); 1353 | pFileQueue = CreateStringQueue(); 1354 | 1355 | // Parse input 1356 | upreparse(); 1357 | 1358 | if (errCount == 0) 1359 | { 1360 | currentFile = strdup(argv[1]); 1361 | currentLine = 1; 1362 | uin = fopen(currentFile, "r"); 1363 | uparse(); 1364 | 1365 | // Mark main() function as called 1366 | function* m = LookupFunction(fTable, "main"); 1367 | m->called = TRUE; 1368 | 1369 | // Prune unused functions from parse tree 1370 | pruneUnusedFunctions(fTable, treeRoot); 1371 | } 1372 | 1373 | if (errCount == 0) 1374 | { 1375 | FoldConstants(treeRoot, NULL, 0); 1376 | 1377 | if (printParseTreeF == TRUE) 1378 | { 1379 | printf("PARSE TREE:\n"); 1380 | PrintParseTree(treeRoot, 0); 1381 | printf("\n"); 1382 | } 1383 | 1384 | if (printFunctionTableF == TRUE) 1385 | { 1386 | PrintFunctionTable(fTable); 1387 | printf("\n"); 1388 | } 1389 | EmitFasm(outPath, treeRoot, org); 1390 | free(outPath); 1391 | } 1392 | 1393 | // Free memory and return 1394 | FreeStringTable(strTable); 1395 | FreeSymbolStack(symStack); 1396 | FreeFunctionTable(fTable); 1397 | FreeStringQueue(fileQueue); 1398 | FreeStringQueue(pFileQueue); 1399 | FreeTree(treeRoot); 1400 | return 0; 1401 | } 1402 | 1403 | /* Error function. */ 1404 | int yyerror(char* ErrMessage) 1405 | { 1406 | printf("%s on line %d of '%s'\n", ErrMessage, currentLine, currentFile); 1407 | errCount++; 1408 | return 0; 1409 | } 1410 | 1411 | -------------------------------------------------------------------------------- /src/upre.l: -------------------------------------------------------------------------------- 1 | /* 2 | * upre.l 3 | * (f)lex lexer file for first pass tokenizer. 4 | */ 5 | 6 | %{ 7 | #include "defines.h" 8 | #include "upre.tab.h" 9 | #include "symboltable.h" 10 | #include "symbolstack.h" 11 | #include "strutil.h" 12 | #include "functiontable.h" 13 | #include "parsetree.h" 14 | #include 15 | #include 16 | #include "stringtable.h" 17 | #include "stringqueue.h" 18 | #include 19 | 20 | // External variables 21 | extern char* currentFile; 22 | extern int currentLine; 23 | extern int errCount; 24 | extern string_table* strTable; 25 | extern string_table* fileTable; 26 | extern struct tree_node* treeRoot; 27 | extern symbol_stack* symStack; 28 | extern function_table* fTable; 29 | extern string_queue* fileQueue; 30 | extern string_queue* pFileQueue; 31 | 32 | // Lexer variables 33 | int blockDepth = 0; 34 | 35 | // Wrap function 36 | int yywrap(void) 37 | { 38 | if (pFileQueue->size == 0) 39 | { 40 | // No more files to preprocess 41 | fclose(uprein); 42 | free(currentFile); 43 | return 1; 44 | } else { 45 | // Close previous file 46 | fclose(uprein); 47 | 48 | // Open next file 49 | free(currentFile); 50 | currentFile = DequeueString(pFileQueue); 51 | uprein = fopen(currentFile, "r"); 52 | if (uprein == NULL) 53 | { 54 | char* bn = getBasename(currentFile); 55 | printf("imported file '%s' does not exist\n", bn); 56 | free(bn); 57 | free(currentFile); 58 | exit(1); 59 | } 60 | 61 | // Return code to continue 62 | return 0; 63 | } 64 | } 65 | %} 66 | 67 | %option prefix="upre" 68 | %option outfile="uprelex.c" 69 | 70 | %% 71 | 72 | [/][\*] { 73 | int i = currentLine; 74 | while (TRUE) 75 | { 76 | int v = yylex(); 77 | if (v == ENDMULTICOMMENT) 78 | break; 79 | 80 | if (v == 0) 81 | { 82 | printf("unexpected end of file, unmatched '/*' on line %d\n", i); 83 | exit(1); 84 | } else if (strcmp(yytext, "\n") == 0) { 85 | currentLine++; 86 | } 87 | } 88 | } 89 | 90 | [\*][/] { 91 | return ENDMULTICOMMENT; 92 | } 93 | 94 | import { 95 | return IMPORT; 96 | } 97 | 98 | [\"][^\n]*[\"] { 99 | char* litStr = yytext + 1; 100 | litStr = strdup(litStr); 101 | litStr[strlen(litStr) - 1] = '\0'; 102 | uprelval.sval = litStr; 103 | return STRING_LITERAL; 104 | } 105 | 106 | while { 107 | blockDepth++; 108 | } 109 | 110 | else[\t ]+if ; 111 | 112 | if { 113 | blockDepth++; 114 | } 115 | 116 | asm { 117 | blockDepth++; 118 | } 119 | 120 | void { 121 | return VOID; 122 | } 123 | 124 | byte { 125 | return BYTE; 126 | } 127 | 128 | word { 129 | return WORD; 130 | } 131 | 132 | bool { 133 | return BOOL; 134 | } 135 | 136 | byte\[\] { 137 | return BYTEP; 138 | } 139 | 140 | word\[\] { 141 | return WORDP; 142 | } 143 | 144 | end { 145 | if (blockDepth == 0) 146 | return END; 147 | else 148 | blockDepth--; 149 | } 150 | 151 | "(" { 152 | return OPAREN; 153 | } 154 | 155 | ")" { 156 | return CPAREN; 157 | } 158 | 159 | "," { 160 | return COMMA; 161 | } 162 | 163 | ";" { 164 | return SEMICOLON; 165 | } 166 | 167 | [/][/][^\n]*[\n] { 168 | currentLine++; 169 | } 170 | 171 | [a-zA-Z][a-z_0-9A-Z]* { 172 | uprelval.sval = strdup(yytext); 173 | return IDENT; 174 | } 175 | 176 | [ \t]+ ; 177 | 178 | [\r?\n] { 179 | currentLine++; 180 | } 181 | 182 | . ; 183 | 184 | 185 | %% 186 | -------------------------------------------------------------------------------- /src/upre.y: -------------------------------------------------------------------------------- 1 | /* 2 | * upre.y 3 | * Bison parser for U programming language first pass. 4 | */ 5 | 6 | %{ 7 | // General defines 8 | #define YYDEBUG 1 9 | 10 | // Includes 11 | #include 12 | #include 13 | #include "defines.h" 14 | #include "identifiertypes.h" 15 | #include "symboltable.h" 16 | #include "symbolstack.h" 17 | #include "functiontable.h" 18 | #include "parsetree.h" 19 | #include "strutil.h" 20 | #include "stringtable.h" 21 | #include "list.h" 22 | #include "stringqueue.h" 23 | 24 | // External variables 25 | extern int currentLine; 26 | extern int errCount; 27 | extern char* currentFile; 28 | extern symbol_stack* symStack; 29 | extern function_table* fTable; 30 | extern struct tree_node* treeRoot; 31 | extern string_table* strTable; 32 | extern string_queue* fileQueue; 33 | extern string_queue* pFileQueue; 34 | 35 | // Parser variables 36 | identifier_type params[255]; // Holds current parameter list 37 | int numParams; // Number of parameters in current parameter list 38 | %} 39 | 40 | %name-prefix "upre" 41 | 42 | %union 43 | { 44 | int ival; 45 | char* sval; 46 | int typeval; 47 | } 48 | 49 | %error-verbose 50 | 51 | // Tokens 52 | %token IDENT 53 | %token VOID 54 | %token BYTE 55 | %token WORD 56 | %token BOOL 57 | %token BYTEP 58 | %token WORDP 59 | %token END 60 | %token IF 61 | %token WHILE 62 | %token ASM 63 | %token OPAREN 64 | %token CPAREN 65 | %token COMMA 66 | %token SEMICOLON 67 | %token ENDMULTICOMMENT 68 | %token IMPORT 69 | %token STRING_LITERAL 70 | 71 | // Precedence 72 | %left END 73 | %left IF 74 | %left WHILE 75 | %left IDENT 76 | 77 | /* Types. */ 78 | %type vtype 79 | %type ftype 80 | %type garbage 81 | 82 | %% 83 | program: 84 | component_list 85 | ; 86 | 87 | component_list: 88 | function 89 | | function component_list 90 | | import_statement component_list 91 | ; 92 | 93 | import_statement: 94 | IMPORT STRING_LITERAL SEMICOLON { 95 | // Build relative file string 96 | char* curPath = directoryPath(currentFile); 97 | char* relPath = (char*) malloc(strlen(curPath) 98 | + strlen($2) + 1); 99 | strcpy(relPath, curPath); 100 | strcat(relPath, $2); 101 | 102 | // Add string to file queue 103 | EnqueueString(fileQueue, relPath); 104 | EnqueueString(pFileQueue, relPath); 105 | 106 | // Free memory 107 | free(curPath); 108 | free(relPath); 109 | free($2); 110 | } 111 | 112 | function: 113 | ftype IDENT OPAREN parameter_list CPAREN garbage END { 114 | // Add function to table 115 | AddFunction(fTable, $1, $2); 116 | 117 | // Add parameters 118 | int i; 119 | for (i = 0; i < numParams; i++) 120 | { 121 | AddParameter(fTable, $2, params[i]); 122 | } 123 | 124 | free($2); 125 | } 126 | | ftype IDENT OPAREN CPAREN garbage END { 127 | // Add function to table 128 | AddFunction(fTable, $1, $2); 129 | free($2); 130 | } 131 | ; 132 | 133 | parameter_list: 134 | vtype IDENT { 135 | numParams = 1; 136 | params[0] = $1; 137 | free($2); 138 | } 139 | | parameter_list COMMA vtype IDENT { 140 | params[numParams++] = $3; 141 | free($4); 142 | } 143 | ; 144 | 145 | garbage: 146 | IDENT { 147 | $$ = ""; 148 | free($1); 149 | } 150 | | STRING_LITERAL { 151 | $$ = ""; 152 | free($1); 153 | } 154 | | BYTE { 155 | $$ = ""; 156 | } 157 | | WORD { 158 | $$ = ""; 159 | } 160 | | BOOL { 161 | $$ = ""; 162 | } 163 | | BYTEP { 164 | $$ = ""; 165 | } 166 | | WORDP { 167 | $$ = ""; 168 | } 169 | | OPAREN { 170 | $$ = ""; 171 | } 172 | | CPAREN { 173 | $$ = ""; 174 | } 175 | | garbage BYTE { 176 | $$ = ""; 177 | } 178 | | garbage WORD { 179 | $$ = ""; 180 | } 181 | | garbage BOOL { 182 | $$ = ""; 183 | } 184 | | garbage BYTEP { 185 | $$ = ""; 186 | } 187 | | garbage WORDP { 188 | $$ = ""; 189 | } 190 | | garbage IDENT { 191 | $$ = ""; 192 | free($2); 193 | } 194 | | garbage COMMA { 195 | $$ = ""; 196 | } 197 | | garbage OPAREN { 198 | $$ = ""; 199 | } 200 | | garbage CPAREN { 201 | $$ = ""; 202 | } 203 | | garbage STRING_LITERAL { 204 | $$ = ""; 205 | free($2); 206 | } 207 | | garbage SEMICOLON { 208 | $$ = ""; 209 | } 210 | ; 211 | 212 | ftype: 213 | VOID { 214 | $$ = IT_VOID; 215 | } 216 | | vtype 217 | ; 218 | 219 | vtype: 220 | BYTE { 221 | $$ = IT_BYTE; 222 | } 223 | | WORD { 224 | $$ = IT_WORD; 225 | } 226 | | BYTEP { 227 | $$ = IT_BYTEP; 228 | } 229 | | WORDP { 230 | $$ = IT_WORDP; 231 | } 232 | | BOOL { 233 | $$ = IT_BOOL; 234 | } 235 | ; 236 | %% 237 | 238 | /* Error function. */ 239 | int yyerror(char* ErrMessage) 240 | { 241 | printf("preprocessor: %s on line %d\n", ErrMessage, currentLine); 242 | errCount++; 243 | return 0; 244 | } 245 | 246 | --------------------------------------------------------------------------------