├── Compiler ├── GRAMMAR.txt ├── Group 27 - Semantic Rules - AST.docx ├── Lexer.jpg ├── a.out ├── ast.c ├── ast.h ├── astDef.h ├── code.asm ├── code.o ├── code_gen.c ├── code_gen.h ├── code_genDef.h ├── compiler ├── driver.c ├── error_handler.c ├── error_handler.h ├── error_handlerDef.h ├── interface.h ├── keyword_table.c ├── keyword_table.h ├── keyword_tableDef.h ├── lexer.c ├── lexer.h ├── lexerDef.h ├── makefile ├── nary_tree.c ├── nary_tree.h ├── nary_treeDef.h ├── parser.c ├── parser.h ├── parserDef.h ├── printer.c ├── printer.h ├── printerDef.h ├── semantic_analyzer.c ├── semantic_analyzer.h ├── semantic_analyzerDef.h ├── stack.c ├── stack.h ├── stackDef.h ├── symbol_table.c ├── symbol_table.h ├── symbol_tableDef.h ├── testcase3.txt ├── testcases │ ├── main-1.txt │ ├── main-2.txt │ ├── main-3.txt │ ├── main0.txt │ ├── main1.txt │ ├── main2.txt │ ├── main3.txt │ ├── main4.txt │ ├── main5.txt │ ├── stestcase1.txt │ ├── stestcase2.txt │ ├── stestcase3.txt │ ├── testcase-1.txt │ ├── testcase-10.txt │ ├── testcase-11.txt │ ├── testcase-2.txt │ ├── testcase-3.txt │ ├── testcase-4.txt │ ├── testcase-5.txt │ ├── testcase-6.txt │ ├── testcase-7.txt │ ├── testcase-8.txt │ ├── testcase-9.txt │ ├── testcase0.txt │ ├── testcase1.txt │ ├── testcase10.txt │ ├── testcase11.txt │ ├── testcase2.txt │ ├── testcase3.txt │ ├── testcase4.txt │ ├── testcase5.txt │ └── testcase6.txt ├── type_checker.c ├── type_checker.h └── type_checkerDef.h ├── Conventions.txt ├── Group_27.zip ├── Language specifications.pdf └── README.md /Compiler/GRAMMAR.txt: -------------------------------------------------------------------------------- 1 | program otherFunctions mainFunction 2 | mainFunction TK_MAIN stmts TK_END 3 | otherFunctions function otherFunctions 4 | otherFunctions TK_EPS 5 | function TK_FUNID input_par output_par TK_SEM stmts TK_END 6 | input_par TK_INPUT TK_PARAMETER TK_LIST TK_SQL parameter_list TK_SQR 7 | output_par TK_OUTPUT TK_PARAMETER TK_LIST TK_SQL parameter_list TK_SQR 8 | output_par TK_EPS 9 | parameter_list dataType TK_ID remaining_list 10 | dataType primitiveDatatype 11 | dataType constructedDatatype 12 | primitiveDatatype TK_INT 13 | primitiveDatatype TK_REAL 14 | constructedDatatype TK_RECORD TK_RECORDID 15 | remaining_list TK_COMMA dataType TK_ID remaining_list 16 | remaining_list TK_EPS 17 | stmts typeDefinitions declarations otherStmts returnStmt 18 | typeDefinitions typeDefinition typeDefinitions 19 | typeDefinitions TK_EPS 20 | typeDefinition TK_RECORD TK_RECORDID fieldDefinitions TK_ENDRECORD TK_SEM 21 | fieldDefinitions fieldDefinition fieldDefinition moreFields 22 | fieldDefinition TK_TYPE primitiveDatatype TK_COLON TK_FIELDID TK_SEM 23 | moreFields fieldDefinition moreFields 24 | moreFields TK_EPS 25 | declarations declaration declarations 26 | declarations TK_EPS 27 | declaration TK_TYPE dataType TK_COLON TK_ID global_or_not TK_SEM 28 | global_or_not TK_COLON TK_GLOBAL 29 | global_or_not TK_EPS 30 | otherStmts stmt otherStmts 31 | otherStmts TK_EPS 32 | stmt assignmentStmt 33 | stmt iterativeStmt 34 | stmt conditionalStmt 35 | stmt ioStmt 36 | stmt funCallStmt 37 | assignmentStmt singleOrRecId TK_ASSIGNOP arithmeticExpression TK_SEM 38 | singleOrRecId TK_ID new_24 39 | new_24 TK_DOT TK_FIELDID 40 | new_24 TK_EPS 41 | funCallStmt outputParameters TK_CALL TK_FUNID TK_WITH TK_PARAMETERS inputParameters TK_SEM 42 | outputParameters TK_SQL idList TK_SQR TK_ASSIGNOP 43 | outputParameters TK_EPS 44 | inputParameters TK_SQL idList TK_SQR 45 | iterativeStmt TK_WHILE TK_OP booleanExpression TK_CL stmt otherStmts TK_ENDWHILE 46 | conditionalStmt TK_IF TK_OP booleanExpression TK_CL TK_THEN stmt otherStmts elsePart 47 | elsePart TK_ELSE stmt otherStmts TK_ENDIF 48 | elsePart TK_ENDIF 49 | ioStmt TK_READ TK_OP singleOrRecId TK_CL TK_SEM 50 | ioStmt TK_WRITE TK_OP all TK_CL TK_SEM 51 | arithmeticExpression term expPrime 52 | expPrime lowPrecedenceOperators term expPrime 53 | expPrime TK_EPS 54 | term factor termPrime 55 | termPrime highPrecedenceOperators factor termPrime 56 | termPrime TK_EPS 57 | factor TK_OP arithmeticExpression TK_CL 58 | factor all 59 | highPrecedenceOperators TK_MUL 60 | highPrecedenceOperators TK_DIV 61 | lowPrecedenceOperators TK_PLUS 62 | lowPrecedenceOperators TK_MINUS 63 | all TK_NUM 64 | all TK_RNUM 65 | all TK_ID temp 66 | temp TK_EPS 67 | temp TK_DOT TK_FIELDID 68 | booleanExpression TK_OP booleanExpression TK_CL logicalOp TK_OP booleanExpression TK_CL 69 | booleanExpression var relationalOp var 70 | booleanExpression TK_NOT TK_OP booleanExpression TK_CL 71 | var TK_ID 72 | var TK_NUM 73 | var TK_RNUM 74 | logicalOp TK_AND 75 | logicalOp TK_OR 76 | relationalOp TK_LT 77 | relationalOp TK_LE 78 | relationalOp TK_EQ 79 | relationalOp TK_GT 80 | relationalOp TK_GE 81 | relationalOp TK_NE 82 | returnStmt TK_RETURN optionalReturn TK_SEM 83 | optionalReturn TK_SQL idList TK_SQR 84 | optionalReturn TK_EPS 85 | idList TK_ID more_ids 86 | more_ids TK_COMMA TK_ID more_ids 87 | more_ids TK_EPS 88 | -------------------------------------------------------------------------------- /Compiler/Group 27 - Semantic Rules - AST.docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Aashish683/Compiler-Project/ff87e1005f4f5e890b7121c3aeb813ba0549b580/Compiler/Group 27 - Semantic Rules - AST.docx -------------------------------------------------------------------------------- /Compiler/Lexer.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Aashish683/Compiler-Project/ff87e1005f4f5e890b7121c3aeb813ba0549b580/Compiler/Lexer.jpg -------------------------------------------------------------------------------- /Compiler/a.out: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Aashish683/Compiler-Project/ff87e1005f4f5e890b7121c3aeb813ba0549b580/Compiler/a.out -------------------------------------------------------------------------------- /Compiler/ast.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "astDef.h" 8 | #include "nary_treeDef.h" 9 | 10 | char* getLabel(int l); 11 | 12 | ASTNode* createASTNode(int isLeaf, Label l); 13 | void addASTChild(ASTNode* node, ASTNode* newChild); 14 | void addASTChildren(ASTNode* node, ASTNode* ls); 15 | 16 | int isGlobal(NaryTreeNode* parseTreeNode); 17 | int isSingleOrRecord(NaryTreeNode* parseTreeNode); 18 | Token* getType(NaryTreeNode* parseTreeNode); 19 | Token* getIdentifier(NaryTreeNode* parseTreeNode); 20 | Token* getOperator(NaryTreeNode* parseTreeNode); 21 | 22 | AST* initializeAST(); 23 | AST* constructAST(ParseTree* pt); 24 | ASTNode* constructASTHelper(NaryTreeNode* parseTreeNode); 25 | ASTNode* constructASTHelperInherited(NaryTreeNode* parseTreeNode, ASTNode* inherited); 26 | void printAST(ASTNode* astNode); 27 | 28 | int getASTNodeCount(); 29 | int getASTMemory(); 30 | -------------------------------------------------------------------------------- /Compiler/astDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #ifndef AST_ 8 | #define AST_ 9 | 10 | #include "lexerDef.h" 11 | #include "symbol_tableDef.h" 12 | 13 | typedef struct ASTNode ASTNode; 14 | typedef struct ASTProgram ASTProgram; 15 | typedef struct ASTInputParams ASTInputParams; 16 | typedef struct ASTOutputParams ASTOutputParams; 17 | typedef struct ASTDatatype ASTDatatype; 18 | typedef struct ASTStmts ASTStmts; 19 | typedef struct ASTTypedefinition ASTTypedefinition; 20 | typedef struct ASTFieldDefinition ASTFieldDefinition; 21 | typedef struct ASTDeclaration ASTDeclaration; 22 | typedef struct ASTAssignmentStmt ASTAssignmentStmt; 23 | typedef struct ASTFunCallStmt ASTFunCallStmt; 24 | typedef struct ASTIterativeStmt ASTIterativeStmt; 25 | typedef struct ASTConditionalStmt ASTConditionalStmt; 26 | typedef struct ASTIOStmt ASTIOStmt; 27 | typedef struct ASTStmt ASTStmt; 28 | typedef struct ASTArithmeticExpression ASTArithmeticExpression; 29 | typedef struct ASTBooleanExpression ASTBooleanExpression; 30 | typedef struct ASTID ASTID; 31 | 32 | // Defining some snippets to make coding easier 33 | 34 | #define AST_STMT_ITERATIVE AST_STMT.AST_STMT_TYPE.AST_ITERATIVE_STMT 35 | #define AST_STMT_ASSIGNMENT AST_STMT.AST_STMT_TYPE.AST_ASSIGNMENT_STMT 36 | #define AST_STMT_FUN_CALL AST_STMT.AST_STMT_TYPE.AST_FUN_CALL_STMT 37 | #define AST_STMT_CONDITIONAL AST_STMT.AST_STMT_TYPE.AST_CONDITIONAL_STMT 38 | #define AST_STMT_IO AST_STMT.AST_STMT_TYPE.AST_IO_STMT 39 | 40 | // A struct which is used to label nodes 41 | typedef enum Label { 42 | astProgram, 43 | astFunction, 44 | astInputParams, 45 | astOutputParams, 46 | astDatatype, 47 | astStmts, 48 | astTypeDefintion, 49 | astFieldDefinition, 50 | astDeclaration, 51 | astAssignmentStmt, 52 | astFunCallStmt, 53 | astIterativeStmt, 54 | astConditionalStmt, 55 | astElsePart, 56 | astIOStmtRead, 57 | astIOStmtWrite, 58 | astReturnStmt, 59 | astInputArgs, 60 | astOutputArgs, 61 | astArithmeticExpression, 62 | astBooleanExpression, 63 | astId, 64 | astNum, 65 | astRnum, 66 | } Label; 67 | 68 | /** 69 | * -----Plan------- 70 | * 71 | **/ 72 | typedef struct ASTProgram { 73 | // A field for a linked list of ASTFunction (represents otherFunctions non terminal) 74 | // A field for the main function 75 | // struct ASTNode* AST_FUNCTION_HEAD; 76 | // struct ASTNode* AST_MAIN_FUNCTION; 77 | 78 | } ASTProgram; 79 | 80 | typedef struct ASTFunction { 81 | // A field to store the name of the function 82 | Token* FUNCTION_TOKEN; 83 | // A field for inputparams // Nullable 84 | // struct ASTNode* INPUT_PARAMS; 85 | // A field for outputparams // Non-nullable 86 | // struct ASTNode* OUTPUT_PARAMS; 87 | // A linked list of ASTStmt 88 | // struct ASTNode* STMT_HEAD; 89 | // A pointer to the next ASTFunction node 90 | } ASTFunction; 91 | 92 | /*** Rendered obsolete because of ASTID ***/ 93 | typedef struct ASTInputParams { 94 | // A linked list of two-tuples() represents parameter list non terminal 95 | } ASTInputParams; 96 | 97 | typedef struct ASTOutputParams { 98 | // A linked list of two-tuples() represents parameter list non terminal 99 | } ASTOutputParams; 100 | /******/ 101 | 102 | // Represents the element of a list of parameters 103 | typedef struct ASTID { 104 | // Stores datatype of TK_ID 105 | // Stores TK_ID 106 | Token* DATA_TYPE; 107 | Token* ID; 108 | Token* FIELD_ID; 109 | } ASTID; 110 | 111 | 112 | typedef struct ASTDatatype { 113 | // A field which stores the values (INT,REAL or RECORD_ID) 114 | } ASTDatatype; 115 | 116 | typedef struct ASTStmts { 117 | // A linked list of each element being a typedefinition 118 | // A linked list of each elelemnt being a declaration 119 | 120 | // A linked list of each element being a stmt (AssignmentStmt,iterative,conditional, funCallStmt and IOStmt) 121 | // Implement this by making a custom function which checks wheter ASTNode has the label accoridng to these 4 122 | 123 | // A pointer to the return stmt ASTNode (which is a linked list of the IDs which we are returning, NULL if none) 124 | } ASTStmts; 125 | 126 | typedef struct ASTTypedefinition { 127 | // A field having a RECORD_ID or a pointer to the RECORD_ID node 128 | // A linked list of each element being a FieldDefinition, linked list has a size of at least 2 129 | Token* RECORD_ID; 130 | } ASTTypedefinition; 131 | 132 | typedef struct ASTFieldDefinition { 133 | // A field storing either INT OR REAL, pointer to ASTDatatypr 134 | // A field storing fieldID 135 | Token* DATA_TYPE; 136 | Token* FIELD_ID; 137 | } ASTFieldDefinition; 138 | 139 | typedef struct ASTDeclaration { 140 | // A field storing INT OR REAL OR RECORDID, pointer to ASTDataType 141 | // A field storing the ID 142 | // A field storing whether it is global or not 143 | int IS_GLOBAL; 144 | } ASTDeclaration; 145 | 146 | typedef struct ASTReturnStmt { 147 | int RETURN_LINE_NO; // Required in case nothing is being returned and an error is present, We cannot consult the IDs for the line number as nothing is being returned so no IDs 148 | } ASTReturnStmt; 149 | 150 | typedef struct ASTAssignmentStmt { 151 | // POSSIBLY a flag to store whether a single variable or a record id 152 | // A fieled to store the TK_ID or TK_ID.TK_FIELDID 153 | // A field to store the arithmeticExpression 154 | 155 | int SINGLE_OR_RECORD; 156 | } ASTAssignmentStmt; 157 | 158 | typedef struct ASTFunCallStmt { 159 | // A field to store Output parameters (i.e => a linked list of ids idlist) 160 | // A field to store Input Parameters (i.e => a linked list of ids idlist) 161 | // A field to store the FUN_ID of the function being called 162 | Token* FUN_ID; 163 | } ASTFunCallStmt; 164 | 165 | typedef struct ASTIterativeStmt { 166 | // A field to store booleanExpression 167 | // A field to store linked list of ASTStmt of length at least one 168 | int LINE_NO_START; // Line number where the iterative starts (Needef for error reporting) 169 | int LINE_NO_END; // Line number where the iterative ends (Needed for error reporting) 170 | } ASTIterativeStmt; 171 | 172 | typedef struct ASTConditionalStmt { 173 | // A field to store the boolean expression 174 | // A field to store the linked list of ASTStmt of length at least one 175 | // A field to store linekd list of ASTStmt which are in the elsePart, NULL if no elsepart 176 | } ASTConditionalStmt; 177 | 178 | typedef struct ASTIOStmt { 179 | // A flag to indicatate whether it is a Read or a Write (will be usef prior to accessing other fields) 180 | // POSSIBLY a flag to store whether a single variable or a record id 181 | // A fieled to store the TK_ID or TK_ID.TK_FIELDID 182 | // A field to store whether the output guy is 1) Number, 2) Real Number 3) TK_ID 4) TK_ID.TK_FIELDID 183 | // A field to store the output guy 184 | int IS_READ; // 1 if the IOStmt is a read, 0 if it is a write 185 | int SINGLE_OR_RECORD; // To be accessed only if it is a Read 186 | char* ID; // To be accessed if it is a read/ a write involving an indentifier 187 | char* FIELD_ID; // To be accessed if it is a read/ a write involving a Record identifier 188 | int IS_NUMBER; // Field to store if the entity being written is a number 189 | char* VALUE; // To be accessd if it is a write involving a number 190 | } ASTIOStmt; 191 | 192 | 193 | typedef struct ASTArithmeticExpression { 194 | // A flag to store whether it is a single terminal or not (decides access of below 2 fields if it is, otherwise the rest) 195 | // A field to store whether the output guy is 1) Number, 2) Real Number 3) TK_ID 4) TK_ID.TK_FIELDID 196 | // A field to store the output guy 197 | // A field to store the operator 198 | // A field which stores the pointer to the ArithmeticExpression on the left 199 | // A field which stores the pointer to the ArithmeticExpression on the right 200 | Token* OPERATOR; 201 | } ASTArithmeticExpression; 202 | 203 | typedef struct ASTBooleanExpression { 204 | // A field to store whether it is a 1) Logical 2) Relational 3) Negation 205 | // If logical, the logical op 206 | // If logical a pointer to store the boolean expression on the left 207 | // If logical a pointer to store the boolean expression on the right 208 | // If relational, the relational Op 209 | // If relational stores the left variable which can be a TK_ID,TK_NUM or TK_RNUM 210 | // If relational stores the right variable which can be a TK_ID, TK_NUM or TK_RNUM 211 | // If negation, a pointer to the boolean expression underneath 212 | Token* OPERATOR; 213 | } ASTBooleanExpression; 214 | 215 | typedef struct ASTNum { 216 | Token* VALUE; 217 | } ASTNum; 218 | 219 | typedef struct ASTRNum { 220 | Token* VALUE; 221 | } ASTRNum; 222 | 223 | 224 | typedef union ASTNodeType { 225 | ASTProgram AST_PROGRAM; 226 | ASTFunction AST_FUNCTION; 227 | ASTDatatype AST_DATA_TYPE; 228 | ASTStmts AST_STMTS; 229 | ASTTypedefinition AST_TYPE_DEFINITION; 230 | ASTFieldDefinition AST_FIELD_DEFINITION; 231 | ASTDeclaration AST_DECLARATION; 232 | ASTAssignmentStmt AST_ASSIGNMENT_STMT; 233 | ASTFunCallStmt AST_FUN_CALL_STMT; 234 | ASTIterativeStmt AST_ITERATIVE_STMT; 235 | ASTConditionalStmt AST_CONDITIONAL_STMT; 236 | ASTIOStmt AST_IO_STMT; 237 | ASTReturnStmt AST_RETURN_STMT; 238 | ASTArithmeticExpression AST_ARITHMETIC_EXPRESSION; 239 | ASTBooleanExpression AST_BOOLEAN_EXPRESSION; 240 | ASTID AST_ID; 241 | ASTNum AST_NUM; 242 | ASTRNum AST_RNUM; 243 | } ASTNodeType; 244 | 245 | typedef struct ASTNode { 246 | int IS_LEAF; // Whether the node is a leaf or not 247 | Label LABEL; // Label of the AST Node 248 | ASTNodeType AST_NODE_TYPE; // Represents the actual type of ASTNode beneath this generic AST Node 249 | int CHILDREN_COUNT; // Counts the number of children in a node 250 | SymbolTable* SCOPED_TABLE; // Stores the pointer to the symbol table, this node is scoped in 251 | struct ASTNode* parent; // Points to the parent of the current node 252 | struct ASTNode* next; // Points to the next node of the list which this node is a part of 253 | struct ASTNode* children; // Head of the Linked list of the children of this node (To make traversal conevenient) 254 | struct ASTNode* tail; // Tail of the linked list of the children of this node (To make addition of child O(1)) 255 | } ASTNode; 256 | 257 | typedef struct AST { 258 | ASTNode* root; 259 | } AST; 260 | 261 | #endif 262 | -------------------------------------------------------------------------------- /Compiler/code.asm: -------------------------------------------------------------------------------- 1 | section .text 2 | global main 3 | extern scanf 4 | extern printf 5 | 6 | section .data 7 | inpformat: db "%hd",0 8 | outformat: db "%hd",10,0, 9 | b3: dw 1 10 | b2: dw 1 11 | c2: dw 1 12 | d2: dw 1 13 | 14 | 15 | main: 16 | push rsi 17 | push rdi 18 | push ax 19 | mov rsi, b2 20 | mov rdi, inpformat 21 | mov al, 0 22 | call scanf 23 | pop ax 24 | pop rdi 25 | pop rsi 26 | 27 | 28 | push ax 29 | mov ax, 20 30 | mov [c2], ax 31 | pop ax 32 | 33 | 34 | push rsi 35 | push rdi 36 | push ax 37 | mov rsi, d2 38 | mov rdi, inpformat 39 | mov al, 0 40 | call scanf 41 | pop ax 42 | pop rdi 43 | pop rsi 44 | 45 | 46 | push ax 47 | push bx 48 | push bx 49 | mov ax, [b2] 50 | push ax 51 | mov ax, [c2] 52 | push ax 53 | pop bx 54 | pop ax 55 | add ax,bx 56 | pop bx 57 | push ax 58 | mov ax, [d2] 59 | push ax 60 | pop bx 61 | pop ax 62 | add ax,bx 63 | pop bx 64 | mov [b3], ax 65 | pop ax 66 | 67 | 68 | push rdi 69 | push rsi 70 | push rax 71 | push rcx 72 | push ax 73 | mov rsi, [b3] 74 | mov rdi, outformat 75 | mov al,0 76 | call printf 77 | pop ax 78 | pop rcx 79 | pop rax 80 | pop rsi 81 | pop rdi 82 | 83 | 84 | 85 | 86 | ret 87 | -------------------------------------------------------------------------------- /Compiler/code.o: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Aashish683/Compiler-Project/ff87e1005f4f5e890b7121c3aeb813ba0549b580/Compiler/code.o -------------------------------------------------------------------------------- /Compiler/code_gen.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "code_genDef.h" 8 | 9 | 10 | void printLeaf(ASTNode* leafNode,FILE* f); 11 | 12 | void codeGenerationHelper(ASTNode* node, SymbolTable* st, FILE* f,int recordArithmetic); // recordArithmetic(1) indicates a record operation 13 | void codeGeneration(AST* ast, SymbolTable* st, FILE* f); 14 | -------------------------------------------------------------------------------- /Compiler/code_genDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "astDef.h" 8 | #include "symbol_tableDef.h" 9 | -------------------------------------------------------------------------------- /Compiler/compiler: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Aashish683/Compiler-Project/ff87e1005f4f5e890b7121c3aeb813ba0549b580/Compiler/compiler -------------------------------------------------------------------------------- /Compiler/driver.c: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "interface.h" 8 | #include "keyword_table.h" 9 | #include "lexer.h" 10 | #include "parser.h" 11 | #include "nary_tree.h" 12 | #include "ast.h" 13 | #include "printer.h" 14 | #include "error_handler.h" 15 | #include "symbol_table.h" 16 | #include "semantic_analyzer.h" 17 | #include "type_checker.h" 18 | #include "code_gen.h" 19 | #include 20 | 21 | int main(int argc, char* argv[]) { 22 | int userOption; 23 | 24 | if(argc != 3) { 25 | printf("No of arguments not sufficient \n"); 26 | return 0; 27 | } 28 | 29 | printf("\n\n"); 30 | printf("-------Compiler Project Group 27--------\n"); 31 | 32 | printf("LEVEL 4: "); 33 | printf("Symbol Table has been constructed/ Type checking and Semantic analysis modules work\n"); 34 | printf("------ Status --------\n"); 35 | printf("Both lexer and parser have been implemented\n"); 36 | printf("First and follow sets have been automated\n"); 37 | printf("Symbol Table has been constructed/ Type checking and Semantic analysis modules work\n"); 38 | printf("Code generation has been implemented partially, a test case main0.txt has been provided where it works for small inputs\n"); 39 | printf("We used the following commands to execute the .asm file\n"); 40 | printf("nasm -felf64 code.asm && gcc code.o && ./a.out \n"); 41 | printf("----------------------\n\n"); 42 | 43 | 44 | printf("Enter 0 to exit\n"); 45 | printf("Enter 1 to print the list of tokens created by the lexer\n"); 46 | printf("Enter 2 to print the parse tree\n"); 47 | printf("Enter 3 to print the AST, (Please ensure that the input creates a valid AST)\n"); 48 | printf("Enter 4 to calculate the compression ratio\n"); 49 | printf("Enter 5 to print the symbol table\n"); 50 | printf("Enter 6 to print information about the global variables\n"); 51 | printf("Enter 7 to print information about the functions\n"); 52 | printf("Enter 8 to print all the records\n"); 53 | printf("Enter 9 to verify correctness of the program\n"); 54 | printf("Enter 10 to generate NASM code\n"); 55 | 56 | while(1) { 57 | 58 | scanf("%d", &userOption); 59 | 60 | switch(userOption) { 61 | case 0: { 62 | return 0; 63 | break; 64 | } 65 | case 1: { 66 | printf("Commencing printing of token list \n"); 67 | printf("--------\n"); 68 | int f = open(argv[1],O_RDONLY); 69 | initializeLexer(f); 70 | Token* t; 71 | while((t = getToken()) != NULL) { 72 | printf("%s %s %d \n", getTerminal(t->TOKEN_NAME),t->LEXEME,t->LINE_NO); 73 | } 74 | close(f); 75 | 76 | printf("\nFinished printing of token list \n"); 77 | break; 78 | } 79 | case 2: { 80 | printf("Commencing parsing of input source code \n"); 81 | printf("--------\n"); 82 | 83 | Grammar* g = extractGrammar(); 84 | FirstAndFollow* fafl = computeFirstAndFollowSets(g); 85 | ParsingTable* pTable = initialiseParsingTable(); 86 | createParseTable(fafl,pTable); 87 | // Lexer and parser are both invoked inside parseInputSourceCode 88 | ParseTree* pt = parseInputSourceCode(argv[1],pTable,fafl); 89 | printParseTree(pt,NULL); 90 | 91 | printf("\nFinished parsing of input source code \n"); 92 | break; 93 | } 94 | case 3: { 95 | 96 | printf("Printing AST in a level format (Each level being printed)\n"); 97 | printf("Each node will have it's parent being printed"); 98 | // Adding thhe initialisation of structures also in time calculation 99 | Grammar* g = extractGrammar(); 100 | FirstAndFollow* fafl = computeFirstAndFollowSets(g); 101 | ParsingTable* pTable = initialiseParsingTable(); 102 | createParseTable(fafl,pTable); 103 | ParseTree* pt = parseInputSourceCode(argv[1],pTable,fafl); 104 | 105 | if(getErrorStatus() == 1) { 106 | printf("PARSING OR LEXICAL ANALYSIS YIELDED ERRORS, NOT PROCEEDING TO AST CONSTRUCTION\n"); 107 | break; 108 | } 109 | 110 | AST* ast = constructAST(pt); 111 | 112 | levelPrint(ast->root); 113 | 114 | printf("Completed printing AST\n"); 115 | break; 116 | } 117 | case 4: { 118 | 119 | Grammar* g = extractGrammar(); 120 | FirstAndFollow* fafl = computeFirstAndFollowSets(g); 121 | ParsingTable* pTable = initialiseParsingTable(); 122 | createParseTable(fafl,pTable); 123 | ParseTree* pt = parseInputSourceCode(argv[1],pTable,fafl); 124 | 125 | if(getErrorStatus() == 1) { 126 | printf("PARSING OR LEXICAL ANALYSIS YIELDED ERRORS, NOT PROCEEDING TO AST CONSTRUCTION\n"); 127 | break; 128 | } 129 | 130 | AST* ast = constructAST(pt); 131 | 132 | printf("Parse tree Number of nodes = %d Allocated Memory = %d Bytes\n", getParseTreeNodeCount(), getParseTreeMemory()); 133 | 134 | printf("AST Number of nodes = %d Allocated Memory = %d Bytes\n", getASTNodeCount(), getASTMemory()); 135 | 136 | float compression = (100*((float)getParseTreeMemory()-(float)getASTMemory()))/((float) getParseTreeMemory()); 137 | printf("Compression is %f\n" ,compression); 138 | break; 139 | } 140 | case 5: { 141 | Grammar* g = extractGrammar(); 142 | FirstAndFollow* fafl = computeFirstAndFollowSets(g); 143 | ParsingTable* pTable = initialiseParsingTable(); 144 | createParseTable(fafl,pTable); 145 | ParseTree* pt = parseInputSourceCode(argv[1],pTable,fafl); 146 | 147 | if(getErrorStatus() == 1) { 148 | printf("PARSING OR LEXICAL ANALYSIS YIELDED ERRORS, NOT PROCEEDING TO AST CONSTRUCTION\n"); 149 | break; 150 | } 151 | 152 | AST* ast = constructAST(pt); 153 | 154 | ErrorList* els = initializeErrorList(); 155 | 156 | SymbolTable* st = constructSymbolTable(ast,els); 157 | 158 | captureErrors(ast,els); 159 | 160 | printSymbolTable(st,1); 161 | break; 162 | } 163 | case 6: { 164 | 165 | 166 | Grammar* g = extractGrammar(); 167 | FirstAndFollow* fafl = computeFirstAndFollowSets(g); 168 | ParsingTable* pTable = initialiseParsingTable(); 169 | createParseTable(fafl,pTable); 170 | ParseTree* pt = parseInputSourceCode(argv[1],pTable,fafl); 171 | 172 | if(getErrorStatus() == 1) { 173 | printf("PARSING OR LEXICAL ANALYSIS YIELDED ERRORS, NOT PROCEEDING TO AST CONSTRUCTION\n"); 174 | break; 175 | } 176 | 177 | AST* ast = constructAST(pt); 178 | 179 | ErrorList* els = initializeErrorList(); 180 | 181 | SymbolTable* st = constructSymbolTable(ast,els); 182 | 183 | captureErrors(ast,els); 184 | 185 | printGlobals(st); 186 | break; 187 | } 188 | case 7: { 189 | 190 | Grammar* g = extractGrammar(); 191 | FirstAndFollow* fafl = computeFirstAndFollowSets(g); 192 | ParsingTable* pTable = initialiseParsingTable(); 193 | createParseTable(fafl,pTable); 194 | ParseTree* pt = parseInputSourceCode(argv[1],pTable,fafl); 195 | 196 | if(getErrorStatus() == 1) { 197 | printf("PARSING OR LEXICAL ANALYSIS YIELDED ERRORS, NOT PROCEEDING TO AST CONSTRUCTION\n"); 198 | break; 199 | } 200 | 201 | AST* ast = constructAST(pt); 202 | 203 | ErrorList* els = initializeErrorList(); 204 | 205 | SymbolTable* st = constructSymbolTable(ast,els); 206 | 207 | captureErrors(ast,els); 208 | 209 | printFunctions(st); 210 | break; 211 | } 212 | case 8: { 213 | Grammar* g = extractGrammar(); 214 | FirstAndFollow* fafl = computeFirstAndFollowSets(g); 215 | ParsingTable* pTable = initialiseParsingTable(); 216 | createParseTable(fafl,pTable); 217 | ParseTree* pt = parseInputSourceCode(argv[1],pTable,fafl); 218 | 219 | if(getErrorStatus() == 1) { 220 | printf("PARSING OR LEXICAL ANALYSIS YIELDED ERRORS, NOT PROCEEDING TO AST CONSTRUCTION\n"); 221 | break; 222 | } 223 | 224 | 225 | AST* ast = constructAST(pt); 226 | 227 | ErrorList* els = initializeErrorList(); 228 | 229 | SymbolTable* st = constructSymbolTable(ast,els); 230 | 231 | captureErrors(ast,els); 232 | 233 | printRecords(st); 234 | break; 235 | } 236 | case 9: { 237 | 238 | 239 | clock_t start_time, end_time; 240 | double total_CPU_time, total_CPU_time_in_seconds; 241 | start_time = clock(); 242 | 243 | Grammar* g = extractGrammar(); 244 | FirstAndFollow* fafl = computeFirstAndFollowSets(g); 245 | ParsingTable* pTable = initialiseParsingTable(); 246 | createParseTable(fafl,pTable); 247 | ParseTree* pt = parseInputSourceCode(argv[1],pTable,fafl); 248 | 249 | if(getErrorStatus() == 1) { 250 | printf("PARSING OR LEXICAL ANALYSIS YIELDED ERRORS, NOT PROCEEDING TO AST CONSTRUCTION\n"); 251 | break; 252 | } 253 | else { 254 | /**First Pass**/ 255 | AST* ast = constructAST(pt); 256 | 257 | ErrorList* els = initializeErrorList(); 258 | 259 | /**Second Pass**/ 260 | SymbolTable* st = constructSymbolTable(ast,els); 261 | 262 | /**Third Pass**/ 263 | captureErrors(ast,els); 264 | 265 | 266 | printf("----PRINTING ERRORS-----\n"); 267 | printErrors(els); 268 | } 269 | 270 | end_time = clock(); 271 | total_CPU_time = (double) (end_time - start_time); 272 | total_CPU_time_in_seconds = total_CPU_time / CLOCKS_PER_SEC; 273 | 274 | printf("Calculated time is %f \n",total_CPU_time); 275 | printf("Calculated time in seconds is %f \n", total_CPU_time_in_seconds); 276 | 277 | break; 278 | } 279 | case 10: { 280 | 281 | Grammar* g = extractGrammar(); 282 | FirstAndFollow* fafl = computeFirstAndFollowSets(g); 283 | ParsingTable* pTable = initialiseParsingTable(); 284 | createParseTable(fafl,pTable); 285 | ParseTree* pt = parseInputSourceCode(argv[1],pTable,fafl); 286 | 287 | if(getErrorStatus() == 1) { 288 | printf("PARSING OR LEXICAL ANALYSIS YIELDED ERRORS, NOT PROCEEDING TO AST CONSTRUCTION\n"); 289 | break; 290 | } 291 | else { 292 | /**First Pass**/ 293 | AST* ast = constructAST(pt); 294 | 295 | ErrorList* els = initializeErrorList(); 296 | 297 | /**Second Pass**/ 298 | SymbolTable* st = constructSymbolTable(ast,els); 299 | 300 | /**Third Pass**/ 301 | captureErrors(ast,els); 302 | 303 | 304 | printf("----PRINTING ERRORS-----\n"); 305 | printErrors(els); 306 | 307 | if(els->head != NULL) { 308 | printf("SEMANTIC ANALYSIS YIELDED ERRORS, NOT PROCEEDING TO CODE GENERATION\n"); 309 | break; 310 | } 311 | 312 | FILE* f = fopen(argv[2],"w"); 313 | codeGeneration(ast,st,f); 314 | printf("Completed code generation\n"); 315 | } 316 | break; 317 | } 318 | default: { 319 | continue; 320 | } 321 | 322 | } 323 | 324 | } 325 | 326 | } 327 | -------------------------------------------------------------------------------- /Compiler/error_handler.c: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | // This file contains the error management modules in semantic analysis 8 | 9 | #include "interface.h" 10 | #include "error_handler.h" 11 | #include "type_checker.h" 12 | #include 13 | 14 | 15 | // Function to initialize error list 16 | ErrorList* initializeErrorList() { 17 | ErrorList* els = (ErrorList*)malloc(sizeof(ErrorList)); 18 | els->head = NULL; 19 | els->tail = NULL; 20 | els->MESSAGE_MAX_LENGTH = ERROR_MAX_LENGTH; 21 | return els; 22 | } 23 | 24 | // Function to generate error at given line number and error message 25 | Error* generateError(int lineNumber, int iterativeLineEnd,char* errorMessage) { 26 | Error* error = (Error*)malloc(sizeof(Error)); 27 | error->LINE_NUMBER = lineNumber; 28 | error->ITERATIVE_LINE_END = iterativeLineEnd; 29 | error->ERROR_MESSAGE = errorMessage; 30 | error->next = NULL; 31 | return error; 32 | } 33 | 34 | // Function to add error to the list of errors 35 | void addErrorToList(ErrorList* els, int lineNumber, char* errorMessage) { 36 | 37 | // Add the error only if it has not been reported before 38 | Error* trav = els->head; 39 | int found = 0; 40 | while(trav != NULL) { 41 | if(strcmp(trav->ERROR_MESSAGE,errorMessage) == 0 && trav->LINE_NUMBER == lineNumber) { 42 | found = 1; 43 | break; 44 | } 45 | trav = trav->next; 46 | } 47 | if(found == 1) 48 | return; 49 | 50 | Error* error = generateError(lineNumber,-1,errorMessage); 51 | 52 | // In case the list inside errorList is empty 53 | if(els->head == NULL) { 54 | els->head = error; 55 | els->tail = error; 56 | return; 57 | } 58 | 59 | els->tail->next = error; 60 | els->tail = els->tail->next; 61 | } 62 | 63 | void addIterativeErrorToList(ErrorList* els, int lineStart, int lineEnd, char* errorMessage) { 64 | Error* error = generateError(lineStart,lineEnd,errorMessage); 65 | // In case the list inside errorList is empty 66 | if(els->head == NULL) { 67 | els->head = error; 68 | els->tail = error; 69 | return; 70 | } 71 | 72 | els->tail->next = error; 73 | els->tail = els->tail->next; 74 | } 75 | 76 | void printErrors(ErrorList* els) { 77 | Error* trav = els->head; 78 | if(trav == NULL) { 79 | printf("No errors reported during Semantic Analysis \n"); 80 | printf("Code compiles Successfully\n"); 81 | return; 82 | } 83 | 84 | while(trav != NULL) { 85 | if(trav->ITERATIVE_LINE_END == -1) 86 | printf("Line %d : %s\n" ,trav->LINE_NUMBER,trav->ERROR_MESSAGE); 87 | else 88 | printf("Line %d-%d : %s\n", trav->LINE_NUMBER,trav->ITERATIVE_LINE_END,trav->ERROR_MESSAGE); 89 | trav = trav->next; 90 | } 91 | } 92 | 93 | void throwTypeMismatchError(Token* lhsType, Token* rhsType,ErrorList* els,int lineNumber) { 94 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 95 | sprintf(message,"Types %s and %s not compatible for this operation\n", getDataType(lhsType),getDataType(rhsType)); 96 | addErrorToList(els,lineNumber,message); 97 | } 98 | 99 | void throwMissingDeclarationError(Token* errorVariable,ErrorList* els) { 100 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 101 | sprintf(message,"The variable %s is not declared\n" , errorVariable->LEXEME); 102 | addErrorToList(els,errorVariable->LINE_NO,message); 103 | } 104 | 105 | void throwMultipleDefinitionsError(Token* errorIdentifier, ErrorList* els) { 106 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 107 | sprintf(message,"The identifier %s is declared more than once\n" , errorIdentifier->LEXEME); 108 | addErrorToList(els,errorIdentifier->LINE_NO,message); 109 | } 110 | 111 | void throwClashingGlobalDefinitionError(Token* errorIdentifier, ErrorList* els) { 112 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 113 | sprintf(message,"The identifier %s is global anad cannot be declared again\n" , errorIdentifier->LEXEME); 114 | addErrorToList(els,errorIdentifier->LINE_NO,message); 115 | } 116 | 117 | void throwMissingFunctionDefinitionError(Token* errorFunCall, ErrorList* els) { 118 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 119 | sprintf(message,"The function %s is undefined\n" ,errorFunCall->LEXEME); 120 | addErrorToList(els,errorFunCall->LINE_NO,message); 121 | } 122 | 123 | void throwMissingRecordDefinitionError(Token* errorRecord,ErrorList* els) { 124 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 125 | sprintf(message,"The record %s is undefined\n" ,errorRecord->LEXEME); 126 | addErrorToList(els,errorRecord->LINE_NO,message); 127 | } 128 | 129 | void throwInvalidNumberOfInputArgsError(Token* errorFunCall, int actualNumber, int expectedNumber, ErrorList* els) { 130 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 131 | sprintf(message,"The number of input arguments passed is %d , it should be %d \n", actualNumber,expectedNumber); 132 | addErrorToList(els,errorFunCall->LINE_NO,message); 133 | } 134 | 135 | void throwInvalidNumberOfOutputArgsError(Token* errorFunCall, int actualNumber, int expectedNumber, ErrorList* els) { 136 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 137 | sprintf(message,"The number of output arguments passed is %d , it should be %d \n", actualNumber, expectedNumber); 138 | addErrorToList(els,errorFunCall->LINE_NO,message); 139 | } 140 | 141 | void throwInputArgumentTypeMismatchError(Token* errorFunCall, Token* typeExpected, Token* typePassed,int index,ErrorList* els) { 142 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 143 | sprintf(message,"In inputArgs The type at index %d should have been %s rather than %s\n" ,index,typeExpected->LEXEME,typePassed->LEXEME); 144 | addErrorToList(els,errorFunCall->LINE_NO,message); 145 | } 146 | 147 | void throwOutputArgumentTypeMismatchError(Token* errorFunCall, Token* typeExpected, Token* typeRecieved,int index,ErrorList* els) { 148 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 149 | sprintf(message,"In outputArgs The type at index %d should have been %s rather than %s\n" ,index,typeExpected->LEXEME,typeRecieved->LEXEME); 150 | addErrorToList(els,errorFunCall->LINE_NO,message); 151 | } 152 | 153 | void throwInvalidNumberOfReturnVariablesError(int lineNumber,int actualNumber, int expectedNumber,ErrorList* els) { 154 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 155 | sprintf(message,"The number of variables being returned is %d , instead of %d\n" , actualNumber,expectedNumber); 156 | addErrorToList(els,lineNumber,message); 157 | } 158 | 159 | void throwReturnTypeMismatchError(Token* errorId, Token* typeReturned, Token* typeExpected, ErrorList* els) { 160 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 161 | sprintf(message,"The type %s of the variable being returned %s does not match with the expected type %s\n" ,typeReturned->LEXEME,typeReturned->LEXEME,typeExpected->LEXEME); 162 | addErrorToList(els,errorId->LINE_NO,message); 163 | } 164 | 165 | void throwNoIterationUpdateError(int lineStart, int lineEnd, ErrorList* els) { 166 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 167 | sprintf(message,"None of the variables in the condition of the while loop get updated"); 168 | addIterativeErrorToList(els,lineStart,lineEnd,message); 169 | } 170 | 171 | void throwRecursiveFunctionCallError(Token* errorFunCall, ErrorList* els) { 172 | char* message = (char*)malloc(sizeof(char)*els->MESSAGE_MAX_LENGTH); 173 | sprintf(message,"The function call is recursive\n"); 174 | addErrorToList(els,errorFunCall->LINE_NO,message); 175 | } 176 | -------------------------------------------------------------------------------- /Compiler/error_handler.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "error_handlerDef.h" 8 | 9 | ErrorList* initializeErrorList(); 10 | 11 | Error* generateError(int lineNumber, int iterativeLineEnd,char* errorMessage); 12 | void addErrorToList(ErrorList* els, int lineNumber, char* errorMessage); 13 | 14 | // There are 14 types of errors, but in some cases more than one kind of error is captured 15 | // Overall we are at LEVEL 4 of semantic analysis 16 | 17 | // Type mismatch covers 2 errors in itself including the normal typeMismatch, like a division of a scalar by a record 18 | void throwTypeMismatchError(Token* lhsType, Token* rhsType,ErrorList* els, int lineNumber); 19 | void throwMissingDeclarationError(Token* errorVariable,ErrorList* els); 20 | // Multiple definitions covers all clashing cases eg - functions, variables and records. 21 | void throwMultipleDefinitionsError(Token* errorIdentifier, ErrorList* els); 22 | void throwClashingGlobalDefinitionError(Token* errorIdentifier, ErrorList* els); 23 | void throwMissingFunctionDefinitionError(Token* errorFunCall, ErrorList* els); 24 | void throwRecursiveFunctionCallError(Token* errorFunCall, ErrorList* els); 25 | void throwMissingRecordDefinitionError(Token* errorRecord,ErrorList* els); 26 | void throwInvalidNumberOfInputArgsError(Token* errorFunCall, int actualNumber, int expectedNumber, ErrorList* els); 27 | void throwInvalidNumberOfOutputArgsError(Token* errorFunCall, int actualNumber, int expectedNumber, ErrorList* els); 28 | void throwInputArgumentTypeMismatchError(Token* errorFunCall, Token* typeExpected, Token* typePassed,int index,ErrorList* els); 29 | void throwOutputArgumentTypeMismatchError(Token* errorFunCall, Token* typeExpected, Token* typeRecieved,int index,ErrorList* els); 30 | void throwInvalidNumberOfReturnVariablesError(int lineNumber,int actualNumber, int expectedNumber,ErrorList* els); 31 | void throwReturnTypeMismatchError(Token* errorId, Token* typeReturned, Token* typeExpected, ErrorList* els); 32 | void throwNoIterationUpdateError(int lineStart, int lineEnd, ErrorList* els); 33 | 34 | void printErrors(ErrorList* els); 35 | -------------------------------------------------------------------------------- /Compiler/error_handlerDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #ifndef ERROR_HANDLER_ 8 | #define ERROR_HANDLER_ 9 | 10 | #include "lexerDef.h" 11 | 12 | #define ERROR_MAX_LENGTH 100 13 | 14 | 15 | // The struct stores the error and it's error message in a linked list 16 | typedef struct Error { 17 | int LINE_NUMBER; 18 | int ITERATIVE_LINE_END; // Field to store the line in which the while loop ends 19 | char* ERROR_MESSAGE; 20 | struct Error* next; 21 | } Error; 22 | 23 | typedef struct ErrorList { 24 | Error* head; 25 | Error* tail; 26 | int MESSAGE_MAX_LENGTH; 27 | } ErrorList; 28 | 29 | 30 | #endif 31 | -------------------------------------------------------------------------------- /Compiler/interface.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | // Contains all the basic utilities being required in all components of the compiler 8 | #ifndef INTF_ 9 | #define INTF_ 10 | 11 | #include 12 | #include 13 | #include 14 | #include 15 | #include 16 | #include 17 | 18 | #endif 19 | -------------------------------------------------------------------------------- /Compiler/keyword_table.c: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "interface.h" 8 | #include "keyword_table.h" 9 | #include 10 | #define NUMBER_KEYWORDS 24 11 | 12 | // Commenting this line as the keyword table is declared as global in the header file 13 | // Uncommenting it due to compile time errors 14 | 15 | int hashFunction(char* str) { 16 | 17 | /*Hash function string sum and mod */ 18 | // int sum = 0; 19 | // for(int i=0; i < strlen(str); i++) { 20 | // sum += str[i]-'0'; 21 | // } 22 | // return (sum%NUMBER_KEYWORDS); 23 | 24 | /* Hash function djb2 and mod */ 25 | unsigned long hash = 5381; 26 | int c; 27 | while (c = *str++) 28 | hash = ((hash << 5) + hash) + c; /* hash * 33 + c */ 29 | return (hash%NUMBER_KEYWORDS); 30 | } 31 | 32 | void addEntry(KeywordTable* kt,TokenName tn, char* lexeme) { 33 | int hash = hashFunction(lexeme); 34 | // printf("Hash is %d for the keyword %s\n" , hash, lexeme); 35 | kt->KEYWORDS[hash].keywords = addToList(kt->KEYWORDS[hash].keywords,tn,lexeme); 36 | } 37 | 38 | Node* lookUp(KeywordTable* kt,char* lexeme) { 39 | int hash = hashFunction(lexeme); 40 | Node* trav = kt->KEYWORDS[hash].keywords; 41 | while(trav != NULL) { 42 | if(strcmp(lexeme,trav->LEXEME) == 0) 43 | return trav; 44 | trav = trav->next; 45 | } 46 | 47 | return NULL; 48 | } 49 | 50 | KeywordTable* initializeTable() { 51 | KeywordTable* kt = (KeywordTable*)malloc(sizeof(KeywordTable)); 52 | kt->KEYWORDS = (KeywordNode*)malloc(NUMBER_KEYWORDS*sizeof(KeywordNode)); 53 | 54 | // Initialize each keyword slot as NULL initially 55 | for(int i=0; i < NUMBER_KEYWORDS; i++) { 56 | kt->KEYWORDS[i].keywords = NULL; 57 | } 58 | 59 | addEntry(kt,TK_WITH,"with"); 60 | addEntry(kt,TK_PARAMETERS,"parameters"); 61 | addEntry(kt,TK_END,"end"); 62 | addEntry(kt,TK_WHILE,"while"); 63 | addEntry(kt,TK_TYPE,"type"); 64 | addEntry(kt,TK_MAIN,"_main"); 65 | addEntry(kt,TK_GLOBAL,"global"); 66 | addEntry(kt,TK_PARAMETER,"parameter"); 67 | addEntry(kt,TK_LIST,"list"); 68 | addEntry(kt,TK_INPUT,"input"); 69 | addEntry(kt,TK_OUTPUT,"output"); 70 | addEntry(kt,TK_INT,"int"); 71 | addEntry(kt,TK_REAL,"real"); 72 | addEntry(kt,TK_ENDWHILE,"endwhile"); 73 | addEntry(kt,TK_IF,"if"); 74 | addEntry(kt,TK_THEN,"then"); 75 | addEntry(kt,TK_ENDIF,"endif"); 76 | addEntry(kt,TK_READ,"read"); 77 | addEntry(kt,TK_WRITE,"write"); 78 | addEntry(kt,TK_RETURN,"return"); 79 | addEntry(kt,TK_CALL,"call"); 80 | addEntry(kt,TK_RECORD,"record"); 81 | addEntry(kt,TK_ENDRECORD,"endrecord"); 82 | addEntry(kt,TK_ELSE,"else"); 83 | 84 | return kt; 85 | 86 | } 87 | 88 | /*** List functions ***/ 89 | 90 | // Returns the head of the modified list 91 | Node* addToList(Node* ls, TokenName tn, char* lexeme) { 92 | 93 | // Case when list is empty 94 | if(ls == NULL) { 95 | Node* n = (Node*)malloc(sizeof(Node)); 96 | n->LEXEME = lexeme; 97 | n->TOKEN_NAME = tn; 98 | n->next = NULL; 99 | return n; 100 | } 101 | 102 | // Insert at front O(1) 103 | Node* n = (Node*)malloc(sizeof(Node)); 104 | n->LEXEME = lexeme; 105 | n->TOKEN_NAME = tn; 106 | n->next = ls; 107 | return n; 108 | } 109 | 110 | 111 | // Function to lookup a lexeme in a list 112 | int searchList(Node* ls, char* lexeme) { 113 | Node* trav = ls; 114 | while(trav != NULL) { 115 | if(strcmp(lexeme,trav->LEXEME)) 116 | return 1; 117 | trav = trav->next; 118 | } 119 | 120 | return 0; 121 | } 122 | 123 | 124 | /** Temporary Utility functions **/ 125 | 126 | // Temporary function print a list and also returns the number of elements in a list 127 | int printList(Node* ls) { 128 | Node* trav = ls; 129 | int len = 0; 130 | if(ls == NULL) { 131 | printf("This slot is not occupied!\n"); 132 | printf("\n"); 133 | return 0; 134 | } 135 | 136 | while(trav != NULL) { 137 | printf("Keyword: %s " ,trav->LEXEME); 138 | trav = trav->next; 139 | len++; 140 | } 141 | printf("\n"); 142 | printf("\n"); 143 | return len; 144 | } 145 | 146 | // Temporary function to print hash table 147 | void printHashTable(KeywordTable* kt) { 148 | int empty = 0; 149 | int collisions = 0; 150 | for(int i=0; i < NUMBER_KEYWORDS; i++) { 151 | int len = printList(kt->KEYWORDS[i].keywords); 152 | if(len == 0) 153 | empty++; 154 | if(len > 1) 155 | collisions += len-1; 156 | } 157 | 158 | // Aim for as less a load factor as possible, JAVA 10 specs says 0.75 159 | 160 | printf("\n"); 161 | printf("Calculating laod-factor\n"); 162 | printf("%f\n" ,((float)empty)/NUMBER_KEYWORDS); 163 | 164 | // Aim for as less collision as possible 165 | 166 | printf("\n"); 167 | printf("Calculating total collisions\n"); 168 | printf("%d\n" , collisions); 169 | 170 | 171 | } 172 | -------------------------------------------------------------------------------- /Compiler/keyword_table.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "keyword_tableDef.h" 8 | 9 | // Functions for keyword hashtable 10 | int hashFunction(char* str); 11 | void addEntry(KeywordTable* kt,TokenName tn, char* lexeme); 12 | Node* lookUp(KeywordTable* kt,char* lexeme); 13 | KeywordTable* initializeTable(); 14 | void printHashTable(KeywordTable* kt); 15 | 16 | // Functions for List 17 | Node* addToList(Node* ls, TokenName tn, char* lexeme); 18 | int searchList(Node* ls, char* lexeme); 19 | int printList(Node* ls); 20 | -------------------------------------------------------------------------------- /Compiler/keyword_tableDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | #ifndef KEYWORD_TABLE_DEF_ 7 | #define KEYWORD_TABLE_DEF_ 8 | 9 | #include "lexerDef.h" 10 | 11 | // Node actually storing the keyword! 12 | typedef struct Node { 13 | TokenName TOKEN_NAME; 14 | char* LEXEME; 15 | struct Node* next; 16 | } Node; 17 | 18 | // Each slot contains a linked list 19 | // The linked list is empty if the slot does not have any key 20 | typedef struct KeywordNode { 21 | Node* keywords; 22 | } KeywordNode; 23 | 24 | // An array of slots where keywords are hashed. 25 | // In case of a collision the hashed slot adds the entry as a linked list 26 | typedef struct KeywordTable { 27 | KeywordNode* KEYWORDS; 28 | } KeywordTable; 29 | 30 | #endif 31 | -------------------------------------------------------------------------------- /Compiler/lexer.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "lexerDef.h" 8 | 9 | void initializeBuffers(int f); 10 | void initializeLexer(int f); 11 | void removeComments(char* testCaseFile,char* cleanFile); 12 | int getInputStream(); 13 | char nextChar(); 14 | void retract(int amt); 15 | int rangeMatch(char ch,char start, char end); 16 | int singleMatch(char ch, char chToEqual); 17 | Token* populateToken(Token* t,TokenName tokenName,char* lexeme,int lineNumber,int isNumber,Value* value); 18 | Token* getToken(); 19 | void printBuffers(); 20 | int stringToInteger(char* str); 21 | float stringToFloat(char* str); 22 | -------------------------------------------------------------------------------- /Compiler/lexerDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #ifndef LEX_DEF_ 8 | #define LEX_DEF_ 9 | 10 | typedef enum TokenName { 11 | TK_ASSIGNOP, 12 | TK_COMMENT, 13 | TK_ID, 14 | TK_NUM, 15 | TK_RNUM, 16 | TK_FIELDID, 17 | TK_FUNID, 18 | TK_RECORDID, 19 | TK_WITH, 20 | TK_PARAMETERS, 21 | TK_END, 22 | TK_WHILE, 23 | TK_TYPE, 24 | TK_MAIN, 25 | TK_GLOBAL, 26 | TK_PARAMETER, 27 | TK_LIST, 28 | TK_SQL, 29 | TK_SQR, 30 | TK_INPUT, 31 | TK_OUTPUT, 32 | TK_INT, 33 | TK_REAL, 34 | TK_COMMA, 35 | TK_SEM, 36 | TK_COLON, 37 | TK_DOT, 38 | TK_ENDWHILE, 39 | TK_OP, 40 | TK_CL, 41 | TK_IF, 42 | TK_THEN, 43 | TK_ENDIF, 44 | TK_READ, 45 | TK_WRITE, 46 | TK_RETURN, 47 | TK_PLUS, 48 | TK_MINUS, 49 | TK_MUL, 50 | TK_DIV, 51 | TK_CALL, 52 | TK_RECORD, 53 | TK_ENDRECORD, 54 | TK_ELSE, 55 | TK_AND, 56 | TK_OR, 57 | TK_NOT, 58 | TK_LT, 59 | TK_LE, 60 | TK_EQ, 61 | TK_GT, 62 | TK_GE, 63 | TK_NE, 64 | TK_EPS, 65 | TK_DOLLAR, 66 | TK_ERR 67 | } TokenName; 68 | 69 | typedef union Value { 70 | int INT_VALUE; 71 | float FLOAT_VALUE; 72 | } Value; 73 | 74 | typedef struct Token { 75 | TokenName TOKEN_NAME; 76 | char* LEXEME; 77 | int LINE_NO; 78 | 79 | // Stores 0 if not a number, 1 if an integer, 2 if a real number 80 | // Also used to segregate lexical errors, 81 | // 3 to denote an error when token identification fails, 82 | // 4 to denote an error if the token is identified but does not respect the constraints 83 | // 5 to denote an error if two identifiers are declared back to back 84 | // 6 to denote an unknown symbol 85 | int IS_NUMBER; 86 | 87 | Value* VALUE; // Stores NULL if the Token is not a number 88 | } Token; 89 | 90 | #endif 91 | -------------------------------------------------------------------------------- /Compiler/makefile: -------------------------------------------------------------------------------- 1 | #Group 27 2 | #Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | #Arnav Sailesh 2016A7PS0054P 4 | #Gunraj Singh 2016A7PS0085P 5 | #Aashish Singh 2016A7PS0683P 6 | 7 | all: 8 | gcc -g -o compiler driver.c lexer.c parser.c keyword_table.c nary_tree.c stack.c ast.c symbol_table.c type_checker.c semantic_analyzer.c printer.c error_handler.c code_gen.c 9 | -------------------------------------------------------------------------------- /Compiler/nary_tree.c: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "interface.h" 8 | #include "nary_treeDef.h" 9 | #include "parser.h" 10 | 11 | int parseTreeNodeCount; 12 | int parseTreeMemory; 13 | 14 | NaryTreeNode* createLeafNode(int enumId) { 15 | NaryTreeNode* ntn = (NaryTreeNode*)malloc(sizeof(NaryTreeNode)); 16 | ntn->IS_LEAF_NODE = 1; 17 | ntn->NODE_TYPE.L.ENUM_ID = enumId; 18 | ntn->next = NULL; 19 | parseTreeMemory += sizeof(NaryTreeNode); 20 | return ntn; 21 | } 22 | 23 | NaryTreeNode* createNonLeafNode(int enumId) { 24 | NaryTreeNode* ntn = (NaryTreeNode*)malloc(sizeof(NaryTreeNode)); 25 | ntn->IS_LEAF_NODE = 0; 26 | ntn->NODE_TYPE.NL.ENUM_ID = enumId; 27 | ntn->NODE_TYPE.NL.NUMBER_CHILDREN = 0; 28 | ntn->next = NULL; 29 | ntn->NODE_TYPE.NL.child = NULL; 30 | parseTreeMemory += sizeof(NaryTreeNode); 31 | return ntn; 32 | } 33 | 34 | NaryTreeNode* createNode(int isTerminal, SymbolType type,NaryTreeNode* parent) { 35 | 36 | NaryTreeNode* ntn; 37 | if(isTerminal == 1) { 38 | ntn = createLeafNode(type.TERMINAL); 39 | ntn->parent = parent; 40 | } 41 | else { 42 | ntn = createNonLeafNode(type.NON_TERMINAL); 43 | ntn->parent = parent; 44 | } 45 | 46 | parseTreeNodeCount += 1; 47 | return ntn; 48 | } 49 | 50 | ParseTree* initialiseParseTree() { 51 | parseTreeNodeCount = *((int*)malloc(sizeof(int))); 52 | parseTreeMemory = *((int*)malloc(sizeof(int))); 53 | ParseTree* pt = (ParseTree*)malloc(sizeof(ParseTree)); 54 | pt->root = createNonLeafNode(program); // Initialising the tree with the root node 55 | pt->root->parent = NULL; 56 | return pt; 57 | } 58 | 59 | void addRuleToParseTree(NaryTreeNode* ntn, Rule* r) { 60 | 61 | // For debugging if such a situation happens 62 | if(ntn->IS_LEAF_NODE == 1) { 63 | printf("TERMINALS CANNOT HAVE CHILDREN! \n"); 64 | return; 65 | } 66 | 67 | // Start from RHS of the RULE 68 | int numberChild = 0; 69 | Symbol* trav = r->SYMBOLS->HEAD_SYMBOL->next; 70 | NaryTreeNode* childHead = NULL; 71 | NaryTreeNode* childTrav = NULL; 72 | while(trav != NULL) { 73 | if(childHead == NULL) { 74 | childHead = createNode(trav->IS_TERMINAL,trav->TYPE,ntn); 75 | childTrav = childHead; 76 | } 77 | else { 78 | childTrav->next = createNode(trav->IS_TERMINAL,trav->TYPE,ntn); 79 | childTrav = childTrav->next; 80 | } 81 | numberChild++; 82 | trav = trav->next; 83 | } 84 | 85 | ntn->NODE_TYPE.NL.RULE_NO = r->RULE_NO; 86 | ntn->NODE_TYPE.NL.child = childHead; 87 | ntn->NODE_TYPE.NL.NUMBER_CHILDREN = numberChild; 88 | } 89 | 90 | void printNaryTree(NaryTreeNode* nt) { 91 | if(nt->IS_LEAF_NODE == 1) { 92 | printf("%s " ,getTerminal(nt->NODE_TYPE.L.ENUM_ID)); 93 | return; 94 | } 95 | 96 | printf("%s\n" , getNonTerminal(nt->NODE_TYPE.NL.ENUM_ID)); 97 | 98 | NaryTreeNode* childTrav = nt->NODE_TYPE.NL.child; 99 | while(childTrav != NULL) { 100 | 101 | if(childTrav->IS_LEAF_NODE == 1) 102 | printf("%s " ,getTerminal(childTrav->NODE_TYPE.L.ENUM_ID)); 103 | else 104 | printf("%s " ,getNonTerminal(childTrav->NODE_TYPE.NL.ENUM_ID)); 105 | 106 | childTrav = childTrav->next; 107 | } 108 | 109 | printf("\n"); 110 | 111 | childTrav = nt->NODE_TYPE.NL.child; 112 | while(childTrav != NULL) { 113 | if(childTrav->IS_LEAF_NODE == 0) 114 | printNaryTree(childTrav); 115 | childTrav = childTrav->next; 116 | } 117 | } 118 | 119 | void printTree(ParseTree* pt) { 120 | NaryTreeNode* nt = pt->root; 121 | printNaryTree(nt); 122 | } 123 | 124 | int getParseTreeNodeCount() { 125 | return parseTreeNodeCount; 126 | } 127 | 128 | int getParseTreeMemory() { 129 | return parseTreeMemory; 130 | } 131 | -------------------------------------------------------------------------------- /Compiler/nary_tree.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "nary_treeDef.h" 8 | 9 | ParseTree* initialiseParseTree(); 10 | NaryTreeNode* createLeafNode(int enumId); 11 | NaryTreeNode* createNonLeafNode(int enumId); 12 | NaryTreeNode* createNode(int isTerminal, SymbolType type,NaryTreeNode* parent); 13 | void addRuleToParseTree(NaryTreeNode* ntn, Rule* r); 14 | void printTree(ParseTree* pt); 15 | void printNaryTree(NaryTreeNode* nt); 16 | 17 | int getParseTreeNodeCount(); 18 | int getParseTreeMemory(); 19 | -------------------------------------------------------------------------------- /Compiler/nary_treeDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #ifndef NARY_ 8 | #define NARY_ 9 | 10 | #include "lexerDef.h" 11 | #include "parserDef.h" 12 | #include "astDef.h" 13 | 14 | typedef struct NaryTreeNode NaryTreeNode; 15 | 16 | typedef struct NonLeafNode { 17 | int ENUM_ID; // The enum identifier of the symbol 18 | int NUMBER_CHILDREN; // Number of children for this non terminal 19 | int RULE_NO; // Rule number used to generate children 20 | NaryTreeNode* child; // Points to the starting child of this node 21 | } NonLeafNode; 22 | 23 | typedef struct LeafNode { 24 | int ENUM_ID; 25 | Token* TK; // This field will be populated when the input is being parsed 26 | } LeafNode; 27 | 28 | typedef union NodeType { 29 | NonLeafNode NL; 30 | LeafNode L; 31 | } NodeType; 32 | 33 | typedef struct NaryTreeNode { 34 | NodeType NODE_TYPE; 35 | int IS_LEAF_NODE; 36 | struct NaryTreeNode* parent; 37 | struct NaryTreeNode* next; // Points to the next node of the list which this node is a part of 38 | 39 | } NaryTreeNode; 40 | 41 | typedef struct ParseTree { 42 | NaryTreeNode* root; 43 | } ParseTree; 44 | 45 | #endif 46 | -------------------------------------------------------------------------------- /Compiler/parser.c: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "interface.h" 8 | #include "parserDef.h" 9 | #include "lexer.h" 10 | #include "nary_tree.h" 11 | #include "stack.h" 12 | #include 13 | #define GRAMMAR_FILE "GRAMMAR.txt" 14 | #define TOTAL_GRAMMAR_NONTERMINALS 49 // TODO get the actual number of nonterminals 15 | #define TOTAL_GRAMMAR_TERMINALS 56 16 | #define TOTAL_GRAMMAR_RULES 87 //TODO actual number of rules 17 | 18 | Grammar* g; // g is a record that keeps track of the Grammar 19 | 20 | NonTerminalRuleRecords** ntrr; // Mantains records of the starting rule number and ending rule number of the non terminal indicated by it's array index 21 | int checkIfDone[TOTAL_GRAMMAR_NONTERMINALS] = {0}; // A global structure to check if the First has been calculated for the corresponding non terminal 22 | int vectorSize = TOTAL_GRAMMAR_TERMINALS+1; // Calculate the size of vectors for first and follow 23 | 24 | int syntaxErrorFlag; 25 | int lexicalErrorFlag; 26 | 27 | // An array of strings which stores the terminals in the same order as the enum, this is used to get the enum identifier of the TerminalID 28 | char* TerminalID[] = { 29 | "TK_ASSIGNOP", 30 | "TK_COMMENT", 31 | "TK_ID", 32 | "TK_NUM", 33 | "TK_RNUM", 34 | "TK_FIELDID", 35 | "TK_FUNID", 36 | "TK_RECORDID", 37 | "TK_WITH", 38 | "TK_PARAMETERS", 39 | "TK_END", 40 | "TK_WHILE", 41 | "TK_TYPE", 42 | "TK_MAIN", 43 | "TK_GLOBAL", 44 | "TK_PARAMETER", 45 | "TK_LIST", 46 | "TK_SQL", 47 | "TK_SQR", 48 | "TK_INPUT", 49 | "TK_OUTPUT", 50 | "TK_INT", 51 | "TK_REAL", 52 | "TK_COMMA", 53 | "TK_SEM", 54 | "TK_COLON", 55 | "TK_DOT", 56 | "TK_ENDWHILE", 57 | "TK_OP", 58 | "TK_CL", 59 | "TK_IF", 60 | "TK_THEN", 61 | "TK_ENDIF", 62 | "TK_READ", 63 | "TK_WRITE", 64 | "TK_RETURN", 65 | "TK_PLUS", 66 | "TK_MINUS", 67 | "TK_MUL", 68 | "TK_DIV", 69 | "TK_CALL", 70 | "TK_RECORD", 71 | "TK_ENDRECORD", 72 | "TK_ELSE", 73 | "TK_AND", 74 | "TK_OR", 75 | "TK_NOT", 76 | "TK_LT", 77 | "TK_LE", 78 | "TK_EQ", 79 | "TK_GT", 80 | "TK_GE", 81 | "TK_NE", 82 | "TK_EPS", 83 | "TK_DOLLAR", 84 | "TK_ERR" 85 | }; 86 | 87 | // An array of strings which stores the non terminals in the same order as the enum, this is used to get the enum identifier of the NonTerminalID 88 | char* NonTerminalID[] = { 89 | "program", 90 | "mainFunction", 91 | "otherFunctions", 92 | "function", 93 | "input_par", 94 | "output_par", 95 | "parameter_list", 96 | "dataType", 97 | "primitiveDatatype", 98 | "constructedDatatype", 99 | "remaining_list", 100 | "stmts", 101 | "typeDefinitions", 102 | "typeDefinition", 103 | "fieldDefinitions", 104 | "fieldDefinition", 105 | "moreFields", 106 | "declarations", 107 | "declaration", 108 | "global_or_not", 109 | "otherStmts", 110 | "stmt", 111 | "assignmentStmt", 112 | "singleOrRecId", 113 | "new_24", 114 | "funCallStmt", 115 | "outputParameters", 116 | "inputParameters", 117 | "iterativeStmt", 118 | "conditionalStmt", 119 | "elsePart", 120 | "ioStmt", 121 | "arithmeticExpression", 122 | "expPrime", 123 | "term", 124 | "termPrime", 125 | "factor", 126 | "highPrecedenceOperators", 127 | "lowPrecedenceOperators", 128 | "all", 129 | "temp", 130 | "booleanExpression", 131 | "var", 132 | "logicalOp", 133 | "relationalOp", 134 | "returnStmt", 135 | "optionalReturn", 136 | "idList", 137 | "more_ids" 138 | }; 139 | 140 | //Utility function to copy a lexeme 141 | char* copyLexeme(char* str) { 142 | int len = strlen(str); 143 | char* lex = (char*)malloc(sizeof(char)*(len+1)); 144 | for(int i=0; i < len; i++) 145 | lex[i] = str[i]; 146 | 147 | lex[len] = '\0'; 148 | return lex; 149 | } 150 | 151 | // Utility function to append a character to symbol string 152 | char* appendToSymbol(char* str, char c) { 153 | int len = strlen(str); 154 | char* strConcat = (char*)malloc(sizeof(char)*(len+2)); 155 | for(int i=0; i < len; i++) 156 | strConcat[i] = str[i]; 157 | 158 | strConcat[len] = c; 159 | strConcat[len+1] = '\0'; 160 | return strConcat; 161 | } 162 | 163 | // Returns the Enum ID of the string in the NonTerminal ID map if found, otherwise return -1 164 | int findInTerminalMap(char* str) { 165 | for(int i=0; i < TOTAL_GRAMMAR_TERMINALS; i++) { 166 | if(strcmp(str,TerminalID[i]) == 0) 167 | return i; 168 | } 169 | 170 | return -1; 171 | } 172 | 173 | // Returns the Enum ID of the string in the NonTerminalID map if found, otherwise returns -1 174 | int findInNonTerminalMap(char* str) { 175 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) { 176 | if(strcmp(str,NonTerminalID[i]) == 0) 177 | return i; 178 | } 179 | 180 | return -1; 181 | } 182 | 183 | 184 | // Returns the string corresponding to the enumId (Required when printing is too be done outside parser.c) 185 | 186 | char* getTerminal(int enumId) { 187 | return TerminalID[enumId]; 188 | } 189 | 190 | char* getNonTerminal(int enumId) { 191 | return NonTerminalID[enumId]; 192 | } 193 | 194 | ParsingTable* initialiseParsingTable() { 195 | ParsingTable* pt = (ParsingTable*)malloc(sizeof(ParsingTable)); 196 | pt->entries = (int**)malloc(TOTAL_GRAMMAR_NONTERMINALS*sizeof(int*)); 197 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) { 198 | // Calloc used to initialise with 0 by default, if left empty => error state 199 | pt->entries[i] = (int*)calloc(TOTAL_GRAMMAR_TERMINALS,sizeof(int)); 200 | } 201 | return pt; 202 | } 203 | 204 | // initialise the Grammar according to the number of non terminals and total rules 205 | int initialiseGrammar() { 206 | 207 | g = (Grammar*)malloc(sizeof(Grammar)); 208 | g->GRAMMAR_RULES_SIZE = TOTAL_GRAMMAR_RULES+1; // 1 added as 0 index is left as NULL to provide direct mapping by rule number to the rule 209 | g->GRAMMAR_RULES = (Rule**)malloc(sizeof(Rule*)*g->GRAMMAR_RULES_SIZE); 210 | g->GRAMMAR_RULES[0] = NULL; 211 | } 212 | 213 | // Initialises a symbol structure based on the symbol string extracted from the grammar file 214 | Symbol* intialiseSymbol(char* symbol) { 215 | 216 | Symbol* s = (Symbol*)malloc(sizeof(Symbol)); 217 | // Search for enum IDs in both maps 218 | int idNonTerminal, idTerminal; 219 | idNonTerminal = findInNonTerminalMap(symbol); 220 | // If idNonTerminal is found, assign it as the symbol type 221 | if(idNonTerminal != -1) { 222 | s->TYPE.NON_TERMINAL = idNonTerminal; 223 | s->IS_TERMINAL = 0; 224 | } 225 | else { 226 | idTerminal = findInTerminalMap(symbol); 227 | if(idTerminal != -1) { 228 | s->TYPE.TERMINAL = idTerminal; 229 | s->IS_TERMINAL = 1; 230 | } 231 | } 232 | 233 | s->next = NULL; 234 | 235 | return s; 236 | } 237 | 238 | SymbolList* initialiseSymbolList() { 239 | SymbolList* sl = (SymbolList*)malloc(sizeof(SymbolList)); 240 | sl->HEAD_SYMBOL = NULL; 241 | sl->TAIL_SYMBOL = NULL; 242 | sl->RULE_LENGTH = 0; 243 | return sl; 244 | } 245 | 246 | Rule* initialiseRule(SymbolList* sl, int ruleCount) { 247 | Rule* r = (Rule*)malloc(sizeof(Rule)); 248 | r->SYMBOLS = sl; 249 | r->RULE_NO = ruleCount; 250 | return r; 251 | } 252 | 253 | NonTerminalRuleRecords** intialiseNonTerminalRecords() { 254 | NonTerminalRuleRecords** ntrr = (NonTerminalRuleRecords**)malloc(sizeof(NonTerminalRuleRecords*)*TOTAL_GRAMMAR_NONTERMINALS); 255 | return ntrr; 256 | } 257 | 258 | void initialiseCheckIfDone() { 259 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) 260 | checkIfDone[i] = 0; 261 | } 262 | 263 | FirstAndFollow* initialiseFirstAndFollow() { 264 | FirstAndFollow* fafl = (FirstAndFollow*)malloc(sizeof(FirstAndFollow)); 265 | 266 | // Initialize the array of vectors to be equal to the total number of Non terminals 267 | fafl->FIRST = (int**)malloc(sizeof(int*)*TOTAL_GRAMMAR_NONTERMINALS); 268 | fafl->FOLLOW = (int**)malloc(sizeof(int*)*TOTAL_GRAMMAR_NONTERMINALS); 269 | 270 | 271 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) { 272 | // Calloc used to initialize the vectors to 0 273 | fafl->FIRST[i] = (int*)calloc(vectorSize,sizeof(int)); 274 | fafl->FOLLOW[i] = (int*)calloc(vectorSize,sizeof(int)); 275 | } 276 | 277 | return fafl; 278 | 279 | } 280 | 281 | // Calculates the First of the Symbol s and it's corresponding bit vector is populated by using the enum_id 282 | void calculateFirst(int** firstVector, int enumId) { 283 | 284 | // printf("Stack overflow being caused by %s\n" , NonTerminalID[enumId]); 285 | int start = ntrr[enumId]->start; 286 | int end = ntrr[enumId]->end; 287 | int producesNull = 0; // Flag which tracks whether the non terminal produces NULL 288 | 289 | for(int i=start; i <= end; i++) { 290 | Rule* r = g->GRAMMAR_RULES[i]; 291 | Symbol* s = r->SYMBOLS->HEAD_SYMBOL; 292 | Symbol* trav = s; 293 | Symbol* nextSymbol = trav->next; 294 | int ruleYieldsEpsilon = 1; 295 | while(nextSymbol != NULL) { 296 | 297 | // Case when a terminal is encountered in the RHS 298 | if(nextSymbol->IS_TERMINAL == 1) { 299 | if(nextSymbol->TYPE.TERMINAL != TK_EPS) { 300 | ruleYieldsEpsilon = 0; 301 | firstVector[enumId][nextSymbol->TYPE.TERMINAL] = 1; 302 | } 303 | break; 304 | } 305 | 306 | // Case when it is a Non-terminal 307 | 308 | // Check if it's First has been calculated already, if not calculate it 309 | // In case of stack overflow, for debugging add a condition that nextSymbol should not habe the same ID as the enumId 310 | if(checkIfDone[nextSymbol->TYPE.NON_TERMINAL] == 0) { 311 | calculateFirst(firstVector,nextSymbol->TYPE.NON_TERMINAL); 312 | } 313 | 314 | for(int j=0; j < vectorSize; j++) { 315 | if(firstVector[nextSymbol->TYPE.NON_TERMINAL][j] == 1) 316 | firstVector[s->TYPE.NON_TERMINAL][j] = 1; 317 | } 318 | 319 | if(firstVector[nextSymbol->TYPE.NON_TERMINAL][TK_EPS] == 0) { 320 | ruleYieldsEpsilon = 0; 321 | break; 322 | } 323 | 324 | nextSymbol = nextSymbol->next; 325 | } 326 | 327 | if(ruleYieldsEpsilon) 328 | producesNull = 1; 329 | } 330 | 331 | if(producesNull) 332 | firstVector[enumId][TK_EPS] = 1; 333 | else 334 | firstVector[enumId][TK_EPS] = 0; 335 | 336 | checkIfDone[enumId] = 1; 337 | 338 | } 339 | 340 | void populateFirst(int** firstVector, Grammar* g) { 341 | 342 | // Traversal is done by enum_id (which is iterator i in this case) 343 | // Grammar Rules are written in GRAMMAR_FILE in the same order as enum name as per convention 344 | 345 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) { 346 | if(checkIfDone[i] == 0) 347 | calculateFirst(firstVector,i); 348 | } 349 | } 350 | 351 | void populateFollow(int** followVector, int** firstVector, Grammar* g) { 352 | 353 | 354 | for(int i=1; i <= TOTAL_GRAMMAR_RULES; i++) { 355 | Rule* r = g->GRAMMAR_RULES[i]; 356 | Symbol* head = r->SYMBOLS->HEAD_SYMBOL; 357 | Symbol* trav = head->next; 358 | int epsilonIdentifier = 0; 359 | while(trav != NULL) { 360 | 361 | if(trav->IS_TERMINAL == 0) { 362 | Symbol* followTrav = trav->next; 363 | while(followTrav != NULL) { 364 | if(followTrav->IS_TERMINAL == 1 && followTrav->TYPE.TERMINAL != TK_EPS) { 365 | followVector[trav->TYPE.NON_TERMINAL][followTrav->TYPE.TERMINAL] = 1; 366 | break; 367 | } 368 | else { 369 | 370 | for(int j=0; j < vectorSize; j++) 371 | if(firstVector[followTrav->TYPE.NON_TERMINAL][j] == 1 && j != TK_EPS) 372 | followVector[trav->TYPE.NON_TERMINAL][j] = 1; 373 | 374 | if(firstVector[followTrav->TYPE.NON_TERMINAL][TK_EPS] == 0) 375 | break; 376 | 377 | } 378 | followTrav = followTrav->next; 379 | } 380 | 381 | // Case when we need to take LHS Non terminal 382 | // Venkat => followTrav != NULL && followTrav->next == NULL 383 | if(trav->next == NULL || (followTrav == NULL)) { 384 | for(int j=0; j < vectorSize; j++) 385 | if(followVector[head->TYPE.NON_TERMINAL][j] == 1 && j != TK_EPS) 386 | followVector[trav->TYPE.NON_TERMINAL][j] = 1; 387 | } 388 | 389 | } 390 | 391 | 392 | trav = trav->next; 393 | } 394 | } 395 | } 396 | 397 | // Function to keep populating the followVector until it stabilises 398 | void populateFollowTillStable(int** followVector, int** firstVector, Grammar* g) { 399 | int** prevFollowVector = (int**)malloc(TOTAL_GRAMMAR_NONTERMINALS*sizeof(int*)); 400 | 401 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) { 402 | prevFollowVector[i] = (int*)calloc(vectorSize,sizeof(int)); 403 | } 404 | 405 | followVector[program][TK_DOLLAR] = 1; 406 | prevFollowVector[program][TK_DOLLAR] = 1; 407 | 408 | while(1) { 409 | 410 | populateFollow(followVector,firstVector,g); 411 | int stable = 1; 412 | 413 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) { 414 | for(int j=0; j < vectorSize; j++) { 415 | if(prevFollowVector[i][j] != followVector[i][j]) 416 | stable = 0; 417 | } 418 | } 419 | 420 | if(stable) 421 | break; 422 | 423 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) { 424 | for(int j=0; j < vectorSize; j++) 425 | prevFollowVector[i][j] = followVector[i][j]; 426 | } 427 | } 428 | } 429 | 430 | 431 | FirstAndFollow* computeFirstAndFollowSets(Grammar* g) { 432 | FirstAndFollow* fafl = initialiseFirstAndFollow(); 433 | populateFirst(fafl->FIRST,g); 434 | populateFollowTillStable(fafl->FOLLOW,fafl->FIRST,g); 435 | return fafl; 436 | } 437 | 438 | void createParseTable(FirstAndFollow* fafl, ParsingTable* pt) { 439 | 440 | for(int i=1; i <= TOTAL_GRAMMAR_RULES; i++) { 441 | Rule* r = g->GRAMMAR_RULES[i]; 442 | int lhsNonTerminal = r->SYMBOLS->HEAD_SYMBOL->TYPE.NON_TERMINAL; 443 | 444 | // THIS IS INCORRECT AS THE TERMINALS BEING IN FIRST FROM OTHER RULES ALSO GET DIRECTED TO THIS RULE! 445 | // for(int j=0; j < TOTAL_GRAMMAR_TERMINALS; j++) { 446 | // if(fafl->FIRST[lhsNonTerminal],[j] == 1) { 447 | // // Since it is LL(1), no other non-epsilon producing rule will direct to an entry already filled 448 | // // Ask Venkat to verify 449 | // // Attempt to correct error, epsilon producing rule directs it back to this entry 450 | // if(pt->entries[lhsNonTerminal][j] == 0) 451 | // pt->entries[lhsNonTerminal][j] = r->RULE_NO; 452 | // } 453 | // } 454 | 455 | Symbol* rhsHead = r->SYMBOLS->HEAD_SYMBOL->next; 456 | Symbol* trav = rhsHead; 457 | int epsilonGenerated = 1; 458 | 459 | while(trav != NULL) { 460 | // Terminal encountered in RHS => It cannot generate epsilon, break! 461 | if(trav->IS_TERMINAL == 1 && trav->TYPE.TERMINAL != TK_EPS) { 462 | epsilonGenerated = 0; 463 | pt->entries[lhsNonTerminal][trav->TYPE.TERMINAL] = r->RULE_NO; 464 | break; 465 | } 466 | else if(trav->IS_TERMINAL == 1 && trav->TYPE.TERMINAL == TK_EPS) { 467 | // No action 468 | epsilonGenerated = 1; 469 | break; 470 | } 471 | else { 472 | 473 | // For all the terminals in the first of this Non terminal set the ParsingTable entry 474 | // Note, no special treatment for epsilon as it will not be recieved from the input source code 475 | for(int j=0; j < TOTAL_GRAMMAR_TERMINALS; j++) { 476 | if(fafl->FIRST[trav->TYPE.NON_TERMINAL][j] == 1) 477 | pt->entries[lhsNonTerminal][j] = r->RULE_NO; 478 | } 479 | 480 | // Check if epsilon is generated by the first of this Non terminal, if not break, else continue 481 | if(fafl->FIRST[trav->TYPE.NON_TERMINAL][TK_EPS] == 0) { 482 | epsilonGenerated = 0; 483 | break; 484 | } 485 | } 486 | 487 | trav = trav->next; 488 | } 489 | 490 | // If epsilon is generated by the RHS string then we need to consider follow set of the LHS Non terminal 491 | if(epsilonGenerated) { 492 | for(int j=0; j < TOTAL_GRAMMAR_TERMINALS; j++) { 493 | if(fafl->FOLLOW[lhsNonTerminal][j] == 1) 494 | pt->entries[lhsNonTerminal][j] = r->RULE_NO; 495 | } 496 | } 497 | } 498 | 499 | } 500 | 501 | 502 | 503 | // Appending at the tail of the list in O(1) using the tail pointer 504 | void addToSymbolList(SymbolList* ls, Symbol* s) { 505 | Symbol* h = ls->HEAD_SYMBOL; 506 | // Case when the List is empty 507 | if(h == NULL) { 508 | ls->HEAD_SYMBOL = s; 509 | ls->TAIL_SYMBOL = s; 510 | ls->RULE_LENGTH = 1; 511 | return; 512 | } 513 | 514 | ls->TAIL_SYMBOL->next = s; 515 | ls->TAIL_SYMBOL = s; 516 | ls->RULE_LENGTH += 1; 517 | } 518 | 519 | 520 | 521 | 522 | 523 | 524 | 525 | 526 | 527 | 528 | // Extracts the grammar from GRAMMAR_FILE, return 1 on success, 0 on error 529 | // Working rationale of the function 530 | // => Identify the LHS Non_terminal 531 | // => Keep making the Symbol List 532 | // => Extract the enum number of the LHS Non terminal. 533 | Grammar* extractGrammar() { 534 | 535 | int ruleCount = 1; // Variable which will be used in assigning the rule numbers to the extracted rules 536 | int fd = open(GRAMMAR_FILE,O_RDONLY); 537 | char c; // Variable to store the character being read 538 | int actualRead; // Variable to store the number of bytes being read in the system call 539 | char* symbol = ""; // This will keep track of the current symbol by appending the read in character 540 | 541 | int symbolsRead = 0; // to keep track of the No of symbols read in a particular line or a rule 542 | int noOfLinesofNonTerminal = 0; // to keep track of the no the no of rules read of a particular non terminal 543 | Symbol* currentNonTerminal = NULL; // to keep track of the current Non Terminal 544 | SymbolList* sl = NULL; // Keeps track of the symbol list so far 545 | 546 | initialiseGrammar(); 547 | ntrr = intialiseNonTerminalRecords(); 548 | initialiseCheckIfDone(); 549 | 550 | //create starting symbol list 551 | 552 | while((actualRead = read(fd,&c,sizeof(char))) != 0) { 553 | 554 | // If end of file is reached stop reading further 555 | if(c == EOF) { 556 | break; 557 | } 558 | 559 | // If a space is reached, it means a symbol has terminated and hence must be extracted 560 | if(c == ' ') { 561 | symbolsRead++; 562 | Symbol* s = intialiseSymbol(symbol); 563 | addToSymbolList(sl,s); 564 | 565 | if(symbolsRead == 1 ) { 566 | 567 | // Case when the rules of current non terminal are over 568 | if(currentNonTerminal == NULL) { 569 | ntrr[s->TYPE.NON_TERMINAL] = (NonTerminalRuleRecords*)malloc(sizeof(NonTerminalRuleRecords)); 570 | ntrr[s->TYPE.NON_TERMINAL]->start = 1; 571 | } 572 | // Case when a new LHS Non terminal arrives 573 | else if(currentNonTerminal != NULL && currentNonTerminal->TYPE.NON_TERMINAL != s->TYPE.NON_TERMINAL) { 574 | ntrr[currentNonTerminal->TYPE.NON_TERMINAL]->end = ruleCount-1; 575 | ntrr[s->TYPE.NON_TERMINAL] = (NonTerminalRuleRecords*)malloc(sizeof(NonTerminalRuleRecords)); 576 | ntrr[s->TYPE.NON_TERMINAL]->start = ruleCount; 577 | } 578 | currentNonTerminal = s; 579 | } 580 | 581 | symbol = ""; // to get this variable ready to accept the next symbol 582 | } 583 | 584 | // A newline indicates the current rule has ended and the next iteration will process a new rule, hence increment rule count 585 | // Also store the symbol extracted till now 586 | else if(c == '\n') { 587 | Symbol* s = intialiseSymbol(symbol); 588 | addToSymbolList(sl,s); 589 | Rule* r = initialiseRule(sl,ruleCount); 590 | g->GRAMMAR_RULES[ruleCount] = r; 591 | ruleCount++; 592 | symbolsRead=0; 593 | symbol = ""; 594 | } 595 | else { 596 | if(symbolsRead == 0){ 597 | // Create a fresh Symbol List. 598 | sl = initialiseSymbolList(); 599 | } 600 | 601 | // Append character to the symbol string 602 | symbol = appendToSymbol(symbol,c); 603 | } 604 | 605 | 606 | } 607 | 608 | // Capturing the corner case of the last Non terminal record => Note this requires the GRAMMAR_FILE to terminate with a '\n' 609 | ntrr[currentNonTerminal->TYPE.NON_TERMINAL]->end = ruleCount-1; 610 | 611 | return g; 612 | } 613 | 614 | 615 | 616 | 617 | 618 | // Function which parses the input from the testCaseFile 619 | ParseTree* parseInputSourceCode(char *testcaseFile, ParsingTable* pTable, FirstAndFollow* fafl) { 620 | 621 | int f = open(testcaseFile,O_RDONLY); 622 | 623 | initializeLexer(f); 624 | ParseTree* pt = initialiseParseTree(); 625 | Stack* st = initialiseStack(pt); 626 | 627 | syntaxErrorFlag = *((int*)malloc(sizeof(int))); 628 | lexicalErrorFlag = *((int*)malloc(sizeof(int))); 629 | 630 | syntaxErrorFlag = 0; 631 | lexicalErrorFlag = 0; 632 | Token* missedToken = NULL; 633 | Token* inputToken = getToken(); 634 | // Keep continuinng till the lexer return NULL, which means that the input is exhausted 635 | while(1) { 636 | 637 | // Break if the input has exhausted 638 | if(inputToken == NULL) 639 | break; 640 | 641 | // If token is a comment continue process 642 | if(inputToken->TOKEN_NAME == TK_COMMENT) { 643 | inputToken = getToken(); 644 | continue; 645 | } 646 | 647 | if(inputToken->TOKEN_NAME == TK_ERR) { 648 | lexicalErrorFlag = 1; 649 | } 650 | 651 | NaryTreeNode* stackTop = top(st); 652 | 653 | // printf(" Input token is %s and lexeme is %s \n" , getTerminal(inputToken->TOKEN_NAME), inputToken->LEXEME); 654 | // if(stackTop->IS_LEAF_NODE == 1) 655 | // printf("Stack top is %s \n" ,getTerminal(stackTop->NODE_TYPE.L.ENUM_ID)); 656 | // else 657 | // printf("Stack top is %s \n" , getNonTerminal(stackTop->NODE_TYPE.NL.ENUM_ID)); 658 | 659 | // Case when the top of the stack has a terminal 660 | if(stackTop->IS_LEAF_NODE == 1) { 661 | 662 | // If the token ID of the input and the stack top match 663 | if(inputToken->TOKEN_NAME == stackTop->NODE_TYPE.L.ENUM_ID) { 664 | 665 | // Populate the parse tree field 666 | stackTop->NODE_TYPE.L.TK = (Token*)malloc(sizeof(Token)); 667 | stackTop->NODE_TYPE.L.TK->LEXEME = copyLexeme(inputToken->LEXEME); 668 | stackTop->NODE_TYPE.L.TK->LINE_NO = inputToken->LINE_NO; 669 | stackTop->NODE_TYPE.L.TK->TOKEN_NAME = inputToken->TOKEN_NAME; 670 | stackTop->NODE_TYPE.L.TK->IS_NUMBER = inputToken->IS_NUMBER; 671 | stackTop->NODE_TYPE.L.TK->VALUE = inputToken->VALUE; 672 | 673 | pop(st); 674 | inputToken = getToken(); 675 | continue; 676 | } 677 | else { 678 | 679 | // Throw Error 680 | syntaxErrorFlag = 1; 681 | // Terminal-Terminal clash => Just assume that the terminal is the one you wanted 682 | 683 | // No need to print if the token is TK_ERR, lexer handles the printing in this case 684 | if(inputToken->TOKEN_NAME != TK_ERR) 685 | printf("Line %d : The token %s for the lexeme %s does not match with the expected token %s\n" ,inputToken->LINE_NO,getTerminal(inputToken->TOKEN_NAME),inputToken->LEXEME,getTerminal(stackTop->NODE_TYPE.L.ENUM_ID)); 686 | 687 | 688 | 689 | 690 | // If the input token is a token error in which the token was identified but did not respect the constraints, 691 | // then consider that the error token was the one which was expected by the stack top 692 | if(inputToken->TOKEN_NAME == TK_ERR) { 693 | 694 | stackTop->NODE_TYPE.L.TK = (Token*)malloc(sizeof(Token)); 695 | stackTop->NODE_TYPE.L.TK->LEXEME = inputToken->LEXEME; 696 | stackTop->NODE_TYPE.L.TK->LINE_NO = inputToken->LINE_NO; 697 | stackTop->NODE_TYPE.L.TK->TOKEN_NAME = stackTop->NODE_TYPE.L.ENUM_ID; 698 | stackTop->NODE_TYPE.L.TK->IS_NUMBER = 0; 699 | stackTop->NODE_TYPE.L.TK->VALUE = NULL; 700 | inputToken = getToken(); 701 | pop(st); 702 | } 703 | // Othwerwise assume that the token was missed 704 | else { 705 | stackTop->NODE_TYPE.L.TK = (Token*)malloc(sizeof(Token)); 706 | stackTop->NODE_TYPE.L.TK->LEXEME = "ERROR_MISSED_LEXEME"; 707 | stackTop->NODE_TYPE.L.TK->LINE_NO = inputToken->LINE_NO; 708 | stackTop->NODE_TYPE.L.TK->TOKEN_NAME = stackTop->NODE_TYPE.L.ENUM_ID; 709 | stackTop->NODE_TYPE.L.TK->IS_NUMBER = 0; 710 | stackTop->NODE_TYPE.L.TK->VALUE = NULL; 711 | missedToken = inputToken; 712 | pop(st); 713 | } 714 | 715 | 716 | continue; 717 | 718 | } 719 | } 720 | else { 721 | 722 | int ruleNumber = pTable->entries[stackTop->NODE_TYPE.NL.ENUM_ID][inputToken->TOKEN_NAME]; 723 | 724 | if(ruleNumber != 0) { 725 | 726 | // printf("Parse table says consult Rule %d \n" ,ruleNumber); 727 | Rule* r = g->GRAMMAR_RULES[ruleNumber]; 728 | addRuleToParseTree(stackTop,r); 729 | 730 | // Pop the stackTop5 731 | pop(st); 732 | 733 | // Push children of the rules on the stack 734 | NaryTreeNode* childNode = stackTop->NODE_TYPE.NL.child; 735 | 736 | // IMPORTANT => DO NOT PUSH EPS ON STACK 737 | if(childNode->IS_LEAF_NODE == 1 && childNode->NODE_TYPE.L.ENUM_ID == TK_EPS); 738 | else 739 | pushTreeChildren(st,childNode); 740 | 741 | } 742 | else { 743 | // Throw error 744 | syntaxErrorFlag = 1; 745 | // Keep iterating the input until the input symbol is in the follow set of the non terminal on the top of the stack 746 | 747 | // If input token is TK_ERR, skip and get the next token 748 | if(inputToken->TOKEN_NAME == TK_ERR) { 749 | // printf("Token causing parsing issue is a lexical error , move input ahead! \n"); 750 | inputToken = getToken(); 751 | continue; 752 | } 753 | 754 | // If epsilon lies in the FIRST of the current non terminal, assume the epsilon producing rule of the stack is being used as the default rule 755 | if(fafl->FIRST[stackTop->NODE_TYPE.NL.ENUM_ID][TK_EPS] == 1) { 756 | // printf("Epsilon is present in first of this non terminal, assume that epsilon was generated\n"); 757 | pop(st); 758 | continue; 759 | } 760 | 761 | // Because the error for the missed token has already been reported once in the terminal-terminal error case, no need to print again 762 | if(inputToken != missedToken) 763 | printf("Line %d : The token %s for the lexeme %s does not match with the Non Terminal %s\n" ,inputToken->LINE_NO,getTerminal(inputToken->TOKEN_NAME),inputToken->LEXEME,getNonTerminal(stackTop->NODE_TYPE.NL.ENUM_ID)); 764 | 765 | // Use the follow set of the stackTop to synchronize 766 | while(inputToken != NULL && fafl->FOLLOW[stackTop->NODE_TYPE.NL.ENUM_ID][inputToken->TOKEN_NAME] == 0) { 767 | // printf("Ignoring Token %s\n" , getTerminal(inputToken->TOKEN_NAME)); 768 | inputToken = getToken(); 769 | } 770 | 771 | // If the input gets depleted then break 772 | if(inputToken == NULL) 773 | break; 774 | 775 | // An input Token is found which is in the followSet, so pop it. 776 | else { 777 | pop(st); 778 | continue; 779 | } 780 | 781 | } 782 | 783 | 784 | } 785 | 786 | // printf("-----\n"); 787 | // printf("\n"); 788 | } 789 | 790 | NaryTreeNode* stackTop = top(st); 791 | if(lexicalErrorFlag == 0 && syntaxErrorFlag == 0 && stackTop->IS_LEAF_NODE == 1 && stackTop->NODE_TYPE.L.ENUM_ID == TK_DOLLAR) { 792 | printf("\n \nSuccessfully Parsed the whole Input\n"); 793 | } 794 | else { 795 | printf("\n \nParsing unsuccesful\n"); 796 | } 797 | 798 | close(f); 799 | 800 | return pt; 801 | } 802 | 803 | void printParseTreeHelper(NaryTreeNode * pt, FILE* f) { 804 | 805 | if(pt == NULL) 806 | return; 807 | 808 | if(pt->IS_LEAF_NODE == 1) { 809 | int tokenEnumID = pt->NODE_TYPE.L.ENUM_ID; 810 | char lexeme[30]; 811 | for(int i=0; i < 29; i++) 812 | lexeme[i] = ' '; 813 | lexeme[29] = '\0'; 814 | 815 | if(tokenEnumID != TK_EPS) { 816 | for(int i=0; i < strlen(pt->NODE_TYPE.L.TK->LEXEME); i++) 817 | lexeme[i] = pt->NODE_TYPE.L.TK->LEXEME[i]; 818 | } 819 | else { 820 | char* str = "EPSILON"; 821 | for(int i=0; i < strlen(str); i++) 822 | lexeme[i] = str[i]; 823 | } 824 | 825 | int lineNumber; 826 | int isNumber; 827 | int valueIfInt; 828 | float valueIfFloat; 829 | if(tokenEnumID != TK_EPS) { 830 | lineNumber = pt->NODE_TYPE.L.TK->LINE_NO; 831 | isNumber = pt->NODE_TYPE.L.TK->IS_NUMBER; 832 | if(isNumber == 1) 833 | valueIfInt = pt->NODE_TYPE.L.TK->VALUE->INT_VALUE; 834 | else if(isNumber == 2) 835 | valueIfFloat = pt->NODE_TYPE.L.TK->VALUE->FLOAT_VALUE; 836 | } 837 | else { 838 | lineNumber = -1; 839 | } 840 | 841 | char tokenName[20]; 842 | for(int i=0; i < 19; i++) 843 | tokenName[i] = ' '; 844 | tokenName[19] = '\0'; 845 | 846 | char* obtainedTokenName = getTerminal(pt->NODE_TYPE.L.ENUM_ID); 847 | 848 | for(int i=0; i < strlen(obtainedTokenName); i++) { 849 | tokenName[i] = obtainedTokenName[i]; 850 | } 851 | 852 | char parent[30]; 853 | for(int i=0; i < 29; i++) 854 | parent[i] = ' '; 855 | 856 | parent[29] = '\0'; 857 | char* obtainedParent = getNonTerminal(pt->parent->NODE_TYPE.NL.ENUM_ID); 858 | for(int i=0; i < strlen(obtainedParent); i++) 859 | parent[i] = obtainedParent[i]; 860 | 861 | char* isLeafNode = "yes"; 862 | char* currentSymbol = "----"; 863 | // char* spaceString = ""; 864 | 865 | if(tokenEnumID == TK_EPS || isNumber == 0) 866 | fprintf(f,"%s %d %s %s %s %s %s\n" ,lexeme,lineNumber,tokenName,"---- ",parent,isLeafNode,currentSymbol); 867 | else if(isNumber == 1) 868 | fprintf(f,"%s %d %s %d %s %s %s\n" ,lexeme,lineNumber,tokenName,valueIfInt,parent,isLeafNode,currentSymbol); 869 | else 870 | fprintf(f,"%s %d %s %f %s %s %s\n" ,lexeme,lineNumber,tokenName,valueIfFloat,parent,isLeafNode,currentSymbol); 871 | 872 | } 873 | else { 874 | NaryTreeNode* trav = pt->NODE_TYPE.NL.child; 875 | 876 | if(trav!=NULL) { 877 | printParseTreeHelper(pt->NODE_TYPE.NL.child, f); 878 | trav = trav->next; 879 | } 880 | 881 | char lexeme[30]; 882 | for(int i=0; i < 29; i++) 883 | lexeme[i] = ' '; 884 | lexeme[29] = '\0'; 885 | lexeme[0] = '-'; lexeme[1] = '-'; lexeme[2] = '-'; lexeme[3] = '-'; 886 | 887 | int lineNumber = -1; 888 | char* tokenName = "----------- "; 889 | char* valueIfNumber = "---- "; 890 | char* currentSymbol = getNonTerminal(pt->NODE_TYPE.NL.ENUM_ID); 891 | char parent[30]; 892 | for(int i=0; i < 29; i++) 893 | parent[i] = ' '; 894 | 895 | parent[29] = '\0'; 896 | 897 | char* obtainedParent; 898 | if(pt->parent != NULL) 899 | obtainedParent = getNonTerminal(pt->parent->NODE_TYPE.NL.ENUM_ID); 900 | else 901 | obtainedParent = "NULL"; 902 | 903 | for(int i=0; i < strlen(obtainedParent); i++) 904 | parent[i] = obtainedParent[i]; 905 | 906 | char* isLeafNode = "no"; 907 | 908 | fprintf(f,"%s %d %s %s %s %s %s\n" ,lexeme,lineNumber,tokenName,valueIfNumber,parent,isLeafNode,currentSymbol); 909 | 910 | 911 | while(trav!=NULL){ 912 | printParseTreeHelper(trav,f); 913 | trav = trav->next; 914 | } 915 | 916 | } 917 | } 918 | 919 | void printParseTree(ParseTree* pt, char* outfile) { 920 | 921 | FILE* f; 922 | 923 | // Print on console if no outfile is provided 924 | if(outfile == NULL) 925 | f = stdout; 926 | else 927 | f = fopen(outfile,"wb"); 928 | 929 | if(f == NULL) { 930 | printf("Error opening the outfile\n"); 931 | return; 932 | } 933 | 934 | printParseTreeHelper(pt->root,f); 935 | 936 | // Do not close stdout 937 | if(f != stdout) 938 | fclose(f); 939 | 940 | } 941 | 942 | //Utility function to print all symbols in the list 943 | void printSymbolList(SymbolList* ls) { 944 | Symbol* trav = ls->HEAD_SYMBOL; 945 | while(trav != NULL) { 946 | if(trav->IS_TERMINAL == 1) 947 | printf("%s " ,TerminalID[trav->TYPE.TERMINAL]); 948 | else 949 | printf("%s " ,NonTerminalID[trav->TYPE.NON_TERMINAL]); 950 | 951 | trav = trav->next; 952 | } 953 | } 954 | 955 | //Utility function to print a rule 956 | void printRule(Rule* r) { 957 | 958 | if(r == NULL) { 959 | printf("-------------------------\n"); 960 | return; 961 | } 962 | 963 | SymbolList* ls = r->SYMBOLS; 964 | printSymbolList(ls); 965 | 966 | printf("\n"); 967 | } 968 | 969 | 970 | // Utility function to print the grammar structure 971 | void printGrammarStructure() { 972 | for(int i=0; i < g->GRAMMAR_RULES_SIZE; i++) { 973 | Rule* r = g->GRAMMAR_RULES[i]; 974 | printRule(r); 975 | } 976 | } 977 | 978 | // Utility function to print the NonTerminalRuleRecords 979 | void printNonTerminalRuleRecords() { 980 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) { 981 | NonTerminalRuleRecords* temp = ntrr[i]; 982 | printf("Rules for Non terminal %s start from %d and end at %d\n" ,NonTerminalID[i],temp->start,temp->end); 983 | } 984 | } 985 | 986 | // Utility function to print the first set 987 | void printFirstSets(FirstAndFollow* fafl) { 988 | int** firstVector = fafl->FIRST; 989 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) { 990 | printf("First set for non terminals %s are ====> " ,NonTerminalID[i]); 991 | 992 | for(int j=0; j < vectorSize; j++) { 993 | if(firstVector[i][j] == 1) 994 | printf("%s " , TerminalID[j]); 995 | } 996 | 997 | printf("\n"); 998 | } 999 | } 1000 | 1001 | // Utility function to print follow sets 1002 | void printFollowSets(FirstAndFollow* fafl) { 1003 | int** followVector = fafl->FOLLOW; 1004 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) { 1005 | printf("Follow set for non terminals %s are ====> " ,NonTerminalID[i]); 1006 | 1007 | for(int j=0; j < vectorSize; j++) { 1008 | if(followVector[i][j] == 1) 1009 | printf("%s " , TerminalID[j]); 1010 | } 1011 | 1012 | printf("\n"); 1013 | } 1014 | } 1015 | 1016 | // Utility function to print parsing table 1017 | void printParseTable(ParsingTable* pt) { 1018 | for(int i=0; i < TOTAL_GRAMMAR_NONTERMINALS; i++) { 1019 | printf("%s\n" ,NonTerminalID[i]); 1020 | for(int j=0; j < TOTAL_GRAMMAR_TERMINALS; j++) { 1021 | // For pretty printing 1022 | for(int k=0; k < 10; k++) 1023 | printf(" "); 1024 | 1025 | printf("%s ==> " ,TerminalID[j]); 1026 | if(pt->entries[i][j] != 0) { 1027 | printRule(g->GRAMMAR_RULES[pt->entries[i][j]]); 1028 | } 1029 | else 1030 | printf("NULL\n"); 1031 | } 1032 | } 1033 | } 1034 | 1035 | int getErrorStatus() { 1036 | return (lexicalErrorFlag || syntaxErrorFlag); 1037 | } 1038 | -------------------------------------------------------------------------------- /Compiler/parser.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "interface.h" 8 | #include "parserDef.h" 9 | 10 | /**Grammar Functions**/ 11 | int initialiseGrammar(); 12 | Grammar* extractGrammar(); 13 | FirstAndFollow* computeFirstAndFollowSets (Grammar* g); 14 | void populateFirst(int** firstVector, Grammar* g); 15 | void calculateFirst(int** firstVector, int enum_id); 16 | void populateFollow(int** followBitVector, int ** firstSet, Grammar* g); 17 | void populateFollowTillStable(int** followVector, int** firstVector, Grammar* g); 18 | ParsingTable* initialiseParsingTable(); 19 | void createParseTable(FirstAndFollow* fafl, ParsingTable* pt); 20 | ParseTree* parseInputSourceCode(char *testcaseFile, ParsingTable* pTable,FirstAndFollow* fafl); 21 | void printParseTree(ParseTree* pt, char* outfile); 22 | void printParseTreeHelper(NaryTreeNode * pt, FILE* f); 23 | 24 | // Function to initialise checkIfDone global variable 25 | void initialiseCheckIfDone(); 26 | 27 | /**Functions to map the string to it's enum id**/ 28 | int findInNonTerminalMap(char* str); 29 | int findInTerminalMap(char* str); 30 | 31 | /**Rule and Rules function**/ 32 | NonTerminalRuleRecords** intialiseNonTerminalRecords(); 33 | Rule* initialiseRule(SymbolList* sl, int ruleCount); 34 | 35 | /**Symbol and SymbolList Functions**/ 36 | Symbol* intialiseSymbol(char* symbol); 37 | SymbolList* initialiseSymbolList(); 38 | void addToSymbolList(SymbolList* ls, Symbol* s); 39 | 40 | /*Utility functions*/ 41 | char* getTerminal(int enumId); 42 | char* getNonTerminal(int enumId); 43 | char* appendToSymbol(char* str, char c); 44 | char* copyLexeme(char* str); 45 | 46 | /**Utility functions to print**/ 47 | void printSymbol(Symbol* ls); 48 | void printRule(Rule* r); 49 | void printGrammarStructure(); 50 | void printNonTerminalRuleRecords(); 51 | void printFirstSets(FirstAndFollow* fafl); 52 | void printFollowSets(FirstAndFollow* fafl); 53 | void printParseTable(ParsingTable* pt); 54 | int getErrorStatus(); 55 | -------------------------------------------------------------------------------- /Compiler/parserDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #ifndef PARSE_DEF_ 8 | #define PARSE_DEF_ 9 | 10 | #include "lexerDef.h" 11 | #include "nary_treeDef.h" 12 | 13 | // Enum which stores the Non-terminals 14 | typedef enum NonTerminal { 15 | program, 16 | mainFunction, 17 | otherFunctions, 18 | function, 19 | input_par, 20 | output_par, 21 | parameter_list, 22 | dataType, 23 | primitiveDatatype, 24 | constructedDatatype, 25 | remaining_list, 26 | stmts, 27 | typeDefinitions, 28 | typeDefinition, 29 | fieldDefinitions, 30 | fieldDefinition, 31 | moreFields, 32 | declarations, 33 | declaration, 34 | global_or_not, 35 | otherStmts, 36 | stmt, 37 | assignmentStmt, 38 | singleOrRecId, 39 | C, 40 | funCallStmt, 41 | outputParameters, 42 | inputParameters, 43 | iterativeStmt, 44 | conditionalStmt, 45 | B, 46 | ioStmt, 47 | arithmeticExpression, 48 | ex2, 49 | term, 50 | term2, 51 | factor, 52 | op1, 53 | op2, 54 | booleanExpression, 55 | allVar, 56 | logicalOp, 57 | relationalOp, 58 | returnStmt, 59 | optionalReturn, 60 | idList, 61 | more_ids, 62 | } NonTerminal; 63 | 64 | typedef TokenName Terminal; // The tokens will be representing the terminals 65 | 66 | // Struct to store the first and follow sets of each non terminal 67 | typedef struct FirstAndFollow { 68 | int** FIRST; // Bit vector to store first sets of each non terminal 69 | int** FOLLOW; // Bit vector to store follow sets of each non terminal 70 | } FirstAndFollow; 71 | 72 | //UNION has either of the two, so symboltype can be either terminal or non terminal 73 | typedef union SymbolType { 74 | Terminal TERMINAL; // If the symbol is a terminal 75 | NonTerminal NON_TERMINAL; // If the symbol is a non terminal 76 | } SymbolType; 77 | 78 | typedef struct Symbol { 79 | SymbolType TYPE; // Stores the type number of the terminal/non-terminal 80 | int IS_TERMINAL; // Stores whether Symbol is a terminal or not 81 | struct Symbol* next; // Pointer to the next symbol node in the linked list. A Rule is a linked list and Symbol is a node of that linked list 82 | } Symbol; 83 | 84 | //SymbolList stores the head of the linked list and the length of the list i.e the rule. 85 | typedef struct SymbolList { 86 | Symbol* HEAD_SYMBOL; // Indicates the symbol which represents the start of the rule , the LHS non terminal 87 | Symbol* TAIL_SYMBOL; // Indicates the symbol at the tail, used for appending symbols 88 | int RULE_LENGTH; // Stores the length of the rule i.e number of symbols 89 | } SymbolList; 90 | 91 | // Struct which stores the start and ending rule number corresponding to the rule indexed in the array by its enum id. 92 | typedef struct NonTerminalRuleRecords { 93 | int start; 94 | int end; 95 | } NonTerminalRuleRecords; 96 | 97 | // Struct representing a single rule 98 | typedef struct Rule { 99 | SymbolList* SYMBOLS; // Linked list of symbols (DOUBT => Can make this a dynamically allocated array as well, ask ma'am) 100 | int RULE_NO; // Rule number 101 | } Rule; 102 | 103 | // Struct for the grammar which will be extracted for the txt file 104 | typedef struct Grammar { 105 | int GRAMMAR_RULES_SIZE; // Keep track of the size of the array below 106 | Rule** GRAMMAR_RULES; // An array containg the rules of the grammar 107 | } Grammar; 108 | 109 | // Struct for the parsing table 110 | typedef struct ParsingTable { 111 | int** entries; 112 | } ParsingTable; 113 | 114 | #endif 115 | -------------------------------------------------------------------------------- /Compiler/printer.c: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | // This file handles the printing utilities required by driver 8 | 9 | #include "interface.h" 10 | #include "keyword_table.h" 11 | #include "lexer.h" 12 | #include "parser.h" 13 | #include "nary_tree.h" 14 | #include "ast.h" 15 | #include "symbol_table.h" 16 | #include "type_checker.h" 17 | #include "error_handler.h" 18 | #include "semantic_analyzer.h" 19 | #include "code_gen.h" 20 | #include "printer.h" 21 | 22 | 23 | Queue* createQueue() { 24 | Queue* q = (Queue*)malloc(sizeof(Queue)); 25 | q->head = NULL; 26 | q->tail = NULL; 27 | q->size = 0; 28 | } 29 | 30 | node* createNodee(ASTNode* n, int depth,char* parent) { 31 | node* na = (node*)malloc(sizeof(node)); 32 | na->v = n; 33 | na->depth = depth; 34 | na->parent = parent; 35 | na->next = NULL; 36 | return na; 37 | } 38 | 39 | void enqueue(Queue* q,ASTNode* v,int depth,char* parent) { 40 | if(q == NULL) 41 | return; 42 | 43 | node* n = createNodee(v,depth,parent); 44 | 45 | if(q->head == NULL) { 46 | q->head = n; 47 | q->tail = n; 48 | q->size += 1; 49 | return; 50 | } 51 | 52 | q->tail->next = n; 53 | q->tail = n; 54 | q->size += 1; 55 | } 56 | 57 | node* dequeue(Queue* q) { 58 | if(q == NULL) 59 | return NULL; 60 | 61 | if(q->head == NULL) 62 | return NULL; 63 | 64 | 65 | node* v = q->head; 66 | q->head = q->head->next; 67 | q->size -= 1; 68 | return v; 69 | } 70 | 71 | int isEmptyQ(Queue* q) { 72 | if(q->size == 0) 73 | return 1; 74 | return 0; 75 | } 76 | 77 | void levelPrint(ASTNode* root) { 78 | Queue* q = createQueue(); 79 | int currentDepth = 0; 80 | printf("------LEVEL 0-----\n"); 81 | enqueue(q,root,0,NULL); 82 | while(!isEmptyQ(q)) { 83 | node* top = dequeue(q); 84 | ASTNode* n = top->v; 85 | int depth = top->depth; 86 | if(depth > currentDepth) { 87 | printf("\n"); 88 | printf("\n"); 89 | printf("-----LEVEL %d-----\n" ,depth); 90 | currentDepth = depth; 91 | } 92 | Scope scope = (n->SCOPED_TABLE != NULL) ? n->SCOPED_TABLE->SCOPE : "-1"; 93 | if(currentDepth == 0) 94 | printf("astProgram , Scope = %s \n" ,scope); 95 | else { 96 | printf("(%s,Parent = %s" , getLabel(top->v->LABEL),top->parent); 97 | if(n->LABEL == astId) { 98 | printf(", ID = %s " ,n->AST_NODE_TYPE.AST_ID.ID->LEXEME); 99 | 100 | if(n->AST_NODE_TYPE.AST_ID.FIELD_ID != NULL) 101 | printf(", FIELD_ID = %s ", n->AST_NODE_TYPE.AST_ID.FIELD_ID->LEXEME); 102 | if(n->AST_NODE_TYPE.AST_ID.DATA_TYPE != NULL) 103 | printf(", DATA_TYPE = %s " ,n->AST_NODE_TYPE.AST_ID.DATA_TYPE->LEXEME); 104 | else 105 | printf(", DATA_TYPE = -1 "); 106 | } 107 | else if(n->LABEL == astNum) { 108 | printf(", NUM = %s " ,n->AST_NODE_TYPE.AST_NUM.VALUE->LEXEME); 109 | } 110 | else if(n->LABEL == astRnum) { 111 | printf(", RNUM = %s ", n->AST_NODE_TYPE.AST_RNUM.VALUE->LEXEME); 112 | } 113 | else if(n->LABEL == astArithmeticExpression) { 114 | printf(", Operator = %s " ,n->AST_NODE_TYPE.AST_ARITHMETIC_EXPRESSION.OPERATOR->LEXEME); 115 | } 116 | else if(n->LABEL == astBooleanExpression) { 117 | printf(", Operator = %s " ,n->AST_NODE_TYPE.AST_BOOLEAN_EXPRESSION.OPERATOR->LEXEME); 118 | } 119 | else if(n->LABEL == astIterativeStmt) { 120 | printf(", START LINE = %d, END LINE = %d " ,n->AST_NODE_TYPE.AST_ITERATIVE_STMT.LINE_NO_START,n->AST_NODE_TYPE.AST_ITERATIVE_STMT.LINE_NO_END); 121 | } 122 | 123 | printf(", Scope = %s\n" ,scope); 124 | } 125 | ASTNode* trav = n->children; 126 | while(trav != NULL) { 127 | enqueue(q,trav,depth+1,getLabel(n->LABEL)); 128 | trav = trav->next; 129 | } 130 | } 131 | } 132 | 133 | 134 | // Utility function to print SymbolEntryList 135 | void printSymbolEntryList(SymbolEntry* ls, int isGlobalTable) { 136 | SymbolEntry* trav = ls; 137 | while(trav != NULL) { 138 | if(trav->SYMBOL_LABEL == symbolVariable) { 139 | printf("Entry is a variable of type %s\n" ,trav->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE->LEXEME); 140 | printf("Variable is %s, it is %s\n" ,trav->SYMBOL_TOKEN->LEXEME,((isGlobalTable == 1) ? "global" : "not global")); 141 | printf("Offset is %d\n" ,trav->SYMBOL_OFFSET); 142 | printf("\n"); 143 | } 144 | else if(trav->SYMBOL_LABEL == symbolRecord) { 145 | printf("Entry is a record of type %s\n", trav->SYMBOL_TOKEN->LEXEME); 146 | Token** dataType = trav->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.DATA_TYPE; 147 | int numFields = trav->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.NUMBER_FIELDS; 148 | for(int i=0; i < numFields; i++) { 149 | printf("The field is %s of type %s\n" , trav->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.FIELDS[i]->LEXEME,trav->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.DATA_TYPE[i]->LEXEME); 150 | printf("\n"); 151 | } 152 | printf("\n"); 153 | } 154 | else if(trav->SYMBOL_LABEL == symbolFunction) { 155 | printf("Entry is a function\n"); 156 | printf("Function name is %s\n" , trav->SYMBOL_TOKEN->LEXEME); 157 | printf("The Scope of this function is indicated by %p\n" , trav->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.SCOPED_TABLE); 158 | // Recursive call (Indirect) to printSymbolTable 159 | printf("---------------------------------\n"); 160 | printSymbolTable(trav->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.SCOPED_TABLE,0); 161 | printf("---------------------------------\n"); 162 | } 163 | else if(trav->SYMBOL_LABEL == symbolParameter) { 164 | printf("Entry is a parameter of type %s\n" ,trav->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE->LEXEME); 165 | printf("Entry is an %s parameter\n" , ((trav->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.IS_INPUT == 1) ? "Input" : "Output") ); 166 | printf("Parameter is %s\n" ,trav->SYMBOL_TOKEN->LEXEME); 167 | printf("Offset is %d\n" ,trav->SYMBOL_OFFSET); 168 | printf("\n"); 169 | } 170 | 171 | trav = trav->next; 172 | } 173 | } 174 | 175 | void printGlobals(SymbolTable* st) { 176 | printf("Name Type Offset\n"); 177 | for(int i=0; i < st->NUMBER_SLOTS; i++) { 178 | SymbolEntry* trav = st->SYMBOL_SLOTS[i]; 179 | while(trav != NULL) { 180 | if(trav->SYMBOL_LABEL == symbolVariable) { 181 | printf("%s %s %d\n" ,trav->SYMBOL_TOKEN->LEXEME, trav->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE->LEXEME, trav->SYMBOL_OFFSET); 182 | } 183 | trav = trav->next; 184 | } 185 | } 186 | } 187 | 188 | void printFunctions(SymbolTable* st) { 189 | 190 | printf("Name Memory\n"); 191 | for(int i=0; i < st->NUMBER_SLOTS; i++) { 192 | SymbolEntry* trav = st->SYMBOL_SLOTS[i]; 193 | while(trav != NULL) { 194 | if(trav->SYMBOL_LABEL == symbolFunction) { 195 | printf("%s %d\n" ,trav->SYMBOL_TOKEN->LEXEME, trav->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.SCOPED_TABLE->CURRENT_OFFSET); 196 | } 197 | trav = trav->next; 198 | } 199 | } 200 | 201 | } 202 | 203 | void printRecords(SymbolTable* st) { 204 | printf("Name Data types Width\n"); 205 | for(int i=0; i < st->NUMBER_SLOTS; i++) { 206 | SymbolEntry* trav = st->SYMBOL_SLOTS[i]; 207 | while(trav != NULL) { 208 | if(trav->SYMBOL_LABEL == symbolRecord) { 209 | printf("%s " ,trav->SYMBOL_TOKEN->LEXEME); 210 | Token** dataTypes = trav->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.DATA_TYPE; 211 | int numberFields = trav->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.NUMBER_FIELDS; 212 | for(int i=0; i < numberFields; i++) { 213 | printf("%s," ,dataTypes[i]->LEXEME); 214 | } 215 | printf(" %d\n" , trav->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.TOTAL_OFFSET); 216 | } 217 | trav = trav->next; 218 | } 219 | } 220 | } 221 | 222 | 223 | void printSymbolTableHelper(SymbolTable* scopedTable) { 224 | for(int i=0; i < scopedTable->NUMBER_SLOTS; i++) { 225 | SymbolEntry* trav = scopedTable->SYMBOL_SLOTS[i]; 226 | while(trav != NULL) { 227 | if(trav->SYMBOL_LABEL == symbolVariable) { 228 | 229 | if(trav->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE->TOKEN_NAME == TK_RECORDID) { 230 | SymbolEntry* recordEntry = lookupSymbolEntry(scopedTable->parent,trav->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE); 231 | // No record of that lexeme exists 232 | if(recordEntry == NULL) { 233 | trav = trav->next; 234 | continue; 235 | } 236 | printf("%s ",trav->SYMBOL_TOKEN->LEXEME); 237 | Token** dataTypes = recordEntry->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.DATA_TYPE; 238 | int numberFields = recordEntry->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.NUMBER_FIELDS; 239 | for(int i=0; i < numberFields; i++) { 240 | printf("%s," ,dataTypes[i]->LEXEME); 241 | } 242 | printf(" %s %d\n" ,scopedTable->SCOPE,trav->SYMBOL_OFFSET); 243 | } 244 | else { 245 | printf("%s ",trav->SYMBOL_TOKEN->LEXEME); 246 | printf("%s" ,trav->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE->LEXEME); 247 | printf(" %s %d\n" ,scopedTable->SCOPE,trav->SYMBOL_OFFSET); 248 | } 249 | 250 | } 251 | else if(trav->SYMBOL_LABEL == symbolParameter) { 252 | 253 | if(trav->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE->TOKEN_NAME == TK_RECORDID) { 254 | SymbolEntry* recordEntry = lookupSymbolEntry(scopedTable->parent,trav->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE); 255 | // No record of that lexeme exists 256 | if(recordEntry == NULL) { 257 | trav = trav->next; 258 | continue; 259 | } 260 | printf("%s ",trav->SYMBOL_TOKEN->LEXEME); 261 | Token** dataTypes = recordEntry->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.DATA_TYPE; 262 | int numberFields = recordEntry->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.NUMBER_FIELDS; 263 | for(int i=0; i < numberFields; i++) { 264 | printf("%s," ,dataTypes[i]->LEXEME); 265 | } 266 | printf(" %s %d\n" ,scopedTable->SCOPE,trav->SYMBOL_OFFSET); 267 | } 268 | else { 269 | printf("%s ",trav->SYMBOL_TOKEN->LEXEME); 270 | printf("%s" ,trav->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE->LEXEME); 271 | printf(" %s %d\n" ,scopedTable->SCOPE,trav->SYMBOL_OFFSET); 272 | } 273 | 274 | } 275 | trav = trav->next; 276 | } 277 | } 278 | } 279 | // Utility function to print Symbol table 280 | void printSymbolTable(SymbolTable* st,int isGlobalTable) { 281 | 282 | printf("Lexeme type scope offset\n"); 283 | for(int i=0; i < st->NUMBER_SLOTS; i++) { 284 | SymbolEntry* trav = st->SYMBOL_SLOTS[i]; 285 | while(trav != NULL) { 286 | if(trav->SYMBOL_LABEL == symbolVariable) { 287 | 288 | if(trav->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE->TOKEN_NAME == TK_RECORDID) { 289 | SymbolEntry* recordEntry = lookupSymbolEntry(st,trav->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE); 290 | // No record of that lexeme exists 291 | if(recordEntry == NULL) { 292 | trav = trav->next; 293 | continue; 294 | } 295 | printf("%s ",trav->SYMBOL_TOKEN->LEXEME); 296 | Token** dataTypes = recordEntry->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.DATA_TYPE; 297 | int numberFields = recordEntry->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.NUMBER_FIELDS; 298 | for(int i=0; i < numberFields; i++) { 299 | printf("%s," ,dataTypes[i]->LEXEME); 300 | } 301 | printf(" global ---\n"); 302 | } 303 | else { 304 | printf("%s ",trav->SYMBOL_TOKEN->LEXEME); 305 | printf("%s" ,trav->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE->LEXEME); 306 | printf(" global ---\n"); 307 | } 308 | 309 | } 310 | trav = trav->next; 311 | } 312 | } 313 | 314 | 315 | for(int i=0; i < st->NUMBER_SLOTS; i++) { 316 | SymbolEntry* trav = st->SYMBOL_SLOTS[i]; 317 | while(trav != NULL) { 318 | if(trav->SYMBOL_LABEL == symbolFunction) { 319 | printSymbolTableHelper(trav->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.SCOPED_TABLE); 320 | } 321 | trav = trav->next; 322 | } 323 | } 324 | } 325 | 326 | 327 | // int main(int argc, char* argv[]) { 328 | // Grammar* g = extractGrammar(); 329 | // FirstAndFollow* fafl = computeFirstAndFollowSets(g); 330 | // ParsingTable* pTable = initialiseParsingTable(); 331 | // createParseTable(fafl,pTable); 332 | // ParseTree* pt = parseInputSourceCode(argv[1],pTable,fafl); 333 | // AST* ast = constructAST(pt); 334 | // levelPrint(ast->root); 335 | // printf("\n"); 336 | // printf("---PRINTED AST----\n"); 337 | // printf("\n"); 338 | // printf("\n"); 339 | // printf("\n"); 340 | // printf("\n"); 341 | // printf("\n"); 342 | 343 | // printf("-------MAKING AND PRINTING SYMBOL TABLE------\n"); 344 | 345 | // ErrorList* els = initializeErrorList(); 346 | 347 | // SymbolTable* st = constructSymbolTable(ast,els); 348 | 349 | // printSymbolTable(st,1); 350 | 351 | // printf("\n"); 352 | // printf("\n"); 353 | // printf("\n"); 354 | // printf("----PRINTING AST TO SEE IF SCOPE FIELD HAS BEEN POPULATED-----\n"); 355 | 356 | // levelPrint(ast->root); 357 | 358 | // printf("\n"); 359 | // printf("\n"); 360 | // printf("\n"); 361 | // printf("----COMMENCING TYPE CHECKING----\n"); 362 | 363 | 364 | // captureErrors(ast,els); 365 | 366 | // printf("----TYPE CHECKING COMPLETE-----\n"); 367 | 368 | 369 | // printf("\n"); 370 | // printf("\n"); 371 | 372 | // printf("---PRINTING SYMBOL TABLE WITH OFFSETS CALCULATED---\n"); 373 | // printSymbolTable(st,1); 374 | // printf("---PRINTING SYMBOL TABLE COMPLETE---\n"); 375 | 376 | // printf("----PRINTING AST AGAIN-----\n"); 377 | // printf("\n"); 378 | // printf("\n"); 379 | // printf("\n"); 380 | // levelPrint(ast->root); 381 | // printf("\n"); 382 | // printf("----PRINTING AST COMPLETED----\n"); 383 | 384 | // printf("\n"); 385 | // printf("\n"); 386 | // printf("----PRINTING ERRORS-----\n"); 387 | // printErrors(els); 388 | 389 | // printf("---PRINTING ERRORS COMPLETE----\n"); 390 | 391 | // FILE* f = fopen("code1.asm","w"); 392 | // codeGeneration(ast,st,f); 393 | // } 394 | -------------------------------------------------------------------------------- /Compiler/printer.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "printerDef.h" 8 | 9 | Queue* createQueue(); 10 | node* createNodee(ASTNode* n, int depth,char* parent); 11 | void enqueue(Queue* q,ASTNode* v,int depth,char* parent); 12 | node* dequeue(Queue* q); 13 | int isEmptyQ(Queue* q); 14 | void levelPrint(ASTNode* root); 15 | void printSymbolEntryList(SymbolEntry* ls, int isGlobalTable); 16 | void printSymbolTableHelper(SymbolTable* scopedTable); 17 | void printSymbolTable(SymbolTable* st,int isGlobalTable); 18 | void printGlobals(SymbolTable* st); 19 | void printFunctions(SymbolTable* st); 20 | void printRecords(SymbolTable* st); 21 | -------------------------------------------------------------------------------- /Compiler/printerDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "astDef.h" 8 | 9 | typedef struct node { 10 | ASTNode* v; 11 | char* parent; 12 | int depth; 13 | struct node* next; 14 | }node; 15 | 16 | typedef struct Queue { 17 | node* head; 18 | node* tail; 19 | int size; 20 | }Queue; 21 | -------------------------------------------------------------------------------- /Compiler/semantic_analyzer.c: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "interface.h" 8 | #include "error_handler.h" 9 | #include "symbol_table.h" 10 | #include "type_checker.h" 11 | #include "semantic_analyzer.h" 12 | #include 13 | 14 | TokenListItem* initializeTokenListItem(Token* tk) { 15 | TokenListItem* tls = (TokenListItem*)malloc(sizeof(TokenListItem)); 16 | tls->TK = tk; 17 | tls->next = NULL; 18 | return tls; 19 | } 20 | 21 | int searchConditionals(TokenListItem* tls, Token* tk) { 22 | TokenListItem* trav = tls; 23 | while(trav != NULL) { 24 | if(strcmp(trav->TK->LEXEME,tk->LEXEME) == 0) 25 | return 1; 26 | trav = trav->next; 27 | } 28 | return 0; 29 | } 30 | 31 | TokenListItem* mergeConditionals(TokenListItem* leftConditionals, TokenListItem* rightConditionals) { 32 | TokenListItem* trav = leftConditionals; 33 | while(trav != NULL && trav->next != NULL) { 34 | trav = trav->next; 35 | } 36 | 37 | trav->next = rightConditionals; 38 | return leftConditionals; 39 | } 40 | 41 | TokenListItem* getConditionals(ASTNode* astBooleanExpressionNode) { 42 | 43 | if(astBooleanExpressionNode->LABEL != astBooleanExpression) { 44 | printf("getConditionals called on a node which is not astBooleanExpression\n"); 45 | return NULL; 46 | } 47 | 48 | ASTNode* lhsNode = astBooleanExpressionNode->children; 49 | ASTNode* rhsNode = astBooleanExpressionNode->children->next; 50 | 51 | // Case for astBool --> TK_NOT astBool 52 | if(lhsNode->LABEL == astBooleanExpression && rhsNode == NULL) 53 | return getConditionals(lhsNode); 54 | // Case for astBool --> astBool TK_LOGICAL astBool 55 | else if(lhsNode->LABEL == astBooleanExpression && rhsNode->LABEL == astBooleanExpression) { 56 | TokenListItem* leftConditionals = getConditionals(lhsNode); 57 | TokenListItem* rightConditionals = getConditionals(rhsNode); 58 | return mergeConditionals(leftConditionals,rightConditionals); 59 | } 60 | else if(lhsNode->LABEL == astId && rhsNode->LABEL == astId) { 61 | TokenListItem* tls1 = initializeTokenListItem(lhsNode->AST_NODE_TYPE.AST_ID.ID); 62 | TokenListItem* tls2 = initializeTokenListItem(rhsNode->AST_NODE_TYPE.AST_ID.ID); 63 | tls1->next = tls2; 64 | return tls1; 65 | } 66 | else if(lhsNode->LABEL == astId) { 67 | TokenListItem* tls1 = initializeTokenListItem(lhsNode->AST_NODE_TYPE.AST_ID.ID); 68 | return tls1; 69 | } 70 | else if(rhsNode->LABEL == astId) { 71 | TokenListItem* tls2 = initializeTokenListItem(rhsNode->AST_NODE_TYPE.AST_ID.ID); 72 | return tls2; 73 | } 74 | else { 75 | // No astIds involved in condition, only ast_num and stuff 76 | // Give warning message for this ^^ ask ma'am 77 | return NULL; 78 | } 79 | } 80 | 81 | int searchIterativeChildren(ASTNode* astIterativeStmtNode, TokenListItem* conditionals) { 82 | 83 | if(astIterativeStmtNode->LABEL != astIterativeStmt) { 84 | printf("Iterative search on a non iterative node, not correct\n"); 85 | } 86 | 87 | // Search the statements beneatht the iterative 88 | ASTNode* trav = astIterativeStmtNode->children->next; 89 | while(trav != NULL) { 90 | if(trav->LABEL == astAssignmentStmt) { 91 | ASTNode* astIdNode = trav->children; 92 | int itrRes = searchConditionals(conditionals,astIdNode->AST_NODE_TYPE.AST_ID.ID); 93 | if(itrRes == 1) 94 | return 1; 95 | } 96 | else if(trav->LABEL == astFunCallStmt) { 97 | // Get the starting ID node for the output args 98 | ASTNode* innerTrav = trav->children->children; 99 | while(innerTrav != NULL) { 100 | int itrRes = searchConditionals(conditionals,innerTrav->AST_NODE_TYPE.AST_ID.ID); 101 | if(itrRes == 1) 102 | return 1; 103 | innerTrav = innerTrav->next; 104 | } 105 | 106 | } 107 | else if(trav->LABEL == astIOStmtRead) { 108 | ASTNode* astIdNode = trav->children; 109 | int itrRes = searchConditionals(conditionals,astIdNode->AST_NODE_TYPE.AST_ID.ID); 110 | if(itrRes == 1) 111 | return 1; 112 | } 113 | else if(trav->LABEL == astIterativeStmt) { 114 | int itrRes = searchIterativeChildren(trav,conditionals); 115 | if(itrRes == 1) 116 | return 1; 117 | } 118 | 119 | trav = trav->next; 120 | } 121 | return 0; 122 | } 123 | 124 | int checkForIterationUpdate(ASTNode* astIterativeStmtNode) { 125 | ASTNode* astBooleanExpressionNode = astIterativeStmtNode->children; 126 | TokenListItem* conditionals = getConditionals(astBooleanExpressionNode); 127 | 128 | if(conditionals == NULL) 129 | return 1; 130 | 131 | int itrRes = searchIterativeChildren(astIterativeStmtNode,conditionals); 132 | return itrRes; 133 | } 134 | 135 | void captureErrorsHelper(ASTNode* node, ErrorList* els) { 136 | 137 | if(node == NULL) 138 | return; 139 | 140 | Label label = node->LABEL; 141 | 142 | switch(label) { 143 | case astProgram: { 144 | // No action 145 | break; 146 | } 147 | case astFunction: { 148 | break; 149 | } 150 | case astInputParams: { 151 | // Populate offsets of all input params 152 | ASTNode* trav = node->children; 153 | while(trav != NULL ) { 154 | 155 | if(trav->LABEL != astId) { 156 | printf("Child of astInputParams not astId, incorrect\n"); 157 | } 158 | 159 | SymbolEntry* idEntry = lookupSymbolEntry(trav->SCOPED_TABLE,trav->AST_NODE_TYPE.AST_ID.ID); 160 | 161 | if(idEntry == NULL) { 162 | printf("Entry of a declaration node not present, detected in type checking, not correct\n"); 163 | } 164 | populateOffset(trav,idEntry,trav->SCOPED_TABLE,els); 165 | trav = trav->next; 166 | } 167 | 168 | break; 169 | } 170 | case astOutputParams: { 171 | // Populate offsets of all output params 172 | ASTNode* trav = node->children; 173 | while(trav != NULL ) { 174 | 175 | if(trav->LABEL != astId) { 176 | printf("Child of astInputParams not astId, incorrect\n"); 177 | } 178 | 179 | SymbolEntry* idEntry = lookupSymbolEntry(trav->SCOPED_TABLE,trav->AST_NODE_TYPE.AST_ID.ID); 180 | 181 | if(idEntry == NULL) { 182 | printf("Entry of a declaration node not present, detected in type checking, not correct\n"); 183 | } 184 | 185 | populateOffset(trav,idEntry,trav->SCOPED_TABLE,els); 186 | trav = trav->next; 187 | } 188 | 189 | break; 190 | } 191 | case astDatatype: { 192 | break; 193 | } 194 | case astStmts: { 195 | // No action 196 | break; 197 | } 198 | case astTypeDefintion: { 199 | // No action 200 | break; 201 | } 202 | case astFieldDefinition: { 203 | // No action 204 | break; 205 | } 206 | case astDeclaration: { 207 | // Store offset 208 | SymbolEntry* idEntry = lookupSymbolEntry(node->SCOPED_TABLE,node->children->AST_NODE_TYPE.AST_ID.ID); 209 | 210 | if(idEntry == NULL) { 211 | printf("Entry of a declaration node not present, detected in type checking, not correct\n"); 212 | } 213 | 214 | 215 | populateOffset(node->children,idEntry,node->SCOPED_TABLE,els); 216 | break; 217 | } 218 | case astAssignmentStmt: { 219 | 220 | ASTNode* astIdNode = node->children; 221 | ASTNode* astRightNode = node->children->next; 222 | 223 | 224 | SymbolEntry* entry = lookupSymbolEntry(astIdNode->SCOPED_TABLE,astIdNode->AST_NODE_TYPE.AST_ID.ID); 225 | 226 | if(entry == NULL) { 227 | // Throw a missing declaration error and return 228 | throwMissingDeclarationError(astIdNode->AST_NODE_TYPE.AST_ID.ID,els); 229 | return; 230 | } 231 | 232 | Token* lhsType; 233 | Token* rhsType; 234 | 235 | // Case when the entry on the left is a variabale 236 | if(entry->SYMBOL_LABEL == symbolVariable) { 237 | 238 | Token* fieldId = astIdNode->AST_NODE_TYPE.AST_ID.FIELD_ID; 239 | 240 | // If there is no fieldId set datatype to the record type 241 | if(fieldId == NULL) 242 | lhsType = entry->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 243 | // If a field is being accessed, set data type to the field type 244 | else { 245 | Token* recordId = entry->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 246 | lhsType = extractFieldDataType(astIdNode->SCOPED_TABLE->parent,recordId,fieldId,els); 247 | } 248 | 249 | } 250 | else if(entry->SYMBOL_LABEL == symbolParameter) { 251 | 252 | Token* fieldId = astIdNode->AST_NODE_TYPE.AST_ID.FIELD_ID; 253 | 254 | // If there is no fieldId set datatype to the record type 255 | if(fieldId == NULL) 256 | lhsType = entry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE; 257 | // If a field is being accessed, set data type to the field type 258 | else { 259 | Token* recordId = entry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE; 260 | lhsType = extractFieldDataType(astIdNode->SCOPED_TABLE->parent,recordId,fieldId,els); 261 | } 262 | 263 | } 264 | else { 265 | // LHS not an identifier or a parameter, so not valid throw error 266 | } 267 | 268 | 269 | // Case when the rhs is an int number 270 | if(astRightNode->LABEL == astNum) 271 | rhsType = astRightNode->AST_NODE_TYPE.AST_NUM.VALUE; 272 | // Case when the rhs is a real number 273 | else if(astRightNode->LABEL == astRnum) 274 | rhsType = astRightNode->AST_NODE_TYPE.AST_RNUM.VALUE; 275 | // Case when the rhs is a variable 276 | else if(astRightNode->LABEL == astId) { 277 | 278 | SymbolEntry* s = lookupSymbolEntry(astRightNode->SCOPED_TABLE,astRightNode->AST_NODE_TYPE.AST_ID.ID); 279 | 280 | // If s is not found in the symbol table throw a missing declaration error and return 281 | if(s == NULL) { 282 | throwMissingDeclarationError(astRightNode->AST_NODE_TYPE.AST_ID.ID,els); 283 | return; 284 | } 285 | 286 | // Check if the ID has a fieldID 287 | Token* fieldId = astRightNode->AST_NODE_TYPE.AST_ID.FIELD_ID; 288 | 289 | // If there is no fieldId set datatype to the record type 290 | if(fieldId == NULL) 291 | rhsType = s->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 292 | // If a field is being accessed, set data type to the field type 293 | else { 294 | Token* recordId = s->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 295 | rhsType = extractFieldDataType(astRightNode->SCOPED_TABLE->parent,recordId,fieldId,els); 296 | } 297 | 298 | } 299 | // Case when the rhs is an arithmetic expression 300 | else if(astRightNode->LABEL == astArithmeticExpression) 301 | rhsType = getArithmeticExpressionType(astRightNode,els); 302 | else { 303 | printf("type checking on assignment statement does not invole astnum astrnum astid or atrithmeticexpression, not correct\n"); 304 | } 305 | 306 | 307 | // Case when the RHS has an error in it's arithmeticExpression 308 | if(rhsType == NULL) 309 | ; // No action as error would have been reported inside 310 | 311 | // Case when both of them are unequal, throw error 312 | else if(assignableDataTypes(lhsType,rhsType) == 0) { 313 | throwTypeMismatchError(lhsType,rhsType,els,astIdNode->AST_NODE_TYPE.AST_ID.ID->LINE_NO); 314 | } 315 | 316 | break; 317 | } 318 | case astFunCallStmt: { 319 | 320 | // Check if a function calls itself, if it is throw error 321 | if(strcmp(node->SCOPED_TABLE->SCOPE,node->AST_NODE_TYPE.AST_FUN_CALL_STMT.FUN_ID->LEXEME) == 0) 322 | throwRecursiveFunctionCallError(node->AST_NODE_TYPE.AST_FUN_CALL_STMT.FUN_ID,els); 323 | 324 | break; 325 | } 326 | case astIterativeStmt: { 327 | int itrRes = checkForIterationUpdate(node); 328 | if(itrRes == 0) 329 | throwNoIterationUpdateError(node->AST_NODE_TYPE.AST_ITERATIVE_STMT.LINE_NO_START,node->AST_NODE_TYPE.AST_ITERATIVE_STMT.LINE_NO_END,els); 330 | 331 | break; 332 | } 333 | case astConditionalStmt: { 334 | 335 | break; 336 | } 337 | case astElsePart: { 338 | break; 339 | } 340 | case astIOStmtRead: { 341 | 342 | break; 343 | } 344 | case astIOStmtWrite: { 345 | 346 | break; 347 | } 348 | case astReturnStmt: { 349 | 350 | // ReturnStmt <-- stmts <-- function 351 | ASTNode* functionNode = node->parent->parent; 352 | 353 | SymbolEntry* functionEntry = lookupSymbolEntry(node->SCOPED_TABLE->parent,functionNode->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN); 354 | 355 | if(functionEntry == NULL) { 356 | printf("FUnction in which this return statement belongs has no entry in the symbol table, fishy, not correct\n"); 357 | } 358 | 359 | // Get the number of children being returned 360 | int numberReturns = node->CHILDREN_COUNT; 361 | 362 | if(numberReturns != functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.NUMBER_OUTPUT_PARAMS) { 363 | throwInvalidNumberOfReturnVariablesError(node->AST_NODE_TYPE.AST_RETURN_STMT.RETURN_LINE_NO,numberReturns,functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.NUMBER_OUTPUT_PARAMS,els); 364 | return; 365 | } 366 | 367 | ASTNode* trav = node->children; 368 | int index = 0; 369 | 370 | while(trav != NULL) { 371 | 372 | 373 | if(trav->LABEL != astId) { 374 | printf("Child of astReturnStmt is not astId, detected in type cheking phase, nto correct\n"); 375 | } 376 | 377 | SymbolEntry* idEntry = lookupSymbolEntry(node->SCOPED_TABLE,trav->AST_NODE_TYPE.AST_ID.ID); 378 | 379 | // If ID is not found, throw a missing declaration error 380 | if(idEntry == NULL) { 381 | throwMissingDeclarationError(trav->AST_NODE_TYPE.AST_ID.ID,els); 382 | return; 383 | } 384 | 385 | // If present, evaluate 386 | Token* datatype1; 387 | Token* datatype2; 388 | 389 | // Case when the id is a parameter 390 | if(idEntry->SYMBOL_LABEL == symbolParameter) 391 | datatype1 = idEntry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE; 392 | // Case when the id is a variable 393 | else 394 | datatype1 = idEntry->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 395 | 396 | datatype2 = functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.OUTPUT_TYPES[index]; 397 | 398 | // If data type's token heads match 399 | if(datatype1->TOKEN_NAME == datatype2->TOKEN_NAME) { 400 | // If the data type head is a record then check if they are the same record 401 | if(datatype1->TOKEN_NAME == TK_RECORDID && strcmp(datatype1->LEXEME,datatype2->LEXEME) != 0) { 402 | throwReturnTypeMismatchError(trav->AST_NODE_TYPE.AST_ID.ID,datatype1,datatype2,els); 403 | } 404 | else { 405 | ; 406 | } 407 | } 408 | // If the heads themeselves do not match 409 | else { 410 | throwReturnTypeMismatchError(trav->AST_NODE_TYPE.AST_ID.ID,datatype1,datatype2,els); 411 | } 412 | 413 | trav = trav->next; 414 | index++; 415 | } 416 | 417 | break; 418 | } 419 | case astInputArgs: { 420 | ASTNode* functionNode = node->parent; 421 | 422 | if(functionNode->LABEL != astFunCallStmt) { 423 | printf("astInputArgs is a child of a node which is not astFunction, detected in type checking phase, not correct\n"); 424 | } 425 | 426 | SymbolEntry* functionEntry = lookupSymbolEntry(node->SCOPED_TABLE->parent,functionNode->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN); 427 | 428 | if(functionEntry == NULL) { 429 | // Entry of function definition not found, this error would have been handled before, so skip 430 | return; 431 | } 432 | 433 | int numberArguments = node->CHILDREN_COUNT; 434 | 435 | if(numberArguments != functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.NUMBER_INPUT_PARAMS) { 436 | throwInvalidNumberOfInputArgsError(functionNode->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN,numberArguments, functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.NUMBER_INPUT_PARAMS,els); 437 | return; 438 | } 439 | 440 | ASTNode* trav = node->children; 441 | int index = 0; 442 | 443 | while(trav != NULL) { 444 | 445 | if(trav->LABEL != astId) { 446 | printf("Child of astInputArgs is not astId, detected in type cheking phase, nto correct\n"); 447 | } 448 | 449 | SymbolEntry* idEntry = lookupSymbolEntry(node->SCOPED_TABLE,trav->AST_NODE_TYPE.AST_ID.ID); 450 | 451 | // If ID is not found, throw a missing declaration error 452 | if(idEntry == NULL) { 453 | throwMissingDeclarationError(trav->AST_NODE_TYPE.AST_ID.ID,els); 454 | return; 455 | } 456 | 457 | // If present, evaluate 458 | Token* datatype1; 459 | Token* datatype2; 460 | 461 | // Case when the id is a parameter 462 | if(idEntry->SYMBOL_LABEL == symbolParameter) 463 | datatype1 = idEntry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE; 464 | // Case when the id is a variable 465 | else 466 | datatype1 = idEntry->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 467 | 468 | datatype2 = functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.INPUT_TYPES[index]; 469 | 470 | // If data type's token heads match 471 | if(datatype1->TOKEN_NAME == datatype2->TOKEN_NAME) { 472 | // If the data type head is a record then check if they are the same record 473 | if(datatype1->TOKEN_NAME == TK_RECORDID && strcmp(datatype1->LEXEME,datatype2->LEXEME) != 0) { 474 | throwInputArgumentTypeMismatchError(functionNode->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN,datatype2,datatype1,index,els); 475 | } 476 | else { 477 | ; 478 | } 479 | } 480 | // If the heads themeselves do not match 481 | else { 482 | throwInputArgumentTypeMismatchError(functionNode->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN,datatype2,datatype1,index,els); 483 | } 484 | 485 | trav = trav->next; 486 | index++; 487 | } 488 | 489 | break; 490 | } 491 | case astOutputArgs: { 492 | 493 | ASTNode* functionNode = node->parent; 494 | 495 | if(functionNode->LABEL != astFunCallStmt) { 496 | printf("astOutputArgs is a child of a node which is not astFunction, detected in type checking phase, not correct\n"); 497 | } 498 | 499 | 500 | SymbolEntry* functionEntry = lookupSymbolEntry(node->SCOPED_TABLE->parent,functionNode->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN); 501 | 502 | if(functionEntry == NULL) { 503 | // Entry of function definition not found, this error would have been handled before, so skip 504 | return; 505 | } 506 | 507 | int numberArguments = node->CHILDREN_COUNT; 508 | 509 | if(numberArguments != functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.NUMBER_OUTPUT_PARAMS) { 510 | throwInvalidNumberOfOutputArgsError(functionNode->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN,numberArguments, functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.NUMBER_OUTPUT_PARAMS,els); 511 | return; 512 | } 513 | 514 | ASTNode* trav = node->children; 515 | int index = 0; 516 | 517 | while(trav != NULL) { 518 | 519 | if(trav->LABEL != astId) { 520 | printf("Child of astOutputArgs is not astId, detected in type cheking phase, nto correct\n"); 521 | } 522 | 523 | SymbolEntry* idEntry = lookupSymbolEntry(node->SCOPED_TABLE,trav->AST_NODE_TYPE.AST_ID.ID); 524 | 525 | // If ID is not found, throw a missing declaration error 526 | if(idEntry == NULL) { 527 | throwMissingDeclarationError(trav->AST_NODE_TYPE.AST_ID.ID,els); 528 | return; 529 | } 530 | 531 | // If present, evaluate 532 | Token* datatype1; 533 | Token* datatype2; 534 | 535 | // Case when the id is a parameter 536 | if(idEntry->SYMBOL_LABEL == symbolParameter) 537 | datatype1 = idEntry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE; 538 | // Case when the id is a variable 539 | else 540 | datatype1 = idEntry->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 541 | 542 | datatype2 = functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.OUTPUT_TYPES[index]; 543 | 544 | // If data type's token heads match 545 | if(datatype1->TOKEN_NAME == datatype2->TOKEN_NAME) { 546 | // If the data type head is a record then check if they are the same record 547 | if(datatype1->TOKEN_NAME == TK_RECORDID && strcmp(datatype1->LEXEME,datatype2->LEXEME) != 0) { 548 | throwOutputArgumentTypeMismatchError(functionNode->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN,datatype2,datatype1,index,els); 549 | } 550 | else { 551 | ; 552 | } 553 | } 554 | // If the heads themeselves do not match 555 | else { 556 | throwOutputArgumentTypeMismatchError(functionNode->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN,datatype2,datatype1,index,els); 557 | } 558 | 559 | trav = trav->next; 560 | index++; 561 | } 562 | 563 | break; 564 | } 565 | case astArithmeticExpression: { 566 | 567 | break; 568 | } 569 | case astBooleanExpression: { 570 | checkBooleanExpressionType(node,node->SCOPED_TABLE,els); 571 | break; 572 | } 573 | case astId: { 574 | Token* dataType = extractDataTypeFromSymbolTable(node,els); 575 | 576 | // Error already reported in extractDataTypeFromSymbolTable 577 | if(dataType == NULL) 578 | ; 579 | else { 580 | // Store the data type head so that future passes can get it easily 581 | node->AST_NODE_TYPE.AST_ID.DATA_TYPE = dataType; 582 | } 583 | break; 584 | } 585 | case astNum: { 586 | 587 | break; 588 | } 589 | case astRnum: { 590 | 591 | break; 592 | } 593 | } 594 | 595 | // Consider children 596 | ASTNode* trav = node->children; 597 | while(trav != NULL) { 598 | captureErrorsHelper(trav,els); 599 | trav = trav->next; 600 | } 601 | } 602 | 603 | // Evaluates the types of expressions in AST like ArithmeticExpressions, BooleanExpressions, Function return stmt 604 | void captureErrors(AST* ast, ErrorList* els) { 605 | ASTNode* node = ast->root; 606 | captureErrorsHelper(node,els); 607 | } 608 | 609 | 610 | void semanticAnalysis(AST* ast) { 611 | 612 | 613 | // Initialize Error List 614 | ErrorList* els = initializeErrorList(); 615 | 616 | // Initialize symbol table (First AST Pass) 617 | SymbolTable* st = constructSymbolTable(ast,els); 618 | 619 | // Capture errors (Second AST Pass) 620 | captureErrors(ast,els); 621 | 622 | } 623 | -------------------------------------------------------------------------------- /Compiler/semantic_analyzer.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "semantic_analyzerDef.h" 8 | 9 | void semanticAnalysis(AST* ast); 10 | 11 | void captureErrors(AST* ast, ErrorList* els); 12 | void captureErrorsHelper(ASTNode* node, ErrorList* els); 13 | 14 | TokenListItem* initializeTokenListItem(Token* tk); 15 | TokenListItem* getConditionals(ASTNode* astBooleanExpressionNode); 16 | TokenListItem* mergeConditionals(TokenListItem* leftConditionals, TokenListItem* rightConditionals); 17 | int searchConditionals(TokenListItem* tls, Token* tk); 18 | int searchIterativeChildren(ASTNode* astIterativeStmtNode, TokenListItem* conditionals); 19 | TokenListItem* mergeConditionals(TokenListItem* leftConditionals, TokenListItem* rightConditionals); 20 | -------------------------------------------------------------------------------- /Compiler/semantic_analyzerDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #ifndef SEMANTIC_ANALYZER_ 8 | #define SEMANTIC_ANALYZER_ 9 | 10 | #include "lexerDef.h" 11 | // We work with tokens for type checking 12 | // Actually we use the TOKEN_NAME field inside to compare 13 | // Tokens chosen because they directly feed the line number. 14 | 15 | // A structure which is used to form a linked list of tokens 16 | typedef struct TokenListItem { 17 | Token* TK; 18 | struct TokenListItem* next; 19 | } TokenListItem; 20 | 21 | #endif 22 | -------------------------------------------------------------------------------- /Compiler/stack.c: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "interface.h" 8 | #include "stackDef.h" 9 | #include "nary_tree.h" 10 | 11 | StackNode* createStackNode(NaryTreeNode* ntn) { 12 | StackNode* stn = (StackNode*)malloc(sizeof(StackNode)); 13 | stn->TREE_NODE = ntn; 14 | stn->next = NULL; 15 | return stn; 16 | } 17 | 18 | NaryTreeNode* top(Stack* st) { 19 | if(st->HEAD == NULL) 20 | return NULL; 21 | else 22 | return st->HEAD->TREE_NODE; 23 | } 24 | 25 | void push(Stack* st,NaryTreeNode* ntn) { 26 | StackNode* stn = createStackNode(ntn); 27 | StackNode* head = st->HEAD; 28 | 29 | // Case when stack is empty 30 | if(head == NULL) { 31 | st->HEAD = stn; 32 | st->NUM_NODES++; 33 | return; 34 | } 35 | 36 | stn->next = head; 37 | st->HEAD = stn; 38 | st->NUM_NODES++; 39 | return; 40 | } 41 | 42 | void pop(Stack* st) { 43 | StackNode* head = st->HEAD; 44 | 45 | // Case when stack is already empty 46 | if(head == NULL) 47 | return; 48 | 49 | st->HEAD = st->HEAD->next; 50 | st->NUM_NODES--; 51 | } 52 | 53 | // Function recursively pushes children on the stack 54 | void pushTreeChildren(Stack* st,NaryTreeNode* ntn) { 55 | if(ntn == NULL) 56 | return; 57 | pushTreeChildren(st,ntn->next); 58 | push(st,ntn); 59 | } 60 | 61 | // Initialise the stack with TK_DOLLAr as the end and program as the startting non terminal 62 | Stack* initialiseStack(ParseTree* pt) { 63 | Stack* st = (Stack*)malloc(sizeof(Stack)); 64 | st->HEAD = NULL; 65 | st->NUM_NODES = 0; 66 | 67 | SymbolType sType; 68 | sType.TERMINAL = TK_DOLLAR; 69 | NaryTreeNode* ntn = createNode(1,sType,NULL); 70 | push(st,ntn); 71 | push(st,pt->root); 72 | return st; 73 | } 74 | -------------------------------------------------------------------------------- /Compiler/stack.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "stackDef.h" 8 | 9 | StackNode* createStackNode(NaryTreeNode* ntn); 10 | 11 | // Stack operations 12 | void push(Stack* st,NaryTreeNode* ntn); 13 | NaryTreeNode* top(Stack* st); 14 | void pop(Stack* st); 15 | Stack* initialiseStack(ParseTree* pt); 16 | void pushTreeChildren(Stack* st,NaryTreeNode* ntn); 17 | -------------------------------------------------------------------------------- /Compiler/stackDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #ifndef STACK_ 8 | #define STACK_ 9 | #include "nary_treeDef.h" 10 | 11 | typedef struct StackNode { 12 | NaryTreeNode* TREE_NODE; 13 | struct StackNode* next; 14 | } StackNode; 15 | 16 | typedef struct Stack { 17 | StackNode* HEAD; 18 | int NUM_NODES; 19 | } Stack; 20 | 21 | #endif 22 | -------------------------------------------------------------------------------- /Compiler/symbol_table.c: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "interface.h" 8 | #include "symbol_table.h" 9 | #include "ast.h" 10 | #include "error_handler.h" 11 | #include 12 | 13 | // Function to initialize symbol table 14 | SymbolTable* initializeSymbolTable(int numberSlots, Scope scope) { 15 | SymbolTable* st = (SymbolTable*)malloc(sizeof(SymbolTable)); 16 | st->SCOPE = scope; 17 | st->NUMBER_SLOTS = numberSlots; 18 | st->SYMBOL_SLOTS = (SymbolEntry**)malloc(st->NUMBER_SLOTS*sizeof(SymbolEntry)); 19 | st->CURRENT_OFFSET = 0; 20 | st->parent = NULL; 21 | return st; 22 | } 23 | 24 | // Hash function which hashes according to the lexeme 25 | int symbolHashFunction(SymbolTable* st ,char* str) { 26 | 27 | /* Hash function djb2 and mod */ 28 | unsigned long hash = 5381; 29 | int c; 30 | while (c = *str++) 31 | hash = ((hash << 5) + hash) + c; /* hash * 33 + c */ 32 | return (hash%st->NUMBER_SLOTS); 33 | 34 | } 35 | 36 | // Create a symbol entry 37 | SymbolEntry* createSymbolEntry(Token* symbolToken,SymbolLabel symbolLabel) { 38 | SymbolEntry* symbolEntry = (SymbolEntry*)malloc(sizeof(SymbolEntry)); 39 | symbolEntry->SYMBOL_TOKEN = symbolToken; 40 | symbolEntry->SYMBOL_LABEL = symbolLabel; 41 | symbolEntry->next = NULL; 42 | return symbolEntry; 43 | } 44 | 45 | // Function to add a symbol entry to linked list of entires 46 | SymbolEntry* addEntryToList(SymbolEntry* list, SymbolEntry* s) { 47 | 48 | // Case when list is empty 49 | if(list == NULL) 50 | return s; 51 | 52 | // Case when list is not empty 53 | s->next = list; 54 | return s; 55 | } 56 | 57 | // Add symbol entry to symbol table 58 | void addSymbolEntry(SymbolTable* st, SymbolEntry* entry,ErrorList* els) { 59 | 60 | // Check if the entry already exists in the table or not 61 | SymbolEntry* existingEntry = lookupSymbolEntry(st,entry->SYMBOL_TOKEN); 62 | 63 | // Case when the entry exists in the current table, throw error and return 64 | if(existingEntry != NULL) { 65 | throwMultipleDefinitionsError(entry->SYMBOL_TOKEN,els); 66 | return; 67 | } 68 | // Case when the entry does not exist in the current table 69 | else { 70 | 71 | // If the current table is the global table, then no clashes observed 72 | // Continue with installing the entry 73 | if(st->parent == NULL) { 74 | ; 75 | } 76 | // If the current table is a scoped table, then check the global table for any clashes 77 | else { 78 | SymbolEntry* existingGlobalEntry = lookupSymbolEntry(st->parent,entry->SYMBOL_TOKEN); 79 | 80 | // If there is a global entry, then throw error 81 | if(existingGlobalEntry != NULL) { 82 | throwClashingGlobalDefinitionError(entry->SYMBOL_TOKEN,els); 83 | return; 84 | } 85 | // No clashes in the global table as well, continue with installing the entry 86 | else { 87 | ; 88 | } 89 | } 90 | } 91 | 92 | // To be done in a separate pass 93 | // // Set the identifier of the entry to the current identifier 94 | // entry->SYMBOL_OFFSET = st->CURRENT_OFFSET; 95 | 96 | // // Increment current identifier of table 97 | // // st->CURRENT_OFFSET++; 98 | 99 | int hashIndex = symbolHashFunction(st,entry->SYMBOL_TOKEN->LEXEME); 100 | st->SYMBOL_SLOTS[hashIndex] = addEntryToList(st->SYMBOL_SLOTS[hashIndex],entry); 101 | } 102 | 103 | // Searches for a symbol entry in the list which has the same lexeme as the token provided 104 | SymbolEntry* searchSymbolEntry(SymbolEntry* ls, Token* token) { 105 | SymbolEntry* trav = ls; 106 | 107 | while(trav != NULL) { 108 | // If found return pointer to entry 109 | if(strcmp(trav->SYMBOL_TOKEN->LEXEME,token->LEXEME) == 0) 110 | return trav; 111 | trav = trav->next; 112 | } 113 | 114 | // Return NULL otherwise 115 | return NULL; 116 | } 117 | 118 | // Searches the symbol table for an entry which has the same lexeme as the provided token 119 | // If a match is not found in the current table, it searches the global table 120 | SymbolEntry* lookupSymbolEntry(SymbolTable* st, Token* token) { 121 | // Hash according to the lexeme present in the token 122 | int hashIndex = symbolHashFunction(st,token->LEXEME); 123 | // Search in the slot 124 | 125 | SymbolEntry* entry = searchSymbolEntry(st->SYMBOL_SLOTS[hashIndex],token); 126 | 127 | // Case when there is no entry found in the table 128 | if(entry == NULL) { 129 | // Check the global table (if the current table is a scoped table) 130 | if(st->parent != NULL) { 131 | return lookupSymbolEntry(st->parent,token); 132 | } 133 | // If the table was itself the global table return NULL 134 | else 135 | return NULL; 136 | } 137 | // Case when the entry is found 138 | else 139 | return entry; 140 | 141 | } 142 | 143 | void constructSymbolTableHelper(ASTNode* node, SymbolTable* st,ErrorList* els) { 144 | 145 | // Case when we have reached NULL 146 | if(node == NULL) 147 | return; 148 | 149 | Label nodeLabel = node->LABEL; 150 | 151 | switch(nodeLabel) { 152 | case astProgram: { 153 | // Global table is set for astProgram 154 | node->SCOPED_TABLE = st; 155 | break; 156 | } 157 | case astFunction: { 158 | SymbolEntry* function = createSymbolEntry(node->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN,symbolFunction); 159 | addSymbolEntry(st,function,els); 160 | 161 | // Create new table for the function 162 | function->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.SCOPED_TABLE = initializeSymbolTable(SYMBOL_TABLE_SLOTS,node->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN->LEXEME); 163 | // Set the scope of the new table as the function 164 | function->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.SCOPED_TABLE->SCOPE = node->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN->LEXEME; 165 | // Set the function's scoped symbol table's parent as the main symbol table 166 | function->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.SCOPED_TABLE->parent = st; 167 | // Set the scope of this node to the new table created 168 | node->SCOPED_TABLE = function->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.SCOPED_TABLE; 169 | // For all children under astFunction we should populate the new table 170 | st = function->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.SCOPED_TABLE; 171 | break; 172 | 173 | } 174 | case astInputParams: { 175 | 176 | // Get the parent function node 177 | ASTNode* functionNode = node->parent; 178 | // Find the entry of the function definition in the symbol table 179 | SymbolEntry* functionEntry = lookupSymbolEntry(st->parent,functionNode->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN); 180 | functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.NUMBER_INPUT_PARAMS = node->CHILDREN_COUNT; 181 | functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.INPUT_TYPES = (Token**)malloc(sizeof(Token*)*functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.NUMBER_INPUT_PARAMS); 182 | 183 | int index = 0; 184 | 185 | // Get children (The IDs constituting the input params of the node) 186 | ASTNode* trav = node->children; 187 | while(trav != NULL) { 188 | 189 | if(trav->LABEL != astId) { 190 | printf("Child of astInputParams detected not be an astIdNode in the symbol table construction phase , not correct\n"); 191 | } 192 | 193 | Token* dataType = trav->AST_NODE_TYPE.AST_ID.DATA_TYPE; 194 | Token* variable = trav->AST_NODE_TYPE.AST_ID.ID; 195 | // Create entry 196 | SymbolEntry* entry = createSymbolEntry(variable,symbolParameter); 197 | // Set data type 198 | entry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE = dataType; 199 | 200 | // Set input parameter or output parameter 201 | entry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.IS_INPUT = 1; 202 | addSymbolEntry(st,entry,els); 203 | 204 | functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.INPUT_TYPES[index] = dataType; 205 | 206 | trav = trav->next; 207 | index++; 208 | } 209 | 210 | node->SCOPED_TABLE = st; 211 | break; 212 | } 213 | case astOutputParams: { 214 | 215 | ASTNode* functionNode = node->parent; 216 | // Find the entry of the function definition in the symbol table 217 | SymbolEntry* functionEntry = lookupSymbolEntry(st->parent,functionNode->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN); 218 | functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.NUMBER_OUTPUT_PARAMS = node->CHILDREN_COUNT; 219 | functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.OUTPUT_TYPES = (Token**)malloc(sizeof(Token*)*functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.NUMBER_OUTPUT_PARAMS); 220 | 221 | int index = 0; 222 | 223 | ASTNode* trav = node->children; 224 | 225 | while(trav != NULL) { 226 | 227 | if(trav->LABEL != astId) { 228 | printf("Child of astOutputParams detected not be an astIdNode in the symbol table construction phase , not correct\n"); 229 | } 230 | 231 | Token* dataType = trav->AST_NODE_TYPE.AST_ID.DATA_TYPE; 232 | Token* variable = trav->AST_NODE_TYPE.AST_ID.ID; 233 | // Create entry 234 | SymbolEntry* entry = createSymbolEntry(variable,symbolParameter); 235 | // Set data type 236 | entry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE = dataType; 237 | 238 | // Set input parameter or output parameter 239 | entry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.IS_INPUT = 0; 240 | addSymbolEntry(st,entry,els); 241 | 242 | functionEntry->SYMBOL_ENTRY_TYPE.FUNCTION_ENTRY.OUTPUT_TYPES[index] = dataType; 243 | 244 | trav = trav->next; 245 | index++; 246 | } 247 | 248 | node->SCOPED_TABLE = st; 249 | break; 250 | } 251 | case astDatatype: { 252 | break; 253 | } 254 | case astStmts: { 255 | node->SCOPED_TABLE = st; 256 | break; 257 | } 258 | case astTypeDefintion: { 259 | 260 | // Create symbolEntry for corresponding RECORDID 261 | SymbolEntry* typeDefinition = createSymbolEntry(node->AST_NODE_TYPE.AST_TYPE_DEFINITION.RECORD_ID,symbolRecord); 262 | 263 | // Get the number of field definitions beneath this type definition node and store it 264 | int numberChildren = node->CHILDREN_COUNT; 265 | typeDefinition->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.NUMBER_FIELDS = numberChildren; 266 | 267 | // Allocate space for corresponding number of fields 268 | typeDefinition->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.DATA_TYPE = (Token**)malloc(numberChildren*sizeof(Token*)); 269 | typeDefinition->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.FIELDS = (Token**)malloc(numberChildren*sizeof(Token*)); 270 | 271 | 272 | // Traverse all children of type definition which should be field definition node 273 | ASTNode* trav = node->children; 274 | int count = 0; 275 | int calculateOffset = 0; 276 | while(trav != NULL) { 277 | 278 | if(trav->LABEL != astFieldDefinition) { 279 | printf("Type definition node considering a node other than a field definition, not correct!\n"); 280 | } 281 | 282 | // Construct data type in cartesian representation 283 | typeDefinition->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.DATA_TYPE[count] = trav->AST_NODE_TYPE.AST_FIELD_DEFINITION.DATA_TYPE; 284 | typeDefinition->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.FIELDS[count] = trav->AST_NODE_TYPE.AST_FIELD_DEFINITION.FIELD_ID; 285 | 286 | if(typeDefinition->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.DATA_TYPE[count]->TOKEN_NAME == TK_INT) 287 | calculateOffset += 2; 288 | else if(typeDefinition->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.DATA_TYPE[count]->TOKEN_NAME == TK_REAL) 289 | calculateOffset += 4; 290 | else { 291 | // TODO REPORT ERROR IF IT IS AN ERROR, ASK MA'AM 292 | ; 293 | } 294 | 295 | count++; 296 | trav = trav->next; 297 | } 298 | 299 | 300 | // Set the offset in the entry 301 | typeDefinition->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.TOTAL_OFFSET = calculateOffset; 302 | // Install entry in the global symbol table 303 | addSymbolEntry(st->parent,typeDefinition,els); 304 | node->SCOPED_TABLE = st->parent; 305 | 306 | break; 307 | } 308 | case astFieldDefinition: { 309 | // As field definition will always belong to a record entry which is always in a global table 310 | node->SCOPED_TABLE = st->parent; 311 | break; 312 | } 313 | case astDeclaration: { 314 | 315 | // Create a dummy entry which will be populated in the code below 316 | SymbolEntry* variable = createSymbolEntry(NULL,symbolVariable); 317 | int isGlobal = node->AST_NODE_TYPE.AST_DECLARATION.IS_GLOBAL; 318 | 319 | ASTNode* trav = node->children; 320 | if(trav != NULL) { 321 | // Each child must be an astID node 322 | if(trav->LABEL != astId) { 323 | printf("Declaration not involving an astId node, not correct\n"); 324 | } 325 | 326 | variable->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE = trav->AST_NODE_TYPE.AST_ID.DATA_TYPE; 327 | variable->SYMBOL_TOKEN = trav->AST_NODE_TYPE.AST_ID.ID; 328 | trav = trav->next; 329 | } 330 | 331 | // If it is a global variable , add the entry to the global symbol table 332 | if(isGlobal == 1) { 333 | addSymbolEntry(st->parent,variable,els); 334 | node->SCOPED_TABLE = st->parent; 335 | } 336 | // Else add the entry to the function's scoped symbol table 337 | else { 338 | addSymbolEntry(st,variable,els); 339 | node->SCOPED_TABLE = st; 340 | } 341 | 342 | break; 343 | } 344 | case astAssignmentStmt: { 345 | node->SCOPED_TABLE = st; 346 | break; 347 | } 348 | case astFunCallStmt: { 349 | 350 | // Check if the function has been defined prior to being called 351 | // Check for a function definition 352 | SymbolEntry* functionEntry = lookupSymbolEntry(st->parent,node->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN); 353 | 354 | if(functionEntry == NULL) { 355 | throwMissingFunctionDefinitionError(node->AST_NODE_TYPE.AST_FUNCTION.FUNCTION_TOKEN,els); 356 | } 357 | 358 | node->SCOPED_TABLE = st; 359 | break; 360 | } 361 | case astIterativeStmt: { 362 | node->SCOPED_TABLE = st; 363 | break; 364 | } 365 | case astConditionalStmt: { 366 | node->SCOPED_TABLE = st; 367 | break; 368 | } 369 | case astElsePart: { 370 | node->SCOPED_TABLE = st; 371 | break; 372 | } 373 | case astIOStmtRead: { 374 | node->SCOPED_TABLE = st; 375 | break; 376 | } 377 | case astIOStmtWrite: { 378 | node->SCOPED_TABLE = st; 379 | break; 380 | } 381 | case astReturnStmt: { 382 | node->SCOPED_TABLE = st; 383 | break; 384 | } 385 | case astInputArgs: { 386 | node->SCOPED_TABLE = st; 387 | break; 388 | } 389 | case astOutputArgs: { 390 | node->SCOPED_TABLE = st; 391 | break; 392 | } 393 | case astArithmeticExpression: { 394 | // No installation needed beyond this node 395 | node->SCOPED_TABLE = st; 396 | break; 397 | } 398 | case astBooleanExpression: { 399 | // No installaton needed beyond this node 400 | node->SCOPED_TABLE = st; 401 | break; 402 | } 403 | case astId: { 404 | 405 | // Redundant as we will be handling input and output params at their respective places 406 | 407 | // In case the ID is part of input parameters or output parameters, it needs to be installed in the symbol table 408 | // if(node->parent->LABEL == astInputParams || node->parent->LABEL == astOutputParams) { 409 | // // Input and output parameters must be established in this scope 410 | // Token* dataType = node->AST_NODE_TYPE.AST_ID.DATA_TYPE; 411 | // Token* variable = node->AST_NODE_TYPE.AST_ID.ID; 412 | // // Case when the data type is a primitive type 413 | // SymbolEntry* entry = createSymbolEntry(variable,symbolParameter); 414 | // // Set data type 415 | // entry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE = dataType; 416 | 417 | // // Set input parameter or output parameter 418 | // entry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.IS_INPUT = ((node->parent->LABEL == astInputParams) ? 1 : 0); 419 | // addSymbolEntry(st,entry,els); 420 | 421 | 422 | // } 423 | 424 | // Adjust scope for this node 425 | node->SCOPED_TABLE = st; 426 | 427 | break; 428 | } 429 | case astNum: { 430 | node->SCOPED_TABLE = st; 431 | break; 432 | } 433 | case astRnum: { 434 | node->SCOPED_TABLE = st; 435 | break; 436 | } 437 | } 438 | 439 | // Inform group members that this code is reachable only if the above swith case does not return, which is the case in some cases 440 | // Traverse children 441 | ASTNode* trav = node->children; 442 | while(trav != NULL) { 443 | constructSymbolTableHelper(trav,st,els); 444 | trav = trav->next; 445 | } 446 | 447 | 448 | } 449 | 450 | 451 | // Creates the symbol table by using the declarations and function definitions to populate slots in the appropriate table 452 | // Also populates the scope field of the AST nodes, so that a node's scoped table can directly be referred in the next step 453 | SymbolTable* constructSymbolTable(AST* ast,ErrorList* els) { 454 | 455 | // Initialize symbol table with number of slots and scope as global 456 | SymbolTable* st = initializeSymbolTable(SYMBOL_TABLE_SLOTS,"global"); 457 | 458 | constructSymbolTableHelper(ast->root,st,els); 459 | return st; 460 | 461 | } 462 | -------------------------------------------------------------------------------- /Compiler/symbol_table.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "symbol_tableDef.h" 8 | #include "error_handlerDef.h" 9 | #include "astDef.h" 10 | 11 | SymbolTable* initializeSymbolTable(int numberSlots,Scope s); 12 | int symbolHashFunction(SymbolTable* st ,char* str); 13 | void addSymbolEntry(SymbolTable* st, SymbolEntry* entry,ErrorList* els); 14 | SymbolEntry* addEntryToList(SymbolEntry* list, SymbolEntry* s); 15 | SymbolEntry* createSymbolEntry(Token* symbolToken,SymbolLabel symbolLabel); 16 | SymbolEntry* searchSymbolEntry(SymbolEntry* ls, Token* token); 17 | SymbolEntry* lookupSymbolEntry(SymbolTable* st, Token* token); 18 | 19 | void constructSymbolTableHelper(ASTNode* node, SymbolTable* st,ErrorList* els); 20 | SymbolTable* constructSymbolTable(AST* ast,ErrorList* els); 21 | -------------------------------------------------------------------------------- /Compiler/symbol_tableDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #ifndef SYMBOL_TABLE_DEF_ 8 | #define SYMBOL_TABLE_DEF_ 9 | 10 | #include "lexerDef.h" 11 | 12 | #define SYMBOL_TABLE_SLOTS 20 // This is a number which we use as a default, a function is present to initialize number of slots as per our choice 13 | 14 | typedef char* Scope; // We shall use a string for referring to a scope 15 | 16 | typedef struct SymbolEntry SymbolEntry; 17 | typedef struct SymbolTable SymbolTable; 18 | 19 | /** 20 | * Plan 21 | * The starting symbol table will be the global symbol table, responsible for storing the global variable, type definitions and function names 22 | * In case of a function, the table entry will be scoped into another table 23 | * This scoped table of the function will have input/output parameters, the variables defined inside the function and so on 24 | * 25 | */ 26 | 27 | typedef enum SymbolLabel { 28 | symbolFunction, 29 | symbolVariable, 30 | symbolRecord, 31 | symbolParameter 32 | } SymbolLabel; 33 | 34 | typedef struct SymbolTable { 35 | SymbolEntry** SYMBOL_SLOTS; // A dynamic array of Symbol Slots, each slot stores the head of the linked list of symbols hashed there 36 | Scope SCOPE; // Scope of the current table 37 | int NUMBER_SLOTS; // Number of slots 38 | int CURRENT_OFFSET; // The next entry will be allocated this number, which will then be incremented. (Is also the count of elements in the table) 39 | SymbolTable* parent; // Parent table of this symbol table (Needed for functions to find if an entry is available in the parent table eg- global entries and type definition) 40 | } SymbolTable; 41 | 42 | typedef struct FunctionEntry { 43 | 44 | // This points to a symbol table for that particular function 45 | // Note that input params being taken to this function must be installed in this symbol table 46 | 47 | // If a parameter is of record type defined in the caller, it must be valid in this table 48 | // The function's scoped symbol table can access the typeDefinition in the original table (Implement via pointers) 49 | struct SymbolTable* SCOPED_TABLE; 50 | Token** INPUT_TYPES; // Stores the data types of input params 51 | int NUMBER_INPUT_PARAMS; // Stores the number of input params 52 | Token** OUTPUT_TYPES; // Stores the data types of output params 53 | int NUMBER_OUTPUT_PARAMS; // Stores the number of output params 54 | } FunctionEntry; 55 | 56 | // Note that all global variables should be in the global symbol table 57 | typedef struct VariableEntry { 58 | Token* DATA_TYPE; 59 | } VariableEntry; 60 | 61 | typedef struct RecordEntry { 62 | Token** DATA_TYPE; 63 | Token** FIELDS; // The data type in a cartesian format, size based on the number of fields 64 | int TOTAL_OFFSET; 65 | int NUMBER_FIELDS; // The number of fields in the data type 66 | } RecordEntry; 67 | 68 | typedef struct ParameterEntry { 69 | Token* DATA_TYPE; 70 | int IS_INPUT; // Stores whether the parameter is input (1) or output (0) 71 | } ParameterEntry; 72 | 73 | typedef union SymbolEntryType { 74 | FunctionEntry FUNCTION_ENTRY; 75 | VariableEntry VARIABLE_ENTRY; 76 | RecordEntry RECORD_ENTRY; 77 | ParameterEntry PARAMETER_ENTRY; 78 | } SymbolEntryType; 79 | 80 | 81 | typedef struct SymbolEntry { 82 | 83 | Token* SYMBOL_TOKEN; // Field to store the symbol 84 | int SYMBOL_OFFSET; // A unique identifier to denote each entry in a symbol table 85 | SymbolLabel SYMBOL_LABEL; // A label to indicate type of entry in symbol table 86 | SymbolEntryType SYMBOL_ENTRY_TYPE; // Stores the type of entry in the table 87 | struct SymbolEntry* next; // Represents a pointer to an entry hashed to the same position 88 | } SymbolEntry; 89 | 90 | 91 | #endif 92 | -------------------------------------------------------------------------------- /Compiler/testcase3.txt: -------------------------------------------------------------------------------- 1 | 2 | %Test Case 3 3 | %Following program computes an arithmetic expression 4 | 5 | % The following function computes the function value for the given inputs 6 | _computeFunctionValue input parameter list[int c3, int c4, int c5] 7 | output parameter list [real c6]; 8 | type real : d4cbcd5677; 9 | type real : c4bbb; 10 | c6 <--- 5000.79; 11 | d4cbcd5677<--- ((c3 + 2*c4)-(c5-5))/ 4; 12 | c4bbb <--- ((d4cbcd5677- 2.35)*(2345-234*8))+5*c3; 13 | if((~(c4bbb == 0)) &&& (c4bbb > 78.56)) 14 | then 15 | c6<--- d4cbcd5677/c4bbb; 16 | else 17 | write(c4bbb); 18 | endif 19 | 20 | return [c6]; 21 | end 22 | 23 | 24 | %The following program computes the function value for the user defined input 25 | _main 26 | type int : b5; 27 | type int : d5cb34567; 28 | type int : b3b444 : global; 29 | type real: c3; 30 | b5 <--- 1; 31 | read(d5cb34567); 32 | read(b3b444); 33 | 34 | [c3] <--- call _computeFunctionValue with parameters [b5, d5cb34567, b3b444]; 35 | write(c3); 36 | return; 37 | end 38 | -------------------------------------------------------------------------------- /Compiler/testcases/main-1.txt: -------------------------------------------------------------------------------- 1 | _main 2 | type int :b3; 3 | type int : b2; 4 | type int : c2; 5 | type int : d2; 6 | type int : b2c; 7 | read(b2); 8 | read(d2); 9 | 10 | if(d2 > b2) 11 | then 12 | if(d2 > 5) 13 | then 14 | write(d2); 15 | else 16 | write(5); 17 | endif 18 | 19 | else 20 | write(b2); 21 | endif 22 | 23 | return; 24 | end 25 | -------------------------------------------------------------------------------- /Compiler/testcases/main-2.txt: -------------------------------------------------------------------------------- 1 | _main 2 | 3 | record #marks 4 | type int : maths; 5 | type int: physics; 6 | type int: chemistry; 7 | endrecord; 8 | 9 | type int :b3; 10 | type int : b2; 11 | type int : c2; 12 | type int : d2; 13 | type int : b2c; 14 | type record #marks : b3c2; 15 | 16 | b2 <--- 1; 17 | 18 | while(b2 >= 0) 19 | read(b2); 20 | read(c2); 21 | c2 <--- b2 + c2; 22 | write(c2); 23 | endwhile 24 | 25 | return; 26 | end 27 | -------------------------------------------------------------------------------- /Compiler/testcases/main-3.txt: -------------------------------------------------------------------------------- 1 | _main 2 | 3 | record #marks 4 | type int : maths; 5 | type int: physics; 6 | type int: chemistry; 7 | endrecord; 8 | 9 | type record #marks : b3c2; 10 | 11 | read(b3c2.maths); 12 | read(b3c2.chemistry); 13 | 14 | b3c2.physics <--- b3c2.maths + b3c2.chemistry; 15 | 16 | 17 | write(b3c2.maths); 18 | write(b3c2.physics); 19 | write(b3c2.chemistry); 20 | 21 | return; 22 | end 23 | -------------------------------------------------------------------------------- /Compiler/testcases/main0.txt: -------------------------------------------------------------------------------- 1 | _main 2 | type int :b3; 3 | type int : b2; 4 | type int : c2; 5 | type int : d2; 6 | read(b2); 7 | 8 | write(b2); 9 | return; 10 | end 11 | -------------------------------------------------------------------------------- /Compiler/testcases/main1.txt: -------------------------------------------------------------------------------- 1 | _main 2 | type int :b3; 3 | type int : b2; 4 | type int : c2; 5 | type int : d2; 6 | read(b2); 7 | c2<--- 20; 8 | read(d2); 9 | 10 | b3 <--- b2 + c2 + d2; 11 | write(b3); 12 | return; 13 | end 14 | 15 | 16 | 17 | -------------------------------------------------------------------------------- /Compiler/testcases/main2.txt: -------------------------------------------------------------------------------- 1 | _main 2 | type int :b3c45; 3 | type int : b2d6; 4 | type int : d6:global; 5 | read(b3c45); 6 | read(b2d6); 7 | d6<--- 100; 8 | if(b3c45 <= b2d6) 9 | then 10 | d6<---d6+100; 11 | else 12 | d6<---d6-200; 13 | endif 14 | write(d6); 15 | return; 16 | end 17 | 18 | 19 | 20 | -------------------------------------------------------------------------------- /Compiler/testcases/main3.txt: -------------------------------------------------------------------------------- 1 | 2 | _main 3 | type int : b5b567; 4 | type int : c3bd; 5 | type int : d3; 6 | b5b567 <--- 1; 7 | d3 <--- 0; 8 | while ( b5b567 <= 7) 9 | read( c3bd); 10 | d3 <--- d3 + c3bd; 11 | b5b567 <--- b5b567 + 1; 12 | endwhile 13 | write(d3); 14 | return; 15 | end 16 | 17 | 18 | 19 | 20 | -------------------------------------------------------------------------------- /Compiler/testcases/main4.txt: -------------------------------------------------------------------------------- 1 | 2 | _main 3 | record #marks 4 | type int : maths; 5 | type int: physics; 6 | type int: chemistry; 7 | endrecord; 8 | 9 | type record #marks : d4; 10 | type int : b5; 11 | type int : d5cb34567; 12 | type record #marks : b3c2; 13 | 14 | b5 <--- 1; 15 | read(d5cb34567); 16 | d4.maths <--- 0; 17 | d4.physics <--- 0; 18 | d4.chemistry <---0; 19 | while(b5<=d5cb34567) 20 | read(b3c2.maths); 21 | read(b3c2.physics); 22 | read(b3c2.chemistry); 23 | d4 <--- b3c2 + d4; 24 | b5 <--- b5 +1; 25 | write(b5); 26 | write(d5cb34567); 27 | endwhile 28 | write(d4); 29 | return; 30 | end 31 | -------------------------------------------------------------------------------- /Compiler/testcases/main5.txt: -------------------------------------------------------------------------------- 1 | _main 2 | record #marks 3 | type int : maths; 4 | type int: physics; 5 | type int: chemistry; 6 | endrecord; 7 | 8 | type record #marks: d4; 9 | type record #marks: d5; 10 | type record #marks: d6; 11 | 12 | read(d4.maths); 13 | read(d4.physics); 14 | read(d4.chemistry); 15 | 16 | read(d5.maths); 17 | read(d5.physics); 18 | read(d5.chemistry); 19 | 20 | d6 <--- d4 + d5; 21 | 22 | write(d6.maths); 23 | write(d6.physics); 24 | write(d6.chemistry); 25 | 26 | return; 27 | end 28 | -------------------------------------------------------------------------------- /Compiler/testcases/stestcase1.txt: -------------------------------------------------------------------------------- 1 | %Test Case 1 2 | %Following function computes the sum of user defined real numbers 3 | %The variable d3 maintains the sum of values 4 | _sumN input parameter list [int d5cc34] 5 | output parameter list[real d3]; 6 | type int : b5b567: global; 7 | type int : b3; 8 | type real : c3bd; 9 | b5b567 <--- 1; 10 | d3 <--- 0.00; 11 | while ( b5b567 <= d5cc34) 12 | read( c3bd); 13 | d3 <--- d3 + c3bd; 14 | while ( d3 <= 4) 15 | read( c3bd); 16 | d3 <--- 1; 17 | d3 <--- b3 + c3bd; 18 | endwhile 19 | endwhile 20 | return [b5b567]; 21 | end 22 | 23 | _main 24 | type real :c4bd56; 25 | type int :c2; 26 | type int : b5b567:global; 27 | b3 <--- 7; 28 | read( c2); 29 | [c4bd56]<--- call _sumN with parameters [c2]; 30 | write(c4bd56); 31 | return; 32 | end 33 | -------------------------------------------------------------------------------- /Compiler/testcases/stestcase2.txt: -------------------------------------------------------------------------------- 1 | 2 | %Test Case 2 3 | %Following program computes the average marks obtained by students in three subjects 4 | 5 | %Following function reads marks of a student in all subjects and returns as a record variable 6 | % Note that the variable b7 is not used anywhere but it is the syntactic requirement to have 7 | % atleast one input parameter 8 | 9 | _addRecords input parameter list[int b2, record #marks c3b5, record #marks c3b6] 10 | output parameter list [ record #marks b3c45]; 11 | [c3b5] <--- call _readMarks with parameters [b2]; 12 | [c3b6]<--- call _readMarks with parameters [b2]; 13 | b3c45 <--- c3b5 + c3b6; 14 | return [b3c45]; 15 | end 16 | 17 | _readMarks input parameter list[int b7] 18 | output parameter list [ record #marks b3c2]; 19 | read(b3c2.maths); 20 | read(b3c2.physics); 21 | read(b3c2.chemistry); 22 | return [b3c2]; 23 | end 24 | 25 | 26 | %The following program computes the average of marks of total d5cb34567 students 27 | _main 28 | record #marks 29 | type real : maths; 30 | type real: physics; 31 | type real: chemistry; 32 | endrecord; 33 | % each field above represents the marks obtained in corresponding subject 34 | 35 | type record #marks : d4; 36 | % The variable d4 stores the marks of one student 37 | 38 | type int : b5; 39 | type int : d5cb34567; 40 | type real : b5; 41 | type record #marks : b5c6; 42 | %The identifier b5c6 stores the sum of marks 43 | 44 | b5 <--- 1; 45 | read(d5cb34567); 46 | b5c6.maths <--- 0.00; 47 | b5c6.physics <--- 0.00; 48 | b5c6.chemistry <---0.00; 49 | while(b5<=d5cb34567) 50 | [d4] <--- call _readMarks with parameters [b5]; 51 | [b5c6] <--- call _addRecords with parameters [b5c6]; 52 | % above displays the sum of records 53 | b5 <--- b5 +1; 54 | endwhile 55 | d4 <--- b5c6 / d5cb34567; 56 | write(d4.maths); 57 | write(d4.physics); 58 | write(d4.chemistry); 59 | return; 60 | end 61 | -------------------------------------------------------------------------------- /Compiler/testcases/stestcase3.txt: -------------------------------------------------------------------------------- 1 | _swapints 2 | input parameter list [int b2b, int b2c] 3 | output parameter list[int c2b, int c2c]; 4 | c2b<---b2c; 5 | c2c<---b2b; 6 | return [c2b, c2c]; 7 | end 8 | 9 | _swaprecs 10 | input parameter list [record #rec d5b, record #rec d2c] 11 | output parameter list[record #rec d5c, record #rec d2b]; 12 | type int : b3; 13 | [b3,b3]<--- call _swapints with parameters [b3,b3]; 14 | [d5d,d2b]<--- call _swapints with parameters [b3, d5c]; 15 | return [d2d, d2b]; 16 | end 17 | 18 | _main 19 | record #rec 20 | type int : len; 21 | type int : height; 22 | endrecord; 23 | type record #rec : b2; 24 | type record #rec : d5b: global; 25 | b2.len<---5; 26 | b2.height <---7; 27 | d5b.len <---10.56; 28 | d5b.height <---20; 29 | [b2, c2]<--- call _swaprecs with parameters [b2,d5b]; 30 | write(b2); 31 | write(c2); 32 | return; 33 | end 34 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase-1.txt: -------------------------------------------------------------------------------- 1 | _readMarks input parameter list[int b7] 2 | output parameter list [ record #marks b3c45]; 3 | read(b3c45.maths); 4 | read(b3c45.physics); 5 | read(b3c45.chemistry); 6 | return [b3c45]; 7 | end 8 | 9 | _main 10 | type int :b3 : global; 11 | type real :c4bd56; 12 | type real :c4bd23; 13 | b3 <--- 7; 14 | [c4bd56]<--- call _sumN with parameters [b3]; 15 | write(c4bd56); 16 | return; 17 | end 18 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase-10.txt: -------------------------------------------------------------------------------- 1 | 2 | % The program is successfully correct. 3 | 4 | 5 | _recordDemo1 input parameter list [record #book d5cc34, record #book d2cd] 6 | output parameter list[record #book d3]; 7 | record #new 8 | type int : value; 9 | type int : cost; 10 | endrecord; 11 | type record #new: b7bc34 : global; 12 | 13 | d3 <--- d5cc34 + d2cd; 14 | return [d3]; 15 | end 16 | 17 | _main 18 | 19 | record #book 20 | type int : edition; 21 | type real : price; 22 | endrecord; 23 | type record #book: b2 : global; 24 | type record #book: c2 : global; 25 | type record #book: d2 : global; 26 | b2.edition <--- 3; 27 | b2.price <--- 24.95; 28 | c2.edition <--- 2; 29 | c2 <--- 98.80; 30 | % COMMENT LINE 31 | d2 <--- b2 * c2; 32 | % COMMENT 33 | % COMMENT 34 | [d2] <--- call _function1 with parameters[b2,c2]; 35 | write(d2); 36 | return; 37 | end 38 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase-11.txt: -------------------------------------------------------------------------------- 1 | _main 2 | type int :b3 : global; 3 | type real :c4bd56; 4 | type real :c4bd23; 5 | b3 <--- (7+2)*2; 6 | write(c4bd56); 7 | return; 8 | end 9 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase-2.txt: -------------------------------------------------------------------------------- 1 | _readMarks input parameter list[int b7] 2 | output parameter list [ record #marks b3c45]; 3 | record #new 4 | type int : value; 5 | type real : cost; 6 | endrecord; 7 | 8 | read(b3c45.maths); 9 | read(b3c45.physics); 10 | read(b3c45.chemistry); 11 | return [b3c45]; 12 | end 13 | 14 | _main 15 | 16 | record #ney 17 | type int : value; 18 | type real : cost; 19 | endrecord; 20 | 21 | type int :b3 : global; 22 | type real :c4bd56; 23 | type real :c4bd23; 24 | b3 <--- 7; 25 | [c4bd56]<--- call _sumN with parameters [b3]; 26 | write(c4bd56); 27 | return; 28 | end 29 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase-3.txt: -------------------------------------------------------------------------------- 1 | _main 2 | type real :b3 : global; 3 | type real :c4bd56; 4 | type real :c4bd23; 5 | b3 <--- 7; 6 | [c4bd56]<--- call _sumN with parameters [b3]; 7 | write(c4bd56); 8 | return; 9 | end 10 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase-4.txt: -------------------------------------------------------------------------------- 1 | _main 2 | type real :b3 : global; 3 | type real :c4bd56 : global; 4 | type real :c4bd23; 5 | b3 <--- 7; 6 | [c4bd56]<--- call _sumN with parameters [b3]; 7 | write(c4bd56); 8 | return; 9 | end 10 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase-5.txt: -------------------------------------------------------------------------------- 1 | _readMarks input parameter list[int b7] 2 | output parameter list [ record #m b3c45]; 3 | record #new 4 | type int : value; 5 | type real : cost; 6 | endrecord; 7 | 8 | read(b3c45.maths); 9 | read(b3c45.physics); 10 | read(b3c45.chemistry); 11 | return [b3c45]; 12 | end 13 | 14 | _main 15 | 16 | record #ney 17 | type int : value; 18 | type real : cost; 19 | endrecord; 20 | 21 | type int :b3 : global; 22 | type real :c4bd56; 23 | type real :c4bd23; 24 | b3 <--- 7; 25 | [c4bd56]<--- call _sumN with parameters [b3]; 26 | write(c4bd56); 27 | return; 28 | end 29 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase-6.txt: -------------------------------------------------------------------------------- 1 | %Test Case 3 2 | %Following program computes an arithmetic expression 3 | 4 | % The following function computes the function value for the given inputs 5 | _computeFunctionValue input parameter list[int c3, int c4, int c5] 6 | output parameter list [real c6]; 7 | type real : d4cbcd5677; 8 | type real : c4bbb; 9 | c6 <--- 5000.79; 10 | d4cbcd5677<--- ((c3 + 2*c4)-(c5-5))/ 4; 11 | c4bbb <--- ((d4cbcd5677- 2.35)*(2345-234*8))+5*c3; 12 | if((~(c4bbb == 0)) &&& (c4bbb > 78.56)) 13 | then 14 | c6<--- d4cbcd5677/c4bbb; 15 | else 16 | write(c4bbb); 17 | endif 18 | 19 | return [c6]; 20 | end 21 | 22 | 23 | _main 24 | type real :b3 : global; 25 | type real :c4bd56; 26 | type real :c4bd23; 27 | b3 <--- 7; 28 | [c4bd56]<--- call _sumN with parameters [b3]; 29 | write(c4bd56); 30 | return; 31 | end 32 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase-7.txt: -------------------------------------------------------------------------------- 1 | _readMarks input parameter list[int b7] 2 | output parameter list [ record #m b3c45]; 3 | record #new 4 | type int : value; 5 | type real : cost; 6 | endrecord; 7 | 8 | read(b3c45.maths); 9 | read(b3c45.physics); 10 | read(b3c45.chemistry); 11 | return [b3c45]; 12 | end 13 | 14 | _main 15 | 16 | record #ney 17 | type int : value; 18 | type real : cost; 19 | endrecord; 20 | 21 | type int :b3 : global; 22 | type real :c4bd56; 23 | type real :c4bd23; 24 | b3 <--- 7; 25 | [c4bd56]<--- call _sumN with parameters [b3]; 26 | write(c4bd56); 27 | return; 28 | end 29 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase-8.txt: -------------------------------------------------------------------------------- 1 | _recordDemo1 input parameter list [record #book d5cc34, record #book d2cd] 2 | output parameter list[record #book d3]; 3 | 4 | record #new 5 | type int : value; 6 | type real: cost; 7 | endrecord; 8 | 9 | d3<--- d5cc34 + d2cd; 10 | return [d3]; 11 | end 12 | 13 | 14 | _main 15 | 16 | record #book 17 | type int : edition; 18 | type real: price; 19 | endrecord; 20 | 21 | type record #book: b2; 22 | type record #book: c2; 23 | type record #book: d2; 24 | type record #new: b7bc34; 25 | 26 | b2.edition <--- 3; 27 | b2.price <--- 24.95; 28 | c2.edition <--- 2; 29 | c2.price <--- 98.80; 30 | d2<--- b2 + c2; 31 | 32 | [d2]<--- call _function1 with parameters[b2,c2]; 33 | write(d2); 34 | 35 | return; 36 | end 37 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase-9.txt: -------------------------------------------------------------------------------- 1 | %Test Case 1 2 | %Following function computes the sum of user defined real numbers 3 | %The variable d3 maintains the sum of values 4 | _sumN input parameter list [int d5cc34] 5 | output parameter list[real d3]; 6 | type int : b5b567; 7 | b5b567 <--- 1; 8 | d3 <--- 0.00; 9 | while ( b5b567 <= d5cc34) 10 | read( c3bd); 11 | d3 <--- d3 + c3bd; 12 | b5b567 <--- b5b567 + 1; 13 | endwhile 14 | 15 | [b5b567] <--- call _sumN with parameters [b5b567] 16 | return [d3]; 17 | end 18 | 19 | _main 20 | 21 | record #book 22 | type int : edition; 23 | type real: price; 24 | endrecord; 25 | 26 | type record #book: b2; 27 | type record #book: c2; 28 | type record #book: d2; 29 | type int :b3; 30 | type real :c4bd56; 31 | b3 <--- 7; 32 | [c4bd56]<--- call _sumN with parameters [b3]; 33 | write(c4bd56); 34 | if(b2 <= c2) 35 | then 36 | d6<---d6+100; 37 | else 38 | d6<---d6-200; 39 | endif return; 40 | end 41 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase0.txt: -------------------------------------------------------------------------------- 1 | _main 2 | type int :b3 : global; 3 | type real :c4bd56; 4 | type real :c4bd23; 5 | b3 <--- 7; 6 | [c4bd56]<--- call _sumN with parameters [b3]; 7 | write(c4bd56); 8 | return; 9 | end 10 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase1.txt: -------------------------------------------------------------------------------- 1 | %Test Case 1 2 | %Following function computes the sum of user defined real numbers 3 | %The variable d3 maintains the sum of values 4 | _sumN input parameter list [int d5cc34] 5 | output parameter list[real d3]; 6 | type int : b5b567; 7 | b5b567 <--- 1; 8 | d3 <--- 0.00; 9 | while ( b5b567 <= d5cc34) 10 | read( c3bd); 11 | d3 <--- d3 + c3bd; 12 | b5b567 <--- b5b567 + 1; 13 | endwhile 14 | return [d3]; 15 | end 16 | 17 | _main 18 | type int :b3; 19 | type real :c4bd56; 20 | b3 <--- 7; 21 | [c4bd56]<--- call _sumN with parameters [b3]; 22 | write(c4bd56); 23 | return; 24 | end 25 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase10.txt: -------------------------------------------------------------------------------- 1 | 2 | % The program is successfully correct. 3 | 4 | 5 | _recordDemo1 input parameter list [record #book d5cc34, record #book d2cd] 6 | output parameter list[record #book d3]; 7 | record #new 8 | type int : value; 9 | type real : cost; 10 | endrecord; 11 | 12 | d3 <--- d5cc34 + d2cd; 13 | return [d3]; 14 | end 15 | 16 | _main 17 | 18 | record #book 19 | type int : edition; 20 | type real : price; 21 | endrecord; 22 | type record #book: b2; 23 | type record #book: c2; 24 | type record #book: d2; 25 | type record #new: b7bc34; 26 | b2.edition <--- 3; 27 | b2.price <--- 24.95; 28 | c2.edition <--- 2; 29 | %c2 <--- 98.80; 30 | % COMMENT LINE 31 | d2 <--- b2 * c2; 32 | % COMMENT 33 | % COMMENT 34 | %[d2] <--- call _function1 with parameters[b2,c2]; 35 | write(d2); 36 | return; 37 | end 38 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase11.txt: -------------------------------------------------------------------------------- 1 | %Test Case 1 2 | %Following function computes the sum of user defined real numbers 3 | %The variable d3 maintains the sum of values 4 | _sumN input parameter list [int d5cc34] 5 | output parameter list[real d3]; 6 | type int : b5b567; 7 | b5b567 <--- 1; 8 | d3 <--- 0.00; 9 | while ( b5b567 <= d5cc34) 10 | read( c3bd); 11 | d3 <--- d3 + c3bd; 12 | b5b567 <--- b5b567 + 1; 13 | endwhile 14 | return [d3]; 15 | end 16 | 17 | _main 18 | type int :b3; 19 | type real :c4bd56; 20 | b3 <--- 7; 21 | [c4bd56]<--- call _sumN with parameters [b3]; 22 | write(c4bd56); 23 | return; 24 | end 25 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase2.txt: -------------------------------------------------------------------------------- 1 | 2 | %Test Case 2 3 | %Following program computes the average marks obtained by students in three subjects 4 | 5 | %Following function reads marks of a student in all subjects and returns as a record variable 6 | % Note that the variable b7 is not used anywhere but it is the syntactic requirement to have 7 | % atleast one input parameter 8 | _readMarks input parameter list[int b7] 9 | output parameter list [ record #marks b3c45]; 10 | read(b3c45.maths); 11 | read(b3c45.physics); 12 | read(b3c45.chemistry); 13 | return [b3c45]; 14 | end 15 | % Notice here that your compiler recognizes the type definition of marks even before it is 16 | % declared. This will be handled at the semantic analyzer phase. 17 | 18 | 19 | %The following program computes the average of marks of total d5cb34567 students 20 | _main 21 | record #marks 22 | type real : maths; 23 | type real: physics; 24 | type real: chemistry; 25 | endrecord; 26 | % each field above represents the marks obtained in corresponding subject 27 | 28 | type record #marks : d4; 29 | % The variable d4 stores the marks of one student 30 | 31 | type int : b5; 32 | type int : d5cb34567; 33 | type record #marks : b5c6; 34 | %The identifier b5c6 stores the sum of marks 35 | 36 | b5 <--- 1; 37 | read(d5cb34567); 38 | b5c6.maths <--- 0.00; 39 | b5c6.physics <--- 0.00; 40 | b5c6.chemistry <---0.00; 41 | while(b5<=d5cb34567) 42 | [d4] <--- call _readMarks with parameters [b5]; 43 | b5c6 <--- b5c6 + d4; 44 | % above displays the sum of records 45 | b5 <--- b5 +1; 46 | endwhile 47 | d4 <--- b5c6 / d5cb34567; 48 | write(d4.maths); 49 | write(d4.physics); 50 | write(d4.chemistry); 51 | return; 52 | end 53 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase3.txt: -------------------------------------------------------------------------------- 1 | 2 | %Test Case 3 3 | %Following program computes an arithmetic expression 4 | 5 | % The following function computes the function value for the given inputs 6 | _computeFunctionValue input parameter list[int c3, int c4, int c5] 7 | output parameter list [real c6]; 8 | type real : d4cbcd5677; 9 | type real : c4bbb; 10 | c6 <--- 5000.79; 11 | d4cbcd5677<--- ((c3 + 2*c4)-(c5-5))/ 4; 12 | c4bbb <--- ((d4cbcd5677- 2.35)*(2345-234*8))+5*c3; 13 | if((~(c4bbb == 0)) &&& (c4bbb > 78.56)) 14 | then 15 | c6<--- d4cbcd5677/c4bbb; 16 | else 17 | write(c4bbb); 18 | endif 19 | 20 | return [c6]; 21 | end 22 | 23 | 24 | %The following program computes the function value for the user defined input 25 | _main 26 | type int : b5; 27 | type int : d5cb34567; 28 | type int : b3b444 : global; 29 | type real: c3; 30 | b5 <--- 1; 31 | read(d5cb34567); 32 | read(b3b444); 33 | 34 | [c3] <--- call _computeFunctionValue with parameters [b5, d5cb34567, b3b444]; 35 | write(c3); 36 | return; 37 | end 38 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase4.txt: -------------------------------------------------------------------------------- 1 | 2 | %Test Case4 WITH ERRORS 3 | %Following program computes an arithmetic expression 4 | 5 | % The following function computes the function value for the given inputs 6 | _computeFunctionValue input parameter list[int c3, int c4, int c5] 7 | output parameter [real c6]; 8 | type real : d4cbcccccccccccccbbbbbbbbdddd5222222222222633333333333377; 9 | type real : c4bbb; 10 | c6 <--- 5000.7; 11 | d4cbcd5677<--- ((c3 + 2*c4-(c5-5))/ 4; 12 | c4bbb <--- ((d4cbcd5677- 2.35)*(2345-234*8))+5*c3; 13 | if((~(c4bbb == 0)) && (c4bbb > 78.56)) 14 | then 15 | c6<--- d4cbcd5677/c4bbb 16 | else 17 | write(c4bbb); 18 | endif 19 | 20 | end 21 | 22 | 23 | %The following program computes the function value for the user defined input 24 | _main 25 | type int b5; 26 | type int : d5cb34567; 27 | type int : b3b444 : global; 28 | type $real: c3; 29 | b5 <-- 1; 30 | read(d5cb34567); 31 | read(45); 32 | read(b3b444); 33 | 34 | [c3] <--- call _computeFunctionValue with parameters [b5, d5cb34567, b3b444]; 35 | write(c3); 36 | return; 37 | end 38 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase5.txt: -------------------------------------------------------------------------------- 1 | 2 | % The program is successfully correct. 3 | 4 | 5 | _recordDemo1 input parameter list [record #book d5cc34, record #book d2cd] 6 | output parameter list[record #book d3]; 7 | record #new 8 | type int : value; 9 | type real : cost; 10 | endrecord; 11 | 12 | d3 <--- d5cc34 + d2cd; 13 | return [d3]; 14 | end 15 | 16 | _main 17 | 18 | record #book 19 | type int : edition; 20 | type real : price; 21 | endrecord; 22 | type record #book: b2; 23 | type record #book: c2; 24 | type record #book: d2; 25 | type record #new: b7bc34; 26 | b2.edition <--- 3; 27 | b2.price <--- 24.95; 28 | c2.edition <--- 2; 29 | c2 <--- 98.80; 30 | % COMMENT LINE 31 | d2 <--- b2 + c2; 32 | % COMMENT 33 | % COMMENT 34 | [d2] <--- call _function1 with parameters[b2,c2]; 35 | write(d2); 36 | return; 37 | end 38 | -------------------------------------------------------------------------------- /Compiler/testcases/testcase6.txt: -------------------------------------------------------------------------------- 1 | 2 | % The program is syntactically incorrect, errors described in comments. 3 | 4 | 5 | _recordDemo1 input parameter list [record #book d5cc34, record #book d2cd] 6 | output parameter list[record #book d3]; 7 | record #new 8 | type int : value; 9 | type real : cost; 10 | endrecord; 11 | 12 | d3 <--- d5cc34 + d2caaaaaaaaaaaaaaaaaaad; 13 | return [d3]; 14 | end 15 | 16 | _main 17 | 18 | record #book 19 | type int : edition; 20 | type real : price; 21 | endrecord; 22 | 23 | % ERROR => LINE 23,25,26 are expected to close with semi-colons 24 | type record #book: b2 25 | type record #book: c2 26 | type record #book: d2 27 | type record #new: b7bc34; 28 | b2.edition <--- 3; 29 | % ERROR => 24.9 is not tokenizable 30 | b2.price <--- 24.9; 31 | c2.edition <--- 2; 32 | % ERROR => 98.808 is tokenized by the lexer as 98.80 as TK_RNUM and 8 as TK_NUM, but TK_SEM was expected and not TK_NUM 33 | c2 <--- 98.808; 34 | % COMMENT LINE 35 | d2 <--- b2 + c2; 36 | % COMMENT 37 | % COMMENT 38 | [d2] <--- call _function1 with parameters[b2,c2]; 39 | write(d2); 40 | return; 41 | end 42 | -------------------------------------------------------------------------------- /Compiler/type_checker.c: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "interface.h" 8 | #include "symbol_table.h" 9 | #include "type_checker.h" 10 | #include "error_handler.h" 11 | #include 12 | 13 | 14 | // Function prints out the Type as a string 15 | // Takes TK_INT TK_NUM TK_REAL TK_RNUM TK_RECORDID 16 | char* getDataType(Token* t) { 17 | 18 | TokenName type = t->TOKEN_NAME; 19 | 20 | switch(type) { 21 | case TK_INT: 22 | case TK_NUM:{ 23 | return "int"; 24 | break; 25 | } 26 | case TK_REAL: 27 | case TK_RNUM: { 28 | return "real"; 29 | break; 30 | } 31 | case TK_RECORDID: { 32 | return t->LEXEME; 33 | } 34 | } 35 | } 36 | 37 | 38 | // This function extracts the data type of an ID entry from the symbol table 39 | // It takes the AST Node for the identifier and an errorList to report errors 40 | // CRUCIAL NOTE => This returns the Datatype head, so for a recorf it sends a token of type TK_RECORDID, actual type is token->LEXEME 41 | Token* extractDataTypeFromSymbolTable(ASTNode* astIdNode, ErrorList* els) { 42 | 43 | Token* type; 44 | SymbolEntry* entry = lookupSymbolEntry(astIdNode->SCOPED_TABLE,astIdNode->AST_NODE_TYPE.AST_ID.ID); 45 | 46 | if(entry == NULL) { 47 | // Throw a missing declaration error and return NULL 48 | throwMissingDeclarationError(astIdNode->AST_NODE_TYPE.AST_ID.ID,els); 49 | return NULL; 50 | } 51 | 52 | // Case when the entry on the left is a variabale 53 | if(entry->SYMBOL_LABEL == symbolVariable) { 54 | 55 | Token* fieldId = astIdNode->AST_NODE_TYPE.AST_ID.FIELD_ID; 56 | 57 | // If there is no fieldId set datatype to the record type 58 | if(fieldId == NULL) 59 | type = entry->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 60 | // If a field is being accessed, set data type to the field type 61 | else { 62 | Token* recordId = entry->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 63 | type = extractFieldDataType(astIdNode->SCOPED_TABLE->parent,recordId,fieldId,els); 64 | } 65 | 66 | } 67 | else if(entry->SYMBOL_LABEL == symbolParameter) { 68 | 69 | Token* fieldId = astIdNode->AST_NODE_TYPE.AST_ID.FIELD_ID; 70 | // If there is no fieldId set datatype to the record type 71 | if(fieldId == NULL) 72 | type = entry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE; 73 | // If a field is being accessed, set data type to the field type 74 | else { 75 | Token* recordId = entry->SYMBOL_ENTRY_TYPE.PARAMETER_ENTRY.DATA_TYPE; 76 | type = extractFieldDataType(astIdNode->SCOPED_TABLE->parent,recordId,fieldId,els); 77 | } 78 | 79 | } 80 | else { 81 | // LHS not an identifier or a parameter, so not valid throw error 82 | printf("Symbol table entry which is an ASTId has been mapped to neither a parameter not a variable, not correct\n"); 83 | type = NULL; 84 | } 85 | 86 | return type; 87 | } 88 | 89 | // Extracts the data type of a field in a record, if not found returns NULL 90 | // Note that st HAS TO be the global symbol table 91 | Token* extractFieldDataType(SymbolTable* st,Token* recordId,Token* fieldId,ErrorList* els) { 92 | 93 | SymbolEntry* recordEntry = lookupSymbolEntry(st,recordId); 94 | 95 | // No record entry exists 96 | if(recordEntry == NULL) { 97 | throwMissingRecordDefinitionError(recordId,els); 98 | return NULL; 99 | } 100 | 101 | int index = 0; 102 | int numberFields = recordEntry->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.NUMBER_FIELDS; 103 | Token** fields = recordEntry->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.FIELDS; 104 | 105 | for(index=0; index < numberFields; index++) { 106 | // Found index of entry 107 | if(strcmp(fields[index]->LEXEME,fieldId->LEXEME) == 0) 108 | break; 109 | } 110 | 111 | if(index == numberFields) 112 | return NULL; 113 | 114 | // Return the data type at the same index 115 | return recordEntry->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.DATA_TYPE[index]; 116 | } 117 | 118 | // TODO => RENAME THIS FUNCTION BECAUSE IT PERFORMS MULTIPLE ACTIVITIES, BETTER CLARITY 119 | // Populate the offset 120 | // Node should be an astIdNode 121 | // This Function operates on all id nodes which are successors of astoutputParams astInputParams astDecls 122 | // Given the above point, this function also acts as a leverage to report missing record entries 123 | void populateOffset(ASTNode* astIdNode, SymbolEntry* idEntry,SymbolTable* scopedTable,ErrorList* els) { 124 | 125 | if(astIdNode == NULL) { 126 | printf("astDeclaration having a NULL child, detected in type checking not correct\n"); 127 | return; 128 | } 129 | 130 | if(astIdNode->LABEL != astId) { 131 | printf("Offset population being called for on a node which is not astId, not correct\n"); 132 | } 133 | 134 | // Removed as we have to store global offsets 135 | // // If it's a global entry, store -1 and return 136 | // if(astIdNode->SCOPED_TABLE->parent == NULL) { 137 | // idEntry->SYMBOL_OFFSET = -1; 138 | // return; 139 | // } 140 | 141 | 142 | Token* type; 143 | 144 | type = extractDataTypeFromSymbolTable(astIdNode,els); 145 | 146 | if(type == NULL) { 147 | printf("type is neither TK_INT TK_REAL OR TK_RECORDID, not correct\n"); 148 | return; 149 | } 150 | 151 | if(type->TOKEN_NAME == TK_INT) { 152 | idEntry->SYMBOL_OFFSET = scopedTable->CURRENT_OFFSET; 153 | scopedTable->CURRENT_OFFSET += 2; 154 | } 155 | else if(type->TOKEN_NAME == TK_REAL) { 156 | idEntry->SYMBOL_OFFSET = scopedTable->CURRENT_OFFSET; 157 | scopedTable->CURRENT_OFFSET += 4; 158 | } 159 | else if(type->TOKEN_NAME == TK_RECORDID) { 160 | SymbolEntry* recordEntry = lookupSymbolEntry(scopedTable,type); 161 | if(recordEntry == NULL) { 162 | // Same thing would (ideally, if correctly implemented) have already been detected above 163 | // A TODO Type could not be found, RECORD NOT PRESENT 164 | throwMissingRecordDefinitionError(type,els); 165 | return; 166 | } 167 | int recordOffset = recordEntry->SYMBOL_ENTRY_TYPE.RECORD_ENTRY.TOTAL_OFFSET; 168 | idEntry->SYMBOL_OFFSET = scopedTable->CURRENT_OFFSET; 169 | scopedTable->CURRENT_OFFSET += recordOffset; 170 | } 171 | else { 172 | printf("Unknown type, encountered while populating offset, not correct\n"); 173 | } 174 | 175 | } 176 | 177 | // Function which compares two data types and an operator, returns the token which the result has if compatible, else returns NULL 178 | Token* compatibleDataTypes(Token* t1, Token* t2,Token* operator) { 179 | 180 | // Case when both are not TK_RECORDID 181 | if(t1->TOKEN_NAME != TK_RECORDID && t2->TOKEN_NAME != TK_RECORDID) { 182 | // Compare TokenNames 183 | 184 | // Case when names match exactly 185 | if(t1->TOKEN_NAME == t2->TOKEN_NAME) 186 | return t1; 187 | // TK_INT data type represents TK_NUM as well 188 | else if(t1->TOKEN_NAME == TK_INT && (t2->TOKEN_NAME == TK_NUM || t2->TOKEN_NAME == TK_RNUM)) 189 | return t1; 190 | else if(t2->TOKEN_NAME == TK_INT && (t1->TOKEN_NAME == TK_NUM || t1->TOKEN_NAME == TK_RNUM)) 191 | return t2; 192 | // TK_REAL data type represents TK_RNUM as well 193 | else if(t1->TOKEN_NAME == TK_REAL && (t2->TOKEN_NAME == TK_RNUM || t2->TOKEN_NAME == TK_NUM)) 194 | return t1; 195 | 196 | else if(t2->TOKEN_NAME == TK_REAL && (t1->TOKEN_NAME == TK_RNUM || t1->TOKEN_NAME == TK_NUM)) 197 | return t2; 198 | // No match found 199 | else 200 | return NULL; 201 | } 202 | 203 | // Case when both are TK_RECORDID 204 | else if(t1->TOKEN_NAME == TK_RECORDID && t2->TOKEN_NAME == TK_RECORDID) { 205 | // Compare RECORDIDs 206 | if(strcmp(t1->LEXEME,t2->LEXEME) == 0 && (operator->TOKEN_NAME == TK_PLUS || operator->TOKEN_NAME == TK_MINUS) ) 207 | return t1; 208 | else { 209 | return NULL; 210 | } 211 | } 212 | 213 | // Case when only one is TK_RECORDID, obviously a mismatch 214 | else { 215 | // Record with scalar in multiplication and division is valid 216 | // Record being the LHS 217 | // Note that both mul and div are valid as *4 and /4 are both correct 218 | if((operator->TOKEN_NAME == TK_MUL || operator->TOKEN_NAME == TK_DIV) && t1->TOKEN_NAME == TK_RECORDID && (t2->TOKEN_NAME == TK_INT || t2->TOKEN_NAME == TK_REAL || t2->TOKEN_NAME == TK_NUM || t2->TOKEN_NAME == TK_RNUM)) 219 | return t1; 220 | // Record with scalar in multiplication is valid 221 | // Record being the RHS 222 | // Note that only multiplication is valid as 4/ is invalid (Ask ma'am) 223 | else if((operator->TOKEN_NAME == TK_MUL) && t2->TOKEN_NAME == TK_RECORDID && (t1->TOKEN_NAME == TK_INT || t1->TOKEN_NAME == TK_REAL || t1->TOKEN_NAME == TK_NUM || t1->TOKEN_NAME == TK_RNUM)) 224 | return t2; 225 | // Invalid 226 | else 227 | return NULL; 228 | } 229 | } 230 | 231 | // Checks if type of token t1 can be assigned to type of token t2 232 | // Valid inputs must have token names out of TK_INT , TK_REAL or TK_RECORDID, TK_NUM or TK_RNUM 233 | int assignableDataTypes(Token* t1,Token* t2) { 234 | 235 | // Case when both are equal 236 | if(t1->TOKEN_NAME == t2->TOKEN_NAME) { 237 | 238 | // Case when both are TK_RECORDS 239 | if(t1->TOKEN_NAME == TK_RECORDID) { 240 | // If both represent the same record 241 | if(strcmp(t1->LEXEME,t2->LEXEME) == 0) 242 | return 1; 243 | else 244 | return 0; 245 | } 246 | else { 247 | return 1; 248 | } 249 | } 250 | // Case when both do not match 251 | else { 252 | // Case when exactly one of them is a record 253 | if(t1->TOKEN_NAME == TK_RECORDID || t2->TOKEN_NAME == TK_RECORDID) 254 | return 0; 255 | else if(t1->TOKEN_NAME == TK_INT && t2->TOKEN_NAME == TK_NUM) 256 | return 1; 257 | else if(t1->TOKEN_NAME == TK_REAL && t2->TOKEN_NAME == TK_RNUM) 258 | return 1; 259 | else 260 | return 0; 261 | } 262 | 263 | 264 | } 265 | 266 | // Function evaluates the Type returned by an arithmeticExpressionNode 267 | // Input should only be an ASTNode with type astArithmeticExpressiin 268 | Token* getArithmeticExpressionType(ASTNode* astArithmeticExpressionNode, ErrorList* els) { 269 | 270 | Token* operator = astArithmeticExpressionNode->AST_NODE_TYPE.AST_ARITHMETIC_EXPRESSION.OPERATOR; 271 | ASTNode* lhsNode = astArithmeticExpressionNode->children; 272 | ASTNode* rhsNode = astArithmeticExpressionNode->children->next; 273 | 274 | Token* lhsType; 275 | Token* rhsType; 276 | 277 | // Return pointer to TK_NUM 278 | if(lhsNode->LABEL == astNum) 279 | lhsType = lhsNode->AST_NODE_TYPE.AST_NUM.VALUE; 280 | 281 | // Return pointer to TK_RNUM 282 | else if(lhsNode->LABEL == astRnum) 283 | lhsType = lhsNode->AST_NODE_TYPE.AST_RNUM.VALUE; 284 | 285 | else if(lhsNode->LABEL == astId) { 286 | 287 | SymbolEntry* s = lookupSymbolEntry(lhsNode->SCOPED_TABLE,lhsNode->AST_NODE_TYPE.AST_ID.ID); 288 | 289 | // If s is not found in the symbol table throw a missing declaration error and return 290 | if(s == NULL) { 291 | throwMissingDeclarationError(lhsNode->AST_NODE_TYPE.AST_ID.ID,els); 292 | return NULL; 293 | } 294 | 295 | // Check if the ID has a fieldID 296 | Token* fieldId = lhsNode->AST_NODE_TYPE.AST_ID.FIELD_ID; 297 | 298 | // If there is no fieldId set datatype to the record type 299 | if(fieldId == NULL) 300 | lhsType = s->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 301 | // If a field is being accessed, set data type to the field type 302 | else { 303 | Token* recordId = s->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 304 | lhsType = extractFieldDataType(lhsNode->SCOPED_TABLE->parent,recordId,fieldId,els); 305 | } 306 | 307 | 308 | 309 | } 310 | 311 | else if(lhsNode->LABEL == astArithmeticExpression) { 312 | 313 | lhsType = getArithmeticExpressionType(lhsNode,els); 314 | 315 | } 316 | 317 | else 318 | printf("Type of arithmetic expression is being calculated for a node which is not part of the paradigm, not correct!\n"); 319 | 320 | 321 | if(rhsNode->LABEL == astNum) 322 | rhsType = rhsNode->AST_NODE_TYPE.AST_NUM.VALUE; 323 | 324 | else if(rhsNode->LABEL == astRnum) 325 | rhsType = rhsNode->AST_NODE_TYPE.AST_RNUM.VALUE; 326 | 327 | else if(rhsNode->LABEL == astId) { 328 | SymbolEntry* s = lookupSymbolEntry(rhsNode->SCOPED_TABLE,rhsNode->AST_NODE_TYPE.AST_ID.ID); 329 | 330 | // If s is not found in the symbol table throw a missing declaration error and return 331 | if(s == NULL) { 332 | throwMissingDeclarationError(rhsNode->AST_NODE_TYPE.AST_ID.ID,els); 333 | return NULL; 334 | } 335 | 336 | // Check if the ID has a fieldID 337 | Token* fieldId = rhsNode->AST_NODE_TYPE.AST_ID.FIELD_ID; 338 | 339 | // If there is no fieldId set datatype to the record type 340 | if(fieldId == NULL) 341 | rhsType = s->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 342 | // If a field is being accessed, set data type to the field type 343 | else { 344 | Token* recordId = s->SYMBOL_ENTRY_TYPE.VARIABLE_ENTRY.DATA_TYPE; 345 | rhsType = extractFieldDataType(rhsNode->SCOPED_TABLE->parent,recordId,fieldId,els); 346 | } 347 | 348 | } 349 | 350 | else if(rhsNode->LABEL == astArithmeticExpression) { 351 | rhsType = getArithmeticExpressionType(rhsNode,els); 352 | } 353 | 354 | else 355 | printf("Type of arithmetic expression is being calculated for a node which is not part of the paradigm, not correct!\n"); 356 | 357 | 358 | // Case when either one of them is an ERROR 359 | if(lhsType == NULL || rhsType == NULL) { 360 | return NULL; 361 | } 362 | Token* resultType = compatibleDataTypes(lhsType,rhsType,operator); 363 | // Case when both are compatible and neither of them is an error 364 | if(resultType != NULL) 365 | return resultType; 366 | // Case when they do not match, happens first, the upper case happens later 367 | else if(resultType == NULL) { 368 | throwTypeMismatchError(lhsType,rhsType,els,operator->LINE_NO); 369 | return NULL; 370 | } 371 | } 372 | 373 | 374 | // Finds the type being given by a boolean expression node 375 | // Returns 1 for it being either TRUE or FALSE, -1 for being an ERROR 376 | int checkBooleanExpressionType(ASTNode* astBooleanExpressionNode, SymbolTable* st, ErrorList* els) { 377 | 378 | Token* operator = astBooleanExpressionNode->AST_NODE_TYPE.AST_BOOLEAN_EXPRESSION.OPERATOR; 379 | ASTNode* lhsNode = astBooleanExpressionNode->children; 380 | ASTNode* rhsNode = astBooleanExpressionNode->children->next; 381 | 382 | Token* lhsType; 383 | Token* rhsType; 384 | 385 | // Case when lhs is a boolean and rhs is NULL (bool -> TK_NOT bool) 386 | if(lhsNode->LABEL == astBooleanExpression && rhsNode == NULL) { 387 | return checkBooleanExpressionType(lhsNode,st,els); 388 | } 389 | // Base case when lhs is an id and rhs is also an id 390 | else if(lhsNode->LABEL == astId && rhsNode->LABEL == astId) { 391 | 392 | lhsType = extractDataTypeFromSymbolTable(lhsNode,els); 393 | rhsType = extractDataTypeFromSymbolTable(rhsNode,els); 394 | 395 | if(lhsType == NULL || rhsType == NULL) { 396 | // Entry not found in symbol table, missing declaration error already thrown in function 397 | return -1; 398 | } 399 | 400 | // The data type heads match 401 | if(lhsType->TOKEN_NAME == rhsType->TOKEN_NAME) { 402 | // Check whether they are records, records are not allowed 403 | if(lhsType->TOKEN_NAME == TK_RECORDID) { 404 | throwTypeMismatchError(lhsType,rhsType,els,lhsNode->AST_NODE_TYPE.AST_ID.ID->LINE_NO); 405 | return -1; 406 | } 407 | else 408 | return 1; 409 | 410 | } 411 | // No match return error 412 | else { 413 | throwTypeMismatchError(lhsType,rhsType,els,lhsNode->AST_NODE_TYPE.AST_ID.ID->LINE_NO); 414 | return -1; 415 | } 416 | } 417 | // Case when lhs is a boolean and rhs is also a boolean 418 | else if(lhsNode->LABEL == astBooleanExpression && rhsNode->LABEL == astBooleanExpression) { 419 | int isBooleanLhs = checkBooleanExpressionType(lhsNode,st,els); 420 | int isBooleanRhs = checkBooleanExpressionType(rhsNode,st,els); 421 | 422 | if(isBooleanLhs == -1 || isBooleanRhs == -1) 423 | return -1; 424 | else 425 | return 1; 426 | } 427 | else if(lhsNode->LABEL == astNum && rhsNode->LABEL == astRnum) 428 | return -1; 429 | else if(lhsNode->LABEL == astRnum && rhsNode->LABEL == astNum) 430 | return -1; 431 | else if(lhsNode->LABEL == astNum && rhsNode->LABEL == astNum) 432 | return 1; 433 | else if(lhsNode->LABEL == astRnum && rhsNode->LABEL == astRnum) 434 | return 1; 435 | else if(lhsNode->LABEL == astId) { 436 | Token* tk = extractDataTypeFromSymbolTable(lhsNode,els); 437 | // If record, it is false 438 | if(tk->TOKEN_NAME == TK_RECORDID) 439 | return -1; 440 | else if(tk->TOKEN_NAME == TK_INT && rhsNode->LABEL == astNum) 441 | return 1; 442 | else if(tk->TOKEN_NAME == TK_INT && rhsNode->LABEL == astRnum) 443 | return -1; 444 | else if(tk->TOKEN_NAME == TK_REAL && rhsNode->LABEL == astNum) 445 | return -1; 446 | else if(tk->TOKEN_NAME == TK_REAL && rhsNode->LABEL == astRnum) 447 | return 1; 448 | 449 | } 450 | else if(rhsNode->LABEL == astId) { 451 | Token* tk = extractDataTypeFromSymbolTable(rhsNode,els); 452 | // If record, it is false 453 | if(tk->TOKEN_NAME == TK_RECORDID) 454 | return -1; 455 | else if(tk->TOKEN_NAME == TK_INT && lhsNode->LABEL == astNum) 456 | return 1; 457 | else if(tk->TOKEN_NAME == TK_INT && lhsNode->LABEL == astRnum) 458 | return -1; 459 | else if(tk->TOKEN_NAME == TK_REAL && lhsNode->LABEL == astNum) 460 | return -1; 461 | else if(tk->TOKEN_NAME == TK_REAL && lhsNode->LABEL == astRnum) 462 | return 1; 463 | } 464 | else { 465 | printf("Incorrect case hit in vboolean type checking!\n"); 466 | return -1; 467 | } 468 | 469 | } 470 | -------------------------------------------------------------------------------- /Compiler/type_checker.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #include "type_checkerDef.h" 8 | 9 | void populateOffset(ASTNode* astIdNode, SymbolEntry* idEntry,SymbolTable* scopedTable,ErrorList* els); 10 | int checkForIterationUpdate(ASTNode* astIterativeStmtNode); 11 | Token* getArithmeticExpressionType(ASTNode* astArithmeticExpressionNode, ErrorList* els); 12 | int checkBooleanExpressionType(ASTNode* astBooleanExpressionNode, SymbolTable* st, ErrorList* els); 13 | Token* extractDataTypeFromSymbolTable(ASTNode* astIdNode, ErrorList* els); 14 | 15 | Token* compatibleDataTypes(Token* t1, Token* t2,Token* operator); 16 | int assignableDataTypes(Token* t1,Token* t2); 17 | Token* extractFieldDataType(SymbolTable* st,Token* recordId,Token* fieldId,ErrorList* els); 18 | char* getDataType(Token* t); 19 | -------------------------------------------------------------------------------- /Compiler/type_checkerDef.h: -------------------------------------------------------------------------------- 1 | /* Group 27 2 | Venkat Nalla Siddartha Reddy 2016A7PS0030P 3 | Arnav Sailesh 2016A7PS0054P 4 | Gunraj Singh 2016A7PS0085P 5 | Aashish Singh 2016A7PS0683P */ 6 | 7 | #ifndef TYPE_CHECKER_ 8 | #define TYPE_CHECKER_ 9 | 10 | #include "astDef.h" 11 | #include "symbol_tableDef.h" 12 | #include "error_handlerDef.h" 13 | 14 | #endif 15 | -------------------------------------------------------------------------------- /Conventions.txt: -------------------------------------------------------------------------------- 1 | Variables and functions names are in camel case with starting letter to be small => dotMark 2 | Structs, Unions, enums are in camel case with starting letter being in capitals => TokenName 3 | Field names are in caps and separated by '_' (except generic ones like next, children, parent etc) 4 | -------------------------------------------------------------------------------- /Group_27.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Aashish683/Compiler-Project/ff87e1005f4f5e890b7121c3aeb813ba0549b580/Group_27.zip -------------------------------------------------------------------------------- /Language specifications.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Aashish683/Compiler-Project/ff87e1005f4f5e890b7121c3aeb813ba0549b580/Language specifications.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Compiler-Project 2 | This repository contains the compiler designed in C following the language specifications provided in _Language Specifications.pdf_ , as part of the course Compiler Construction at BITS Pilani. 3 | 4 | The compiler parses input files according to the language specifications, and if the input is syntactically and semantically correct, it is capable of generating corresponding assembly-level code. Some testcases have been provided in _Compiler/testcases_. 5 | 6 | As required for the project, the code is compatible for GCC version 5.4.0. and has been tested on Ubuntu 16.04 7 | To run it on Windows, you will have to use [cygwin](https://www.cygwin.com/). 8 |
9 | 10 | ### To create the compiler 11 | 1. `cd Compiler` 12 | 2. `make` 13 | 3. This will create an executable file named _compiler_ 14 | 15 | ### To test the compiler on a testcase 16 | 1. Ensure you are in the directory containing the _compiler_ executable 17 | 2. Run `./compiler ` 18 | 3. Choose among the 11 options to test the feature you want. 19 | 4. To create assembly-level code , choose option 10. 20 | 5. Choose option 0 to exit. (Ctrl + C would result with no output in the ASM file) 21 | 6. To run the asm file `nasm -felf64 code.asm && gcc code.o && ./a.out` (Using code.asm as the name of the asm file in this example) 22 | 23 | ### Stages of compiling a testcase (Option 10) 24 | 25 | | Stage | Details | 26 | | :------------- | :----------: | 27 | | Lexical Analysis | Categorizes the contents of the input file as tokens (as per the language specifications).| 28 | | Syntax Analysis | Creates a Parse Tree for the tokens being returned by the lexer. If an error is encountered in any of the above 2 stages, the compiler does not proceed to semantic analsysis.| 29 | | Semantic Analysis |Creates an Abstract Syntax Tree for the coresponding Parse Tree and populates the Symbol Table. If an error is encountered during Semantic Analysis, the compiler does not proceed to code generation | 30 | | Code generation | Generates corresponding assembly-level code. As per the project requirements, code generation is restricted to only those testcases which have a single function (main) and handle only integers.| 31 | 32 | ## Contributors 33 | The team members of this project were: 34 | * [Aashish Singh](https://github.com/Aashish683) 35 | * Gunraj Singh 36 | * Venkat Reddy 37 | * [Arnav Sailesh](https://github.com/ArnavS11) 38 | --------------------------------------------------------------------------------