├── .gitignore
├── .travis.yml
├── chap-0-intro.md
├── chap-1-compiler-basics.md
├── chap-2-llvm-basics.md
├── chap-3-lexer.md
├── chap-4-parser.md
├── chap-5-code-generator.md
├── chap-6-summary.md
├── chap-7-if-else.md
├── chap-8-function-declarations.md
├── chap-9-while-loops.md
├── demo_file.cr
├── diagrams
    ├── img
    │   ├── BDMAS_1.png
    │   ├── BDMAS_2.png
    │   ├── BDMAS_3.png
    │   ├── BDMAS_4.png
    │   ├── BDMAS_5.png
    │   ├── BDMAS_6.png
    │   ├── BDMAS_7.png
    │   ├── BDMAS_8.png
    │   ├── Emerald_Architecture.png
    │   ├── LLVM_Architecture.png
    │   ├── demo_output.png
    │   ├── if_else_to_ir.png
    │   ├── lexer_basic.png
    │   ├── parser_basic.png
    │   └── while_to_ir.png
    └── xml_archive
    │   ├── Emerald_Architecture.xml
    │   ├── LLVM_Architecture.xml
    │   ├── Lexer.xml
    │   ├── Parser.xml
    │   └── Parsing BDMAS Walkthrough.xml
├── emeraldc.cr
├── example_clang
    ├── main.c
    ├── main2.c
    ├── main2.ll
    └── readme.md
├── example_ir
    ├── example_1.ll
    ├── example_2.ll
    ├── example_3.ll
    ├── example_4.cr
    ├── example_4.ll
    └── readme.md
├── license
├── readme.md
├── spec
    ├── errors_spec.cr
    ├── floats_spec.cr
    ├── full_integration_spec.cr
    ├── functions_2_spec.cr
    ├── functions_spec.cr
    ├── generator_spec.cr
    ├── if_statements_spec.cr
    ├── int64_spec.cr
    ├── lexer_spec.cr
    ├── parser_more_examples_spec.cr
    ├── parser_order_of_op_spec.cr
    ├── parser_spec.cr
    ├── strings_spec.cr
    ├── value_resolution_spec.cr
    ├── variables_and_literals_spec.cr
    └── while_spec.cr
├── src
    └── emerald
    │   ├── close_statements.cr
    │   ├── emerald.cr
    │   ├── error.cr
    │   ├── lexer.cr
    │   ├── nodes
    │       ├── basic_block_node.cr
    │       ├── binary_operator_node.cr
    │       ├── call_expression_node.cr
    │       ├── declaration_reference_node.cr
    │       ├── expression_node.cr
    │       ├── function_declaration_node.cr
    │       ├── if_expression_node.cr
    │       ├── literal_nodes.cr
    │       ├── node.cr
    │       ├── return_node.cr
    │       ├── root_node.cr
    │       ├── variable_declaration_node.cr
    │       └── while_expression_node.cr
    │   ├── parser.cr
    │   ├── state.cr
    │   ├── token.cr
    │   ├── types.cr
    │   └── verifier.cr
├── std-lib-opt.ll
├── std-lib.ll
├── test_inputs
    ├── floats.cr
    ├── full_integration.cr
    ├── full_integration_2.cr
    ├── full_integration_3.cr
    ├── functions.cr
    ├── functions_2.cr
    ├── if_statements_1.cr
    ├── if_statements_2.cr
    ├── int64.cr
    ├── strings.cr
    ├── value_resolution_1.cr
    ├── value_resolution_2.cr
    ├── variables_and_literals.cr
    └── while.cr
└── test_outputs
    ├── floats
    ├── full_integration
    ├── full_integration_2
    ├── full_integration_3
    ├── functions
    ├── functions_2
    ├── if_statements_1
    ├── if_statements_2
    ├── int64
    ├── strings
    ├── value_resolution_1
    ├── value_resolution_2
    ├── variables_and_literals
    └── while


/.gitignore:
--------------------------------------------------------------------------------
 1 | .DS_Store
 2 | note.cr
 3 | output.ll
 4 | output.s
 5 | output
 6 | emerald/output.ll
 7 | emerald/output.s
 8 | emerald/output
 9 | spec/output.ll
10 | spec/output.s
11 | spec/output
12 | emeraldc
13 | emeraldc.dwarf
14 | std-lib.s
15 | std-lib-opt.s
16 | diagrams/psd/*


--------------------------------------------------------------------------------
/.travis.yml:
--------------------------------------------------------------------------------
 1 | language: crystal
 2 | 
 3 | sudo: false
 4 | 
 5 | os:
 6 |   - osx
 7 | 
 8 | before_install: |
 9 |   export LLVM_CONFIG=/usr/local/opt/llvm@6/bin/llvm-config
10 |   export PATH="/usr/local/opt/llvm@6/bin:$PATH"
11 | before_script: |
12 |   ls -l /usr/bin | grep llvm-config
13 |   crystal build emeraldc.cr


--------------------------------------------------------------------------------
/chap-0-intro.md:
--------------------------------------------------------------------------------
 1 | # Chapter 0 Introduction
 2 | 
 3 | In this tutorial we will work together to write a compiler for a simple toy programming language. Disclaimer, we make no claims of performance, safety, or functionality. The main objective will be to better understand how the LLVM api works when building a front-end and to better understand how compilers work in general. Maybe you just want to satisfy your curiousity or maybe you genuinely want to build the next great programming language of the future. In either case I hope this tutorial will be of great value to you.
 4 | 
 5 | We are going to write the compiler using Crystal and there are two major reasons for this decision. Primarily, I find Crystal to offer very clean syntax and by default comes with excellent bindings to LLVM. I want all functionality in our toy compiler to be explicit, and easy to understand, debug, test, and monitor. With Crystal, I can easily ensure all the code is clear and concise and the bindings will stay out of our way. Secondarily, because Crystal is itself a LLVM front-end, there is a ton of information and examples of using the LLVM bindings directly in the Crystal source code. I highly recommend you spend some time reading the Crystal compiler source code before/during/after reading this tutorial.
 6 | 
 7 | The language will be imperative, statically typed, and able to compile to object code callable from C. It will discourage punctuation usage and strive to be explicit while terse. We will start by parsing everything at the top level, and gradually incorporate control flow and nested expressions expanding the initially sparse syntax.
 8 | 
 9 | We will call our toy language Emerald to honor both Crystal and Ruby. Further, the syntax will also be a major nod to both languages. Here is a snippet of our initial goal, showing some of the basic syntax elements.
10 | ```ruby
11 | # I am a comment!
12 | four = 2 + 2
13 | puts four
14 | puts 10 < 6
15 | puts 11 != 10
16 | ```
17 | 
18 | While the above example may look simple, it is going to require us to cover some serious ground in our understanding of the LLVM api.  Already our simple syntax is going to require variables, a "built-in" puts command, and binary operators. We will need to be able to parse the structure of input files, and understand the order of operations and expression context. But do not be discouraged. We are going to tackle this in easily digestable pieces. Once we have a solid foundation, we can gradually extend our language with more powerful features.
19 | 
20 | ### Lookahead
21 | 
22 | Information
23 | 
24 | [Chapter 1 - Compiler Basics](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-1-compiler-basics.md)  -- Partial
25 | 
26 | [Chapter 2 - LLVM Basics](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-2-llvm-basics.md)  -- Partial
27 | 
28 | Basic Architecture
29 | 
30 | [Chapter 3 - Lexer](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-3-lexer.md)  -- Partial
31 | 
32 | [Chapter 4 - Parser](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-4-parser.md)  -- Partial
33 | 
34 | [Chapter 5 - Code Generator](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-5-code-generator.md)  -- Incomplete
35 | 
36 | [Chapter 6 - Summary](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-6-summary.md)  -- Incomplete
37 | 
38 | Advanced Architecture
39 | 
40 | [Chapter 7 - Implementing If/Else](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-7-if-else.md)  -- Incomplete
41 | 
42 | [Chapter 8 - Implementing Function Declarations](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-8-function-declarations.md)  -- Incomplete
43 | 
44 | [Chapter 9 - Implementing Loops](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-9-while-loops.md)
45 |   -- Incomplete
46 | 
47 | ### Diagrams
48 | 
49 | Emerald Architecture
50 | 
51 | ![Emerald Architecture](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/Emerald_Architecture.png)
52 | 
53 | LLVM Architecture
54 | 
55 | ![LLVM Architecture](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/LLVM_Architecture.png)
56 | 


--------------------------------------------------------------------------------
/chap-1-compiler-basics.md:
--------------------------------------------------------------------------------
 1 | # Chapter 1 Compiler Basics
 2 | 
 3 | This chapter serves as a very basic crash course in compilers. This is going to be very explicit and to the point. Please feel free to fill this information in with lots of other sources when you get the chance!
 4 | 
 5 | Here are the basics of what you need to know.
 6 | 
 7 | ### Steps of Standard Compiler Usage
 8 | 
 9 | 1. User has a file with some source code the compiler understands. 
10 | 2. User runs the compiler executable on the source code file.
11 | 3. The compiler "lexes" file into a token array
12 | 4. The compiler "parses" the token array into an AST.
13 | 5. The compiler "code generates" the AST into machine code or intermediate language.
14 | 6. The intermediate language is converted to machine code if necessary and the user can then run the native machine code.
15 | 
16 | This is a gross simplification however it may introduce new concepts that are foreign. We will cover the terminology and details here.
17 | 
18 | **Lexing/Lexer**: A lexer is code designed to perform lexing. Lexing is the process of reading source code as an array of characters, and producing an array of tokens. The lexer decides where one keyword, identifier, variable, or other syntax component ends and the next one starts. Every language has the ability to decide exactly how characters are grouped and delineated to form tokens. Some languages use symbols like ';' to delineate the end of lines, while some languages use the newline escape sequence, "\n". Most languages use spaces to delineate tokens but the lexer can be programmed to recognize any grouping and order of characters to generate the final array of tokens.
19 | 
20 | **Tokens**: A token is simply a symbolic representation of exactly one keyword, identifier, variable, literal, or other component in the language. The token in its most basic form will hold the value of the given token as well as its semantic function in the language. It is useful for tokens to also carry their positional location in the file for later reference by the compiler. Tokens are the output of the lexer and input of the parser.
21 | 
22 | **Parsing/Parser**: The parser is responsible for taking the array of tokens produced by the lexer and generating what is known as an AST (Abstract Syntax Tree). The parser's job is actually twofold. First, the parser's job is to make sense of the sequence of tokens it receives to produce valid expressions in the form of AST nodes. Because of this functionality, the parser is going to find mistakes if they exist in the source file. Therefore the secondary function of the parser is to identify and notify the user of syntax errors in the source code of the input file.
23 | 
24 | **AST (Abstract Syntax Tree)**: The AST is a tree-like representation of the program code structured in such a way that the code generator can walk through the nodes to eventually produce machine code that is directly executable. Walking an AST is simply the process of moving along the nodes of the tree from top to bottom, parent nodes to child nodes and back. The AST nodes hold references to other nodes, making it possible to 'walk' the syntax of the language. These nodes typically also carry the location information to make error messages useful to developers in the event of syntax errors.
25 | 
26 | **Code Generator**: The code generator takes the AST as input and produces intermediate or machine code. The code generator "walks" the nodes of the AST, using the references it has to other nodes to generate the necessary instructions in the output code. Depending on the compiler architecture you may need to assemble your output intermediate representation to machine code prior to the final execution.
27 | 
28 | ### Advanced Compiler Stages
29 | 
30 | If we want to get more advanced we can add two more stages to this process. The first advanced stage would be an AST simplifying step. This would occur between parsing and the code-generation. The job of the AST simplification stage is to walk the nodes of the AST and look for expressions that can be evaluated at compile time to single nodes. The more nodes that can be collapsed during this stage, the less work required for the code-generation to perform and the fewer calculations required at run-time. This can definitely be viewed as a code optimization.
31 | 
32 | The second advanced stage is an actual optimization step. Principally, these optimizations are run during or after code-generation and are also sometimes performed at link-time. The goal of this step is to look for patterns in the generated machine or assembly code that will allow simplifications to the code without affecting the final result. Depending on your needs, this step can be tuned between compile-time or run-time speed and between performance and safety.
33 | 
34 | In our toy example we will be using some AST simplification techniques as required to make use of the builder API, but we will not be spending any time with explicit optimizations. Feel free to use the toy example as a means to experiment with LLVM's optimizations and better understand how they manipulate the code to improve performance. If you compare source code to generated LLVM ir with optimizations, you will notice that the optimizations can be quite effective at turning function bodies into inline values and removing unnecessary calculations from statements.
35 | 
36 | #### Next
37 | [Chapter 2 - LLVM Basics](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-2-llvm-basics.md)


--------------------------------------------------------------------------------
/chap-2-llvm-basics.md:
--------------------------------------------------------------------------------
  1 | # Chapter 2 LLVM Basics
  2 | 
  3 | This chapter is a crash course in LLVM's most basic concepts and terminologies. This is by no means complete, but it will be enough to give you a decent idea of how LLVM works and how you will be using its API.
  4 | 
  5 | First and foremost, we will be putting some blinders on and using LLVM in a very simplistic way. Despite the fact that LLVM exposes a very detailed modular API with lots of control at all stages of code compilation, we are going to take a very lazy, and in some cases perhaps even naive approach, to working with it. The benefits of this approach are: A) its more rewarding to get to a working stage quickly, and B) LLVM has lots of tooling that will make our naive code run plenty efficiently for the time being. Once we understand the basics, then we can begin to add more advanced techniques to our approach.
  6 | 
  7 | So what do I mean by us taking a lazy approach? What I mean is we are going to let the LLVM ir builder API do the heavy lifting for us and we are not going to spend much, if any, time tinkering with the generated ir other than applying the standard optimizations LLVM offers. This means if our lexer and parser can generate an AST of nodes that the LLVM ir builder API understands, we are essentially done. The only remaining work will be to call the builder API with the correct references to each of our nodes.
  8 | 
  9 | So what do I mean by us taking a naive approach? Full disclaimer, I am very much still a student of compilers and LLVM, I am probably doing many things that a compiler/LLVM expert would consider naive or ignorant. Also in the interest of enlightening people without adding unnecessary confusion I will try to keep things extremely simplistic. This means I will try to avoid using  indirection, complex abstractions, inheritance, and implicit behaviour in the compiler even if the end result is more verbose code. The code may not follow all the best practices but it will be easy to read and understand.
 10 | 
 11 | ### General LLVM Information
 12 | 
 13 | Here are some gross simplications that should help you get started with LLVM. Fill your knowledge in with more details as it becomes necessary.
 14 | 
 15 | 1. The main unit of grouping in LLVM is the module. You can have several modules in a program, and each module will contain functions, global variables and an externalized interface. In our simplistic, naive approach we will never use more than one module but note that it is possible.
 16 | 2. Everything inside a module is an LLVM Value descendent. These include but are not limited to functions, blocks, expressions, instructions, etc. A nice way to visually think of this is: **Module** can have **Functions** which have one or more **BasicBlock** which consist of **Instructions**. Values are a way for each component to reference each other and is the basis for compile time mathematic calculations.
 17 | 3. BasicBlocks are an important concept. BasicBlocks are the cornerstone of the ir builder API and for good reason. A BasicBlock is simply a list of instructions in which the only way they can be executed is from first to last in order with no control flow. Think of a function body with no logic statements or jumps and you are likely looking at a BasicBlock.
 18 | 4. A function is basically a block of code that accepts a given list of typed parameters and returns a typed value. LLVM views it the same way. We will be initially running all our code inside a main function that we will initialize by giving it a C main interface. A simplified C main function takes no parameters and returns an integer. Therefore in LLVM we say that it will take an empty array of LLVM type values, and return a LLVM 32 bit integer.
 19 | 5. LLVM type values are exactly what they sound like; it is LLVMs internal representation of type as related to an LLVM Value object. This is the system in which your code can be statically typed and compiled to object code callable from C. By giving our main function a C interface using LLVM types, and by flagging our main function as a LLVM::Linkage::External we have allowed linkers to identify our object code's main function as if it were compiled from C.
 20 | 6. The LLVM builder API has the notion of position. A given builder has to be directed where it will be appending new instructions. In the case of our initial simplified approach we will be appending all our instructions to the BasicBlock of the main function. As you can imagine this sets up the primary means of constructing function bodies across multiple functions in a module. To add control flow, loops, and more functions we need to track the basic blocks of our module and append the instructions into the correct blocks.
 21 | 7. Finally once you are finished compiling the instructions into your module, you will want to dump the output to LLVM IR ll files. The resultant [name_of_file].ll can now be treated like any LLVM IR as if it were just compiled straight from C. This includes all the optimizations and plug-ins available in the LLVM architecture. It is also ready to be compiled to object code and linked with any other object code compiled from other sources. The file.ll theoretically can be compiled to any target architecture so long as the LLVM IR is not doing anything machine specific.
 22 | 8. Because our example is compiling instructions into a main function, if we execute the compiled and linked version of our output (aka the machine binary), it should immediately invoke the main function and we should see the results of our instructions.
 23 | 9. LLVM is a low level machine, therefore it doesn't have high level types by default. However, there is nothing stopping you from adding your own types as aliases or structured combinations of low level built-in types. In our toy example we will not make much use of this power but know that it exists and can be used to create more powerful containers for values, for example like Crystal's or C++'s string type. In our example we will simply treat strings like C strings, terminated by a null byte, which LLVM treats as an i8* (8 bit integer pointer). This means that all string values in our program will be global string values, passed by pointer.
 24 | 
 25 | 
 26 | ### LLVM IR instructions
 27 | 
 28 | To get you started quickly here is a quick glossary of some LLVM ir instructions and what they do:
 29 | 
 30 | ```llvm
 31 | ;alloca - reserve space in memory for typed variable
 32 | 
 33 |     %fourp = alloca i32
 34 | 
 35 | ;store - put value into allocated memory
 36 | 
 37 |     store i32 4, i32* %fourp
 38 | 
 39 | ;load - get value stored in allocated memory
 40 | 
 41 |     %value = load i32, i32* %fourp
 42 | 
 43 | ;getelementptr - get a pointer to a subelement, 
 44 | ;- useful for converting a char buffer into a const char*
 45 | 
 46 |     %buffer = alloca [79 x i8]
 47 |     %bpointer = getelementptr [79 x i8], [79 x i8]* %buffer, i32 0, i32 0
 48 | 
 49 | ;call - call a named function with return type and params
 50 | 
 51 |     call i32 @puts( i8* %bufferp )
 52 | 
 53 | ;br - jumps to a code block based on the provided value or unconditionally jumps
 54 | 
 55 |     br i1 false, label %if_block, label %else_block ;conditional jump
 56 |     br label %code_block ;unconditional jump
 57 | 
 58 | ;icmp - compare two integer values with a given operator
 59 | ;- eq, ne, ult, ugt, uge, ule, slt, sgt, sge, sle
 60 | ;- equals, not equal, signed and unsigned less than, greater than, less than or equal, greater than or equal
 61 | ;- returns i1
 62 | 
 63 |     %comparison = icmp eq i32 2, 0
 64 | 
 65 | ;ret - return a value from the active function
 66 | 
 67 |     ret i32 0
 68 | 
 69 | ;sext & zext - extend an integer to a larger bit size
 70 | ;- sext is sign extended, zext is zero extended
 71 |     
 72 |     %result = zext i1 %value to i32
 73 |     %result = sext i1 %value to i32
 74 | 
 75 | ;trunc - reduce an integer size to a smaller bit size
 76 | 
 77 |     %result = trunc i32 %value to i1
 78 | ```
 79 | 
 80 | ### Crystal LLVM Builder API Bindings
 81 | 
 82 | We are using Crystal's builder API bindings to LLVM and as such we also need to have an idea of how to use the builder API to assemble modules. Below is a generic program class that demonstrates how to use the builder API in a simplistic way. Once you grasp this, you should be able to see how you can direct the builder API into different blocks and functions throughout your module as needed.
 83 | 
 84 | ```crystal
 85 | require "llvm"
 86 | 
 87 | class Program
 88 |   getter main : LLVM::BasicBlock, mod : LLVM::Module, builder : LLVM::Builder
 89 |   getter! func : LLVM::Function
 90 | 
 91 |   def initialize
 92 |     # Create the context
 93 |     context = LLVM::Context.new
 94 | 
 95 |     # Create a module
 96 |     @mod = context.new_module("name")
 97 | 
 98 |     # Add a global number variable "number" = 10
 99 |     mod.globals.add context.int32, "number"
100 |     mod.globals["number"].initializer = context.int32.const_int(10)
101 | 
102 |     # Create a main function
103 |     @func = mod.functions.add "main", ([] of LLVM::Type), context.int32
104 | 
105 |     # Create body for main function - builder appends to basic blocks.
106 |     @main = func.basic_blocks.append "main_body"
107 | 
108 |     # Make main function externally linkable
109 |     func.linkage = LLVM::Linkage::External
110 | 
111 |     # Declare external function puts
112 |     mod.functions.add "puts", [context.void_pointer], context.int32
113 | 
114 |     # Initialize Crystal's builder API
115 |     @builder = context.new_builder
116 |   end
117 | 
118 |   def code_generate
119 |     # Before calling builder, you must position it into the active basic block of your program
120 |     builder.position_at_end main
121 | 
122 |     # While walking the AST nodes you can call builder API to generate instructions into the basic block...
123 |     str_ptr = builder.global_string_pointer "Johnny", "str"
124 |     builder.call mod.functions["puts"], str_ptr, "str_call"
125 |     num_val = builder.load mod.globals["number"]
126 |     builder.ret num_val
127 | 
128 |     File.open("output.ll", "w") do |file|
129 |       mod.to_s(file)
130 |     end
131 |   end
132 | end
133 | 
134 | program = Program.new
135 | program.code_generate
136 | ```
137 | 
138 | It is the relationship between your AST nodes and your code generation functions that the final module will get built. Therefore you should spend time walking the nodes of your AST and thinking about what builder API calls you will need to accomplish the functionality you desire in LLVM IR.
139 | 
140 | ### Builder API Usage
141 | 
142 | Below is a list of builder methods with short descriptions. A few of the ones you'll find especially useful have demonstration usages provided.
143 | ```crystal
144 | #add(lhs, rhs, name = "") add two values together and return sum value
145 | 
146 |     value = builder.add(four_val, five_val, "4_plus_5")
147 | 
148 | #alloca(type, name = "") allocate space for given variable type
149 | 
150 |     value = builder.alloca(LLVM::Int32, "number")
151 | 
152 | #and(lhs, rhs, name = "") perform bitwise and operation
153 | #array_malloc(type, value, name = "")
154 | #ashr(lhs, rhs, name = "") perform bitwise right hand shift
155 | #atomicrmw(op, ptr, val, ordering, singlethread) atomically modify memory
156 | #bit_cast(value, type, name = "") convert value to type2 without changing bits
157 | #br(block) unconditional branch to block
158 | 
159 |     builder.br(block_ref)
160 | 
161 | #call(func, args : Array(LLVM::Value), name : String = "") call a multi parameter function
162 | #call(func, arg : LLVM::Value, name : String = "") call a single param function
163 | 
164 |     ret_value = builder.call mod.functions["puts"], str_ptr, "puts_call"
165 | 
166 | #call(func, name : String = "") call a no parameter function
167 | #cmpxchg(pointer, cmp, new, success_ordering, failure_ordering) atomically modify memory based on comparison
168 | #cond(cond, then_block, else_block) conditional branch to block
169 | 
170 |     builder.cond if_value, then_block_ref, else_block_ref
171 | 
172 | #exact_sdiv(lhs, rhs, name = "") performs division using exact keyword, result is poison value if rounding would occur
173 | #extract_value(value, index, name = "") extracts value from aggregate object
174 | #fadd(lhs, rhs, name = "") floating point and vector addition
175 | #fcmp(op, lhs, rhs, name = "") floating point comparison
176 | #fdiv(lhs, rhs, name = "") floating point division
177 | #fence(ordering, singlethread, name = "") introduces edges between operations
178 | #fmul(lhs, rhs, name = "") floating point multiplication
179 | #fp2si(value, type, name = "") floating point to signed int
180 | #fp2ui(value, type, name = "") floating point to unsigned int
181 | #fpext(value, type, name = "") floating point extension
182 | #fptrunc(value, type, name = "") floating point truncation
183 | #fsub(lhs, rhs, name = "") floating point subtraction
184 | #gep(value, index1 : LLVM::Value, index2 : LLVM::Value, name = "") get element pointer returns a subelement of a container using start, end indices
185 | #gep(value, index : LLVM::Value, name = "") get element pointer returns a subelement of a container using a start index
186 | #gep(value, indices : Array(LLVM::ValueRef), name = "") get element pointer returns a sub element using an indices array
187 | #global_string_pointer(string, name = "") generate global string constant pointer
188 |     
189 |     string_ptr = builder.global_string_pointer("Hello World", "example")
190 | 
191 | #icmp(op, lhs, rhs, name = "") integer comparison operation
192 |     
193 |     result = builder.icmp(LLVM::IntPredicate::ULT, ten_val, nine_val, "comparison")
194 | 
195 | #inbounds_gep(value, indices : Array(LLVM::ValueRef), name = "") gep with inbounds keyword
196 | #inbounds_gep(value, index1 : LLVM::Value, index2 : LLVM::Value, name = "") gep with inbounds keyword
197 | #inbounds_gep(value, index : LLVM::Value, name = "") gep with inbounds keyword
198 | #int2ptr(value, type, name = "") convert integer to pointer type
199 | #invoke(fn, args : Array(LLVM::Value), a_then, a_catch, name = "") allows exception handling by returning to then block unless exception is detected and then instead returns to catch block
200 | #landing_pad(type, personality, clauses, name = "") designates a basic block as where an exception is handled inside a catch routine
201 | #load(ptr, name = "") get the value stored in a pointer
202 | 
203 |     value = builder.load(ptr_to_value, "value_in_ptr")
204 | 
205 | #lshr(lhs, rhs, name = "") performs a logical right hand shift operation
206 | #mul(lhs, rhs, name = "") perform multiplication
207 | #not(value, name = "")
208 | #or(lhs, rhs, name = "") performs bitwise or operation
209 | 
210 | #phi(type, table : LLVM::PhiTable, name = "") setup phi node based on preexisting table data
211 | 
212 | #NOTE a phi node is simply a variable that takes on a value based on the preceding block that passed control to the phi node.
213 | 
214 | #phi(type, incoming_blocks : Array(LLVM::BasicBlock), incoming_values : Array(LLVM::Value), name = "") setup phi node based on array of basic blocks and an array of the values it should take in each case
215 | 
216 | #position_at_end(block) position builder at end of a given block
217 | #ptr2int(value, type, name = "") resolve integer pointer to int
218 | #ret(value) return a specified value
219 | 
220 |     builder.ret (LLVM.int (LLVM::Int32, 0))
221 | 
222 | #ret return void
223 | #sdiv(lhs, rhs, name = "") signed integer division
224 | #select(cond, a_then, a_else, name = "") select a value based on a condition
225 | #sext(value, type, name = "") signed extension
226 | #shl(lhs, rhs, name = "") shift left expression
227 | #si2fp(value, type, name = "") cast signed integer to floating point
228 | #srem(lhs, rhs, name = "") return remainder of signed integer division
229 | #store(value, ptr) store value into pointer
230 | 
231 |     builder.store four_val, number_ptr
232 | 
233 | #sub(lhs, rhs, name = "") integer subtraction
234 | #switch(value, otherwise, cases) allow branching to one of several branches based on value
235 | #trunc(value, type, name = "") truncating integer
236 | #udiv(lhs, rhs, name = "") unsigned division
237 | #ui2fp(value, type, name = "") unsigned integer to floating point
238 | #urem(lhs, rhs, name = "") unsigned division remainder
239 | #xor(lhs, rhs, name = "") bitwise logical xor operation
240 | #zext(value, type, name = "") zero extension
241 | ```
242 | 
243 | Further Reading and References:
244 | 
245 | 1. [LLVM for Grad Students by Adrian Sampson](https://www.cs.cornell.edu/~asampson/blog/llvm.html)
246 | 
247 | 2. [How to get started with the LLVM C API by Paul Smith](https://pauladamsmith.com/blog/2015/01/how-to-get-started-with-llvm-c-api.html)
248 | 
249 | 3. [Create a working compiler with the LLVM framework, Part 1 by Arpen Sen](https://www.ibm.com/developerworks/library/os-createcompilerllvm1/)
250 | 
251 | 4. [My First LLVM Compiler by Wilfred Hughes](http://www.wilfred.me.uk/blog/2015/02/21/my-first-llvm-compiler/)
252 | 
253 | #### Next
254 | [Chapter 3 - Lexer](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-3-lexer.md)
255 | 


--------------------------------------------------------------------------------
/chap-3-lexer.md:
--------------------------------------------------------------------------------
 1 | # Chapter 3 Lexer
 2 | 
 3 | It is time to begin with the actual design and implementation of our toy compiler. If you made it this far, then you know our first step in building a compiler is to create something known as a Lexer. The Lexer will be a component of our compiler. Its job is to take an array of character data and produce an array of tokens.
 4 | 
 5 | The Lexer will begin by taking the array of characters and looping through them one at a time. As it does so it will be keeping track of important information such as the line and column number for positioning, the current character, and the current token being parsed. When the Lexer either reaches a character indicating the end of a token, or the current token being parsed equals specific keywords or symbols it knows that it can parse the current token information. When the previous condition is reached, the Lexer adds a new Token to its Token array reflecting the parsed token information, it then restarts the token processing algorithm where it left off.
 6 | 
 7 | Our Lexer will also have some additional properties to help it when parsing our language. One thing our Lexer will have is a context property. This allows the Lexer to understand more complicated groupings of characters. For example we want our Lexer to recognize that the following grouping of characters: "Hello World!" is in fact one String token rather than two separate tokens. An easy way to accomplish this is to use the context property to indicate when the Lexer has entered a String section so it knows to continue adding characters until the second quotation symbol is encountered. This context property can also be used to collate comments into a single Token.
 8 | 
 9 | The only remaining requirement for our Lexer is for it to inject some whitespace tokens during its work to aid the Parser. In order for the Parser to be able to clearly know when a given expression ends and a new one begins, our Lexer should append a new line token on each new line escape sequence, and to be explicit we will also append an end of file token once Lexing is complete. This way our array of tokens will clearly indicate the linear order of all the semantics of our programming language including the effects of whitespace on expressions.
10 | 
11 | Here is a high level view of what the lexer is doing to generate the token array from a given array of characters.
12 | 
13 | ![Lexer Basic](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/lexer_basic.png)
14 | 
15 | #### Next
16 | [Chapter 4 - Parser](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-4-parser.md)


--------------------------------------------------------------------------------
/chap-4-parser.md:
--------------------------------------------------------------------------------
  1 | # Chapter 4 Parser
  2 | 
  3 | This step is going to be the most complicated, but if you can get through and understand this, then the rest is going to be a piece of cake. We currently are able to use our Lexer to get an array of Tokens. We now wish to parse the tokens into an AST (Abstract Syntax Tree). It gets this name due to the tree-like structure it produces when fully parsed. Every literal, expression, block, return, if, while, def, (any syntactic component of the language) is going to have its own node. Each of these nodes will reference other nodes to indicate how they are related to one another. 
  4 | 
  5 | For instance, the binary operation 2 + 2 could be looked at as a binary expression node, with an operator node represented by the plus symbol, and a left hand side and right hand side expressions that are each in this case simply a number literal of 2. In this example the binary operator expression node would be considered the root, while the remaining nodes are its children. This tree like structure is very important as it makes our code-generation stage much more simple. In order to translate the AST into LLVM IR we simply will walk the AST, inspecting nodes as we go, and calling the LLVM IR Builder api with references to the respective child nodes.
  6 | 
  7 | ```
  8 | BinaryExpressionNode -> 2 + 2
  9 |     Operator         -> +
 10 |     LHS              -> 2
 11 |     RHS              -> 2
 12 | ```
 13 | 
 14 | In order to create our AST we are going to need a class for each node type so as to allow each node to have instance variables that reflect the required child nodes for each node type that LLVM understands and that we wish to port into our language. Initially we only need a few node types, and all our code will be treated as though its being called from the main function and therefore appended to the end of the main function's BasicBlock as we go. Once we are ready to add control flow, loops, and functions we will need to keep track of the blocks in our program and append to the correct one during code generation. Finally we will implement the puts command as a language built-in which will require some special parsing and code generation logic to accommodate its features. A language built-in means that rather than forcing a user to implement a function and compile it each time with their own code, the language will have the function signature and implementation pre-compiled in the standard library so that every program linked against the standard library will have the same desired implementation of said function.
 15 | 
 16 | Our parser will work by inspecting the current token in the array. The parser will be aware of each node type and how it relates to other node types in sequence. Each line will be treated as an expression, of which itself may consist of multiple other expressions. The parser will determine which tokens should be expected following a given token, if those tokens are not found, an error will be generated to help the user determine where a syntax error is occurring. Otherwise the parser will continue to take the tokens and generate the required node structure to form the final AST. We should be able to easily inspect our AST at the end of this stage to visually debug and ensure our code-generation calls are getting the correct information.
 17 | 
 18 | If we use our initial example code from chapter 0.
 19 | 
 20 | ```ruby
 21 | # I am a comment!
 22 | four = 2 + 2
 23 | puts four
 24 | puts 10 < 6
 25 | puts 11 != 10
 26 | ```
 27 | 
 28 | We will be parsing this into an AST that will look something like this.
 29 | ```
 30 | Expressions Node : [four = 2 + 2, puts four, puts 10 < 6, puts 11 != 10]
 31 | [0]
 32 | |-> Variable Declaration Node : four = 2 + 2
 33 | |---> Binary Expression Node : 2 + 2
 34 | |-----> Number Literal : 2
 35 | |-----> Number Literal : 2
 36 | [1]
 37 | |-> Call Expression Node : puts four
 38 | |---> Declaration Reference Expression : four
 39 | [2]
 40 | |-> Call Expression Node : puts 10 < 6
 41 | |---> Binary Expression Node : 10 < 6
 42 | |-----> Number Literal : 10
 43 | |-----> Number Literal : 6
 44 | [3]
 45 | |-> Call Expression Node : puts 11 != 10
 46 | |---> Binary Expression Node : 11 != 10
 47 | |-----> Number Literal : 11
 48 | |-----> Number Literal : 10
 49 | ```
 50 | 
 51 | Something that may be of use to you for experimental purposes is to see Clang's AST representation for simple C code. This is useful as LLVM uses Clang's AST in its own codegen methods and should be informative to browse. Below is a simple C code example.
 52 | 
 53 | ```c
 54 | // example_clang/main.c
 55 | 
 56 | int addFour(int x) {
 57 |     return x + 4;
 58 | }
 59 | 
 60 | 
 61 | int main(){
 62 |     int four = addFour(0);
 63 | }
 64 | 
 65 | ```
 66 | 
 67 | ```bash
 68 | clang -cc1 -ast-dump name_of_file.c
 69 | ```
 70 | 
 71 | If we scrape out some of the extraneous information we can see Clang's AST for this code:
 72 | ```
 73 | |-FunctionDecl 0x7f9247882400 <main.c:1:1, line:3:1> line:1:5 used addFour 'int (int)'
 74 | | |-ParmVarDecl 0x7f92478316d8 <col:13, col:17> col:17 used x 'int'
 75 | | `-CompoundStmt 0x7f9247882590 <col:20, line:3:1>
 76 | |   `-ReturnStmt 0x7f9247882578 <line:2:2, col:13>
 77 | |     `-BinaryOperator 0x7f9247882550 <col:9, col:13> 'int' '+'
 78 | |       |-ImplicitCastExpr 0x7f9247882538 <col:9> 'int' <LValueToRValue>
 79 | |       | `-DeclRefExpr 0x7f92478824f0 <col:9> 'int' lvalue ParmVar 0x7f92478316d8 'x' 'int'
 80 | |       `-IntegerLiteral 0x7f9247882518 <col:13> 'int' 4
 81 | `-FunctionDecl 0x7f92478825f8 <line:6:1, line:8:1> line:6:5 main 'int ()'
 82 |   `-CompoundStmt 0x7f92478827e8 <col:11, line:8:1>
 83 |     `-DeclStmt 0x7f92478827d0 <line:7:2, col:23>
 84 |       `-VarDecl 0x7f92478826b0 <col:2, col:22> col:6 four 'int' cinit
 85 |         `-CallExpr 0x7f92478827a0 <col:13, col:22> 'int'
 86 |           |-ImplicitCastExpr 0x7f9247882788 <col:13> 'int (*)(int)' <FunctionToPointerDecay>
 87 |           | `-DeclRefExpr 0x7f9247882710 <col:13> 'int (int)' Function 0x7f9247882400 'addFour' 'int (int)'
 88 |           `-IntegerLiteral 0x7f9247882738 <col:21> 'int' 0
 89 | 
 90 | ```
 91 | 
 92 | Our parser is going to parse binary operations by taking a somewhat novel yet simple approach. In basic terms, when our parser reaches an expression, it will append the number literal nodes and make it the active node in anticipation of an impending binary node. If a binary node is reached, it then seeks the suitable insertion point in the AST and then promotes itself to that node, assuming that nodes children, itself becoming the new active node in the parse tree.
 93 | 
 94 | If you are like me the above words might sound pretty confusing so a picture is worth a 1000 words:
 95 | 
 96 | ```
 97 | watch as it parses the expression 2 * 5 + 3 in sequence
 98 | 
 99 | step 1 Expression node is added with first token value - literal value 2 active
100 | Root Node
101 |   Expression Node
102 |     Literal Node 2 (Active)
103 | 
104 | step 2 A binary operator is reached, promoted, inherits literal as its child, and is now the active node
105 | Root Node
106 |   Expression Node
107 |     Operator Node * (Active)
108 |       Literal Node 2
109 | 
110 | step 3 Next literal is appended to active node and then itself becomes active
111 | Root Node
112 |   Expression Node
113 |     Operator Node *
114 |       Literal Node 2
115 |       Literal Node 5 (Active)
116 | 
117 | step 4 New operator is reached, is lower precedence than * operator so is therefore promoted twice, and is new active node
118 | Root Node
119 |   Expression Node
120 |     Operator Node + (Active)
121 |       Operator Node *
122 |         Literal Node 2
123 |         Literal Node 5
124 | 
125 | step 5 Final token is reached and appended to active node, resolving expression AST
126 | Root Node
127 |   Expression Node
128 |     Operator Node +
129 |       Literal Node 3
130 |       Operator Node *
131 |         Literal Node 2
132 |         Literal Node 5
133 | ```
134 | 
135 | ```
136 | parsing : 2 * 3 + (4 * (5 + 6) * 7) + 8 * 9
137 | Root
138 |   Expression Node
139 |     Operator Node +
140 |       Operator Node +
141 |         Expression Node (4 * (5 + 6) * 7)
142 |           Operator Node *
143 |             Operator Node *
144 |               Expression Node (5 + 6)
145 |                 Operator Node 5
146 |                   Literal Node 5
147 |                   Literal Node 6
148 |               Literal Node 7
149 |             Literal Node 4
150 |         Operator Node *
151 |           Literal Node 8
152 |           Literal Node 9
153 |       Operator Node *
154 |         Literal Node 2
155 |         Literal Node 3
156 | 
157 | 
158 | ```
159 | 
160 | Here are some sketches of this process to help you visualize.
161 | 
162 | Blue means this node is new this step, red means it is both new and currently active.
163 | 
164 | In this example we are using the expression:
165 | 
166 | ```crystal
167 | 2 * 3 + (4 * (5 + 6 * 7) + 8) * 9 - 1
168 | ```
169 | 
170 | Step 1 - Root node and main expression node generated.
171 | 
172 | ![BDMAS Parsing Stage 1](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/BDMAS_1.png)
173 | 
174 | Step 2 - Begin parsing main expression, append 2 literal node.
175 | 
176 | ![BDMAS Parsing Stage 2](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/BDMAS_2.png)
177 | 
178 | Step 3 - Multiplication operator is promoted, and literal 3 appended to it.
179 | 
180 | ![BDMAS Parsing Stage 3](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/BDMAS_3.png)
181 | 
182 | Step 4 - Addition operator is promoted to top, the multiply operator becomes its child, and the parenthesis expression is also appended as its child
183 | 
184 | ![BDMAS Parsing Stage 4](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/BDMAS_4.png)
185 | 
186 | Step 5 - First 4 literal is appended to expression node, then multiplication operator is promoted, and then expression node is appended to multiplication node as active node.
187 | 
188 | ![BDMAS Parsing Stage 5](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/BDMAS_5.png)
189 | 
190 | Step 6 - Inner parenthesis expression is parsed, 5 literal to expression node, addition is promoted, 6 literal to addition node, multiplication is promoted, 7 literal to multiplication node.
191 | 
192 | ![BDMAS Parsing Stage 6](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/BDMAS_6.png)
193 | 
194 | Step 7 - First parenthesis closes so active node jumps to closest expression node parent and then immediately promotes addition binary and appends 8 literal as its child.
195 | 
196 | ![BDMAS Parsing Stage 7](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/BDMAS_7.png)
197 | 
198 | Step 8 - Second parenthesis closes so active node jumps to closest expression node parent and then immediately promotes multiplication, appends literal 9 to multiplication, double promotes subtraction, and then finally appends 1 literal to subtraction node.
199 | 
200 | ![BDMAS Parsing Stage 8](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/BDMAS_8.png)
201 | 
202 | Visualize how the active node is changing as it parses the expressions and inner expressions.
203 | 
204 | The above approach allows us to parse each token in sequence and handle parenthesis scope because the following is true:
205 | 
206 | 1. Expression nodes act as gatekeepers, preventing operators from promotion beyond their borders.
207 | 2. Expression nodes act as beacons, allowing the closing parenthesis to correctly activate the next required node in the parsing process.
208 | 3. promote and add_child in accordance with the active node should always resolve to the correct place if there are no syntax errors.
209 | 4. We likely can test for these syntax errors and provide user friendly messages if this situation is detected.
210 | 
211 | The algorithm can be simply stated as follows:
212 | 
213 | ```
214 | Whenever a opening parenthesis is encountered, 
215 |   the active node appends an expression node 
216 |   which then becomes the new active node
217 | 
218 | Whenever a closing parenthesis is encountered, 
219 |   the active node is recursively changed to its own parent node 
220 |   until the active node is an expression node.
221 | 
222 | This process provides the boundary for the operator nodes 
223 |   and the designation for the active node once resolved.
224 | ```
225 | 
226 | Here is a high level view of what the parser is doing to generate the AST from a given array of tokens.
227 | 
228 | ![Parser Basic](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/parser_basic.png)
229 | 
230 | #### Next
231 | [Chapter 5 - Code Generator](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-5-code-generator.md)


--------------------------------------------------------------------------------
/chap-5-code-generator.md:
--------------------------------------------------------------------------------
  1 | # Chapter 5 Code Generator
  2 | 
  3 | While the parser's job was to convert an array of tokens into a structured AST, the code generator's job is to make the transition from AST to either intermediate or binary code. Code generation is completed by walking along the nodes of the AST and making sense of the structure. This is a recursive process, as the code generator must walk to the terminal nodes before resolving an expression. The final result will be all the parsed expressions into the module which will then be dumped as LLVM IR.
  4 | 
  5 | It can be helpful to have an idea of what a given AST will translate into IR code, even if you do not plan to write that IR yourself. Merely the structure of that IR will be informative on how to walk the AST and generate the required builder calls via LLVM's IR Builder API. Just as in the last chapter, we can use clang to emit the AST and the LLVM IR for simple C examples to better understand how an AST translates to LLVM IR. Forgive this ugly C example, I am using variables to prevent Clang from outputting optimized LLVM IR that is non-informative. If you know how to output raw LLVM IR without collapsing number literals please let me know.
  6 | 
  7 | ```c
  8 | // example_clang/main2.c
  9 | 
 10 | int main(){
 11 |     int number;
 12 |     int two = 2, three = 3, four = 4;
 13 |     if (two + three * four < three) {
 14 |         number = two;
 15 |     } else {
 16 |         number = 5;
 17 |     }
 18 |     return number;
 19 | }
 20 | ```
 21 | 
 22 | ```bash
 23 | clang -cc1 -ast-dump name_of_file.c
 24 | clang -cc1 -emit-llvm name_of_file.c
 25 | ```
 26 | 
 27 | Simplified AST:
 28 | ```
 29 | `-FunctionDecl 0x7fc33a831718 <main2.c:1:1, line:10:1> line:1:5 main 'int ()'
 30 |   `-CompoundStmt 0x7fc33a8829b8 <col:11, line:10:1>
 31 |     |-DeclStmt 0x7fc33a882470 <line:2:5, col:15>
 32 |     | `-VarDecl 0x7fc33a882410 <col:5, col:9> col:9 used number 'int'
 33 |     |-DeclStmt 0x7fc33a882658 <line:3:5, col:37>
 34 |     | |-VarDecl 0x7fc33a882498 <col:5, col:15> col:9 used two 'int' cinit
 35 |     | | `-IntegerLiteral 0x7fc33a8824f8 <col:15> 'int' 2
 36 |     | |-VarDecl 0x7fc33a882528 <col:5, col:26> col:18 used three 'int' cinit
 37 |     | | `-IntegerLiteral 0x7fc33a882588 <col:26> 'int' 3
 38 |     | `-VarDecl 0x7fc33a8825b8 <col:5, col:36> col:29 used four 'int' cinit
 39 |     |   `-IntegerLiteral 0x7fc33a882618 <col:36> 'int' 4
 40 |     |-IfStmt 0x7fc33a882928 <line:4:5, line:8:5>
 41 |     | |-<<<NULL>>>
 42 |     | |-<<<NULL>>>
 43 |     | |-BinaryOperator 0x7fc33a8827c0 <line:4:9, col:30> 'int' '<'
 44 |     | | |-BinaryOperator 0x7fc33a882758 <col:9, col:23> 'int' '+'
 45 |     | | | |-ImplicitCastExpr 0x7fc33a882740 <col:9> 'int' <LValueToRValue>
 46 |     | | | | `-DeclRefExpr 0x7fc33a882670 <col:9> 'int' lvalue Var 0x7fc33a882498 'two' 'int'
 47 |     | | | `-BinaryOperator 0x7fc33a882718 <col:15, col:23> 'int' '*'
 48 |     | | |   |-ImplicitCastExpr 0x7fc33a8826e8 <col:15> 'int' <LValueToRValue>
 49 |     | | |   | `-DeclRefExpr 0x7fc33a882698 <col:15> 'int' lvalue Var 0x7fc33a882528 'three' 'int'
 50 |     | | |   `-ImplicitCastExpr 0x7fc33a882700 <col:23> 'int' <LValueToRValue>
 51 |     | | |     `-DeclRefExpr 0x7fc33a8826c0 <col:23> 'int' lvalue Var 0x7fc33a8825b8 'four' 'int'
 52 |     | | `-ImplicitCastExpr 0x7fc33a8827a8 <col:30> 'int' <LValueToRValue>
 53 |     | |   `-DeclRefExpr 0x7fc33a882780 <col:30> 'int' lvalue Var 0x7fc33a882528 'three' 'int'
 54 |     | |-CompoundStmt 0x7fc33a882878 <col:37, line:6:5>
 55 |     | | `-BinaryOperator 0x7fc33a882850 <line:5:9, col:18> 'int' '='
 56 |     | |   |-DeclRefExpr 0x7fc33a8827e8 <col:9> 'int' lvalue Var 0x7fc33a882410 'number' 'int'
 57 |     | |   `-ImplicitCastExpr 0x7fc33a882838 <col:18> 'int' <LValueToRValue>
 58 |     | |     `-DeclRefExpr 0x7fc33a882810 <col:18> 'int' lvalue Var 0x7fc33a882498 'two' 'int'
 59 |     | `-CompoundStmt 0x7fc33a882908 <line:6:12, line:8:5>
 60 |     |   `-BinaryOperator 0x7fc33a8828e0 <line:7:9, col:18> 'int' '='
 61 |     |     |-DeclRefExpr 0x7fc33a882898 <col:9> 'int' lvalue Var 0x7fc33a882410 'number' 'int'
 62 |     |     `-IntegerLiteral 0x7fc33a8828c0 <col:18> 'int' 5
 63 |     `-ReturnStmt 0x7fc33a8829a0 <line:9:5, col:12>
 64 |       `-ImplicitCastExpr 0x7fc33a882988 <col:12> 'int' <LValueToRValue>
 65 |         `-DeclRefExpr 0x7fc33a882960 <col:12> 'int' lvalue Var 0x7fc33a882410 'number' 'int'
 66 | 
 67 | ```
 68 | 
 69 | Simplified LLVM IR:
 70 | ```
 71 | ; Function Attrs: nounwind ssp uwtable
 72 | define i32 @main() #0 {
 73 |   %1 = alloca i32, align 4
 74 |   %2 = alloca i32, align 4
 75 |   %3 = alloca i32, align 4
 76 |   %4 = alloca i32, align 4
 77 |   %5 = alloca i32, align 4
 78 |   store i32 0, i32* %1, align 4
 79 |   store i32 2, i32* %3, align 4
 80 |   store i32 3, i32* %4, align 4
 81 |   store i32 4, i32* %5, align 4
 82 |   %6 = load i32, i32* %3, align 4
 83 |   %7 = load i32, i32* %4, align 4
 84 |   %8 = load i32, i32* %5, align 4
 85 |   %9 = mul nsw i32 %7, %8
 86 |   %10 = add nsw i32 %6, %9
 87 |   %11 = load i32, i32* %4, align 4
 88 |   %12 = icmp slt i32 %10, %11
 89 |   br i1 %12, label %13, label %15
 90 | 
 91 | ; <label>:13:                                     ; preds = %0
 92 |   %14 = load i32, i32* %3, align 4
 93 |   store i32 %14, i32* %2, align 4
 94 |   br label %16
 95 | 
 96 | ; <label>:15:                                     ; preds = %0
 97 |   store i32 5, i32* %2, align 4
 98 |   br label %16
 99 | 
100 | ; <label>:16:                                     ; preds = %15, %13
101 |   %17 = load i32, i32* %2, align 4
102 |   ret i32 %17
103 | }
104 | ```
105 | 
106 | If we ignore all the variable allocations and loads, we can simply this IR further. Here is the core functionality of the above LLVM IR with comments:
107 | ```
108 | %9 = mul nsw i32 %7, %8                 ; Multiply 3 by 4 = 12
109 | %10 = add nsw i32 %6, %9                ; Add 2 to 12 = 14
110 | %12 = icmp slt i32 %10, %11             ; If 14 < 3 == false
111 | br i1 %12, label %13, label %15         ; If true goto label %13, else label %15
112 | 
113 | ; <label>:13:                           ; If block
114 |   %14 = load i32, i32* %3, align 4
115 |   store i32 %14, i32* %2, align 4       ; Store 2 into variable %2
116 |   br label %16
117 | 
118 | ; <label>:15:                           ; Else block      
119 |   store i32 5, i32* %2, align 4         ; Store 5 into variable %2
120 |   br label %16
121 | 
122 | ; <label>:16:                           ; Finally
123 |     %17 = load i32, i32* %2, align 4    ; Get value in variable %2
124 |     ret i32 %17                         ; Return value
125 | }
126 | 
127 | ```
128 | 
129 | Of particular interest is how it resolved the if condition in reverse order of the AST node walking. Although the multiplication node is the furthest node from the parent node in the expression, it is code generated first. It then follows this up the AST chain, by generating the binary addition, then the binary comparison, and then the jump to the 'if' and 'else' blocks. It parsed these operations into its AST considering order of operations. This forced the multiplication into a further node than the addition and comparison operations to ensure it was code generated first in the AST walking process.
130 | 
131 | One thing you may be aware of if you tried my example without the use of variables is that Clang simplifies number literal expressions prior to walking the final AST and performing code generation. This is not only a valid approach to making the code generation process more simple, but also a runtime optimization. Clang is extremely efficient at simplifying code down so long as it can evaluate conditions at compile time. In my test case, LLVM was able to simplify the machine code to two variable allocations and a number literal return statement at compile time when I did not use variables.
132 | 
133 | In our case, we are not going to directly simplify our AST prior to code generation. We are however going to evaluate all literal nodes that the AST is able to resolve at compile time via a value resolution pass. Where a value resolution can not be completed at compile time, a LLVM value will be resolved instead such that it can be used as a reference in the parent's nodes resolution phase. We can likely still benefit from LLVM optimizations on the outputted IR regardless of our approach. Feel free to experiment with simplifying your AST prior to code generation if you want to take the example compiler further.
134 | 
135 | One last thing you will need to be aware of. Because our toy example will eventually allow control flow, multiple blocks, and functions, we must use LLVM space to store and load variables' values. LLVM must resolve a variables value to ensure that the proper code blocks either executed or not based on the control flow of the program. We possibly could evaluate the value at compile time in some cases but this would greatly increase the complexity of the compiler. Instead we will play it safe and look up all variables from llvm. 
136 | 
137 | #### Next
138 | [Chapter 6 - Summary](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-6-summary.md)


--------------------------------------------------------------------------------
/chap-6-summary.md:
--------------------------------------------------------------------------------
 1 | # Chapter 6 Summary
 2 | 
 3 | If you made it this far then you can give yourself a big pat on the back. You made it! You have made a fully functioning toy compiler. You have seen the entire process from beginning to end.
 4 | 
 5 | While the toy compiler we made has all the same pieces as more mature compilers, it still needs some more work if we want it to be production ready. In fact, this is not meant to discourage you but currently this toy example is not very practical at all. Nevertheless, we have a great skeleton in place and we can very easily add new tokens, node types, and code generation functionality to expand the power of our toy language.
 6 | 
 7 | With the basics now in place, you could try getting loops, if/else branches, or even class based functionality working. With the techniques seen in the tutorial you should be able to use simple C examples to aid you in this process. Once you see how Clang processes the tokens and AST you should feel confident enough to imitate the behaviour. Clang is LLVM's flagship implementation. If your approach mimics Clang's then you know you are on the right track to code that will perform well.
 8 | 
 9 | But beyond the missing functionality there is still much more to know about the architecture. We did not even touch on optimizations, modules, scoping, etc... There is much you will want to play with if you want to get the full appreciation and understanding of the LLVM framework.
10 | 
11 | Now that you have a basic grasp on compilers, one thing I can not recommend enough, is to study how other compilers implement their functionality. You will be surprised what you can learn by studying the compiler implementation of other languages. A great learning example is the Crystal compiler. The compiler is written in Crystal making it extremely simple to read through and it uses LLVM bindings itself which makes for a helpful study aid.
12 | 
13 | Along with Crystal, there are many languages written on top of the LLVM architecture, and each is an opportunity to absorb the real world knowledge and experience of a compiler implementation. While we focused on keeping our language simple, these languages will have a much deeper integration with LLVM's functionality. It can often be much more enlightening to see how a compiler implements something rather than trying to read manuals and header files in LLVM itself.
14 | 
15 | #### Next
16 | [Chapter 7 - Implementing If/Else](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-7-if-else.md)


--------------------------------------------------------------------------------
/chap-7-if-else.md:
--------------------------------------------------------------------------------
 1 | # Chapter 7 If / Else
 2 | 
 3 | Through the first section of the book, we concerned ourselves with only the basics of a toy compiler. At the end of chapter 5 our compiler is able to work with basic expressions in a procedural, line by line fashion. However if we want to add some sections of code that only execute when given conditions are true we need to introduce if else statements.
 4 | 
 5 | If else statements are actually pretty easy to implement based on the working parts we already have. We do however need to start learning about basic blocks in a little bit more depth and begin to understand how basic blocks can be combined with conditional checks to create conditional logic in our code.
 6 | 
 7 | To keep things really simple, we will not yet introduce an elsif / elseif keyword and will only allow for if and else statements. With this setup we can make some generalizations about how our compiler will handle conditional checks. Our previous experience with LLVM showed us that we could create simple working code by injecting instructions into a basic block. Now that we are going to be implementing if else expressions we will need more than one basic block.
 8 | 
 9 | The easiest way to think about an if expression in LLVM is that you will need at least 3 basic blocks (4 if it contains an else block) to achieve this behaviour. If we analyze a crystal if / else, we can see how the instructions relate to the different basic blocks.
10 | 
11 | ```crystal
12 | # If / Else
13 | if 2 < 3
14 |     puts "true"
15 | else
16 |     puts "false"
17 | end
18 | ```
19 | 
20 | Any code preceding the if statement along with the conditional if check will go into the first block. That block will then jump to either the if block or the else block based on the result of the conditional check. Finally both the if block and else block will jump to an exit block which will contain any instructions following the end keyword.
21 | 
22 | ```crystal
23 | if 2 < 3
24 |     puts "true"
25 | end
26 | ```
27 | 
28 | In this case any code preceding the if statement along with the conditional if check will go into the first block. If the conditional evaluates true then it will jump to the if block, otherwise it will jump to the exit block. The if block also unconditionally jumps to the exit block.
29 | 
30 | Therefore to implement if else blocks we need functionality to allow us to add basic blocks into our main function, keeping track of what basic block is active. We must also change the active block as we walk through our nodes, and append instructions onto the active block. All of this functionality will be implemented into our State class. When we are walking our AST, the state module will concern itself with this information, appending blocks as needed, and appending instructions to the correct block.
31 | 
32 | One final thing we need to implement is the ability to close our blocks correctly. Due to the order of nodes in our AST, we actually need to keep track of the closing statements of our blocks and only insert them once all other instructions have been inserted into the block. The easiest way to accomplish this is to save all these instructions into an array and inject them after all other instructions have been appended. There are only two types of closing statements, a conditional jump and unconditional jump which will allow our blocks to flow correctly into each other as required.
33 | 
34 | With all this implemented, our compiler should have no problem parsing nested if else statements, generating all the necessary blocks required, and inserting the instructions into the correct blocks. We still only have one function, but that function can now consist of many interrelated blocks that can flow to one another as required.
35 | 
36 | Here is a diagram that can be helpful to see how this effectively translates in a real world example:
37 | ![If Else to IR](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/if_else_to_ir.png)
38 | 
39 | #### Next
40 | [Chapter 8 - Implementing Function Declarations](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-8-function-declarations.md)


--------------------------------------------------------------------------------
/chap-8-function-declarations.md:
--------------------------------------------------------------------------------
 1 | # Chapter 8 Function Declarations
 2 | 
 3 | Now that our toy language can use logic to control what portions of code will run based on specified conditions we can write much more interesting programs. But we may find that our code is very long winded and hard to understand when its all crammed into one main function. Many programmers like to refactor blocks of repeated code into their own functions to aid readability. Also due to our example language being statically typed, functions allow the compiler to check the types of parameters and issue helpful errors when a parameter does not match the expected type.
 4 | 
 5 | Therefore, we know that functions are going to be a useful tool for our language. The question remains however, how will we implement them? The good news is, based on what we have written for our compiler so far, implementing functions is pretty straight forward. Much like we needed the ability to monitor active blocks when we implemented if/else statements, to parse function declarations we require the ability to get and set the active function in our module. The state class therefore will concern itself with both active function and active block and ensure that instructions are always appended to the correct location in the module. 
 6 | 
 7 | The other concern we now have is, how are we going to designate the function signature? A function signature is actually a pretty simple idea, we merely need to know how many parameters a function can take, what type each parameter is, and what type of value the function returns. In Crystal for example if we wanted a function that takes two integers and returns their sum we might end up with a function like the following:
 8 | 
 9 | ```crystal
10 | def add_x_and_y(x : Int32, y : Int32) : Int32
11 |     return x + y
12 | end
13 | ```
14 | 
15 | Right in the function definition we can see this function takes two parameters, each of type Int32, and returns an Int32. Therefore if we later call add_x_and_y with anything but 2 parameters of type Int32 the compiler will tell us and we can hopefully find the issue quickly and resolve it. Also based on the return type of this function, we can safely take the product of this function and use it as an Int32 knowing the compiler is enforcing this.
16 | 
17 | For our toy language, our syntax will be similar however we will not use a colon to seperate our types from our parameters. This should not interfere with parsing but feel free to implement a seperator token if you find the need for them. Therefore the same function in Emerald would read as follows:
18 | 
19 | ```crystal
20 | def add_x_and_y(x Int32, y Int32) Int32
21 |     return x + y
22 | end
23 | ```
24 | 
25 | The def keyword will be the key to putting this together. When the compiler detects the def keyword, it knows that it is now parsing a function declaration. It will resolve the parameters and their types as well as the return type, and then use this to inject a function into the module with the correct function signature. It then resolves the instructions in the function body to the newly active function, completing the function definition. When the parser reaches the end of the function definition, the main function is once again made active and the parsing continues.
26 | 
27 | Later when this function identifier is used again, it will be identified as a call expression, and the compiler will call its internal representation of the function with the provided parameters. Assuming the correct number and types of parameters are given, the function will return the expected result. Therefore, the compiler needs to look up all indentifiers to determine if they are a function or a variable. In our implementation, if an identifier is both a function and a variable, it will be treated only as a function so care is needed for naming functions and variables. In particular, it would be impossible to determine a no parameter function from a variable based on our syntax so it is recommended that you simply use differing names to avoid the confusion.
28 | 
29 | #### Next
30 | [Chapter 9 - Implementing Loops](https://github.com/Virtual-Machine/llvm-tutorial-book/blob/master/chap-9-while-loops.md)


--------------------------------------------------------------------------------
/chap-9-while-loops.md:
--------------------------------------------------------------------------------
1 | # Chapter 9 While Loops
2 | 
3 | In chapter 7 we implemented if/else statements. The good news is that implementing while loops are fairly similar. Much like if/else statements, while loops can be implemented in LLVM IR using several blocks to control the flow of execution.
4 | 
5 | Here is a diagram that can be helpful to see how this effectively translates in a real world example:
6 | ![While to IR](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/while_to_ir.png)


--------------------------------------------------------------------------------
/demo_file.cr:
--------------------------------------------------------------------------------
1 | four = 2 + 2
2 | puts four + 1
3 | puts 10 < 6 + 1 * 2
4 | 


--------------------------------------------------------------------------------
/diagrams/img/BDMAS_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/BDMAS_1.png


--------------------------------------------------------------------------------
/diagrams/img/BDMAS_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/BDMAS_2.png


--------------------------------------------------------------------------------
/diagrams/img/BDMAS_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/BDMAS_3.png


--------------------------------------------------------------------------------
/diagrams/img/BDMAS_4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/BDMAS_4.png


--------------------------------------------------------------------------------
/diagrams/img/BDMAS_5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/BDMAS_5.png


--------------------------------------------------------------------------------
/diagrams/img/BDMAS_6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/BDMAS_6.png


--------------------------------------------------------------------------------
/diagrams/img/BDMAS_7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/BDMAS_7.png


--------------------------------------------------------------------------------
/diagrams/img/BDMAS_8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/BDMAS_8.png


--------------------------------------------------------------------------------
/diagrams/img/Emerald_Architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/Emerald_Architecture.png


--------------------------------------------------------------------------------
/diagrams/img/LLVM_Architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/LLVM_Architecture.png


--------------------------------------------------------------------------------
/diagrams/img/demo_output.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/demo_output.png


--------------------------------------------------------------------------------
/diagrams/img/if_else_to_ir.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/if_else_to_ir.png


--------------------------------------------------------------------------------
/diagrams/img/lexer_basic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/lexer_basic.png


--------------------------------------------------------------------------------
/diagrams/img/parser_basic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/parser_basic.png


--------------------------------------------------------------------------------
/diagrams/img/while_to_ir.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/3f1c2a21ee31e99eef7b80eb3763ec8677bb513b/diagrams/img/while_to_ir.png


--------------------------------------------------------------------------------
/diagrams/xml_archive/Emerald_Architecture.xml:
--------------------------------------------------------------------------------
1 | <mxfile userAgent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" version="6.0.2.8" editor="www.draw.io" type="device"><diagram name="Page-1">7Zpbc6IwFMc/Da8OEI3y2Lra3Zl2prN29vKYQgSmSNwYK+6n34SEa+hOKyC2U/vQcHL//xI8OdEA801yQ9E2uCMejgzb9BIDfDFs2zJtyP8Jy1FaxlNl8GnoqUKFYRX+xVlNZd2HHt5VCjJCIhZuq0aXxDF2WcWGKCWHarE1iaq9bpGPNcPKRZFu/Rl6LJDWWTYtYf+KQz/IeragI3MekfvkU7KPVX+GDdbpR2ZvUNaWmuguQB45lExgYYA5JYTJ1CaZ40hom8km6y1fyM3HTXHMXlPBlhWeUbRXU19sMEVcLttckT11MU/MOV81XHbMJDoEIcOrLXLF84EvAwNcB2wT8SeLJ9dhFM1JRGhaGszny6XjcLvqD1OGkxfHbOVK8BWGyQYzeuRFVIUJkDXU2rKh0vJQkLIyfYMSpZmyIbU4/LzlQiCeUBo16wU0vR7IE4656YpSdGypEoTL5TXoRqVBZRprMhk2jJiYvEj5LJ2jtKwJn5DYypkM8M+eyAIAm+KvbJJ180Uqm+DDka1UW+bmcnet0JgmhKbJ7TtGOfFSzjL99LK0G5BNm5DBDpBNNGS3OMG0pWyO092+t2rqWK9d0V3IAzV5rlYPHWz3rlbOeMjtPtXXzu2PO2759r21RNd2VxLBISWaaRLdI7prvb/E92pX+oAh95ej6SO9DvMGx/xdz0h7pRaTzl7TQyqV+ciVL9eaNDj2roQbzJ/cCO12oVvVBich+1VK/+ZpczQRTzEfjsgys4ciT3aDPc13runGhyJdx7KLyRD1MSt5Ubq6JfUmDeJlNoojxMLn6iCaFFU93JMw9TCSajPZ0cSqMZFjV5XKrnOtHcusNgTqcOWMtYZSvvmsX4fcel/IgY4cfiJ/G3L9ZHbRyKGOfPqJ/G3I9cPlycgVZgXdOhnr5CMwzJXNGJr9MWw6+Q7McPYRGI7PyFA/Cg/O0BmSIZjVINonQoRnhKgf2FtAtLuhOP4IOzE/gNQPPz1A1AMLHe7EwgkqfKLfyidqwbfqBH2Il+85nSA9UNIC+XQQ5oO+rN8h8/9cFzxm4fs0GD2KcGKkt34ifLMTTcTbfXp7ECCKXCbiaeICUNzE8DxGBBl1QaPM+dXAY0c3A32GuAFUTZwjsNQQxtUwyKDlaCv+VUnUZM7UT6PknWveZ9jzvJo3ve5qmucR0JEvUzXlU4kzvUuBd3NDvD3Xtw/5+4ulnlV+0GEstepgnP6FclleonP5Z27QYXS0M4gXFfw6FeIZD93AvkCIF+W8nwqxv0M3fyx+qSSLFz8HA4t/</diagram></mxfile>


--------------------------------------------------------------------------------
/diagrams/xml_archive/LLVM_Architecture.xml:
--------------------------------------------------------------------------------
1 | <mxfile userAgent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" version="6.0.2.8" editor="www.draw.io" type="device"><diagram name="Page-1">7Zrvc9ogGMf/mrzdJUHRvKxZ7XZn1916t60vMaKyYnAJVru/fhBAg9jNmRizXu1dJQ8/At8PIQ8PeiBebG4ytJzfsgmmXuhPNh5474Vh4IdQfEnLs7J0QKgMs4xMdKGd4Z78wqamtq7IBOdWQc4Y5WRpGxOWpjjhlg1lGVvbxaaM2nddohl2DPcJoq71G5nwubL2zbCk/QMms7m5cwAjlTNGyeMsY6tU388LwbT4qOwFMm3pgeZzNGHrkglceyDOGOMqtdjEmEptjWyq3vCF3G2/M5zyYypoLE+IrvTQ71Z8ueLCNhp9vRVfH7+If4LwSmijusyfjUzrOeH4fokSeb0WU8EDgzlfUHEViOSUUBozyrKiNIBwEA5FRwZuJ3W/n3DG8aZk0p2+wWyBefYsiujcLlA19PwKodZzvaMVGI3nJVJ9bUN6gsy2Le9EEgmt02HNgKPZLUrmJBX6+J8QJ08ycZXneDGmzxU1G4QQ1qXZRUXrOKJ5IaRcDl6mZrwYo7JMWcqLh9vIAH+umCoAsC//yiZVV81WVV/0RTVhNyvM5XtV4hJFABRccp6xR1zKGRafs8zyA7x6h3jBGnh1HV6jUVxRtDiOorqkCfa0CY6dzHWIA49ZAe7GP+SrKfSHpPLSORyKxRPUo1znkstA7xjlBiRFWdWVU0oGYT2SwUtK1nckiylKZ8IUs8WyjrkVRXFcj1Dgkk9l9KJQI5I+1qBSFNW0rF9SJeNiW2/iPWlwOrmSXrS4SijKc5LY2uAN4d9L6QeR9t915VUquiOzfHOxy3tRuZytsgRbDilH2Qxzy9/CE8tnd9Utqdc9IJ6xZZiqhcbaNBxQVN/hMyOFO7KxmzE7m2CPiRqMrlT2vPfaCXy7IbAPV0ngNFTw3Y76OORmhrYTOXCRwzfkFZG7G7s2IYcu8t4b8orI3X3pycg1Zg09OBZr91Uy3CprGPrnY3hom9wsw/6rZNhpkKG7dW6aYdQqhqC/BzE8ESJsEKK7xa8AMTyJYqdVFOtbTe2GgjO+Ed1oQ41P4s4J2vlED9on+he+ZSfodS6+TTpBbrSkAvJeE8zbtVj/h8z/cLYwNuF+ShOvOC6UUZrcs0+3RBeYJw/nTCQyNZFItD3F2R4hjGs6QThnMBxA3UQTAaUDMV1H/kQH4hIVscz/prZfAGEmgj5VdWpncM7QZ7MMDi17ewzIVGqNE5znRXTdN1DWpGgL0VyKTotYqaO9IjJl2aKwpUhWGetA/VnAnCva2igYUGO01XZBjn7ltNyPjNq/Kwc1xk9Phdju8NipEBvclptfIF0SYrvd+1Mhnm9bLi53P4VSxXe/NwPXvwE=</diagram></mxfile>


--------------------------------------------------------------------------------
/diagrams/xml_archive/Lexer.xml:
--------------------------------------------------------------------------------
1 | <mxfile userAgent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" version="6.0.2.10" editor="www.draw.io" type="device"><diagram name="Page-1">5Vtbc9o4FP41PG7G+AY8Ak26nWlnO5PuZPuo2MLWRkheWVyyv34lWTKW7VDaYBGyPIB1LOty/J1P5xyJUbBc7z8yUORfaArxyPfS/Sj4MPL9sefH4kdKnitJJEtSkDGU6koHwT36F5ontXSDUlhaFTmlmKPCFiaUEJhwSwYYozu72opiu9cCZLAjuE8A7kofUMrzSjo105Ly3yHKctPzOJ5Vdx5B8pQxuiG6v5EfrNSnur0Gpi090TIHKd01RMHtKFgySnl1td4vIZa6NWqrnrt74W49bgYJP+kB/UTJn83cYSpUoYuU8ZxmlAB8e5Au1PygbMETpZyvsbgci0vRKXv+S8pvIlP8rqv9DTl/1m8abDgVokPrnyktdBvVeOQgXpySGTPdsETXmmiQAJZB3sSSnvZHSNdQjEbUYRADjrZ280ADJqvrHZQmLrTe+nWo+94CvNGNfiKIIyDrzBkDslMqAOAtc8BAwiErO0rf5YjD+wKo+eyEUdmKLTmjT3BJMWWqfuCpj7izQhg35IvZrTePajVuIeNwf1yRXQ3tR5YxPtvF3cEgxgbGecMYYu/1Kh1HZ4XlHvEGKkXp+8UBa7RpIdZ3hFjTeQOyXwAiQvJZoMX37jYk4YiSwWB6pz7DwDR0iNPIHwKn4wZKvRs/soBqwVShdmCg+j1ADRwBNQrfx/LUp8TIlRJN5w1rf8iRUJUYNi1RZeieMn/QXK6SxnLlIflFhFsi51rKH57LFnBVqOteI2WITt1RRu35DUgZF2aMoIcxZq7AHnTAfg+2sMZrsmFq8P5SdiTcdPGoBDvYAoTBo7IKAvctTFfWUbVgbESoTcpLDkgCVY8MyQaGc+7OagKT1rIZubSB89J6rw1MbCO48aZByxCCqRF8hQyJSUDmZjGYde3Dd7UY+IN41kc9lptxl4HeitpfMJLz09KsQ0t/lohkTUaS2QxeUQ9WbniLsa5skQ38SzLMdAiYezfTSTTqrLSi1Iby5VbfOl/WhHnsil3GHZh/kqvrNzEpGVxuZZ6w/V4wRkUp9V/moJDCBNONaHDxFlEdzmxUz04E9fQcoJ79b0Hd41JWeTcXoO66lArUT4TuSJO23xWwfc8hsgMHeRTLKWm5JBMHEI67EA5cpahN55b70RMUWTGPCnCEgGSNmrWLou6uAU9yHUul6fGWRAFVgZgAN1EvmykP6Ap8mdj/MeuHQ9mG82CpZRy+A+Po2b95tXGoR80GjKlQUER42Wj5qxQc3vQkbtFg2NpAa9efvq5+EHgtMFQjPkCjnvpplt7djXoASA5gRVlj0UohRmtUmWd1JxM6tZYzbdWdKOVkw/Y2hCNsd3aFxt67Eg5m7ZNhfLwXd756gvNhbd0kWy1bdxWghPEg6j2S/ngT+g1DR/o1nbd95WN+htpaAC1OUpRUFjCp9tFlfc1IDaJJgPx+lE1nkEAGOJROyIrRdd3ne3I7gr49+MGYKOq8yjtEuqcavkm9Dpn0jsVnkH2f0OVWcTBMRurkEOdniMfEqxiRp1/nob6AJ3LF8+c919S7jHqXpfke9Yau4smgG0/OiwKq7bQmExvOVX6iKegNtCvg35ogLkEYJl1+KcJwkBMJ+1yVqSuK6LoqSwylm3Ekoql2jBksob1howEus9y4N8v9FuEdRIHtXsQO4R0Ok8w+Gd4OshpheEl4hx14/yE8YrZDKvFH6KkOd0k10hERDV0jrn2XuI6G8TxOjzBd8Pa0C2xnh1dN583Dq3TbOs1GCRzZUaaoch3pp/bJnN6gbzD0vvuD11HPVqKzsCTqbiUuMS2P5imWB5+68kF0hcMBNpkSWenQXPjgq6avXVUwvrlyU0zV6/PHW/F7fcLkR5YxO4dldBPrIz/GoodFiraW7uJ/NvL/OwvpG/4GMMrIKJhLOhLTlCRk7ourTP+qdh6N4B6tC4xWSMVQDwA/8VzYWJaPdMrlszAlBYU5zihDPF+bJsQsHtvNCpkaoZG2XrTyYI+8WyK51H6tWqSn9kHPK1jIl4sSgOf6xhqlqWKIPjCdAw/tzZcuHsK4Bw/+z+NBFA//yqr2ZQ5/fQtu/wM=</diagram></mxfile>


--------------------------------------------------------------------------------
/diagrams/xml_archive/Parser.xml:
--------------------------------------------------------------------------------
1 | <mxfile userAgent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" version="6.0.3.0" editor="www.draw.io" type="device"><diagram name="Page-1">3Vpbc5s4FP41ntl92AxGBtuPTppkO7OdydTZyfZRARnUCESFiO3++tXVXJM4KeBxebDFQRLSOd+5igm4Sna3DGbxFxoiMnGdcDcBnyauO3WAI/4kZa8ps/lCEyKGQ01ySsIa/0R2pKEWOES5oWkSp5RwnNWJAU1TFPAaDTJGt/VuG0rCGiGDEWoR1gEkbeoDDnmsqQvXL+l/IxzF9s1Tf6mfPMLgKWK0SM37Ji7YqEs/TqCdy2w0j2FItxUSuJ6AK0Yp161kd4WI5G2dbTcvPD2sm6GUHzXAjMj53u4dhYIV5pYyHtOIppBcl9RLtT8kZ3DEXcwTIppT0RQvZfv/JP3Cs7ffTLfviPO9kTQsOBWkcvZ/KM3MHO0d2CXSggVmjXODCcgixOvQkcuvjDPbvkU0QWI1ogNDBHL8XJc0NICJDv1KpomG4Vs3D81iniEpzKSfU8wxlH1WjEH5UioA4NzTJ5TmLYZvY8zROoNqc1uhUHWm5pyJcVeUUKb6A0dd4skGE1KhXy6vnZX3GgufEeNo9yp3dnVu7uu321IZphbCcUURfOfX2Tn1eoXkDvMKIsXdt7HBaplXQ6t7KrTa1VTg+gXiVFDuIMuR+L8p0oBjKkg+Ecu9DPGzaEay+VUaJtdJhcGXU+ZyLaHku3ulrJsiw0AtWDSEFRS/2k4L4HFtnZVCqLFY64ngc2hfJpZfed9gunKjrmF0xT1SWRY9KIvnDqEs04qqOBeuV9OWmq4o1elXW9wObQGn0hZvdpb+sYuJ3smYaFdTMTkPMSZIRXXCJTbNQqqMDEoyvj9HA+CCEb3lIZAc0ACMq/+gQ/+XJ4MuaEF3DZVz47H8DQqmdqP8n/Z3WIZ68BliAh8VxlPt+biJ/hzlbPVwC3vBM0nPOUwDpF7HsBw9XLDYK/7nDQ/ojagA3rJPBXgflBt4fBvbyza2dfowBpSXLSj/m+M0qqLYQLiBWIntEHHEEpxKdAoMBrHerQoYN4eAUQNZ5NHkLIALFicErt9v6NZDaFHmSvNmsvQxvHfl6b7TLZKj8a6G2rzadsgoTnlemflOEkpJz12vJmkAGnWRRn9/6bzWXzT0CkpJH7ZynPDBIMKfftw5l8J3hxM+OInwfa8hfKdfYXZUgFaE0K1kSIYCvMEyv33RXGaMBsjm0FBIZf9Tm2UdI+ToR4F0WKCqSFWrHKEUMcE51V1m5blmizThAcpzyPYX4uY+VnG1CC8iOU2iuUdy+fY8xpt6eLJRw5FK68/AiPtuM//22kZ81mHEp9MerDhol1M+qwBQihcxJnZsspqNqgrXwr/XRCsVUG2Cx5WZEinUCFnoFDliZyGkWcOeHjzoW0Lqo0gCvJaMbnCq67Pr+wH554trkBRzNhszwh4kxXReTjEbXmzee45puVd1TbNFp5sbITCftU3IvU4uqQjLg1bBVZZXIVcu5RHm6l85E7EslHPjaLRLyGKYl5VYk3Pa8uyhKHsG9gPY6pV14cdG6vM+FKDfQ7JOBXBH1oBFWwNOdyZhV1PVAFMY1IGLjFE00DfKueoA8Axw2yyNgDFP0rzf7STN66gNWud+grOBdm3wK+IFk7DdjBJfDFbCBqAjhh7MvLbTp9rRY4V3/o9CfhlxKf3WX5DgKJ2AlXRlYucyErbPy/NDPc+jJaxxkhGbkD1A8sRjAfgothH43cF1rkhEmQi+k8rJ5GNz3jdOK5V7fUW4KU1RQ66GZPb2yWwMXEpR4gCSlXmQ4DBU+tqFpj4A0SiNdRxqzvwOQLg9AMIfxm4d72/f727fWxzxuyLQXy2OHM1ff3j+zvvibx7DTPYhOH3qld3axI1eiwKNTG7ec2GxnUzc6iKRMnmHGhFTqYD+dsP5Q9UWMkYTqrvpMyyZYAg/prhCMzkHZX82sw5zdkD21dRC5yvj+L2WTetAw8vFo6Y4jo3PPlCXELflh31atOXXk+D6fw==</diagram></mxfile>


--------------------------------------------------------------------------------
/diagrams/xml_archive/Parsing BDMAS Walkthrough.xml:
--------------------------------------------------------------------------------
1 | <mxfile userAgent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" version="6.0.2.8" editor="www.draw.io" type="device"><diagram name="Page-1">7V3fc9u4Ef5rPHN3M+kQxA+Cj3GaXh96MzfNQ3uPisTY6smiK8uJ3b++pEXQ4i4gkRQWwoi6mVxi2Aag/bCLxbe7wA3/9PDy62b2eP9buShWN2myeLnhf71JU5ZyUf1Vt7zuWmT9Vd1wt1kudk3Je8OX5f+K5jdN6/NyUTw1bbumbVmutsvHbuO8XK+L+bbTNttsyh/dH/tWrhadhsfZXdGZRt3wZT5bFejH/rVcbO93rTpV7+1/L5Z392ZkpvLdd77O5n/ebcrndTPeTcq/vf23+/bDzPTVjPt0P1uUP/aa+Ocb/mlTltvdvx5ePhWrWrZdsf3N8d123ptive3zCw0s32er5+aj/z7bPC3Xd1Vj/Y1fqj/8pu7ktvr/T6Jp+km2bappyn5um/TPTVsllORD9aeZ1tP21Qh4W7xU87u93z6sqgZW/fNpuyn/LD6Vq3JTtazLdfWTt9+Wq5VpqkR5m39OPsqqfbZa3q2rtnn1MYvqm7ffi812WeH3sfnGw3KxqMe6/XG/3BZfHmfzeuAf1Wqt2rCQmgnWvRQve02N0H4tyodiu3mtfuSlu1Bfu1/+2Fsspu1+b6EY2GfN+rxrO37HqPpHA5MdMo4gQ9It1ouPtRbUElrNnp6W866snQIoFnfFwY+/9wGl5fOZtk2xmm2X37vaZPvQzQi/l8tqJq108650cyC1p/J5My+a39lf3KAbJgFKEnS0nW3uii3q6A2B9kP3AkVYQFGreol/K9+mNG9Xsfrvc63dFQg8SZSaz/eb1F399z9r/W9+vxp518XuWz40qWk6uxJlXXgyrESCSIfkKLjm86T6D8P1+eVxU1RaVq6rltlDLZ3116edkHa9ft2Yn/VqVB0rBI55SWtGAZVWeNGkVJZXn9PyNp9fCrtt7GmY+9tg10BDjbBKghnhHOGzM6SXs/pbPzq8yWQcL/ZqzX5pviw32/vyrlzPVp/fW2/fPOGi7iHpSvw/xXb72vj9s+dtWTW99/CPsnw8pi+75df1vo7q0IluSca60mcZ2VI2H2lvLe/vMhe0ohWQqcWgZynRkh7ntWE3YA8L01h38OHpbYF/rLd79fiCHYd0Ck4eVBoLwGQ2y+bnhd6xU93dAFOLAPxs2c6Rhu7ZHOz9HILh0dCpi9+0U2Di2uUeQgOyeHdtPUyPeCa6i1IMVhAh08NdeFzX+LBwmRs4T4BMebgNPMVeEtjAR27Mv0xgY+aGBWk5+nBmqZ3tBZOYHJh9yUbuxgJ4UAJu6/6sloFhUrjATWAsLhJSJB5xsZH+PgzdFE4gSH8C+l/HAwMjgeMTBE7qgMDFcHTUvPv5La6Vn5Oja6DBETcBOiI0iZd/cFTAQbNEmcmWf7znxnTguZFlWUeMCvphPYgVKQ934XFZT+XcyBIg0yTcuZEfPTf2DNeP3b3fQroXv3+nIP9CWoKVVAaMn/WE2RKyYDu0eJ6+qN8jG29v6pc7puzf1PGzHjYdglMWFoQGIgWX+ViIFF0Ykh89voxPlAFJKj2zYCZgNiExpywBI2b4Ze9282hq1JVSdSMHjj86oMfO8XkovDXlWd4RQCaprCkcSScjralIHVMmsKZZhBBpMlLBOdKpEGlNB5Etx+9KtPbk68zh1ahSQKKV4+S/K9HaHzigYAHzCk2/ZzWKDCSparJTgHOkwVQryIQmNIoCn6UvjWrVXZdbW8goMgWwrPdIqFbjzvZWo7ybopNDp6zHKdlwfq4uPC5rHFS9UKo17cpUW3g4KqpVUJ0nJ0KhZg4rH8IwRXGihPsl3XEFjgQ31N78nHRMmcCERXGiBILLLawHDUT52JQqCFFOl1Il8ni3976R1BMzrlApGaG0D4dPL7kykpuQl5GyhXsjo43NWGeLqE6BXpbmqHAAYbKK6RjiqRIyinTVr2CkfGySqILUCl1GlCluiAqi3EIlEUEEw6BjIWIJXbmTPJrIS2wnxQTsJNK5kDdLhIuYWwLlUwAXOJPv2hoAXWXbBa/R8XGVYowpHQ65GDZHzjWQQIIl4OmojIaSetz2yLVr0v63R3XWy62cotPhUNK+UGpD2QQoURW1pFOwghCoxLK6yKwgFRs/hSg5VrE0IHIxkPESHJRYRmUYnSMNDpMz0BOlXcRs/KXFyaHz3XoVIVRA4xUfCZNurn/tHSiX3URRlkLvrEekHPC9uA+PKxtnV11mqFxDmQq8vqlC5RnV4XIioXJTEYosfQDbZNbEeWtZ0K5JdnBBQ6FttffBJXFN2r8Zy2I4XmLRWWgQKpSUL5Q4o0NJRLvPZ3IY1iLRR8TW49KqPKDo8flwKuFzeOk44xZujix+ntmOd1diuWdcPAFeGzeX1YbY+WNIwZKmlvaQ3+rpZI6GykfuKVK7Jk1g2KK4uxuKjlsIJCKU2qFORolDKsYjSlRFPFOIemN1CkhP5vH6bCaNl/528O49M4z3PMhURmf2uvdjj/UPPJ2oSuZTT9CTk0Lc2C3WniIYp9r7GxHjnha5JkIOYukSgLCwuDtUpk7HEInRMGeNa+zyerobDg4lzP0jQ10Jnbkm7d+V0DG45Vh0KhxKyhdKktGhdLTcnthaqilYS6R2llVIZi2PuvTEb2y4cyIv2QfSkM2SASmRHPueYbV6GgGvDNwTVfm5ATGOoR4kU4j4oyJU8FBjQymauSZN8CJaGiVKeTiUxpKTCCVBR07m5y4KkZPcI7MMLpaQEZ/8ShQESFeDD/5Ybg8g2yFj4AmE7vr+thcj/ZheOBJ6M6x3wIE7pkxgeGMgCZDgyO5MdI50KkSC7pKe/ChDcK01OBBwBKoUsGA8P/fBfwolCRBgHvCGdZZcc1ZOSDcCN84xZSL2QaCLYeNrz70tR5VhEXiqs4NDKfOBBydCKtekCd6QTGLIWsGyk+Fgkr5gyhJCmKjSVqbgo2CFsiwvMkNoeWf9WhA5GjqVh4QuBgYa+F9MCSwBT7bRNdJQ0wjTySkto3nA74IrIs1aRp5FEB3geMlHknbHmBimSgw8a8My6KP1KHiBrD3qw+fixtcgXGZRJAyqZAwvcaqiSMauFwifZp5AVWRr7oPYpxg4cLx1kh1g0FBoc+27S6faNWkKSxYDHYBlZ2FEqGDivmDKNCFM8V6A0F6z0dtxhpepIMH1iGfIY30Mzrh3+N4auibNQO7bmA7/QvWP3RzGr4XDNzJcctwe3onPMgtvSBa3ZykVZTAF2lvKHr4kmTuSxkAZyByIgM4dQUONdkeUcE2aYJ8zqMQFE5074h7qZJgo3ZH0aNratVrTfaZGChUy/pfGW67J0r53bJxar2nI6lZXckJdme7FGipx+MIhyjFZek1SOKGKRAHodMjYXBoDK6HxOTUjcgPQUNo8GDw4xAOvFW4nTWHaYkhSwLJT4WBSvmDKGSFMVEkKUyihxAplWV5khpBTMQ7TrI3MBeBsW7ULAub1ls+TinaQ156FBC8GZkLDlIM2yE1wq4CLah1+q4Br0gR7HY/hnk8kO22JvBLBpMfyfAgmTcjzcZxA4ccUTrOesX1n9hC3ThcX4THE+tFVkFlKdQ7AQ2UjzwE4KJLSnQO47VR9vZzBe+UpSpoKemKI4UwucnAhHh114h5qsC4K0BMiYXzq4rlr5PQENBFZVx0y0iKozu5T4J950i2Se4+jBIEuhmwBDrKi00RhEfh6jA4M1ebeDn+MzjVpAiNq6iDOC5OAsuNUN/25hxoMUw57oruPkQmqbIF0CoYQKZQIaQipzumTKJIDhGVqezSPDjpb9UTwQ7mABotqC3OONPzZQNhTTriDYebk0qrkOHTkWEj7leE1H0u2k7lVrneVnOhGIqsW6RaRfXHDl9JwHz4XN6YiLrNKLoNCTfASJ6uSE0eZhCtbd6ikAGmEBTwy+yRtHEHoTRrvnRYX09ezdMc21/7P0rkmTWDJZAx8QCrBTmp75dIXTK6hhsMEe6J745FJCypn3O17O8kKnBNSqBTHEVPgWWHch0PO2JE+OhsH9qMQw9TAVJKjhYLunHm4J0jMVlKd7KfATsP712uLibCjcxpiONlnEvhNjIydRkOhQ0vvHJfENWmK3SiGrIhMwZVK5tu5hxp+AzvsiZCCkUezIq61bM7lhRXKQvDRGUKvtyJUUti8/rtpf/vij/qLv0jvzI95m3u3iptGoz29kwwh69C62qfJdLq3C7RZXMjshChXU9d0gfHQsYQD7LjlunsyQ6Ri4AdYAk9utlct/Oz1lrHkSIaAtVcFomkT7PZm2rEhRfaG44GxTkcKvQfpEymqrIEpVK1hqFreLYg9pCIXplm2xtoLN5DiBUHzeg/jSUWHCtIXFreSDrwYWIjcPBL/Hk+gYiHwUHokC8ES4Zo1xXYXw5UHSHg8pfJL3EOdjlP7GiYFTlSvEk2zco0lMI5ro9LpwiDKlj0RWu1UhphTfHj1o3Z4qBz00VftcAxEgJ48al1GRY5MwwVRMHG83RRDuCBZDOyIAk9+1nWNZDrmGmq4jsGe4Gx86lhKpGNTKDVD1pCHTNPPjnIlxJW9HyYAsTuVKQjENkoltBHl4DKklDOyWifnUMNL0mBPiu54kMWQm8EFZBwEXUkaIjdGl6S5Jk0B09G7IYnLr9l4g+nDlEFZt9GbvQVibqLomDLtXiD9Tdm576GYQkSUg5ulrdnCdLtVFLdQmGp/VOTq/xYKOFQ6tjJXaNekKcxgDOyHgE9Vp4IqbO0eajhMsKecLmitj7IfxAYzn4DBRIi2VK9/g1l9uanLPPdWRPU57n8rF0X9E/8H</diagram></mxfile>


--------------------------------------------------------------------------------
/emeraldc.cr:
--------------------------------------------------------------------------------
  1 | require "option_parser"
  2 | require "./emerald/emerald"
  3 | 
  4 | test_runner = false
  5 | clean = false
  6 | full = false
  7 | execute = false
  8 | help = false
  9 | 
 10 | options = {
 11 |   "color"             => true,
 12 |   "supress"           => false,
 13 |   "printTokens"       => false,
 14 |   "printAST"          => false,
 15 |   "printResolutions"  => false,
 16 |   "printInstructions" => false,
 17 |   "printOutput"       => false,
 18 |   "optimize"          => false,
 19 |   "filename"          => "test_file.cr",
 20 | }
 21 | 
 22 | OptionParser.parse! do |parser|
 23 |   parser.banner = "Usage: emeraldc [file_name] [flags]"
 24 |   # acdefhinrstv options
 25 | 
 26 |   # Test Runner
 27 |   parser.on("-t", "--test-runner", "Run all tests in spec") { test_runner = true }
 28 | 
 29 |   # No Colors
 30 |   parser.on("-n", "--no_colors", "Turn off colourized output") { options["color"] = false }
 31 | 
 32 |   # Help
 33 |   parser.on("-h", "--help", "Print Help") { puts parser.to_s; help = true }
 34 | 
 35 |   # Clean
 36 |   parser.on("-c", "--clean", "Cleans out all output files") { clean = true }
 37 | 
 38 |   # Full Build
 39 |   parser.on("-f", "--full", "Fully compile to runnable binary") { full = true }
 40 |   parser.on("-o", "--optimize", "Optimize output using level 3 optimizations") { options["optimize"] = true }
 41 |   parser.on("-e", "--execute", "Fully compile and execute runnable binary") { execute = true }
 42 | 
 43 |   # Output
 44 |   parser.on("-s", "--supress", "Supress output.ll generation") { options["supress"] = true }
 45 | 
 46 |   # Debug Verbosity
 47 |   parser.on("-l", "--lexer_output", "Prints token array to aid debugging") { options["printTokens"] = true }
 48 |   parser.on("-a", "--ast", "Prints AST to aid debugging") { options["printAST"] = true }
 49 |   parser.on("-r", "--resolutions", "Prints AST resolutions to aid debugging") { options["printResolutions"] = true }
 50 |   parser.on("-i", "--instructions", "Prints instructions to aid debugging") { options["printInstructions"] = true }
 51 |   parser.on("-v", "--verbose", "Prints output to aid debugging") { options["printOutput"] = true }
 52 |   parser.on("-d", "--debug", "Full debug output, = -l -a -r -i -v") {
 53 |     options["printTokens"] = true
 54 |     options["printAST"] = true
 55 |     options["printResolutions"] = true
 56 |     options["printInstructions"] = true
 57 |     options["printOutput"] = true
 58 |   }
 59 |   # Filename
 60 |   parser.unknown_args do |item|
 61 |     if item.size > 0
 62 |       options["filename"] = item[0]
 63 |     end
 64 |   end
 65 | end
 66 | 
 67 | if clean
 68 |   File.delete("./output.ll") if File.file?("./output.ll")
 69 |   File.delete("./output.s") if File.file?("./output.s")
 70 |   File.delete("./output") if File.file?("./output")
 71 | end
 72 | 
 73 | if help
 74 | elsif test_runner
 75 |   path_to_file = Dir.current + "/spec/output.ll"
 76 |   File.delete(path_to_file) if File.exists?(path_to_file)
 77 |   system "crystal spec"
 78 | else
 79 |   # Compilation
 80 |   program = EmeraldProgram.new_from_options options
 81 |   program.compile
 82 | 
 83 |   opt_string = ""
 84 |   if options["optimize"]
 85 |     opt_string = "-opt"
 86 |   end
 87 | 
 88 |   if full || execute
 89 |     llc_main = system "llc output.ll"
 90 |     if llc_main
 91 |       llc_stdlib = system "llc std-lib#{opt_string}.ll"
 92 |       if llc_stdlib
 93 |         clang_link = system "clang output.s std-lib#{opt_string}.s -o output"
 94 |         if clang_link
 95 |           if execute
 96 |             system "./output"
 97 |           end
 98 |         else
 99 |           puts "clang is unable to link output.s std-lib.s into executable"
100 |         end
101 |       else
102 |         puts "llc is unable to compile generated std-lib.ll"
103 |       end
104 |     else
105 |       puts "llc is unable to compile generated output.ll"
106 |     end
107 |   end
108 | end
109 | 


--------------------------------------------------------------------------------
/example_clang/main.c:
--------------------------------------------------------------------------------
1 | int addFour(int x) {
2 | 	return x + 4;
3 | }
4 | 
5 | 
6 | int main(){
7 | 	int four = addFour(0);
8 | }
9 | 


--------------------------------------------------------------------------------
/example_clang/main2.c:
--------------------------------------------------------------------------------
 1 | int main(){
 2 |     int number;
 3 |     int two = 2, three = 3, four = 4;
 4 |     if (two + three * four < three) {
 5 |         number = two;
 6 |     } else {
 7 |         number = 5;
 8 |     }
 9 |     return number;
10 | }


--------------------------------------------------------------------------------
/example_clang/main2.ll:
--------------------------------------------------------------------------------
 1 | ; ModuleID = 'main2.c'
 2 | source_filename = "main2.c"
 3 | target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
 4 | target triple = "x86_64-apple-macosx10.12.0"
 5 | 
 6 | ; Function Attrs: nounwind ssp uwtable
 7 | define i32 @main() #0 {
 8 |   %1 = alloca i32, align 4
 9 |   %2 = alloca i32, align 4
10 |   %3 = alloca i32, align 4
11 |   %4 = alloca i32, align 4
12 |   %5 = alloca i32, align 4
13 |   store i32 0, i32* %1, align 4
14 |   store i32 2, i32* %3, align 4
15 |   store i32 3, i32* %4, align 4
16 |   store i32 4, i32* %5, align 4
17 |   %6 = load i32, i32* %3, align 4
18 |   %7 = load i32, i32* %4, align 4
19 |   %8 = load i32, i32* %5, align 4
20 |   %9 = mul nsw i32 %7, %8
21 |   %10 = add nsw i32 %6, %9
22 |   %11 = load i32, i32* %4, align 4
23 |   %12 = icmp slt i32 %10, %11
24 |   br i1 %12, label %13, label %15
25 | 
26 | ; <label>:13:                                     ; preds = %0
27 |   %14 = load i32, i32* %3, align 4
28 |   store i32 %14, i32* %2, align 4
29 |   br label %16
30 | 
31 | ; <label>:15:                                     ; preds = %0
32 |   store i32 5, i32* %2, align 4
33 |   br label %16
34 | 
35 | ; <label>:16:                                     ; preds = %15, %13
36 |   %17 = load i32, i32* %2, align 4
37 |   ret i32 %17
38 | }
39 | 
40 | attributes #0 = { nounwind ssp uwtable "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core2" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
41 | 
42 | !llvm.module.flags = !{!0}
43 | !llvm.ident = !{!1}
44 | 
45 | !0 = !{i32 1, !"PIC Level", i32 2}
46 | !1 = !{!"clang version 3.9.0 (tags/RELEASE_390/final)"}
47 | 


--------------------------------------------------------------------------------
/example_clang/readme.md:
--------------------------------------------------------------------------------
 1 | #Example Clang
 2 | 
 3 | These files are used as examples in Chapter 4 and 5 when using Clang to inspect how it translates C into an AST and subsequent LLVM IR.
 4 | 
 5 | ### Dump AST
 6 | ```bash
 7 | clang -cc1 -ast-dump name_of_file.c
 8 | ```
 9 | 
10 | 
11 | ### Emit LLVM IR
12 | ```bash
13 | clang -cc1 -emit-llvm name_of_file.c
14 | ```
15 | 
16 | 
17 | 


--------------------------------------------------------------------------------
/example_ir/example_1.ll:
--------------------------------------------------------------------------------
 1 | ; ModuleID = 'test'
 2 | source_filename = "test"
 3 | 
 4 | ; Declare the string constant as a global constant.
 5 | @.str = private unnamed_addr constant [12 x i8] c"hello world\00"
 6 | @.str2 = private unnamed_addr constant [13 x i8] c"hello world2\00"
 7 | 
 8 | define i32 @main() {
 9 | main_body:
10 |   %number = alloca i32
11 |   br i1 false, label %if_block, label %else_block
12 | 
13 | if_block:                                         ; preds = %main_body
14 |   %str1 = getelementptr [12 x i8], [12 x i8]* @.str, i64 0, i64 0
15 |   call i32 @puts(i8* %str1)
16 |   store i32 2, i32* %number
17 |   br label %return_block
18 | 
19 | else_block:                                       ; preds = %main_body
20 |   %str2 = getelementptr [13 x i8], [13 x i8]* @.str2, i64 0, i64 0
21 |   call i32 @puts(i8* %str2)
22 |   store i32 5, i32* %number
23 |   br label %return_block
24 | 
25 | return_block:                                     ; preds = %else_block, %if_block
26 |   %ret_value = load i32, i32* %number
27 |   ret i32 %ret_value
28 | }
29 | 
30 | declare i32 @puts(i8*)
31 | 


--------------------------------------------------------------------------------
/example_ir/example_2.ll:
--------------------------------------------------------------------------------
 1 | ; ModuleID = 'test'
 2 | source_filename = "proposal-1"
 3 | 
 4 | @.itoa = internal constant [3 x i8] c"%d\00"
 5 | 
 6 | define i32 @main() {
 7 | entry:
 8 |         %four = alloca i32
 9 |         store i32 4, i32* %four
10 |         %stringb = alloca [79 x i8]
11 |         %bufferp = getelementptr [79 x i8], [79 x i8]* %stringb, i32 0, i32 0
12 |         %itoa = getelementptr [3 x i8], [3 x i8]* @.itoa, i32 0, i64 0
13 |         
14 |         ; puts four
15 |         %fourv = load i32, i32* %four
16 | 
17 |         ; print four as string
18 |         call i32 (i8*, i32, i64, i8*, ...) @__sprintf_chk(i8* %bufferp, i32 0, i64 79, i8* %itoa, i32 %fourv)            
19 |         call i32 @puts( i8* %bufferp ) nounwind
20 | 
21 |         ; puts 10 < 6
22 |         %value = icmp ult i32 6, 10
23 |         %upc = zext i1 %value to i32
24 |         
25 |         call i32 (i8*, i32, i64, i8*, ...) @__sprintf_chk(i8* %bufferp, i32 0, i64 79, i8* %itoa, i32 %upc)            
26 |         call i32 @puts( i8* %bufferp ) nounwind
27 | 
28 |         ret i32 0
29 | }
30 | 
31 | declare i32 @__sprintf_chk(i8*, i32, i64, i8*, ...) #1
32 | 
33 | declare i32 @puts(i8*)
34 | 


--------------------------------------------------------------------------------
/example_ir/example_3.ll:
--------------------------------------------------------------------------------
 1 | ; Crystal code to llvm_ir
 2 | ; crystal build test.cr --emit llvm-ir --prelude=empty
 3 | 
 4 | ; four = 4
 5 | ; four
 6 | ; 10 < four
 7 | ; 10 - four
 8 | 
 9 | ; ModuleID = 'main_module'
10 | source_filename = "main_module"
11 | target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
12 | target triple = "x86_64-apple-macosx"
13 | 
14 | @ARGC_UNSAFE = internal global i32 0
15 | @ARGV_UNSAFE = internal global i8** null
16 | 
17 | define internal i32 @__crystal_main(i32 %argc, i8** %argv) {
18 | alloca:
19 |   %four = alloca i32
20 |   br label %entry
21 | 
22 | entry:                                            ; preds = %alloca
23 |   store i32 %argc, i32* @ARGC_UNSAFE
24 |   store i8** %argv, i8*** @ARGV_UNSAFE
25 |   store i32 4, i32* %four
26 |   %0 = load i32, i32* %four
27 |   %1 = load i32, i32* %four
28 |   %2 = icmp slt i32 10, %1
29 |   %3 = load i32, i32* %four
30 |   %4 = sub i32 10, %3
31 |   ret i32 %4
32 | }
33 | 
34 | declare i32 @printf(i8*, ...)
35 | 
36 | ; Function Attrs: uwtable
37 | define i32 @main(i32 %argc, i8** %argv) #0 {
38 | entry:
39 |   %0 = call i32 @__crystal_main(i32 %argc, i8** %argv)
40 |   ret i32 0
41 | }
42 | 
43 | ; Function Attrs: nounwind
44 | declare void @llvm.stackprotector(i8*, i8**) #1
45 | 
46 | attributes #0 = { uwtable "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"="true" }
47 | attributes #1 = { nounwind }


--------------------------------------------------------------------------------
/example_ir/example_4.cr:
--------------------------------------------------------------------------------
 1 | require "llvm"
 2 | 
 3 | class Program
 4 |   getter main : LLVM::BasicBlock, mod : LLVM::Module, builder : LLVM::Builder
 5 |   getter! func : LLVM::Function
 6 | 
 7 |   def initialize
 8 |     # Create the context
 9 |     context = LLVM::Context.new
10 | 
11 |     # Create a module
12 |     @mod = context.new_module("name")
13 | 
14 |     # Add a global number variable "number" = 10
15 |     mod.globals.add context.int32, "number"
16 |     mod.globals["number"].initializer = context.int32.const_int(10)
17 | 
18 |     # Create a main function
19 |     @func = mod.functions.add "main", ([] of LLVM::Type), context.int32
20 | 
21 |     # Create body for main function - builder appends to basic blocks.
22 |     @main = func.basic_blocks.append "main_body"
23 | 
24 |     # Make main function externally linkable
25 |     func.linkage = LLVM::Linkage::External
26 | 
27 |     # Declare external function puts
28 |     mod.functions.add "puts", [context.void_pointer], context.int32
29 | 
30 |     # Initialize Crystal's builder API
31 |     @builder = context.new_builder
32 |   end
33 | 
34 |   def code_generate
35 |     # Before calling builder, you must position it into the active basic block of your program
36 |     builder.position_at_end main
37 | 
38 |     # While walking the AST nodes you can call builder API to generate instructions into the basic block...
39 |     str_ptr = builder.global_string_pointer "Johnny", "str"
40 |     builder.call mod.functions["puts"], str_ptr, "str_call"
41 |     num_val = builder.load mod.globals["number"]
42 |     builder.ret num_val
43 | 
44 |     File.open("example_4.ll", "w") do |file|
45 |       mod.to_s(file)
46 |     end
47 |   end
48 | end
49 | 
50 | program = Program.new
51 | program.code_generate
52 | 


--------------------------------------------------------------------------------
/example_ir/example_4.ll:
--------------------------------------------------------------------------------
 1 | ; ModuleID = 'module_name'
 2 | 
 3 | @number = global i32 10
 4 | @str = private unnamed_addr constant [7 x i8] c"Johnny\00"
 5 | 
 6 | define i32 @main() {
 7 | main_body:
 8 |   %str_call = call i32 @puts(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @str, i32 0, i32 0))
 9 |   %0 = load i32, i32* @number
10 |   ret i32 %0
11 | }
12 | 
13 | declare i32 @puts(i8*)
14 | 


--------------------------------------------------------------------------------
/example_ir/readme.md:
--------------------------------------------------------------------------------
 1 | #Example IR
 2 | 
 3 | These examples are meant to help you see simple llvm ir in action and get a feel for how it works.
 4 | 
 5 | For each .ll file
 6 | ```bash
 7 | llc file.ll #yields file.s assembly file
 8 | clang file.s -o main #compiles assembly to machine code
 9 | ```
10 | 
11 | A helpful tool for examining simplified Crystal to llvm-ir
12 | ```
13 | crystal build file.cr --emit llvm-ir --prelude=empty
14 | ```
15 | 
16 | File example_4.cr generates example_4.ll when executed
17 | ```bash
18 | crystal build example_4.cr
19 | ./example4  #output is example_4.ll
20 | ```
21 | 


--------------------------------------------------------------------------------
/license:
--------------------------------------------------------------------------------
1 | MIT, learn and do what you want!


--------------------------------------------------------------------------------
/readme.md:
--------------------------------------------------------------------------------
 1 | # LLVM Frontend Tutorial
 2 | 
 3 | <a href="https://travis-ci.org/Virtual-Machine/llvm-tutorial-book"><img src="https://travis-ci.org/Virtual-Machine/llvm-tutorial-book.svg?branch=master"></a>
 4 | 
 5 | This is the main repository for my LLVM tutorial.
 6 | 
 7 | **WARNING** This project is still in the early phase. The lexer, parser and code generator are functional, though subject to refactoring and possible bugs. Until this warning is removed, source code is subject to change. Chapter texts remain incomplete, yet useful at the moment. Compiler meets basic example requirements and beyond, but there are more additions planned.
 8 | 
 9 | The project intention is to create a user friendly introduction to llvm, compilers, and programming language creation in general.
10 | 
11 | Through the chapters of this tutorial we will create a working compiler for a toy language. Initially we will keep the syntax very simple to get to a working product quickly and then iteratively build on new functionality and syntax to gradually make the toy language more expressive and powerful.
12 | 
13 | If anyone notices anything out of date or that is not factually correct, I welcome the knowledge. I am hoping to learn a lot from writing this.
14 | 
15 | I am also concurrently maintaining a repository of example ir demos that is useful for exploring how llvm ir works to corresponding code samples. 
16 | [Example LLVM IR](https://github.com/Virtual-Machine/ir-examples)
17 | 
18 | ```bash
19 | #quick start
20 | crystal build emeraldc.cr #generates emerald compiler emeraldc
21 | llc std-lib.ll            #generates std-lib.s ready for linking with output.s
22 | # By default emeraldc targets test_file.cr
23 | ./emeraldc -h #show emeraldc help
24 | ./emeraldc -l -a -r -i -v #compile test_file.cr with all debug info
25 | ./emeraldc -d #same as -l -a -r -i -v
26 | ./emeraldc file_of_your_choice.cr #choose file to compile
27 | ./emeraldc -q #no generated output.ll
28 | ./emeraldc -f #full compilation to output binary
29 | ./emeraldc -c #clean all output files
30 | ./emeraldc -n #no color output
31 | ./emeraldc -e #execute full compilation binary
32 | ./emeraldc -t #execute tests in spec/
33 | ./emeraldc -o #optimize llvm ir
34 | ```
35 | 
36 | ### Todo
37 | 
38 | ☐ Refactor code
39 | 
40 | ☐ Write text for chapters
41 |  - Chapter 9
42 | 
43 | ☐ Integrate code into chapter texts
44 |  - Chapter 3
45 |  - Chapter 4
46 |  - Chapter 5
47 |  - Chapter 7
48 |  - Chapter 8
49 |  - Chapter 9
50 | 
51 | ☐ Use refactored code to produce compiler versions corresponding to chapter texts
52 |   - Chapter 3
53 |   - Chapter 4
54 |   - Chapter 5
55 |   - Chapter 7
56 |   - Chapter 8
57 |   - Chapter 9
58 | 
59 | ☐ Add diagrams for program behaviour
60 |  - walk detail
61 |  - resolve_value detail
62 |  - ir generation detail
63 | 
64 | 
65 | ### Demo With Debug Output
66 | 
67 | demo_file.cr
68 | ```crystal
69 | four = 2 + 2
70 | puts four + 1
71 | puts 10 < 6 + 1 * 2
72 | ```
73 | 
74 | ```bash
75 | ./emeraldc demo_file.cr -d
76 | ```
77 | 
78 | output:
79 | 
80 | ![Output](https://raw.githubusercontent.com/Virtual-Machine/llvm-tutorial-book/master/diagrams/img/demo_output.png)
81 | 
82 | 


--------------------------------------------------------------------------------
/spec/errors_spec.cr:
--------------------------------------------------------------------------------
  1 | require "spec"
  2 | require "../emerald/emerald"
  3 | 
  4 | # Note that these examples create program examples with:
  5 | #      EmeraldProgram.new_from_input input_code, true
  6 | # The true param forces program to re-raise exception instead of printing and exiting to allow tests to work as expected.
  7 | 
  8 | describe "Errors" do
  9 |   describe "Emerald error handling" do
 10 |     it "should catch bad combinations of tokens 1" do
 11 |       expect_raises EmeraldTokenVerificationException do
 12 |         program = EmeraldProgram.new_from_input "puts ( * 2)", true
 13 |         program.compile
 14 |       end
 15 |     end
 16 | 
 17 |     it "should catch bad combinations of tokens 2" do
 18 |       expect_raises EmeraldTokenVerificationException do
 19 |         program = EmeraldProgram.new_from_input "puts (2 *)", true
 20 |         program.compile
 21 |       end
 22 |     end
 23 | 
 24 |     it "should catch bad combinations of tokens 3" do
 25 |       expect_raises EmeraldTokenVerificationException do
 26 |         program = EmeraldProgram.new_from_input "puts 2 * * 2", true
 27 |         program.compile
 28 |       end
 29 |     end
 30 | 
 31 |     it "should catch bad combinations of tokens 4" do
 32 |       expect_raises EmeraldTokenVerificationException do
 33 |         program = EmeraldProgram.new_from_input "puts 2 ()", true
 34 |         program.compile
 35 |       end
 36 |     end
 37 | 
 38 |     it "should catch bad combinations of tokens 5" do
 39 |       expect_raises EmeraldTokenVerificationException do
 40 |         program = EmeraldProgram.new_from_input "puts () 2", true
 41 |         program.compile
 42 |       end
 43 |     end
 44 | 
 45 |     it "should catch bad combinations of tokens 6" do
 46 |       expect_raises EmeraldTokenVerificationException do
 47 |         program = EmeraldProgram.new_from_input "puts 2 \"Int\"", true
 48 |         program.compile
 49 |       end
 50 |     end
 51 | 
 52 |     it "should catch unsupported operators 1" do
 53 |       expect_raises EmeraldLexingException do
 54 |         program = EmeraldProgram.new_from_input "x = 0\nputs x += 1", true
 55 |         program.compile
 56 |       end
 57 |     end
 58 | 
 59 |     it "should catch unsupported operators 2" do
 60 |       expect_raises EmeraldLexingException do
 61 |         program = EmeraldProgram.new_from_input "x = 0\nputs x -= 1", true
 62 |         program.compile
 63 |       end
 64 |     end
 65 | 
 66 |     it "should catch unsupported operators 3" do
 67 |       expect_raises EmeraldLexingException do
 68 |         program = EmeraldProgram.new_from_input "x = 0\nputs x *= 1", true
 69 |         program.compile
 70 |       end
 71 |     end
 72 | 
 73 |     it "should catch unsupported operators 4" do
 74 |       expect_raises EmeraldLexingException do
 75 |         program = EmeraldProgram.new_from_input "x = 0\nputs x /= 1", true
 76 |         program.compile
 77 |       end
 78 |     end
 79 | 
 80 |     it "should catch value resolution errors for undefined operators on known type combinations" do
 81 |       expect_raises EmeraldValueResolutionException do
 82 |         program = EmeraldProgram.new_from_input "puts 1 & 1", true
 83 |         program.compile
 84 |       end
 85 |     end
 86 | 
 87 |     it "should catch value resolution errors for undefined type combinations" do
 88 |       expect_raises EmeraldValueResolutionException do
 89 |         program = EmeraldProgram.new_from_input "puts true + 1", true
 90 |         program.compile
 91 |       end
 92 |     end
 93 | 
 94 |     it "should catch parsing exception for undefined top level tokens" do
 95 |       expect_raises EmeraldParsingException do
 96 |         program = EmeraldProgram.new_from_input "+ 2", true
 97 |         program.compile
 98 |       end
 99 |     end
100 | 
101 |     it "should catch parsing exception for closing unopened parenthesis" do
102 |       expect_raises EmeraldParsingException do
103 |         program = EmeraldProgram.new_from_input "2 + 2)", true
104 |         program.compile
105 |       end
106 |     end
107 | 
108 |     it "should catch parsing exception for unclosed parenthesis" do
109 |       expect_raises EmeraldParsingException do
110 |         program = EmeraldProgram.new_from_input "(2 + 2", true
111 |         program.compile
112 |       end
113 |     end
114 | 
115 |     it "should catch variable reference exception for undefined variables" do
116 |       expect_raises EmeraldVariableReferenceException do
117 |         program = EmeraldProgram.new_from_input "puts hello", true
118 |         program.compile
119 |       end
120 |     end
121 | 
122 |     it "should catch invalid function declarations - no parameter type" do
123 |       expect_raises EmeraldParsingException do
124 |         program = EmeraldProgram.new_from_input "def add_up(x) Int32\nreturn 0 + x\nend", true
125 |         program.compile
126 |       end
127 |     end
128 | 
129 |     it "should catch invalid function declarations - no parameter type multiple" do
130 |       expect_raises EmeraldParsingException do
131 |         program = EmeraldProgram.new_from_input "def add_up(x, y Int32) Int32\nreturn 0\nend", true
132 |         program.compile
133 |       end
134 |     end
135 | 
136 |     it "should catch invalid function declarations - no return type" do
137 |       expect_raises EmeraldParsingException do
138 |         program = EmeraldProgram.new_from_input "def add_up(x Int32)\nreturn 0 + x\nend", true
139 |         program.compile
140 |       end
141 |     end
142 | 
143 |     # Currently unable to test for EmeraldInstructionException
144 |     # Not sure if there is possible input to generate error as of yet.
145 |   end
146 | end
147 | 


--------------------------------------------------------------------------------
/spec/floats_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "Float processing" do
 5 |   system "./emeraldc test_inputs/floats.cr -e > test_outputs/floats"
 6 |   contents = File.read("test_outputs/floats")
 7 | 
 8 |   it "should evaluate floats as expected" do
 9 |     contents.should eq "5.9\n8.4\ntrue\n2.0\n9\ntrue\ntrue\nfalse\n"
10 |   end
11 | end
12 | 


--------------------------------------------------------------------------------
/spec/full_integration_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "Integration" do
 5 |   system "./emeraldc test_inputs/full_integration.cr -e > test_outputs/full_integration"
 6 |   contents = File.read("test_outputs/full_integration")
 7 | 
 8 |   it "should resolve final output as expected" do
 9 |     contents.should eq "196.8\n"
10 |   end
11 | end
12 | 
13 | describe "Integration_2" do
14 |   system "./emeraldc test_inputs/full_integration_2.cr -e > test_outputs/full_integration_2"
15 |   contents = File.read("test_outputs/full_integration_2")
16 | 
17 |   it "should resolve final output as expected" do
18 |     contents.should eq "Hello\nHello\nHello\nHello\nYuge!\nNot so biggly\n6.7\n6.7\n6.7\n6.7\n6.7\n5.4\n"
19 |   end
20 | end
21 | 
22 | describe "Integration_3" do
23 |   system "./emeraldc test_inputs/full_integration_3.cr -e > test_outputs/full_integration_3"
24 |   contents = File.read("test_outputs/full_integration_3")
25 | 
26 |   it "should resolve final output as expected" do
27 |     contents.should eq "Hellos\nHello\nHellos\nHellos\nGoodbye\nYuge!\ntrue\nNot so biggly\nfalse\n6.7\n6.7\n0\n"
28 |   end
29 | end
30 | 


--------------------------------------------------------------------------------
/spec/functions_2_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "Functions_2" do
 5 |   system "./emeraldc test_inputs/functions_2.cr -e > test_outputs/functions_2"
 6 |   contents = File.read("test_outputs/functions_2")
 7 | 
 8 |   it "should resolve final output as expected" do
 9 |     contents.should eq "1\n0\n11\n1\n6\n11\n12\n"
10 |   end
11 | end
12 | 


--------------------------------------------------------------------------------
/spec/functions_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "Functions" do
 5 |   system "./emeraldc test_inputs/functions.cr -e > test_outputs/functions"
 6 |   contents = File.read("test_outputs/functions")
 7 | 
 8 |   it "should resolve final output as expected" do
 9 |     contents.should eq "9\n2\nhello world\nhello hello hello hello hello hello hello hello hello \n"
10 |   end
11 | end
12 | 


--------------------------------------------------------------------------------
/spec/generator_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | expected = %[; ModuleID = 'Emerald'
 5 | source_filename = \"Emerald\"
 6 | 
 7 | @0 = private unnamed_addr constant [6 x i8] c\"false\\00\"
 8 | @1 = private unnamed_addr constant [5 x i8] c\"true\\00\"
 9 | 
10 | define i32 @main() {
11 | main_body:
12 |   %four = alloca i32
13 |   store i32 4, i32* %four
14 |   %0 = load i32, i32* %four
15 |   %puts = call i32 @\"puts:int\"(i32 %0)
16 |   %puts1 = call i32 @\"puts:str\"(i8* getelementptr inbounds ([6 x i8], [6 x i8]* @0, i32 0, i32 0))
17 |   %puts2 = call i32 @\"puts:str\"(i8* getelementptr inbounds ([5 x i8], [5 x i8]* @1, i32 0, i32 0))
18 |   ret i32 0
19 | }
20 | ]
21 | 
22 | describe "Generator" do
23 |   describe "generate" do
24 |     input = "# I am a comment!
25 | four = 2 + 2
26 | puts four
27 | puts 10 < 6
28 | puts 11 != 10"
29 | 
30 |     program = EmeraldProgram.new_from_input input
31 |     program.compile
32 | 
33 |     it "should output exact LLVM IR for basic example input" do
34 |       program.mod.to_s[0, 522].should eq expected
35 |     end
36 |   end
37 | end
38 | 


--------------------------------------------------------------------------------
/spec/if_statements_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "If Statements" do
 5 |   describe "integration tests with if statements and puts calls" do
 6 |     system "./emeraldc test_inputs/if_statements_1.cr -e > test_outputs/if_statements_1"
 7 |     contents = File.read("test_outputs/if_statements_1")
 8 |     it "should evaluate if_statements_1 as expected" do
 9 |       contents == "1\n"
10 |     end
11 | 
12 |     system "./emeraldc test_inputs/if_statements_2.cr -e > test_outputs/if_statements_2"
13 |     contents = File.read("test_outputs/if_statements_2")
14 |     it "should evaluate if_statements_2 as expected" do
15 |       contents == "starting\nok3\n4\nok4\ndone\n"
16 |     end
17 |   end
18 | end
19 | 


--------------------------------------------------------------------------------
/spec/int64_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "While" do
 5 |   system "./emeraldc test_inputs/int64.cr -e > test_outputs/int64"
 6 |   contents = File.read("test_outputs/int64")
 7 | 
 8 |   it "should resolve final output as expected" do
 9 |     contents.should eq "9223372036854775807\n9223372036854775806\n9223372\n"
10 |   end
11 | end
12 | 


--------------------------------------------------------------------------------
/spec/lexer_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "Lexer" do
 5 |   describe "lex" do
 6 |     input = "# I am a comment!
 7 | four = 2 + 2
 8 | puts four
 9 | puts 10 < 6
10 | puts 11 != 10"
11 | 
12 |     program = EmeraldProgram.new_from_input input
13 |     program.lex
14 | 
15 |     test_data = { 0 => [TokenType::Comment, "I am a comment!", 1, 1],
16 |                   1 => [TokenType::Delimiter, :endl, 1, 18],
17 |                   2 => [TokenType::Identifier, "four", 2, 1],
18 |                   3 => [TokenType::Operator, "=", 2, 6],
19 |                   4 => [TokenType::Int, 2, 2, 8],
20 |                   5 => [TokenType::Operator, "+", 2, 10],
21 |                   6 => [TokenType::Int, 2, 2, 12],
22 |                   7 => [TokenType::Delimiter, :endl, 2, 13],
23 |                   8 => [TokenType::Keyword, :puts, 3, 1],
24 |                   9 => [TokenType::Identifier, "four", 3, 6],
25 |                  10 => [TokenType::Delimiter, :endl, 3, 10],
26 |                  11 => [TokenType::Keyword, :puts, 4, 1],
27 |                  12 => [TokenType::Int, 10, 4, 6],
28 |                  13 => [TokenType::Operator, "<", 4, 9],
29 |                  14 => [TokenType::Int, 6, 4, 11],
30 |                  15 => [TokenType::Delimiter, :endl, 4, 12],
31 |                  16 => [TokenType::Keyword, :puts, 5, 1],
32 |                  17 => [TokenType::Int, 11, 5, 6],
33 |                  18 => [TokenType::Operator, "!=", 5, 9],
34 |                  19 => [TokenType::Int, 10, 5, 12],
35 |                  20 => [TokenType::Delimiter, :endf, 5, 14]}
36 | 
37 |     test_data.each do |index, array|
38 |       it "should build expected token array #{index}" do
39 |         program.token_array[index].typeT.should eq array[0]
40 |         program.token_array[index].value.should eq array[1]
41 |         program.token_array[index].line.should eq array[2]
42 |         program.token_array[index].column.should eq array[3]
43 |       end
44 |     end
45 |   end
46 | end
47 | 


--------------------------------------------------------------------------------
/spec/parser_more_examples_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "Parser" do
 5 |   describe "parse" do
 6 |     input = "
 7 | var_106 = (2 + 2) * 3 + 2 * (5 + (6 * 7))
 8 | puts var_106 - 3 * (var_106 + 3)"
 9 | 
10 |     program = EmeraldProgram.new_from_input input
11 |     program.lex
12 |     program.parse
13 | 
14 |     root = program.ast[0]
15 |     decl_106 = root.children[0]
16 | 
17 |     addition_1 = decl_106.children[0].children[0]
18 |     multiply_1 = addition_1.children[0]
19 |     addition_2 = multiply_1.children[0].children[0]
20 |     multiply_2 = addition_1.children[1]
21 |     addition_3 = multiply_2.children[1].children[0]
22 |     multiply_3 = addition_3.children[1].children[0]
23 | 
24 |     int_2_1 = addition_2.children[0]
25 |     int_2_2 = addition_2.children[1]
26 |     int_2_3 = multiply_2.children[0]
27 |     int_3_1 = multiply_1.children[1]
28 |     int_5_1 = addition_3.children[0]
29 |     int_6_1 = multiply_3.children[0]
30 |     int_7_1 = multiply_3.children[1]
31 |     it "parses first command as variable declaration receiving complex equation" do
32 |       decl_106.class.should eq VariableDeclarationNode
33 |       decl_106.value.should eq "var_106"
34 |       addition_1.class.should eq BinaryOperatorNode
35 |       addition_1.value.should eq "+"
36 |       addition_2.class.should eq BinaryOperatorNode
37 |       addition_2.value.should eq "+"
38 |       int_2_1.class.should eq IntegerLiteralNode
39 |       int_2_1.value.should eq 2
40 |       int_2_2.class.should eq IntegerLiteralNode
41 |       int_2_2.value.should eq 2
42 |       addition_3.class.should eq BinaryOperatorNode
43 |       addition_3.value.should eq "+"
44 |       int_5_1.class.should eq IntegerLiteralNode
45 |       int_5_1.value.should eq 5
46 |       multiply_1.class.should eq BinaryOperatorNode
47 |       multiply_1.value.should eq "*"
48 |       int_3_1.class.should eq IntegerLiteralNode
49 |       int_3_1.value.should eq 3
50 |       multiply_2.class.should eq BinaryOperatorNode
51 |       multiply_2.value.should eq "*"
52 |       int_2_3.class.should eq IntegerLiteralNode
53 |       int_2_3.value.should eq 2
54 |       multiply_3.class.should eq BinaryOperatorNode
55 |       multiply_3.value.should eq "*"
56 |       int_6_1.class.should eq IntegerLiteralNode
57 |       int_6_1.value.should eq 6
58 |       int_7_1.class.should eq IntegerLiteralNode
59 |       int_7_1.value.should eq 7
60 |     end
61 | 
62 |     call_expr = root.children[1]
63 |     subtract_1 = call_expr.children[0].children[0]
64 |     var_decl_1 = subtract_1.children[0]
65 |     multiply_4 = subtract_1.children[1]
66 |     int_3_2 = multiply_4.children[0]
67 |     addition_4 = multiply_4.children[1].children[0]
68 |     var_decl_2 = addition_4.children[0]
69 |     int_3_3 = addition_4.children[1]
70 |     it "parses second command as call expression resolving two declaration references" do
71 |       call_expr.class.should eq CallExpressionNode
72 |       call_expr.value.should eq "puts"
73 |       subtract_1.class.should eq BinaryOperatorNode
74 |       subtract_1.value.should eq "-"
75 |       var_decl_1.class.should eq DeclarationReferenceNode
76 |       var_decl_1.value.should eq "var_106"
77 |       multiply_4.class.should eq BinaryOperatorNode
78 |       multiply_4.value.should eq "*"
79 |       int_3_2.class.should eq IntegerLiteralNode
80 |       int_3_2.value.should eq 3
81 |       addition_4.class.should eq BinaryOperatorNode
82 |       addition_4.value.should eq "+"
83 |       var_decl_2.class.should eq DeclarationReferenceNode
84 |       var_decl_2.value.should eq "var_106"
85 |       int_3_3.class.should eq IntegerLiteralNode
86 |       int_3_3.value.should eq 3
87 |     end
88 |   end
89 | end
90 | 


--------------------------------------------------------------------------------
/spec/parser_order_of_op_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "Parser_2" do
 5 |   describe "parse_order_of_operations" do
 6 |     input = "4 * 3 + 2"
 7 | 
 8 |     program2 = EmeraldProgram.new_from_input input
 9 |     program2.lex
10 |     program2.parse
11 | 
12 |     it "should parse number expressions at root level" do
13 |       program2.ast[0].children[0].class.should eq ExpressionNode
14 |       program2.ast[0].children[0].children[0].class.should eq BinaryOperatorNode
15 |       program2.ast[0].children[0].children[0].value.should eq "+"
16 |       program2.ast[0].children[0].children[0].children[0].class.should eq BinaryOperatorNode
17 |       program2.ast[0].children[0].children[0].children[0].value.should eq "*"
18 |       program2.ast[0].children[0].children[0].children[0].children[0].class.should eq IntegerLiteralNode
19 |       program2.ast[0].children[0].children[0].children[0].children[0].value.should eq 4
20 |       program2.ast[0].children[0].children[0].children[0].children[1].class.should eq IntegerLiteralNode
21 |       program2.ast[0].children[0].children[0].children[0].children[1].value.should eq 3
22 |       program2.ast[0].children[0].children[0].children[1].class.should eq IntegerLiteralNode
23 |       program2.ast[0].children[0].children[0].children[1].value.should eq 2
24 |     end
25 | 
26 |     input = "4 * ((3 + 2) + 1) + 5"
27 | 
28 |     program3 = EmeraldProgram.new_from_input input
29 |     program3.lex
30 |     program3.parse
31 | 
32 |     it "should handle parenthesis as sub expressions" do
33 |       program3.ast[0].children[0].class.should eq ExpressionNode
34 |       program3.ast[0].children[0].children[0].class.should eq BinaryOperatorNode
35 |       program3.ast[0].children[0].children[0].value.should eq "+"
36 |       program3.ast[0].children[0].children[0].children[0].class.should eq BinaryOperatorNode
37 |       program3.ast[0].children[0].children[0].children[0].value.should eq "*"
38 |       program3.ast[0].children[0].children[0].children[0].children[0].class.should eq IntegerLiteralNode
39 |       program3.ast[0].children[0].children[0].children[0].children[0].value.should eq 4
40 |       program3.ast[0].children[0].children[0].children[0].children[1].class.should eq ExpressionNode
41 |       program3.ast[0].children[0].children[0].children[0].children[1].children[0].class.should eq BinaryOperatorNode
42 |       program3.ast[0].children[0].children[0].children[0].children[1].children[0].value.should eq "+"
43 |       program3.ast[0].children[0].children[0].children[0].children[1].children[0].children[0].class.should eq ExpressionNode
44 |       program3.ast[0].children[0].children[0].children[0].children[1].children[0].children[0].children[0].class.should eq BinaryOperatorNode
45 |       program3.ast[0].children[0].children[0].children[0].children[1].children[0].children[0].children[0].value.should eq "+"
46 |       program3.ast[0].children[0].children[0].children[0].children[1].children[0].children[0].children[0].children[0].class.should eq IntegerLiteralNode
47 |       program3.ast[0].children[0].children[0].children[0].children[1].children[0].children[0].children[0].children[0].value.should eq 3
48 |       program3.ast[0].children[0].children[0].children[0].children[1].children[0].children[0].children[0].children[1].class.should eq IntegerLiteralNode
49 |       program3.ast[0].children[0].children[0].children[0].children[1].children[0].children[0].children[0].children[1].value.should eq 2
50 |       program3.ast[0].children[0].children[0].children[0].children[1].children[0].children[1].class.should eq IntegerLiteralNode
51 |       program3.ast[0].children[0].children[0].children[0].children[1].children[0].children[1].value.should eq 1
52 |       program3.ast[0].children[0].children[0].children[1].class.should eq IntegerLiteralNode
53 |       program3.ast[0].children[0].children[0].children[1].value.should eq 5
54 |     end
55 |   end
56 | end
57 | 


--------------------------------------------------------------------------------
/spec/parser_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "Parser" do
 5 |   describe "parse" do
 6 |     input = "# I am a comment!
 7 | four = 2 + 2
 8 | puts four
 9 | puts 10 < 6
10 | puts 11 != 10"
11 | 
12 |     program = EmeraldProgram.new_from_input input
13 |     program.lex
14 |     program.parse
15 | 
16 |     it "should parse start with a root node" do
17 |       program.ast[0].class.should eq RootNode
18 |     end
19 | 
20 |     it "should parse first expression as a Variable Declaration" do
21 |       program.ast[0].children[0].class.should eq VariableDeclarationNode
22 |       program.ast[0].children[0].value.should eq "four"
23 |       program.ast[0].children[0].children[0].class.should eq ExpressionNode
24 |       program.ast[0].children[0].children[0].children[0].class.should eq BinaryOperatorNode
25 |       program.ast[0].children[0].children[0].children[0].value.should eq "+"
26 |       program.ast[0].children[0].children[0].children[0].children[0].class.should eq IntegerLiteralNode
27 |       program.ast[0].children[0].children[0].children[0].children[0].value.should eq 2
28 |       program.ast[0].children[0].children[0].children[0].children[1].class.should eq IntegerLiteralNode
29 |       program.ast[0].children[0].children[0].children[0].children[1].value.should eq 2
30 |     end
31 | 
32 |     it "should parse the second expression as a Call Expression resolving a variable value" do
33 |       program.ast[0].children[1].class.should eq CallExpressionNode
34 |       program.ast[0].children[1].value.should eq "puts"
35 |       program.ast[0].children[1].children[0].class.should eq ExpressionNode
36 |       program.ast[0].children[1].children[0].children[0].class.should eq DeclarationReferenceNode
37 |       program.ast[0].children[1].children[0].children[0].value.should eq "four"
38 |     end
39 | 
40 |     it "should parse the third expression as a Call Expression resolving a Binary Operation" do
41 |       program.ast[0].children[2].class.should eq CallExpressionNode
42 |       program.ast[0].children[2].value.should eq "puts"
43 |       program.ast[0].children[2].children[0].class.should eq ExpressionNode
44 |       program.ast[0].children[2].children[0].children[0].class.should eq BinaryOperatorNode
45 |       program.ast[0].children[2].children[0].children[0].value.should eq "<"
46 |       program.ast[0].children[2].children[0].children[0].children[0].class.should eq IntegerLiteralNode
47 |       program.ast[0].children[2].children[0].children[0].children[0].value.should eq 10
48 |       program.ast[0].children[2].children[0].children[0].children[1].class.should eq IntegerLiteralNode
49 |       program.ast[0].children[2].children[0].children[0].children[1].value.should eq 6
50 |     end
51 | 
52 |     it "should parse the fourth expression as a Call Expression resolving a Binary Operation" do
53 |       program.ast[0].children[3].class.should eq CallExpressionNode
54 |       program.ast[0].children[3].value.should eq "puts"
55 |       program.ast[0].children[3].children[0].class.should eq ExpressionNode
56 |       program.ast[0].children[3].children[0].children[0].class.should eq BinaryOperatorNode
57 |       program.ast[0].children[3].children[0].children[0].value.should eq "!="
58 |       program.ast[0].children[3].children[0].children[0].children[0].class.should eq IntegerLiteralNode
59 |       program.ast[0].children[3].children[0].children[0].children[0].value.should eq 11
60 |       program.ast[0].children[3].children[0].children[0].children[1].class.should eq IntegerLiteralNode
61 |       program.ast[0].children[3].children[0].children[0].children[1].value.should eq 10
62 |     end
63 |   end
64 | end
65 | 


--------------------------------------------------------------------------------
/spec/strings_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "String processing" do
 5 |   system "./emeraldc test_inputs/strings.cr -e > test_outputs/strings"
 6 |   contents = File.read("test_outputs/strings")
 7 | 
 8 |   it "should resolve seven puts calls with desired string output" do
 9 |     contents.should eq "Hello from this file\nHello world!\nRepeated Repeated Repeated !\ntrue\nfalse\nfalse\ntrue\n"
10 |   end
11 | end
12 | 


--------------------------------------------------------------------------------
/spec/value_resolution_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "Generator" do
 5 |   system "./emeraldc test_inputs/value_resolution_1.cr -e > test_outputs/value_resolution_1"
 6 |   contents = File.read("test_outputs/value_resolution_1")
 7 | 
 8 |   it "should resolve final output as -221" do
 9 |     contents.should eq "-221\n"
10 |   end
11 | 
12 |   system "./emeraldc test_inputs/value_resolution_2.cr -e > test_outputs/value_resolution_2"
13 |   contents = File.read("test_outputs/value_resolution_2")
14 | 
15 |   it "should resolve final output as -1138, false" do
16 |     contents.should eq "-1138\nfalse\n"
17 |   end
18 | 
19 |   describe "value resolution_3" do
20 |     input = "
21 | puts (4 +
22 | 3
23 | *
24 | 2)
25 | "
26 | 
27 |     program3 = EmeraldProgram.new_from_input input
28 |     program3.compile
29 | 
30 |     first_expression = program3.ast[0].children[0]
31 | 
32 |     it "resolves multi-line expressions wrapped in parenthesis" do
33 |       first_expression.resolved_value.should eq 10
34 |     end
35 |   end
36 | end
37 | 


--------------------------------------------------------------------------------
/spec/variables_and_literals_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "Generator" do
 5 |   system "./emeraldc test_inputs/variables_and_literals.cr -e > test_outputs/variables_and_literals"
 6 |   contents = File.read("test_outputs/variables_and_literals")
 7 | 
 8 |   it "should resolve final output as expected" do
 9 |     contents.should eq "8.3\n8.3\n8.3\n8.3\n8.3\n8.3\ntrue\nfalse\ntrue\nfalse\ntrue\nfalse\nfalse\ntrue\nfalse\ntrue\nfalse\ntrue\nfalse\nfalse\nfalse\nfalse\nfalse\nfalse\ntrue\ntrue\ntrue\ntrue\ntrue\ntrue\ntrue\nfalse\ntrue\nfalse\ntrue\nfalse\ntrue\ntrue\ntrue\ntrue\ntrue\nfalse\n"
10 |   end
11 | end
12 | 


--------------------------------------------------------------------------------
/spec/while_spec.cr:
--------------------------------------------------------------------------------
 1 | require "spec"
 2 | require "../emerald/emerald"
 3 | 
 4 | describe "While" do
 5 |   system "./emeraldc test_inputs/while.cr -e > test_outputs/while"
 6 |   contents = File.read("test_outputs/while")
 7 | 
 8 |   it "should resolve final output as expected" do
 9 |     contents.should eq "1\n4\n3\n2\n1\n0\nWill print\n2\n4\n3\n2\n1\n0\nWill print\nOnce\n3\n4\n5\n"
10 |   end
11 | end
12 | 


--------------------------------------------------------------------------------
/src/emerald/close_statements.cr:
--------------------------------------------------------------------------------
 1 | class CloseStatement
 2 |   def close(builder : LLVM::Builder) : Nil
 3 |   end
 4 | end
 5 | 
 6 | class JumpStatement < CloseStatement
 7 |   def initialize(@scope : LLVM::BasicBlock, @destination : LLVM::BasicBlock)
 8 |   end
 9 | 
10 |   def close(builder : LLVM::Builder) : Nil
11 |     builder.position_at_end @scope
12 |     builder.br @destination
13 |   end
14 | end
15 | 
16 | class ConditionalStatement < CloseStatement
17 |   def initialize(@scope : LLVM::BasicBlock, @comp : LLVM::Value, @destination1 : LLVM::BasicBlock, @destination2 : LLVM::BasicBlock)
18 |   end
19 | 
20 |   def close(builder : LLVM::Builder) : Nil
21 |     builder.position_at_end @scope
22 |     builder.cond @comp, @destination1, @destination2
23 |   end
24 | end
25 | 


--------------------------------------------------------------------------------
/src/emerald/emerald.cr:
--------------------------------------------------------------------------------
  1 | require "./types"
  2 | require "./error"
  3 | require "./token"
  4 | require "./nodes/node.cr"
  5 | require "./nodes/*"
  6 | require "./lexer"
  7 | require "./verifier"
  8 | require "./parser"
  9 | require "./state"
 10 | require "./close_statements"
 11 | 
 12 | require "llvm"
 13 | 
 14 | class EmeraldProgram
 15 |   getter mod : LLVM::Module
 16 |   getter builder : LLVM::Builder
 17 |   getter ctx : LLVM::Context
 18 |   getter input_code, token_array, ast, output, delimiters, state, options, verifier, main : LLVM::BasicBlock
 19 |   getter! lexer, parser, func : LLVM::Function
 20 | 
 21 |   def self.new_from_input(input : String, test_mode : Bool = false)
 22 |     options = {
 23 |       "color"             => true,
 24 |       "supress"           => false,
 25 |       "printTokens"       => false,
 26 |       "printAST"          => false,
 27 |       "printResolutions"  => false,
 28 |       "printInstructions" => false,
 29 |       "printOutput"       => false,
 30 |       "optimize"          => false,
 31 |       "filename"          => "",
 32 |     }
 33 |     self.new input, options, test_mode
 34 |   end
 35 | 
 36 |   def self.new_from_options(options : Hash(String, (String | Bool)), test_mode : Bool = false)
 37 |     input = File.read(options["filename"].as(String))
 38 |     self.new input, options, test_mode
 39 |   end
 40 | 
 41 |   def initialize(@input_code : String, @options : Hash(String, (String | Bool)), @test_mode : Bool = false)
 42 |     @token_array = [] of Token
 43 |     @ast = [] of Node
 44 |     @output = ""
 45 |     @verifier = Verifier.new
 46 |     @ctx = LLVM::Context.new
 47 |     @mod = @ctx.new_module("Emerald")
 48 |     @func = mod.functions.add "main", ([] of LLVM::Type), @ctx.int32
 49 |     @main = func.basic_blocks.append "main_body"
 50 |     @builder = @ctx.new_builder
 51 |     @state = ProgramState.new builder, ctx, mod, main
 52 |   end
 53 | 
 54 |   def lex : Nil
 55 |     @lexer = Lexer.new input_code
 56 |     begin
 57 |       @token_array = lexer.lex
 58 |       if options["printTokens"]
 59 |         puts options["color"] ? "\033[032mTOKENS\033[039m" : "TOKENS"
 60 |         @token_array.each do |token|
 61 |           puts token.to_s
 62 |         end
 63 |         puts
 64 |       end
 65 |       verifier.verify_token_array @token_array
 66 |     rescue ex : EmeraldSyntaxException
 67 |       ex.full_error @input_code, @options["color"].as(Bool), @test_mode
 68 |     end
 69 |   end
 70 | 
 71 |   def parse : Nil
 72 |     @parser = Parser.new token_array
 73 |     begin
 74 |       @ast = parser.parse
 75 |     rescue ex : EmeraldSyntaxException
 76 |       ex.full_error @input_code, @options["color"].as(Bool), @test_mode
 77 |     end
 78 |   end
 79 | 
 80 |   def generate : Nil
 81 |     # Add debug values to state
 82 |     state.printAST = options["printAST"].as(Bool)
 83 |     state.printResolutions = options["printResolutions"].as(Bool)
 84 |     if state.printAST || state.printResolutions
 85 |       puts options["color"] ? "\033[032mAST / RESOLUTIONS\033[039m" : "AST / RESOLUTIONS"
 86 |     end
 87 | 
 88 |     # Walk nodes to resolve values and generate llvm ir
 89 |     begin
 90 |       @ast[0].walk state
 91 |       state.close_blocks
 92 |     rescue ex : EmeraldSyntaxException
 93 |       ex.full_error @input_code, @options["color"].as(Bool), @test_mode
 94 |     end
 95 | 
 96 |     if state.printAST || state.printResolutions
 97 |       puts
 98 |     end
 99 | 
100 |     # Run standard optimizations on module if enabled
101 |     if @options["optimize"]
102 |       optimize
103 |     end
104 | 
105 |     # Output LLVM IR to output.ll
106 |     output
107 |   end
108 | 
109 |   def optimize : Nil
110 |     fun_pass_manager = mod.new_function_pass_manager
111 |     pass_manager_builder = begin
112 |       registry = LLVM::PassRegistry.instance
113 |       registry.initialize_all
114 | 
115 |       builder = LLVM::PassManagerBuilder.new
116 |       builder.opt_level = 3
117 |       builder.size_level = 0
118 |       builder.use_inliner_with_threshold = 275
119 |       builder
120 |     end
121 |     pass_manager_builder.populate fun_pass_manager
122 |     fun_pass_manager.run mod
123 |   end
124 | 
125 |   def output : String
126 |     if !options["supress"]
127 |       File.open("output.ll", "w") do |file|
128 |         mod.to_s(file)
129 |       end
130 |     end
131 |     if options["printOutput"]
132 |       puts options["color"] ? "\033[032mOUTPUT\033[039m" : "OUTPUT"
133 |       puts mod.to_s
134 |       puts
135 |     end
136 |     @output = mod.to_s
137 |   end
138 | 
139 |   def compile : Nil
140 |     lex
141 |     parse
142 |     generate
143 |   end
144 | end
145 | 
146 | # input = "# I am a comment!
147 | # four = 2 + 2
148 | # puts four
149 | # puts 10 < 6
150 | # puts 11 != 10
151 | # "
152 | 
153 | # program = EmeraldProgram.new_from_input input
154 | # program.compile
155 | # puts program.output
156 | 


--------------------------------------------------------------------------------
/src/emerald/error.cr:
--------------------------------------------------------------------------------
 1 | class EmeraldSyntaxException < Exception
 2 |   getter line
 3 | 
 4 |   def initialize(@message : String, @line : Int32, @column : Int32)
 5 |   end
 6 | 
 7 |   def to_s : String
 8 |     "#{self.class} <#{@line}:#{@column}> : #{@message}"
 9 |   end
10 | 
11 |   def to_s_coloured : String
12 |     "\033[031m#{self.class}\033[039m <\033[034m#{@line}\033[039m:\033[034m#{@column}\033[039m> : #{@message}"
13 |   end
14 | 
15 |   def full_error(source : String, color : Bool, test_mode : Bool) : Nil
16 |     if test_mode
17 |       raise self
18 |     else
19 |       color ? puts self.to_s_coloured : puts self.to_s
20 |       puts
21 | 
22 |       lines = source.split("\n")
23 | 
24 |       print_pre_line = self.line == 1 ? false : true
25 |       print_post_line = self.line == lines.size ? false : true
26 | 
27 |       puts "Line #{self.line - 1} : #{lines[self.line - 2].strip}" if print_pre_line
28 |       color ? puts "\033[031mLine #{self.line} :\033[039m #{lines[self.line - 1].strip}" : puts "Line #{self.line} : #{lines[self.line - 1].strip}"
29 |       puts "Line #{self.line + 1} : #{lines[self.line].strip}" if print_post_line
30 |       exit 1
31 |     end
32 |   end
33 | end
34 | 
35 | class EmeraldValueResolutionException < EmeraldSyntaxException
36 | end
37 | 
38 | class EmeraldTokenVerificationException < EmeraldSyntaxException
39 | end
40 | 
41 | class EmeraldLexingException < EmeraldSyntaxException
42 | end
43 | 
44 | class EmeraldParsingException < EmeraldSyntaxException
45 | end
46 | 
47 | class EmeraldVariableReferenceException < EmeraldSyntaxException
48 | end
49 | 
50 | class EmeraldInstructionException < EmeraldSyntaxException
51 | end
52 | 


--------------------------------------------------------------------------------
/src/emerald/lexer.cr:
--------------------------------------------------------------------------------
  1 | class Lexer
  2 |   getter tokens
  3 |   @max : Int32
  4 | 
  5 |   def initialize(@content : String)
  6 |     @index = 0
  7 |     @position = 1
  8 |     @line = 1
  9 |     @current_token = ""
 10 |     @current_t_line = 0
 11 |     @current_t_position = 0
 12 |     @context = Context::TopLevel
 13 |     @current = ' '
 14 |     @next = ' '
 15 |     @max = @content.size - 1
 16 |     @keywords = [:puts, :return, :true, :false, :if, :else, :end, :def, :while]
 17 |     @types = [:Int32, :Float64, :String, :Bool, :Int64, :Nil]
 18 |     @tokens = [] of Token
 19 |     @functions = ["strlen"] of String
 20 |   end
 21 | 
 22 |   def next_line : Nil
 23 |     @line += 1
 24 |     @position = 0
 25 |   end
 26 | 
 27 |   def move_index : Nil
 28 |     @index += 1
 29 |     @position += 1
 30 |   end
 31 | 
 32 |   def set_position : Nil
 33 |     @current_t_line = @line
 34 |     @current_t_position = @position
 35 |   end
 36 | 
 37 |   def enter_mode(context : Context) : Nil
 38 |     @context = context
 39 |     set_position
 40 |   end
 41 | 
 42 |   def enter_mode_get(context : Context) : Nil
 43 |     @context = context
 44 |     set_position
 45 |     @current_token += @current
 46 |   end
 47 | 
 48 |   def lex : Array(Token)
 49 |     while @index <= @max
 50 |       lex_current_char
 51 |     end
 52 |     close_token
 53 |     @tokens << Token.new TokenType::Delimiter, :endf, @line, @position
 54 |   end
 55 | 
 56 |   def lex_current_char : Nil
 57 |     set_current_and_next
 58 |     if (@current == '(' || @current == ')') && @context != Context::Comment
 59 |       close_token
 60 |       lex_parens
 61 |     elsif @current == ',' && @context != Context::Comment
 62 |       close_token
 63 |       lex_comma
 64 |     else
 65 |       lex_by_context
 66 |     end
 67 |     check_for_raw_delimiter
 68 |     move_index
 69 |   end
 70 | 
 71 |   def set_current_and_next : Nil
 72 |     @current = @content[@index]
 73 |     if @index < @max
 74 |       @next = @content[@index + 1]
 75 |     else
 76 |       @next = '\u{4}'
 77 |     end
 78 |   end
 79 | 
 80 |   def check_for_raw_delimiter : Nil
 81 |     if @current == '\n' && @context != Context::String
 82 |       @tokens << Token.new TokenType::Delimiter, :endl, @line, @position
 83 |       next_line
 84 |     end
 85 |   end
 86 | 
 87 |   def lex_parens : Nil
 88 |     if @current == '('
 89 |       @tokens << Token.new TokenType::ParenOpen, "(", @line, @position
 90 |     elsif @current == ')'
 91 |       @tokens << Token.new TokenType::ParenClose, ")", @line, @position
 92 |     end
 93 |   end
 94 | 
 95 |   def lex_comma : Nil
 96 |     @tokens << Token.new TokenType::Comma, ",", @line, @position
 97 |   end
 98 | 
 99 |   def lex_by_context : Nil
100 |     case @context
101 |     when Context::TopLevel
102 |       lex_top_level
103 |     when Context::Comment
104 |       lex_comment
105 |     when Context::Identifier, Context::Def
106 |       lex_identifier
107 |     when Context::String
108 |       lex_string
109 |     when Context::Number
110 |       lex_number
111 |     when Context::Operator
112 |       lex_operator
113 |     end
114 |   end
115 | 
116 |   def lex_comment : Nil
117 |     if @current != '\n'
118 |       @current_token += @current
119 |     else
120 |       generate_token TokenType::Comment, @current_token.strip
121 |     end
122 |   end
123 | 
124 |   def lex_identifier : Nil
125 |     if !@current.ascii_whitespace?
126 |       @current_token += @current
127 |     else
128 |       close_identifier_token
129 |     end
130 |   end
131 | 
132 |   def lex_function_definition : Nil
133 |     if @current != ',' && !@current.ascii_whitespace?
134 |       @current_token += @current
135 |     elsif @current == ','
136 |       close_identifier_token
137 |       generate_token TokenType::Comma, ","
138 |     else
139 |       close_identifier_token
140 |     end
141 |   end
142 | 
143 |   def lex_string : Nil
144 |     if @current == '\\'
145 |       handle_escape
146 |     elsif @current != '"'
147 |       @current_token += @current
148 |     else
149 |       generate_token TokenType::String, @current_token
150 |     end
151 |   end
152 | 
153 |   def handle_escape : Nil
154 |     if @next == 'n'
155 |       @current_token += "\n"
156 |     elsif @next == 't'
157 |       @current_token += "\t"
158 |     else
159 |       @current_token += @next
160 |     end
161 |     move_index
162 |   end
163 | 
164 |   def lex_number : Nil
165 |     if @current.ascii_number? || @current == '.'
166 |       @current_token += @current
167 |     elsif @current == '_'
168 |     else
169 |       close_number_token
170 |     end
171 |   end
172 | 
173 |   def lex_operator : Nil
174 |     if !@current.ascii_whitespace? && @current != '(' && @current != ')'
175 |       @current_token += @current
176 |     else
177 |       check_for_unsupported_operators @current_token.strip
178 |       generate_token TokenType::Operator, @current_token.strip
179 |     end
180 |   end
181 | 
182 |   def lex_top_level : Nil
183 |     case
184 |     when @current == '"'
185 |       enter_mode(Context::String)
186 |     when @current == '#'
187 |       enter_mode(Context::Comment)
188 |     when @current.ascii_letter?
189 |       enter_mode_get(Context::Identifier)
190 |     when @current.ascii_number?
191 |       enter_mode_get(Context::Number)
192 |     when !@current.ascii_whitespace?
193 |       enter_mode_get(Context::Operator)
194 |     end
195 |   end
196 | 
197 |   def close_token : Nil
198 |     if @current_token.strip != ""
199 |       case @context
200 |       when Context::Number
201 |         close_number_token
202 |       when Context::Identifier, Context::Def
203 |         close_identifier_token
204 |       when Context::Operator
205 |         generate_token TokenType::Operator, @current_token.strip
206 |       when Context::Comment
207 |         generate_token TokenType::Comment, @current_token.strip
208 |       end
209 |     end
210 |   end
211 | 
212 |   def close_number_token : Nil
213 |     number = @current_token.strip
214 |     if number.includes? "."
215 |       generate_token TokenType::Float, number.to_f
216 |     else
217 |       begin
218 |         generate_token TokenType::Int, number.to_i
219 |       rescue
220 |         generate_token TokenType::Int, number.to_i64
221 |       end
222 |     end
223 |   end
224 | 
225 |   def close_identifier_token : Nil
226 |     identifier = @current_token.strip
227 |     if @keywords.any? { |word| word.to_s == identifier }
228 |       @keywords.each do |keyword|
229 |         if keyword.to_s == identifier
230 |           if keyword == :true
231 |             generate_token TokenType::Bool, true
232 |           elsif keyword == :false
233 |             generate_token TokenType::Bool, false
234 |           elsif keyword == :def
235 |             generate_token TokenType::Keyword, keyword
236 |             @context = Context::Def
237 |             set_position
238 |           else
239 |             generate_token TokenType::Keyword, keyword
240 |           end
241 |         end
242 |       end
243 |     elsif @types.any? { |word| word.to_s == identifier }
244 |       @types.each do |type_val|
245 |         if type_val.to_s == identifier
246 |           generate_token TokenType::Type, type_val
247 |         end
248 |       end
249 |     else
250 |       if @context == Context::Def
251 |         generate_token TokenType::FuncDef, identifier
252 |         @functions.push identifier
253 |       elsif @functions.includes? identifier
254 |         generate_token TokenType::FuncCall, identifier
255 |       else
256 |         generate_token TokenType::Identifier, identifier
257 |       end
258 |     end
259 |   end
260 | 
261 |   def generate_token(typeVal : TokenType, value : ValueType) : Nil
262 |     @tokens << Token.new typeVal, value, @current_t_line, @current_t_position
263 |     @current_token = ""
264 |     @context = Context::TopLevel
265 |   end
266 | 
267 |   def check_for_unsupported_operators(operator : String) : Nil
268 |     error = false
269 |     errorString = ""
270 |     case operator
271 |     when "+=", "-=", "*=", "/="
272 |       error = true
273 |       errorString = "Unsupported operator: #{operator}\nTry: var = var #{operator[0]} val"
274 |     end
275 |     if error
276 |       raise EmeraldLexingException.new errorString, @current_t_line, @current_t_position
277 |     end
278 |   end
279 | end
280 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/basic_block_node.cr:
--------------------------------------------------------------------------------
 1 | class BasicBlockNode < Node
 2 |   property! block
 3 |   @block : LLVM::BasicBlock?
 4 | 
 5 |   def initialize(@line : Int32, @position : Int32)
 6 |     @value = nil
 7 |     @children = [] of Node
 8 |   end
 9 | 
10 |   def pre_walk(state : ProgramState) : Nil
11 |     # Generate unique block name
12 |     block_name = "block#{state.blocks.size + 1}"
13 | 
14 |     # Put block into active function and make block active
15 |     self_block = state.active_function.basic_blocks.append block_name
16 |     state.add_block block_name, self_block
17 |     state.active_block = self_block
18 |     @block = self_block
19 | 
20 |     # If this block is a child of a function, allocate function parameters inside block
21 |     if parent.class == FunctionDeclarationNode
22 |       alloca_func_params state
23 |     end
24 |   end
25 | 
26 |   def resolve_value(state : ProgramState) : Nil
27 |     @resolved_value = @children[-1].resolved_value
28 |     # If parent node is an If or While Expression...
29 |     if parent.is_a?(IfExpressionNode) || parent.is_a?(WhileExpressionNode)
30 |       # Get the block of the last If or While Node inside block or use own block
31 |       scope = block
32 |       @children.each do |child|
33 |         if child.class == IfExpressionNode
34 |           scope = child.as(IfExpressionNode).exit_block
35 |         elsif child.class == WhileExpressionNode
36 |           scope = child.as(WhileExpressionNode).exit_block
37 |         end
38 |       end
39 | 
40 |       # If last child node is not a return statement, add jump statements for control flow
41 |       if !@children[-1].is_a?(ReturnNode)
42 |         if parent.is_a?(IfExpressionNode)
43 |           state.close_statements.push JumpStatement.new scope, parent.as(IfExpressionNode).exit_block
44 |         elsif parent.is_a?(WhileExpressionNode)
45 |           state.close_statements.push JumpStatement.new scope, parent.as(WhileExpressionNode).cond_block
46 |         end
47 |       end
48 |     end
49 |   end
50 | 
51 |   def alloca_func_params(state : ProgramState) : Nil
52 |     counter = 0
53 |     parent.as(FunctionDeclarationNode).params.each do |var, type_val|
54 |       state.builder.position_at_end state.active_block
55 |       case type_val
56 |       when :Int32, :Float64, :Bool
57 |         ptr : LLVM::Value
58 |         if type_val == :Int32
59 |           ptr = state.builder.alloca state.int32, var
60 |         elsif type_val == :Float64
61 |           ptr = state.builder.alloca state.double, var
62 |         else
63 |           ptr = state.builder.alloca state.int1, var
64 |         end
65 |         state.builder.store state.active_function.params[counter], ptr
66 |         state.variable_pointers[state.active_function][var] = ptr
67 |       when :String
68 |       else
69 |         raise "Unable to alloca function declaration parameter #{var} of type #{type_val}"
70 |       end
71 |       counter += 1
72 |     end
73 |   end
74 | end
75 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/call_expression_node.cr:
--------------------------------------------------------------------------------
 1 | class CallExpressionNode < Node
 2 |   def initialize(@value : ValueType, @line : Int32, @position : Int32)
 3 |     @children = [] of Node
 4 |   end
 5 | 
 6 |   def resolve_value(state : ProgramState) : Nil
 7 |     state.builder.position_at_end state.active_block
 8 |     if @value.as(String) == "puts"
 9 |       @resolved_value = @children[0].resolved_value
10 |       generate_puts_call state
11 |     else
12 |       generate_func_call state
13 |     end
14 |   end
15 | 
16 |   def generate_puts_call(state : ProgramState) : Nil
17 |     test = @resolved_value
18 |     if test.is_a?(LLVM::Value)
19 |       case test.type
20 |       when state.int64
21 |         state.builder.call state.mod.functions["puts:int64"], test, @value.as(String)
22 |       when state.int32
23 |         state.builder.call state.mod.functions["puts:int"], test, @value.as(String)
24 |       when state.double
25 |         state.builder.call state.mod.functions["puts:float"], test, @value.as(String)
26 |       when state.int1
27 |         state.builder.call state.mod.functions["puts:bool"], test, @value.as(String)
28 |       when state.void_pointer
29 |         state.builder.call state.mod.functions["puts:str"], test, @value.as(String)
30 |       end
31 |     else
32 |       str_pointer = state.define_or_find_global test.to_s
33 |       state.builder.call state.mod.functions["puts:str"], str_pointer, @value.as(String)
34 |     end
35 |   end
36 | 
37 |   def generate_func_call(state : ProgramState) : Nil
38 |     num_params = @children.size
39 |     if num_params == 0
40 |       if state.mod.functions[@value.as(String)].return_type == state.void
41 |         @resolved_value = state.builder.call state.mod.functions[@value.as(String)]
42 |       else
43 |         @resolved_value = state.builder.call state.mod.functions[@value.as(String)], @value.as(String)
44 |       end
45 |     else
46 |       params = [] of LLVM::Value
47 |       @children.each do |child|
48 |         current_value = child.resolved_value
49 |         if current_value.is_a?(Array(LLVM::Value))
50 |           current_value.each do |value|
51 |             params.push value
52 |           end
53 |         else
54 |           params.push state.crystal_to_llvm child.resolved_value
55 |         end
56 |       end
57 |       if state.mod.functions[@value.as(String)].return_type == state.void
58 |         @resolved_value = state.builder.call state.mod.functions[@value.as(String)], params
59 |       else
60 |         @resolved_value = state.builder.call state.mod.functions[@value.as(String)], params, @value.as(String)
61 |       end
62 |     end
63 |   end
64 | end
65 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/declaration_reference_node.cr:
--------------------------------------------------------------------------------
 1 | class DeclarationReferenceNode < Node
 2 |   def initialize(@value : String, @line : Int32, @position : Int32)
 3 |     @children = [] of Node
 4 |   end
 5 | 
 6 |   def resolve_value(state : ProgramState) : Nil
 7 |     @resolved_value = state.reference_variable state.active_function, @value.as(String), @line, @position
 8 |   end
 9 | end
10 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/expression_node.cr:
--------------------------------------------------------------------------------
 1 | class ExpressionNode < Node
 2 |   getter parens
 3 | 
 4 |   def initialize(@parens : Bool, @line : Int32, @position : Int32)
 5 |     @value = nil
 6 |     @children = [] of Node
 7 |   end
 8 | 
 9 |   def resolve_value(state : ProgramState) : Nil
10 |     if @children.size == 1
11 |       @resolved_value = @children[0].resolved_value
12 |     else
13 |       params = [] of LLVM::Value
14 |       @children.each do |child|
15 |         cur_child_value = child.resolved_value
16 |         if cur_child_value.is_a?(Array(LLVM::Value))
17 |           cur_child_value.each do |value|
18 |             params.push state.crystal_to_llvm value
19 |           end
20 |         else
21 |           params.push state.crystal_to_llvm cur_child_value
22 |         end
23 |       end
24 |       @resolved_value = params
25 |     end
26 |   end
27 | end
28 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/function_declaration_node.cr:
--------------------------------------------------------------------------------
  1 | class FunctionDeclarationNode < Node
  2 |   getter params, return_type
  3 | 
  4 |   def initialize(@name : String, @params : Hash(String, Symbol), @return_type : Symbol, @line : Int32, @position : Int32)
  5 |     @value = nil
  6 |     @children = [] of Node
  7 |   end
  8 | 
  9 |   def pre_walk(state : ProgramState) : Nil
 10 |     # Save reference to active block
 11 |     state.saved_block = state.active_block
 12 |     # Get LLVM Parameter type(s)
 13 |     params = [] of LLVM::Type
 14 |     param_names = [] of String
 15 |     @params.each do |name, type_val|
 16 |       params.push state.symbol_to_llvm type_val
 17 |       param_names.push name
 18 |     end
 19 |     # Get LLVM return type
 20 |     return_sig = state.symbol_to_llvm @return_type
 21 |     # Generate function, using types
 22 |     func = state.mod.functions.add @name, params, return_sig
 23 |     # Generated function is now active
 24 |     state.active_function = func
 25 |     state.active_function_name = @name
 26 |     # Add function params to module state under function
 27 |     array = func.params.to_a
 28 |     array.each_with_index do |param, i|
 29 |       state.add_variable func, param_names[i], param
 30 |     end
 31 |   end
 32 | 
 33 |   def implicit_return_node : Node
 34 |     @children[0].children[-1]
 35 |   end
 36 | 
 37 |   def resolve_value(state : ProgramState) : Nil
 38 |     if implicit_return_node.class != ReturnNode
 39 |       if implicit_return_node.class == IfExpressionNode && implicit_return_node.as(IfExpressionNode).uses_exit == false
 40 |       else
 41 |         state.builder.position_at_end state.active_block
 42 |         case @return_type
 43 |         when :Nil
 44 |           state.builder.ret
 45 |         when :Int32
 46 |           resolve_int32_func state
 47 |         when :Int64
 48 |           resolve_int64_func state
 49 |         when :Float64
 50 |           resolve_float64_func state
 51 |         when :Bool
 52 |           resolve_bool_func state
 53 |         when :String
 54 |           resolve_string_func state
 55 |         end
 56 |       end
 57 |     end
 58 | 
 59 |     # Return to saved block and make resume main as active function
 60 |     state.active_function = state.mod.functions["main"]
 61 |     state.active_block = state.saved_block
 62 |   end
 63 | 
 64 |   def resolve_int32_func(state : ProgramState) : Nil
 65 |     if implicit_return_node.resolved_value.is_a?(Int32)
 66 |       state.builder.ret state.gen_int32(implicit_return_node.resolved_value.as(Int32))
 67 |     elsif implicit_return_node.resolved_value.is_a?(LLVM::Value)
 68 |       if implicit_return_node.resolved_value.as(LLVM::Value).type == state.int32
 69 |         state.builder.ret implicit_return_node.resolved_value.as(LLVM::Value)
 70 |       else
 71 |         raise value_exception "integer"
 72 |       end
 73 |     else
 74 |       raise value_exception "integer"
 75 |     end
 76 |   end
 77 | 
 78 |   def resolve_int64_func(state : ProgramState) : Nil
 79 |     if implicit_return_node.resolved_value.is_a?(LLVM::Value)
 80 |       if implicit_return_node.resolved_value.as(LLVM::Value).type == state.int64
 81 |         state.builder.ret implicit_return_node.resolved_value.as(LLVM::Value)
 82 |       else
 83 |         raise value_exception "integer64"
 84 |       end
 85 |     else
 86 |       raise value_exception "integer64"
 87 |     end
 88 |   end
 89 | 
 90 |   def resolve_float64_func(state : ProgramState) : Nil
 91 |     if implicit_return_node.resolved_value.is_a?(Float64)
 92 |       state.builder.ret state.gen_double(implicit_return_node.resolved_value.as(Float64))
 93 |     elsif implicit_return_node.resolved_value.is_a?(LLVM::Value)
 94 |       if implicit_return_node.resolved_value.as(LLVM::Value).type == state.double
 95 |         state.builder.ret implicit_return_node.resolved_value.as(LLVM::Value)
 96 |       else
 97 |         raise value_exception "float"
 98 |       end
 99 |     else
100 |       raise value_exception "float"
101 |     end
102 |   end
103 | 
104 |   def resolve_bool_func(state : ProgramState) : Nil
105 |     if implicit_return_node.resolved_value.is_a?(Bool)
106 |       implicit_return_node.resolved_value.as(Bool) ? state.builder.ret state.gen_int1(1) : state.builder.ret state.gen_int1(0)
107 |     elsif implicit_return_node.resolved_value.is_a?(LLVM::Value)
108 |       if implicit_return_node.resolved_value.as(LLVM::Value).type == state.int1
109 |         state.builder.ret implicit_return_node.resolved_value.as(LLVM::Value)
110 |       else
111 |         raise value_exception "bool"
112 |       end
113 |     else
114 |       raise value_exception "bool"
115 |     end
116 |   end
117 | 
118 |   def resolve_string_func(state : ProgramState) : Nil
119 |     if implicit_return_node.resolved_value.is_a?(String)
120 |       state.builder.ret state.define_or_find_global implicit_return_node.resolved_value.as(String)
121 |     elsif implicit_return_node.resolved_value.is_a?(LLVM::Value)
122 |       if implicit_return_node.resolved_value.as(LLVM::Value).type == state.void_pointer
123 |         state.builder.ret implicit_return_node.resolved_value.as(LLVM::Value)
124 |       else
125 |         raise value_exception "string"
126 |       end
127 |     else
128 |       raise value_exception "string"
129 |     end
130 |   end
131 | 
132 |   def value_exception(expected_type : String) : EmeraldValueResolutionException
133 |     EmeraldValueResolutionException.new "#{@name} function requires an explicit or implicit #{expected_type} return", @line, @position
134 |   end
135 | end
136 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/if_expression_node.cr:
--------------------------------------------------------------------------------
 1 | class IfExpressionNode < Node
 2 |   property! exit_block, if_block, else_block, entry_block
 3 |   getter uses_exit
 4 |   @entry_block : LLVM::BasicBlock?
 5 |   @exit_block : LLVM::BasicBlock?
 6 |   @if_block : LLVM::BasicBlock?
 7 |   @else_block : LLVM::BasicBlock?
 8 | 
 9 |   def initialize(@line : Int32, @position : Int32)
10 |     @value = nil
11 |     @children = [] of Node
12 |     @uses_exit = true
13 |   end
14 | 
15 |   def pre_walk(state : ProgramState) : Nil
16 |     # Setup exit block and mark current active block as entry
17 |     block_name = "eblock#{state.blocks.size + 1}"
18 |     exit_block = state.mod.functions[state.active_function_name].basic_blocks.append block_name
19 |     state.add_block block_name, exit_block
20 |     @entry_block = state.active_block
21 |     @exit_block = exit_block
22 |   end
23 | 
24 |   def resolve_value(state : ProgramState) : Nil
25 |     state.active_block = @exit_block
26 |     # If/else where both if and else blocks end in return nodes...
27 |     if @children[2]? && @children[1].children[-1].is_a?(ReturnNode) && @children[2].children[-1].is_a?(ReturnNode)
28 |       state.builder.position_at_end exit_block
29 |       # ...then the exit block is unreachable
30 |       state.builder.unreachable
31 |       @uses_exit = false
32 |     end
33 | 
34 |     # Get comparison value
35 |     comp_val = @children[0].resolved_value
36 |     if comp_val == true
37 |       comp_val = state.gen_int1(1)
38 |     elsif comp_val == false
39 |       comp_val = state.gen_int1(0)
40 |     else
41 |       comp_val = comp_val.as(LLVM::Value)
42 |     end
43 | 
44 |     @if_block = @children[1].as(BasicBlockNode).block
45 | 
46 |     if @children[2]?
47 |       @else_block = @children[2].as(BasicBlockNode).block
48 |       state.close_statements.push ConditionalStatement.new entry_block, comp_val, if_block, else_block
49 |     else
50 |       state.close_statements.push ConditionalStatement.new entry_block, comp_val, if_block, exit_block
51 |     end
52 |   end
53 | end
54 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/literal_nodes.cr:
--------------------------------------------------------------------------------
 1 | class IntegerLiteralNode < Node
 2 |   def initialize(@value : Int32, @line : Int32, @position : Int32)
 3 |     @children = [] of Node
 4 |   end
 5 | 
 6 |   def resolve_value(state : ProgramState) : Nil
 7 |     @resolved_value = value
 8 |   end
 9 | end
10 | 
11 | class Integer64LiteralNode < Node
12 |   def initialize(@value : Int64, @line : Int32, @position : Int32)
13 |     @children = [] of Node
14 |   end
15 | 
16 |   def resolve_value(state : ProgramState) : Nil
17 |     @resolved_value = value
18 |   end
19 | end
20 | 
21 | class StringLiteralNode < Node
22 |   def initialize(@value : String, @line : Int32, @position : Int32)
23 |     @children = [] of Node
24 |   end
25 | 
26 |   def resolve_value(state : ProgramState) : Nil
27 |     @resolved_value = value
28 |   end
29 | end
30 | 
31 | class FloatLiteralNode < Node
32 |   def initialize(@value : Float64, @line : Int32, @position : Int32)
33 |     @children = [] of Node
34 |   end
35 | 
36 |   def resolve_value(state : ProgramState) : Nil
37 |     @resolved_value = value
38 |   end
39 | end
40 | 
41 | class BooleanLiteralNode < Node
42 |   def initialize(@value : Bool, @line : Int32, @position : Int32)
43 |     @children = [] of Node
44 |   end
45 | 
46 |   def resolve_value(state : ProgramState) : Nil
47 |     @resolved_value = value
48 |   end
49 | end
50 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/node.cr:
--------------------------------------------------------------------------------
  1 | class Node
  2 |   getter value
  3 |   property children, resolved_value
  4 |   property! parent
  5 | 
  6 |   @value : ValueType
  7 |   @resolved_value : ValueType | LLVM::Value
  8 |   @parent : Node?
  9 | 
 10 |   def initialize(@line : Int32, @position : Int32)
 11 |     @children = [] of Node
 12 |     @value = nil
 13 |     @resolved_value = nil
 14 |   end
 15 | 
 16 |   def add_child(node : Node) : Nil
 17 |     @children.push node
 18 |     node.parent = self
 19 |   end
 20 | 
 21 |   def delete_child(node : Node) : Nil
 22 |     @children.delete node
 23 |   end
 24 | 
 25 |   def promote(node : Node) : Nil
 26 |     insertion_point = get_binary_insertion_point node
 27 | 
 28 |     root_node = insertion_point.parent
 29 |     root_node.delete_child insertion_point
 30 |     root_node.add_child node
 31 |     node.add_child insertion_point
 32 |   end
 33 | 
 34 |   def get_binary_insertion_point(node : Node) : Node
 35 |     insert_point = self
 36 |     while true
 37 |       if insert_point.parent.class == BinaryOperatorNode && node.precedence <= insert_point.parent.as(BinaryOperatorNode).precedence
 38 |         insert_point = insert_point.parent
 39 |       else
 40 |         break
 41 |       end
 42 |     end
 43 |     insert_point
 44 |   end
 45 | 
 46 |   def get_first_expression_node : Node
 47 |     active_parent = self.parent
 48 |     while true
 49 |       # if active parent is an expression, we are done
 50 |       if active_parent.class == ExpressionNode
 51 |         return active_parent
 52 |       else
 53 |         # Otherwise we need to keep looking upwards
 54 |         active_parent = active_parent.parent
 55 |       end
 56 |     end
 57 |   end
 58 | 
 59 |   def get_first_parens_node : Node
 60 |     active_parent = self.parent
 61 |     while true
 62 |       # if active parent is an expression, we are done
 63 |       if active_parent.class == ExpressionNode && active_parent.as(ExpressionNode).parens == true
 64 |         return active_parent
 65 |       else
 66 |         # Otherwise we need to keep looking upwards
 67 |         active_parent = active_parent.parent
 68 |       end
 69 |       if active_parent.class == RootNode
 70 |         # TODO This is currently working as expected but should be tested more
 71 |         return self.parent
 72 |       end
 73 |     end
 74 |   end
 75 | 
 76 |   def get_func_decl : FunctionDeclarationNode
 77 |     active_parent = self.parent
 78 |     while true
 79 |       if active_parent.is_a?(FunctionDeclarationNode)
 80 |         return active_parent
 81 |       end
 82 |       active_parent = active_parent.parent
 83 |       if active_parent.class == RootNode
 84 |         raise "Return statements must be contained inside a function declaration"
 85 |       end
 86 |     end
 87 |   end
 88 | 
 89 |   def depth : Int32
 90 |     count = 0
 91 |     active_node = self
 92 |     while true
 93 |       if active_node.class == RootNode
 94 |         return count
 95 |       else
 96 |         active_node = active_node.parent
 97 |         count += 1
 98 |       end
 99 |     end
100 |   end
101 | 
102 |   def walk(state : ProgramState) : Nil
103 |     # Print AST in walk order with depth
104 |     puts "#{"\t" * depth}#{self.class} #{self.value}" if state.printAST
105 |     @children.each do |child|
106 |       child.pre_walk state
107 |       child.walk state
108 |       child.post_walk state
109 |     end
110 |   end
111 | 
112 |   def pre_walk(state : ProgramState) : Nil
113 |   end
114 | 
115 |   def post_walk(state : ProgramState) : Nil
116 |     resolve_value state
117 |     # Print AST resolutions
118 |     puts "#{"\t" * depth}#{self.class} resolved #{@resolved_value}" if state.printResolutions
119 |   end
120 | 
121 |   def resolve_value(state : ProgramState) : Nil
122 |   end
123 | end
124 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/return_node.cr:
--------------------------------------------------------------------------------
 1 | class ReturnNode < Node
 2 |   def initialize(@line : Int32, @position : Int32)
 3 |     @value = nil
 4 |     @children = [] of Node
 5 |   end
 6 | 
 7 |   def resolve_value(state : ProgramState) : Nil
 8 |     @resolved_value = @children[0].resolved_value
 9 |     generate_return state
10 |   end
11 | 
12 |   def generate_return(state : ProgramState) : Nil
13 |     test = @resolved_value
14 |     state.builder.position_at_end state.active_block
15 |     if test.is_a?(LLVM::Value)
16 |       state.builder.ret test
17 |     elsif test.is_a?(Bool)
18 |       test ? state.builder.ret state.gen_int1(1) : state.builder.ret state.gen_int1(0)
19 |     elsif test.is_a?(String)
20 |       str_pointer = state.define_or_find_global test
21 |       state.builder.ret str_pointer
22 |     elsif test.is_a?(Int32)
23 |       func_decl = get_func_decl
24 |       expected_ret = func_decl.return_type
25 |       if expected_ret == :Int32
26 |         state.builder.ret state.gen_int32(test)
27 |       elsif expected_ret == :Int64
28 |         state.builder.ret state.gen_int64(test.to_i64)
29 |       else
30 |         raise "Invalid return for expected return type #{expected_ret}"
31 |       end
32 |     elsif test.is_a?(Int64)
33 |       state.builder.ret state.gen_int64(test)
34 |     elsif test.is_a?(Float64)
35 |       state.builder.ret state.gen_double(test)
36 |     elsif test.nil?
37 |       state.builder.ret
38 |     end
39 |   end
40 | end
41 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/root_node.cr:
--------------------------------------------------------------------------------
 1 | class RootNode < Node
 2 |   def initialize
 3 |     super 1, 1
 4 |     @value = nil
 5 |     @parent = nil
 6 |   end
 7 | 
 8 |   def resolve_value(state : ProgramState) : Nil
 9 |     @resolved_value = @children[-1].resolved_value
10 |   end
11 | end
12 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/variable_declaration_node.cr:
--------------------------------------------------------------------------------
 1 | class VariableDeclarationNode < Node
 2 |   def initialize(@value : String, @line : Int32, @position : Int32)
 3 |     @children = [] of Node
 4 |   end
 5 | 
 6 |   def resolve_value(state : ProgramState) : Nil
 7 |     @resolved_value = @children[0].resolved_value
 8 |     if state.pointer_exists? state.active_function, @value.as(String)
 9 |       state.builder.position_at_end state.active_block
10 |       state.builder.store (state.crystal_to_llvm @resolved_value), state.variable_pointers[state.active_function][@value.as(String)]
11 |     end
12 |     state.add_variable state.active_function, @value.as(String), @resolved_value
13 |     if @resolved_value.is_a?(String)
14 |       state.define_or_find_global @resolved_value.as(String)
15 |     end
16 |   end
17 | end
18 | 


--------------------------------------------------------------------------------
/src/emerald/nodes/while_expression_node.cr:
--------------------------------------------------------------------------------
 1 | class WhileExpressionNode < Node
 2 |   property! exit_block, cond_block, body_block, entry_block
 3 |   @entry_block : LLVM::BasicBlock?
 4 |   @exit_block : LLVM::BasicBlock?
 5 |   @cond_block : LLVM::BasicBlock?
 6 |   @body_block : LLVM::BasicBlock?
 7 | 
 8 |   def initialize(@line : Int32, @position : Int32)
 9 |     @value = nil
10 |     @children = [] of Node
11 |   end
12 | 
13 |   def pre_walk(state : ProgramState) : Nil
14 |     # Setup exit and condition blocks
15 |     block_name = "eblock#{state.blocks.size + 1}"
16 |     exit_block = state.mod.functions[state.active_function_name].basic_blocks.append block_name
17 |     state.add_block block_name, exit_block
18 |     cond_name = "cond#{state.blocks.size + 1}"
19 |     cond_block = state.mod.functions[state.active_function_name].basic_blocks.append cond_name
20 |     state.add_block cond_name, cond_block
21 | 
22 |     # Setup control flow for blocks
23 |     @entry_block = state.active_block
24 |     state.close_statements.push JumpStatement.new entry_block, cond_block
25 |     @exit_block = exit_block
26 |     @cond_block = cond_block
27 |     state.active_block = cond_block
28 |   end
29 | 
30 |   def resolve_value(state : ProgramState) : Nil
31 |     state.active_block = @exit_block
32 | 
33 |     @body_block = @children[1].as(BasicBlockNode).block
34 | 
35 |     @resolved_value = @children[1].resolved_value
36 | 
37 |     comp_val = @children[0].resolved_value
38 |     if comp_val == true
39 |       comp_val = state.gen_int1(1)
40 |     elsif comp_val == false
41 |       comp_val = state.gen_int1(0)
42 |     else
43 |       comp_val = comp_val.as(LLVM::Value)
44 |     end
45 | 
46 |     state.close_statements.push ConditionalStatement.new cond_block, comp_val, body_block, exit_block
47 |   end
48 | end
49 | 


--------------------------------------------------------------------------------
/src/emerald/parser.cr:
--------------------------------------------------------------------------------
  1 | class Parser
  2 |   getter ast
  3 |   property active_node
  4 | 
  5 |   @active_node : Node
  6 |   @current_token : Token
  7 |   @next_token : Token?
  8 | 
  9 |   def initialize(@tokens : Array(Token))
 10 |     @ast = [] of Node
 11 |     ast.push RootNode.new
 12 |     @active_node = @ast[0]
 13 |     @context = [@ast[0]] of Node
 14 |     @current_token = @tokens[0]
 15 |     @paren_nest = 0
 16 |   end
 17 | 
 18 |   def active_context : Node
 19 |     @context[-1]
 20 |   end
 21 | 
 22 |   def parse : Array(Node)
 23 |     while @tokens.size > 0
 24 |       @current_token = @tokens[0]
 25 |       if @tokens.size > 1
 26 |         @next_token = @tokens[1]
 27 |       end
 28 |       parse_token
 29 | 
 30 |       @tokens.shift
 31 |     end
 32 |     if @paren_nest > 0
 33 |       raise EmeraldParsingException.new "Unclosed parenthesis expression", 0, 0
 34 |     end
 35 |     @ast
 36 |   end
 37 | 
 38 |   def parse_token : Nil
 39 |     case @current_token.typeT
 40 |     when TokenType::Int
 41 |       parse_int
 42 |     when TokenType::String
 43 |       parse_string
 44 |     when TokenType::Float
 45 |       parse_float
 46 |     when TokenType::Bool
 47 |       parse_bool
 48 |     when TokenType::ParenOpen
 49 |       parse_paren_open
 50 |     when TokenType::ParenClose
 51 |       parse_paren_close
 52 |     when TokenType::Operator
 53 |       parse_operator
 54 |     when TokenType::Identifier
 55 |       if @next_token.not_nil!.typeT == TokenType::Operator && @next_token.not_nil!.value == "="
 56 |         parse_variable_declaration
 57 |       else
 58 |         parse_declaration_reference
 59 |       end
 60 |     when TokenType::Keyword
 61 |       if @current_token.value == :puts
 62 |         parse_builtin_puts
 63 |       elsif @current_token.value == :def
 64 |         parse_function_definition
 65 |       elsif @current_token.value == :return
 66 |         parse_return
 67 |       elsif @current_token.value == :if
 68 |         parse_if
 69 |       elsif @current_token.value == :else
 70 |         parse_else
 71 |       elsif @current_token.value == :end
 72 |         parse_end
 73 |       elsif @current_token.value == :while
 74 |         parse_while
 75 |       end
 76 |     when TokenType::FuncCall
 77 |       parse_function_call
 78 |     when TokenType::Comma
 79 |       parse_comma
 80 |     when TokenType::Comment
 81 |     when TokenType::Delimiter
 82 |       if @paren_nest == 0
 83 |         @active_node = active_context
 84 |         if @active_node.class == IfExpressionNode
 85 |           if @active_node.children.size == 1
 86 |             node = BasicBlockNode.new @current_token.line, @current_token.column
 87 |             add_and_activate node
 88 |           elsif @active_node.children.size == 2
 89 |             @active_node = @active_node.children[1]
 90 |           elsif @active_node.children.size == 3
 91 |             @active_node = @active_node.children[2]
 92 |           end
 93 |         elsif @active_node.class == WhileExpressionNode
 94 |           if @active_node.children.size == 1
 95 |             node = BasicBlockNode.new @current_token.line, @current_token.column
 96 |             add_and_activate node
 97 |           else
 98 |             @active_node = @active_node.children[1]
 99 |           end
100 |         end
101 |       end
102 |     else
103 |       raise EmeraldParsingException.new "#{@current_token.typeT} is not currently supported at the top level", @current_token.line, @current_token.column
104 |     end
105 |   end
106 | 
107 |   def add_and_activate(node : Node) : Nil
108 |     @active_node.add_child node
109 |     @active_node = node
110 |   end
111 | 
112 |   def add_expression_node(parens : Bool = false) : Nil
113 |     node = ExpressionNode.new parens, @current_token.line, @current_token.column
114 |     add_and_activate node
115 |   end
116 | 
117 |   def parse_int : Nil
118 |     if @active_node == @ast[0]
119 |       add_expression_node
120 |     end
121 |     if @current_token.value.is_a?(Int64)
122 |       node = Integer64LiteralNode.new @current_token.value.as(Int64), @current_token.line, @current_token.column
123 |     else
124 |       node = IntegerLiteralNode.new @current_token.value.as(Int32), @current_token.line, @current_token.column
125 |     end
126 |     add_and_activate node
127 |   end
128 | 
129 |   def parse_string : Nil
130 |     if @active_node == @ast[0]
131 |       add_expression_node
132 |     end
133 |     node = StringLiteralNode.new @current_token.value.as(String), @current_token.line, @current_token.column
134 |     add_and_activate node
135 |   end
136 | 
137 |   def parse_float : Nil
138 |     if @active_node == @ast[0]
139 |       add_expression_node
140 |     end
141 |     node = FloatLiteralNode.new @current_token.value.as(Float64), @current_token.line, @current_token.column
142 |     add_and_activate node
143 |   end
144 | 
145 |   def parse_bool : Nil
146 |     if @active_node == @ast[0]
147 |       add_expression_node
148 |     end
149 |     node = BooleanLiteralNode.new @current_token.value.as(Bool), @current_token.line, @current_token.column
150 |     add_and_activate node
151 |   end
152 | 
153 |   def parse_operator : Nil
154 |     if @active_node == @ast[0]
155 |       raise EmeraldParsingException.new "#{@current_token.typeT} is not currently supported at the top level", @current_token.line, @current_token.column
156 |     end
157 |     node = BinaryOperatorNode.new @current_token.value.as(String), @current_token.line, @current_token.column
158 |     @active_node.promote node
159 |     @active_node = node
160 |   end
161 | 
162 |   def parse_paren_open : Nil
163 |     @paren_nest += 1
164 |     add_expression_node true
165 |   end
166 | 
167 |   def parse_paren_close : Nil
168 |     @paren_nest -= 1
169 |     if @paren_nest < 0
170 |       raise EmeraldParsingException.new "Closing parenthesis without corresponding opening parenthesis in expression", @tokens[0].line, @tokens[0].column
171 |     end
172 |     @active_node = @active_node.get_first_parens_node
173 |     if @active_node.parent.parent.class == CallExpressionNode && @next_token.not_nil!.typeT != TokenType::Comma
174 |       @active_node = @active_node.parent.parent
175 |     end
176 |   end
177 | 
178 |   def parse_variable_declaration : Nil
179 |     identifier = @current_token.value
180 |     node = VariableDeclarationNode.new identifier.as(String), @current_token.line, @current_token.column
181 |     add_and_activate node
182 |     add_expression_node
183 |     @tokens.shift
184 |   end
185 | 
186 |   def parse_declaration_reference : Nil
187 |     node = DeclarationReferenceNode.new @current_token.value.as(String), @current_token.line, @current_token.column
188 |     add_and_activate node
189 |   end
190 | 
191 |   def parse_builtin_puts : Nil
192 |     node = CallExpressionNode.new "puts", @current_token.line, @current_token.column
193 |     add_and_activate node
194 |     add_expression_node
195 |   end
196 | 
197 |   def parse_if : Nil
198 |     node = IfExpressionNode.new @current_token.line, @current_token.column
199 |     add_and_activate node
200 |     @context.push node
201 |     add_expression_node
202 |   end
203 | 
204 |   def parse_while : Nil
205 |     node = WhileExpressionNode.new @current_token.line, @current_token.column
206 |     add_and_activate node
207 |     @context.push node
208 |     add_expression_node
209 |   end
210 | 
211 |   def parse_else : Nil
212 |     @active_node = @active_node.parent.not_nil!
213 |     node = BasicBlockNode.new @current_token.line, @current_token.column
214 |     add_and_activate node
215 |   end
216 | 
217 |   def parse_end : Nil
218 |     @context.pop
219 |     @active_node = active_context
220 |   end
221 | 
222 |   def parse_return : Nil
223 |     node = ReturnNode.new @current_token.line, @current_token.column
224 |     add_and_activate node
225 |     add_expression_node
226 |   end
227 | 
228 |   def parse_function_definition : Nil
229 |     @tokens.shift
230 |     line = @tokens[0].line
231 |     column = @tokens[0].column
232 |     name = @tokens[0].value.as(String)
233 |     params = {} of String => Symbol
234 |     @tokens.shift
235 |     if @tokens[0].typeT == TokenType::ParenOpen
236 |       if @tokens[1].typeT == TokenType::ParenClose
237 |         # Jump over empty parens
238 |         @tokens.shift
239 |         @tokens.shift
240 |       else
241 |         # Parse identifier / type pairs
242 |         until @tokens[0].typeT == TokenType::ParenClose
243 |           @tokens.shift
244 |           if @tokens[0].typeT == TokenType::Identifier && @tokens[1].typeT == TokenType::Type
245 |             params[@tokens[0].value.as(String)] = @tokens[1].value.as(Symbol)
246 |           else
247 |             raise EmeraldParsingException.new "Function #{name} has an invalid declaration.\nWhen declaring parameters, you must provide an identifier and type and seperate with commas\nExample: def add_xy(x Int32, y Int32) Int32", @tokens[0].line, @tokens[0].column
248 |           end
249 |           @tokens.shift
250 |           @tokens.shift
251 |         end
252 |         @tokens.shift
253 |       end
254 |     elsif @tokens[0].typeT == TokenType::Type
255 |       # No parens
256 |     else
257 |       raise EmeraldParsingException.new "Function #{name} has an invalid declaration.\nOnly an opening parenthesis or a type can follow the name of a function.", @tokens[0].line, @tokens[0].column
258 |     end
259 |     if @tokens[0].typeT != TokenType::Type
260 |       raise EmeraldParsingException.new "Function #{name} has an invalid declaration.\nNo return type is present.", @tokens[0].line, @tokens[0].column
261 |     end
262 |     return_type = @tokens[0].value.as(Symbol)
263 |     @tokens.shift
264 |     node = FunctionDeclarationNode.new name, params, return_type, line, column
265 |     add_and_activate node
266 |     body = BasicBlockNode.new @tokens[1].line, @tokens[1].column
267 |     add_and_activate body
268 |     @context.push body
269 |   end
270 | 
271 |   def parse_function_call : Nil
272 |     node = CallExpressionNode.new @current_token.value.as(String), @current_token.line, @current_token.column
273 |     add_and_activate node
274 |     add_expression_node
275 |   end
276 | 
277 |   def parse_comma : Nil
278 |     @active_node = @active_node.get_first_expression_node.parent
279 |     add_expression_node
280 |   end
281 | end
282 | 


--------------------------------------------------------------------------------
/src/emerald/state.cr:
--------------------------------------------------------------------------------
  1 | class ProgramState
  2 |   getter variables, variable_pointers, globals, builder, mod, blocks, close_statements, ctx
  3 |   property! active_block, active_function, active_function_name, saved_block
  4 |   property printAST, printResolutions
  5 | 
  6 |   @active_block : LLVM::BasicBlock?
  7 |   @saved_block : LLVM::BasicBlock?
  8 |   @active_function : LLVM::Function?
  9 | 
 10 |   def initialize(@builder : LLVM::Builder, @ctx : LLVM::Context, @mod : LLVM::Module, @block : LLVM::BasicBlock)
 11 |     @globals = {} of String => LLVM::Value
 12 |     @variables = {} of LLVM::Function => Hash(String, LLVM::Value)
 13 |     @variable_pointers = {} of LLVM::Function => Hash(String, LLVM::Value)
 14 |     @blocks = {} of String => LLVM::BasicBlock
 15 |     @printAST = false
 16 |     @printResolutions = false
 17 |     @active_function = mod.functions["main"]
 18 |     @active_function_name = "main"
 19 |     @active_block = block
 20 |     @saved_block = nil
 21 |     add_block("main_body", block)
 22 |     active_function.linkage = LLVM::Linkage::External
 23 |     declare_standard_functions
 24 |     builder.position_at_end block
 25 |     @close_statements = [] of CloseStatement
 26 |   end
 27 | 
 28 |   def declare_standard_functions : Nil
 29 |     mod.functions.add "puts:int", [int32], int32
 30 |     mod.functions.add "puts:int64", [int64], int64
 31 |     mod.functions.add "puts:bool", [int1], int32
 32 |     mod.functions.add "puts:float", [double], int32
 33 |     mod.functions.add "puts:str", [void_pointer], int32
 34 |     mod.functions.add "concatenate:str", [void_pointer, void_pointer], void_pointer
 35 |     mod.functions.add "repetition:str", [void_pointer, int32], void_pointer
 36 |     mod.functions.add "strlen", [void_pointer], int64
 37 |     mod.functions.add "__strncat_chk", [void_pointer, void_pointer, int64, int64], void_pointer
 38 |     mod.functions.add "llvm.objectsize.i64.p0i8", [void_pointer, int1], int64
 39 |     mod.functions.add "malloc", [int64], void_pointer
 40 |     mod.functions.add "realloc", [void_pointer, int64], void_pointer
 41 |     mod.functions.add "free", [void_pointer], void
 42 |   end
 43 | 
 44 |   def close_blocks : Nil
 45 |     @close_statements.each do |statement|
 46 |       statement.close builder
 47 |     end
 48 |     builder.position_at_end active_block
 49 |     builder.ret gen_int32(0)
 50 |   end
 51 | 
 52 |   def add_block(name : String, block : LLVM::BasicBlock)
 53 |     @blocks[name] = block
 54 |   end
 55 | 
 56 |   def define_or_find_global(name : String) : LLVM::Value
 57 |     if has_global?(name)
 58 |       return @globals[name]
 59 |     else
 60 |       @globals[name] = builder.global_string_pointer name
 61 |       return @globals[name]
 62 |     end
 63 |   end
 64 | 
 65 |   def has_global?(name : String) : Bool
 66 |     @globals[name]? ? true : false
 67 |   end
 68 | 
 69 |   def add_variable(func : LLVM::Function, name : String, value : ValueType) : Nil
 70 |     builder.position_at_end active_block
 71 |     if @variables[func]?
 72 |     else
 73 |       @variables[func] = {} of String => LLVM::Value
 74 |       @variable_pointers[func] = {} of String => LLVM::Value
 75 |     end
 76 |     allocation_required = (!@variables[func].has_key? name)
 77 |     store_variable func, name, value, allocation_required
 78 |   end
 79 | 
 80 |   def store_variable(func : LLVM::Function, name : String, value : ValueType, allocate : Bool) : Nil
 81 |     if value.is_a?(Int32)
 82 |       store_int32 func, name, value, allocate
 83 |     elsif value.is_a?(Bool)
 84 |       store_int1 func, name, value, allocate
 85 |     elsif value.is_a?(Float64)
 86 |       store_float64 func, name, value, allocate
 87 |     elsif value.is_a?(String)
 88 |       ptr = define_or_find_global value
 89 |       @variables[func][name] = ptr
 90 |     else
 91 |       if value.is_a?(LLVM::Value)
 92 |         @variables[func][name] = value
 93 |       else
 94 |         raise "Unable to store value of type #{value.class}"
 95 |       end
 96 |     end
 97 |   end
 98 | 
 99 |   def store_int32(func : LLVM::Function, name : String, value : ValueType, allocate : Bool) : Nil
100 |     ptr = allocate ? (builder.alloca int32, name) : (@variables[func][name])
101 |     builder.store gen_int32(value), ptr
102 |     @variables[func][name] = ptr
103 |     @variable_pointers[func][name] = ptr if allocate
104 |   end
105 | 
106 |   def store_int1(func : LLVM::Function, name : String, value : ValueType, allocate : Bool) : Nil
107 |     ptr = allocate ? (builder.alloca int1, name) : (@variables[func][name])
108 |     value ? builder.store gen_int1(1), ptr : builder.store gen_int1(0), ptr
109 |     @variables[func][name] = ptr
110 |     @variable_pointers[func][name] = ptr if allocate
111 |   end
112 | 
113 |   def store_float64(func : LLVM::Function, name : String, value : ValueType, allocate : Bool) : Nil
114 |     ptr = allocate ? (builder.alloca double, name) : (@variables[func][name])
115 |     builder.store gen_double(value), ptr
116 |     @variables[func][name] = ptr
117 |     @variable_pointers[func][name] = ptr if allocate
118 |   end
119 | 
120 |   def reference_variable(func : LLVM::Function, name : String, line : Int32, column : Int32) : LLVM::Value
121 |     if pointer_exists? func, name
122 |       builder.position_at_end active_block
123 |       return builder.load @variable_pointers[func][name]
124 |     elsif variable_exists? func, name
125 |       type_val = variables[func][name].type
126 |       builder.position_at_end active_block
127 |       if type_val == void_pointer || type_val == int1 || type_val == int32 || type_val == double
128 |         return @variables[func][name]
129 |       else
130 |         return builder.load @variables[func][name]
131 |       end
132 |     else
133 |       raise EmeraldVariableReferenceException.new "Undefined variable reference. Trying to lookup #{name}, but its declaration cannot be found.", line, column
134 |     end
135 |   end
136 | 
137 |   def variable_exists?(func : LLVM::Function, name : String) : Bool
138 |     if @variables[func]?
139 |       if @variables[func].has_key? name
140 |         return true
141 |       end
142 |     end
143 |     return false
144 |   end
145 | 
146 |   def pointer_exists?(func : LLVM::Function, name : String) : Bool
147 |     if @variable_pointers[func]?
148 |       if @variable_pointers[func].has_key? name
149 |         return true
150 |       end
151 |     end
152 |     return false
153 |   end
154 | 
155 |   def symbol_to_llvm(symbol : Symbol) : LLVM::Type
156 |     case symbol
157 |     when :Int32
158 |       return int32
159 |     when :Int64
160 |       return int64
161 |     when :Float64
162 |       return double
163 |     when :Bool
164 |       return int1
165 |     when :String
166 |       return void_pointer
167 |     when :Nil
168 |       return void
169 |     else
170 |       raise "Undefined case in symbol_to_llvm"
171 |     end
172 |   end
173 | 
174 |   def crystal_to_llvm(value : ValueType) : LLVM::Value
175 |     if value.is_a?(Bool)
176 |       value ? return gen_int1(1) : return gen_int1(0)
177 |     elsif value.is_a?(Int32)
178 |       return gen_int32(value)
179 |     elsif value.is_a?(Float64)
180 |       return gen_double(value)
181 |     elsif value.is_a?(String)
182 |       return define_or_find_global value
183 |     elsif value.is_a?(LLVM::Value)
184 |       return value
185 |     else
186 |       raise "Unknown value type in crystal_to_llvm function"
187 |     end
188 |   end
189 | 
190 |   def int32 : LLVM::Type
191 |     return @ctx.int32
192 |   end
193 | 
194 |   def int64 : LLVM::Type
195 |     return @ctx.int64
196 |   end
197 | 
198 |   def double : LLVM::Type
199 |     return @ctx.double
200 |   end
201 | 
202 |   def int1 : LLVM::Type
203 |     return @ctx.int1
204 |   end
205 | 
206 |   def void_pointer : LLVM::Type
207 |     return @ctx.void_pointer
208 |   end
209 | 
210 |   def void : LLVM::Type
211 |     return @ctx.void
212 |   end
213 | 
214 |   def gen_int32(value : Int32) : LLVM::Value
215 |     return @ctx.int32.const_int(value)
216 |   end
217 | 
218 |   def gen_int64(value : Int64) : LLVM::Value
219 |     return @ctx.int64.const_int(value)
220 |   end
221 | 
222 |   def gen_int1(value : Int32) : LLVM::Value
223 |     return @ctx.int1.const_int(value)
224 |   end
225 | 
226 |   def gen_double(value : Float64) : LLVM::Value
227 |     return @ctx.double.const_double(value)
228 |   end
229 | end
230 | 


--------------------------------------------------------------------------------
/src/emerald/token.cr:
--------------------------------------------------------------------------------
 1 | class Token
 2 |   getter typeT, value, line, column
 3 | 
 4 |   def initialize(@typeT : TokenType,
 5 |                  @value : ValueType,
 6 |                  @line : Int32,
 7 |                  @column : Int32)
 8 |   end
 9 | 
10 |   def to_s : String
11 |     "<#{@line}:#{@column}>\t- #{self.typeT}   \t- #{self.value}"
12 |   end
13 | end
14 | 


--------------------------------------------------------------------------------
/src/emerald/types.cr:
--------------------------------------------------------------------------------
 1 | alias ValueType = String | Int32 | Int64 | Float64 | Symbol | Bool | Nil | LLVM::Value | Array(LLVM::Value)
 2 | alias AST = Array(Node)
 3 | alias State = Hash(String, ValueType)
 4 | 
 5 | enum Context
 6 |   TopLevel
 7 |   Comment
 8 |   String
 9 |   Identifier
10 |   Number
11 |   Operator
12 |   Def
13 | end
14 | 
15 | enum TokenType
16 |   Comment
17 |   Keyword
18 |   String
19 |   Identifier
20 |   Bool
21 |   Symbol
22 |   Float
23 |   Int
24 |   Operator
25 |   Delimiter
26 |   ParenOpen
27 |   ParenClose
28 |   Comma
29 |   FuncDef
30 |   FuncCall
31 |   Type
32 | end
33 | 


--------------------------------------------------------------------------------
/src/emerald/verifier.cr:
--------------------------------------------------------------------------------
 1 | class Verifier
 2 |   def initialize
 3 |   end
 4 | 
 5 |   def verify_token_array(token_array : Array(Token)) : Nil
 6 |     verify_token_array_pairs token_array
 7 |   end
 8 | 
 9 |   def verify_token_array_pairs(token_array : Array(Token)) : Nil
10 |     pair_max = token_array.size - 2
11 |     token_array.each_with_index do |token, index|
12 |       if index <= pair_max
13 |         next_token = token_array[index + 1]
14 | 
15 |         check_for_invalid_token_pairs token, next_token
16 |       end
17 |     end
18 |   end
19 | 
20 |   def check_for_invalid_token_pairs(token, next_token)
21 |     case token.typeT
22 |     when TokenType::Operator
23 |       if next_token.typeT == TokenType::Operator || next_token.typeT == TokenType::ParenClose
24 |         build_and_raise_verification_error token, next_token
25 |       end
26 |     when TokenType::ParenOpen
27 |       if next_token.typeT == TokenType::Operator
28 |         build_and_raise_verification_error token, next_token
29 |       end
30 |     when TokenType::ParenClose
31 |       case next_token.typeT
32 |       when TokenType::Int, TokenType::Float, TokenType::String, TokenType::Bool, TokenType::Symbol, TokenType::ParenOpen
33 |         build_and_raise_verification_error token, next_token
34 |       end
35 |     when TokenType::Int, TokenType::Float, TokenType::String, TokenType::Bool, TokenType::Symbol
36 |       case next_token.typeT
37 |       when TokenType::Int, TokenType::Float, TokenType::String, TokenType::Bool, TokenType::Symbol, TokenType::ParenOpen
38 |         build_and_raise_verification_error token, next_token
39 |       end
40 |     end
41 |   end
42 | 
43 |   def build_and_raise_verification_error(token, next_token)
44 |     token_value = token.typeT == TokenType::String ? "\"#{token.value}\"" : token.value
45 |     next_token_value = next_token.typeT == TokenType::String ? "\"#{next_token.value}\"" : next_token.value
46 |     error_string = "Two tokens   #{token_value}   #{next_token_value}   were detected in sequence, this is not valid Emerald syntax"
47 |     error = EmeraldTokenVerificationException.new error_string, token.line, token.column
48 |     raise error
49 |   end
50 | end
51 | 


--------------------------------------------------------------------------------
/std-lib-opt.ll:
--------------------------------------------------------------------------------
  1 | ; ModuleID = 'std-lib.ll'
  2 | source_filename = "std-lib.ll"
  3 | 
  4 | @.puts_float = private unnamed_addr constant [7 x i8] c"%.11g\0A\00"
  5 | @.puts_int = private unnamed_addr constant [4 x i8] c"%d\0A\00"
  6 | @.puts_int64 = private unnamed_addr constant [6 x i8] c"%lld\0A\00"
  7 | @.puts_string = private unnamed_addr constant [4 x i8] c"%s\0A\00"
  8 | @.puts_true = private unnamed_addr constant [6 x i8] c"true\0A\00"
  9 | @.puts_false = private unnamed_addr constant [7 x i8] c"false\0A\00"
 10 | 
 11 | ; Function Attrs: nounwind
 12 | define i32 @"puts:float"(double %num) local_unnamed_addr #0 {
 13 |   %call_return = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.puts_float, i64 0, i64 0), double %num)
 14 |   ret i32 %call_return
 15 | }
 16 | 
 17 | ; Function Attrs: nounwind
 18 | define i32 @"puts:str"(i8* %str) local_unnamed_addr #0 {
 19 |   %call_return = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.puts_string, i64 0, i64 0), i8* %str)
 20 |   ret i32 %call_return
 21 | }
 22 | 
 23 | ; Function Attrs: nounwind
 24 | define i32 @"puts:int"(i32 %num) local_unnamed_addr #0 {
 25 |   %call_return = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.puts_int, i64 0, i64 0), i32 %num)
 26 |   ret i32 %call_return
 27 | }
 28 | 
 29 | ; Function Attrs: nounwind
 30 | define i32 @"puts:int64"(i64 %num) local_unnamed_addr #0 {
 31 |   %call_return = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([6 x i8], [6 x i8]* @.puts_int64, i64 0, i64 0), i64 %num)
 32 |   ret i32 %call_return
 33 | }
 34 | 
 35 | ; Function Attrs: nounwind
 36 | define i32 @"puts:bool"(i1 %bool) local_unnamed_addr #0 {
 37 | entry:
 38 |   br i1 %bool, label %true, label %false
 39 | 
 40 | true:                                             ; preds = %entry
 41 |   %call_return0 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([6 x i8], [6 x i8]* @.puts_true, i64 0, i64 0))
 42 |   ret i32 %call_return0
 43 | 
 44 | false:                                            ; preds = %entry
 45 |   %call_return1 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.puts_false, i64 0, i64 0))
 46 |   ret i32 %call_return1
 47 | }
 48 | 
 49 | define i8* @"concatenate:str"(i8* %str1, i8* %str2) local_unnamed_addr {
 50 |   %str1_length = tail call i64 @strlen(i8* %str1)
 51 |   %new_length = add i64 %str1_length, 1
 52 |   %new_string = tail call i8* @malloc(i64 %new_length)
 53 |   %obj_size = tail call i64 @llvm.objectsize.i64.p0i8(i8* %new_string, i1 false)
 54 |   %ret1 = tail call i8* @__strncat_chk(i8* %new_string, i8* %str1, i64 %new_length, i64 %obj_size)
 55 |   %str2_length = tail call i64 @strlen(i8* %str2)
 56 |   %final_length = add i64 %str2_length, %new_length
 57 |   %final_string = tail call i8* @realloc(i8* %new_string, i64 %final_length)
 58 |   %obj_size2 = tail call i64 @llvm.objectsize.i64.p0i8(i8* %final_string, i1 false)
 59 |   %ret2 = tail call i8* @__strncat_chk(i8* %final_string, i8* %str2, i64 %final_length, i64 %obj_size2)
 60 |   ret i8* %final_string
 61 | }
 62 | 
 63 | define i8* @"repetition:str"(i8* %str, i32 %rep) local_unnamed_addr {
 64 | header:
 65 |   %rep64 = zext i32 %rep to i64
 66 |   %str_length = tail call i64 @strlen(i8* %str)
 67 |   %final_length = add i64 %str_length, 1
 68 |   %final_string = tail call i8* @malloc(i64 %final_length)
 69 |   %obj_size = tail call i64 @llvm.objectsize.i64.p0i8(i8* %final_string, i1 false)
 70 |   br label %loop_body
 71 | 
 72 | loop_body:                                        ; preds = %loop_body, %header
 73 |   %rep1.0 = phi i64 [ %rep64, %header ], [ %rep3, %loop_body ]
 74 |   %ret1 = tail call i8* @__strncat_chk(i8* %final_string, i8* %str, i64 %final_length, i64 %obj_size)
 75 |   %rep3 = add i64 %rep1.0, -1
 76 |   %is_zero = icmp eq i64 %rep3, 0
 77 |   br i1 %is_zero, label %loop_footer, label %loop_body
 78 | 
 79 | loop_footer:                                      ; preds = %loop_body
 80 |   ret i8* %final_string
 81 | }
 82 | 
 83 | ; Function Attrs: nounwind
 84 | declare i32 @printf(i8* nocapture readonly, ...) local_unnamed_addr #0
 85 | 
 86 | ; Function Attrs: nounwind readonly
 87 | declare i64 @strlen(i8* nocapture) local_unnamed_addr #1
 88 | 
 89 | declare i8* @__strncat_chk(i8*, i8*, i64, i64) local_unnamed_addr
 90 | 
 91 | ; Function Attrs: nounwind readnone
 92 | declare i64 @llvm.objectsize.i64.p0i8(i8*, i1) #2
 93 | 
 94 | ; Function Attrs: nounwind
 95 | declare noalias i8* @malloc(i64) local_unnamed_addr #0
 96 | 
 97 | ; Function Attrs: nounwind
 98 | declare noalias i8* @realloc(i8* nocapture, i64) local_unnamed_addr #0
 99 | 
100 | attributes #0 = { nounwind }
101 | attributes #1 = { nounwind readonly }
102 | attributes #2 = { nounwind readnone }
103 | 


--------------------------------------------------------------------------------
/std-lib.ll:
--------------------------------------------------------------------------------
 1 | ;Standard Library
 2 | 
 3 | @.puts_float = private unnamed_addr constant [7 x i8] c"%.11g\0A\00"
 4 | @.puts_int = private unnamed_addr constant [4 x i8] c"%d\0A\00"
 5 | @.puts_int64 = private unnamed_addr constant [6 x i8] c"%lld\0A\00"
 6 | @.puts_string = private unnamed_addr constant [4 x i8] c"%s\0A\00"
 7 | @.puts_true = private unnamed_addr constant [6 x i8] c"true\0A\00"
 8 | @.puts_false = private unnamed_addr constant [7 x i8] c"false\0A\00"
 9 | 
10 | define i32 @"puts:float"(double %num){
11 | 	%call_return = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.puts_float, i32 0, i32 0), double %num)
12 | 	ret i32 %call_return
13 | }
14 | 
15 | define i32 @"puts:str"(i8* %str){
16 | 	%call_return = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.puts_string, i32 0, i32 0), i8* %str)
17 | 	ret i32 %call_return
18 | }
19 | 
20 | define i32 @"puts:int"(i32 %num){
21 | 	%call_return = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.puts_int, i32 0, i32 0), i32 %num)
22 | 	ret i32 %call_return
23 | }
24 | 
25 | define i32 @"puts:int64"(i64 %num){
26 | 	%call_return = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([6 x i8], [6 x i8]* @.puts_int64, i32 0, i32 0), i64 %num)
27 | 	ret i32 %call_return
28 | }
29 | 
30 | define i32 @"puts:bool"(i1 %bool){
31 | entry:
32 | 	br i1 %bool, label %true, label %false
33 | true:
34 | 	%call_return0 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([6 x i8], [6 x i8]* @.puts_true, i32 0, i32 0))
35 | 	ret i32 %call_return0
36 | 
37 | false:
38 | 	%call_return1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.puts_false, i32 0, i32 0))
39 | 	ret i32 %call_return1
40 | }
41 | 
42 | define i8* @"concatenate:str"(i8* %str1, i8* %str2){
43 | 	%str1_length = call i64 @strlen(i8* %str1)
44 | 	%new_length = add i64 %str1_length, 1
45 | 	%new_string = call i8* @malloc(i64 %new_length)
46 | 	%obj_size = call i64 @llvm.objectsize.i64.p0i8(i8* %new_string, i1 false)
47 | 	%ret1 = call i8* @__strncat_chk(i8* %new_string, i8* %str1, i64 %new_length, i64 %obj_size)
48 | 	%str2_length = call i64 @strlen(i8* %str2)
49 | 	%final_length = add i64 %new_length, %str2_length
50 | 	%final_string = call i8* @realloc(i8* %new_string, i64 %final_length)	
51 | 	%obj_size2 = call i64 @llvm.objectsize.i64.p0i8(i8* %final_string, i1 false)
52 | 	%ret2 = call i8* @__strncat_chk(i8* %final_string, i8* %str2, i64 %final_length, i64 %obj_size2)
53 | 	ret i8* %final_string
54 | }
55 | 
56 | define i8* @"repetition:str"(i8* %str, i32 %rep){
57 | header:
58 | 	%rep64 = zext i32 %rep to i64
59 | 	%rep1 = alloca i64
60 | 	store i64 %rep64, i64* %rep1
61 | 	%str_length = call i64 @strlen(i8* %str)
62 | 	%new_length = mul i64 %str_length, %rep64
63 | 	%final_length = add i64 %str_length, 1
64 | 	%final_string = call i8* @malloc(i64 %final_length)
65 | 	br label %loop_body
66 | 
67 | loop_body:
68 | 	%obj_size = call i64 @llvm.objectsize.i64.p0i8(i8* %final_string, i1 false)
69 | 	%ret1 = call i8* @__strncat_chk(i8* %final_string, i8* %str, i64 %final_length, i64 %obj_size)
70 | 	br label %check_loop
71 | 
72 | check_loop:
73 | 	%rep2 = load i64, i64* %rep1
74 | 	%rep3 = sub i64 %rep2, 1
75 | 	store i64 %rep3, i64* %rep1
76 | 	%is_zero = icmp eq i64 %rep3, 0
77 | 	br i1 %is_zero, label %loop_footer, label %loop_body
78 | 
79 | loop_footer:
80 | 	ret i8* %final_string
81 | }
82 | 
83 | declare i32 @printf(i8*, ...)
84 | declare i64 @strlen(i8*)
85 | declare i8* @__strncat_chk(i8*, i8*, i64, i64)
86 | declare i64 @llvm.objectsize.i64.p0i8(i8*, i1)
87 | declare i8* @malloc(i64)
88 | declare i8* @realloc(i8*, i64)
89 | declare void @free(i8*)


--------------------------------------------------------------------------------
/test_inputs/floats.cr:
--------------------------------------------------------------------------------
 1 | puts 3.4 + 2.5
 2 | 
 3 | puts 3.4 + 2.5 * 2
 4 | 
 5 | puts 7.5 < 9.5
 6 | 
 7 | puts 2.5 + 2 - 1.5 - 1
 8 | 
 9 | puts 5 / 2 * 4 - 3 + 4
10 | 
11 | puts 7.5 < 8
12 | 
13 | puts 8.0 == 8
14 | 
15 | puts 8 != 8.0
16 | 


--------------------------------------------------------------------------------
/test_inputs/full_integration.cr:
--------------------------------------------------------------------------------
 1 | int_var = 4
 2 | float_var = 4.3
 3 | 
 4 | test = (int_var + float_var) / 2
 5 | test_1 = (int_var * (3 + (2 + 9 - float_var) * 2) * 3)
 6 | if 2 + 2 * 4 == 10
 7 |   puts test_1
 8 | else
 9 |   puts "No"
10 | end
11 | 


--------------------------------------------------------------------------------
/test_inputs/full_integration_2.cr:
--------------------------------------------------------------------------------
 1 | def repeater(x Int32, y String) Nil
 2 |   while x > 0
 3 |     puts y
 4 |     x = x - 1
 5 |   end
 6 | end
 7 | 
 8 | def ifthen(x Int32) Nil
 9 | 	if x > 5
10 | 		puts "Yuge!"
11 | 	else
12 | 		puts "Not so biggly"
13 | 	end
14 | end
15 | 
16 | def combo(x Int32, y Float64) Nil
17 | 	if y > 3.4
18 | 		if x > 0
19 | 			while x > 0 
20 | 				puts y
21 | 				x = x - 1
22 | 			end
23 | 		end
24 | 	end
25 | end
26 | 
27 | repeater 4, "Hello"
28 | ifthen 7
29 | ifthen 3
30 | combo 5, 6.7
31 | combo 1, (2 + 3.4)


--------------------------------------------------------------------------------
/test_inputs/full_integration_3.cr:
--------------------------------------------------------------------------------
 1 | def repeater(x Int32, y String) String
 2 |   while x > 0
 3 |     puts y
 4 |     x = x - 1
 5 |     if x == 3
 6 |     	return "Hello"
 7 |     end
 8 |   end
 9 |   return "Goodbye"
10 | end
11 | 
12 | def ifthen(x Int32) Bool
13 | 	if x > 5
14 | 		puts "Yuge!"
15 | 		return true
16 | 	else
17 | 		puts "Not so biggly"
18 | 		return false
19 | 	end
20 | end
21 | 
22 | def combo(x Int32, y Float64) Float64
23 | 	if y > 3.4
24 | 		if x > 0
25 | 			while x > 0 
26 | 				puts y
27 | 				return y
28 | 			end
29 | 		end
30 | 	end
31 | 	return 0.0
32 | end
33 | 
34 | puts repeater 4, "Hellos"
35 | puts repeater 2, "Hellos"
36 | puts ifthen 7
37 | puts ifthen 3
38 | puts combo 5, 6.7
39 | puts combo 1, 3.4


--------------------------------------------------------------------------------
/test_inputs/functions.cr:
--------------------------------------------------------------------------------
 1 | def add_xy(x Int32, y Int32) Int32
 2 | 	return x + y
 3 | end
 4 | 
 5 | def add_xy_and_3(x Int32, y Int32) Int32
 6 | 	x = x + 3
 7 | 	return x + y
 8 | end
 9 | 
10 | def get_string_len(x String) Int64
11 | 	return strlen x
12 | end
13 | 
14 | puts add_xy 3, add_xy_and_3 2, 1
15 | 
16 | puts get_string_len("Hi")
17 | 
18 | hello = "hello "
19 | 
20 | world = "world"
21 | 
22 | puts hello + world
23 | 
24 | puts hello * add_xy 3, add_xy_and_3 2, 1


--------------------------------------------------------------------------------
/test_inputs/functions_2.cr:
--------------------------------------------------------------------------------
 1 | def example(z Int32) Int32
 2 | 	if z > 5
 3 | 		return 0
 4 | 	else
 5 | 		return 1
 6 | 	end
 7 | end
 8 | 
 9 | def example_2(x Int32, y Int32) Int32
10 | 	return x + y * 2
11 | end
12 | 
13 | puts example 3
14 | puts example 7
15 | puts example_2 3, 4
16 | puts example_2 (example 1), (example 9)
17 | puts example_2 ((example 1) + 1), 2
18 | puts example_2(3, 4)
19 | puts example_2((3 + 1), 4)


--------------------------------------------------------------------------------
/test_inputs/if_statements_1.cr:
--------------------------------------------------------------------------------
1 | if true
2 |   puts 1
3 | else
4 |   puts 2
5 | end
6 | 


--------------------------------------------------------------------------------
/test_inputs/if_statements_2.cr:
--------------------------------------------------------------------------------
 1 | puts "starting"
 2 | 
 3 | if 2 < 5 - 3
 4 |   puts "ok1"
 5 |   if 10 != 5 * 2
 6 |     puts 1
 7 |   else
 8 |     puts 2
 9 |   end
10 |   puts "ok2"
11 | else
12 |   puts "ok3"
13 |   if 10 != 5 * 2
14 |     puts 3
15 |   else
16 |     puts 4
17 |   end
18 |   puts "ok4"
19 | end
20 | 
21 | puts "done"
22 | 


--------------------------------------------------------------------------------
/test_inputs/int64.cr:
--------------------------------------------------------------------------------
 1 | def test_ret_64_1 Int64
 2 | 	return 9_223_372_036_854_775_807
 3 | end
 4 | 
 5 | puts test_ret_64_1
 6 | 
 7 | def test_ret_64_2 Int64
 8 | 	return 9_223_372_036_854_775_806
 9 | end
10 | 
11 | puts test_ret_64_2
12 | 
13 | def test_ret_64_3 Int64
14 | 	return 9_223_372
15 | end
16 | 
17 | puts test_ret_64_3


--------------------------------------------------------------------------------
/test_inputs/strings.cr:
--------------------------------------------------------------------------------
 1 | # puts string
 2 | puts "Hello from this file"
 3 | 
 4 | # string addition is concatenation
 5 | puts "Hello" + " world!"
 6 | 
 7 | # string multiplied by integer is repetition
 8 | puts "Repeated " * 3 + "!"
 9 | 
10 | # strings can be equal or inqual to other strings
11 | puts "Hello" == "Hello"
12 | puts "Hello" != "Hello"
13 | puts "Hello" == "Hello1"
14 | puts "Hello" != "Hello1"
15 | 


--------------------------------------------------------------------------------
/test_inputs/value_resolution_1.cr:
--------------------------------------------------------------------------------
1 | var_106 = (2 + 2) * 3 + 2 * (5 + (6 * 7))
2 | puts var_106 - 3 * (var_106 + 3)
3 | 


--------------------------------------------------------------------------------
/test_inputs/value_resolution_2.cr:
--------------------------------------------------------------------------------
1 | var_neg_1138 = (2 + 2) * (3 + 2 * (2 - 5)) + 2 - 8 * 3 * (5 + (6 * 7))
2 | puts var_neg_1138
3 | puts var_neg_1138 + 8 * 2 < var_neg_1138 - 8 * 2
4 | 


--------------------------------------------------------------------------------
/test_inputs/variables_and_literals.cr:
--------------------------------------------------------------------------------
 1 | int_var = 4
 2 | str_var = "String"
 3 | float_var = 4.3
 4 | bool_var = true
 5 | 
 6 | puts int_var + float_var
 7 | puts float_var + int_var
 8 | puts 4 + float_var
 9 | puts float_var + 4
10 | puts int_var + 4.3
11 | puts 4.3 + int_var
12 | 
13 | puts int_var < float_var
14 | puts float_var < int_var
15 | puts 4 < float_var
16 | puts float_var < 4
17 | puts int_var < 4.3
18 | puts 4.3 < int_var
19 | 
20 | puts int_var > float_var
21 | puts float_var > int_var
22 | puts 4 > float_var
23 | puts float_var > 4
24 | puts int_var > 4.3
25 | puts 4.3 > int_var
26 | 
27 | puts int_var == float_var
28 | puts float_var == int_var
29 | puts 4 == float_var
30 | puts float_var == 4
31 | puts int_var == 4.3
32 | puts 4.3 == int_var
33 | 
34 | puts int_var != float_var
35 | puts float_var != int_var
36 | puts 4 != float_var
37 | puts float_var != 4
38 | puts int_var != 4.3
39 | puts 4.3 != int_var
40 | 
41 | puts "String" == str_var
42 | puts "String" != str_var
43 | puts str_var == "String"
44 | puts str_var != "String"
45 | puts str_var == str_var
46 | puts str_var != str_var
47 | 
48 | puts bool_var == true
49 | puts bool_var != false
50 | puts true == bool_var
51 | puts false != bool_var
52 | puts bool_var == bool_var
53 | puts bool_var != bool_var
54 | 


--------------------------------------------------------------------------------
/test_inputs/while.cr:
--------------------------------------------------------------------------------
 1 | x = 0
 2 | while x < 5
 3 |   y = 5
 4 |   x = x + 1
 5 |   puts x
 6 |   if x <= 2
 7 |     while y > 0
 8 |       y = y - 1
 9 |       puts y
10 |     end
11 |     if 2 + 2 == 5
12 |       puts "Won't print"
13 |     else
14 |       puts "Will print"
15 |       while x + y > 1
16 |         puts "Once"
17 |         y = y - 1
18 |       end
19 |     end
20 |   end
21 | end
22 | 


--------------------------------------------------------------------------------
/test_outputs/floats:
--------------------------------------------------------------------------------
1 | 5.9
2 | 8.4
3 | true
4 | 2.0
5 | 9
6 | true
7 | true
8 | false
9 | 


--------------------------------------------------------------------------------
/test_outputs/full_integration:
--------------------------------------------------------------------------------
1 | 196.8
2 | 


--------------------------------------------------------------------------------
/test_outputs/full_integration_2:
--------------------------------------------------------------------------------
 1 | Hello
 2 | Hello
 3 | Hello
 4 | Hello
 5 | Yuge!
 6 | Not so biggly
 7 | 6.7
 8 | 6.7
 9 | 6.7
10 | 6.7
11 | 6.7
12 | 5.4
13 | 


--------------------------------------------------------------------------------
/test_outputs/full_integration_3:
--------------------------------------------------------------------------------
 1 | Hellos
 2 | Hello
 3 | Hellos
 4 | Hellos
 5 | Goodbye
 6 | Yuge!
 7 | true
 8 | Not so biggly
 9 | false
10 | 6.7
11 | 6.7
12 | 0
13 | 


--------------------------------------------------------------------------------
/test_outputs/functions:
--------------------------------------------------------------------------------
1 | 9
2 | 2
3 | hello world
4 | hello hello hello hello hello hello hello hello hello 
5 | 


--------------------------------------------------------------------------------
/test_outputs/functions_2:
--------------------------------------------------------------------------------
1 | 1
2 | 0
3 | 11
4 | 1
5 | 6
6 | 11
7 | 12
8 | 


--------------------------------------------------------------------------------
/test_outputs/if_statements_1:
--------------------------------------------------------------------------------
1 | 1
2 | 


--------------------------------------------------------------------------------
/test_outputs/if_statements_2:
--------------------------------------------------------------------------------
1 | starting
2 | ok3
3 | 4
4 | ok4
5 | done
6 | 


--------------------------------------------------------------------------------
/test_outputs/int64:
--------------------------------------------------------------------------------
1 | 9223372036854775807
2 | 9223372036854775806
3 | 9223372
4 | 


--------------------------------------------------------------------------------
/test_outputs/strings:
--------------------------------------------------------------------------------
1 | Hello from this file
2 | Hello world!
3 | Repeated Repeated Repeated !
4 | true
5 | false
6 | false
7 | true
8 | 


--------------------------------------------------------------------------------
/test_outputs/value_resolution_1:
--------------------------------------------------------------------------------
1 | -221
2 | 


--------------------------------------------------------------------------------
/test_outputs/value_resolution_2:
--------------------------------------------------------------------------------
1 | -1138
2 | false
3 | 


--------------------------------------------------------------------------------
/test_outputs/variables_and_literals:
--------------------------------------------------------------------------------
 1 | 8.3
 2 | 8.3
 3 | 8.3
 4 | 8.3
 5 | 8.3
 6 | 8.3
 7 | true
 8 | false
 9 | true
10 | false
11 | true
12 | false
13 | false
14 | true
15 | false
16 | true
17 | false
18 | true
19 | false
20 | false
21 | false
22 | false
23 | false
24 | false
25 | true
26 | true
27 | true
28 | true
29 | true
30 | true
31 | true
32 | false
33 | true
34 | false
35 | true
36 | false
37 | true
38 | true
39 | true
40 | true
41 | true
42 | false
43 | 


--------------------------------------------------------------------------------
/test_outputs/while:
--------------------------------------------------------------------------------
 1 | 1
 2 | 4
 3 | 3
 4 | 2
 5 | 1
 6 | 0
 7 | Will print
 8 | 2
 9 | 4
10 | 3
11 | 2
12 | 1
13 | 0
14 | Will print
15 | Once
16 | 3
17 | 4
18 | 5
19 | 


--------------------------------------------------------------------------------