└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Lego: Kernel lang proposal 2 | 3 | Influences: Ioke, Lisp, Ruby, Smalltalk 4 | 5 | Authors: José Valim, Yehuda Katz 6 | 7 | This proposal outlines the grammar for the Lego kernel language. The Lego kernel language is a minimal specification that provides a very simple but elegant syntax in order to work as building blocks for other languages. It provides operator and container tables in order to allow different languages built on top of Lego to provide significantly different features and syntax. 8 | 9 | # Syntax 10 | 11 | The Lego language is made of three elements: macros, functions and key-value args (besides few literals). Here is how we can define a function that sums two variables: 12 | 13 | def(sum(a, b), do: +(a, b)) 14 | 15 | In this example, we are calling the macro `def` passing two arguments: a function call expressions (`sum` with values `a` and `b`) and a key-value argument `do` with an expression. We may expand the `do` into a series of expressions. For that, we use additional parenthesis: 16 | 17 | def(math(a, b), do: ( 18 | =(c, +(a, b)) 19 | *(c, *(a, b)) 20 | )) 21 | 22 | For instance, `if/else` clauses could be implemented as: 23 | 24 | if(some_variable, do: ( 25 | invoke_some_function() 26 | ), else: ( 27 | done() 28 | )) 29 | 30 | Similarly, a function that sums two numbers is defined as: 31 | 32 | fn(a, b, do: ( 33 | +(a, b) 34 | )) 35 | 36 | Finally, Lego also allows `;` to separate several expressions in the same line: 37 | 38 | def(math(a, b), do: (=(c, +(a, b)); *(c, *(a, b)))) 39 | 40 | This is the basic specification of the language. So far, there are just five characters reserved by the language: `,` `(` `)` `:` `;` 41 | 42 | ## Getting rid of the parenthesis 43 | 44 | The Lego kernel language provides several conveniences to get rid of parenthesis: 45 | 46 | * Operators; 47 | * Optional parenthesis; 48 | * Key-value blocks. 49 | 50 | ### Operators 51 | 52 | Lego provides operators and an operator table. For instance, the following example (already shown above): 53 | 54 | def(math(a, b), do: ( 55 | =(c, +(a, b)) 56 | *(c, *(a, b)) 57 | )) 58 | 59 | Could be rewritten as: 60 | 61 | def(math(a, b), do: ( 62 | c = a + b 63 | c * a * b 64 | )) 65 | 66 | Lego operator table will be able to handle unary, binary and ternary operators. All entries in the operator table have a precedence associated to them. A language built on top of Lego may optionally expose the operator table to developers so they can add their own operators at runtime (similar to the io language). 67 | 68 | Notice that an operator is not limited to only symbols. If a language wishes, `div` could be defined as a binary operator in the operator table. A practical example are guards, which could be implemented as: 69 | 70 | def(math(a, b) when is_number(a) and is_number(b), do: ( 71 | c = a + b 72 | c * a * b 73 | )) 74 | 75 | The example above could be implemented by defining `when` as a binary operator and assigning a low precedence to it. 76 | 77 | ### Optional parenthesis on macro/function calls 78 | 79 | The second convenience provided by Lego are optional parenthesis on macro/function calls. Our `math` example could now be refactored to: 80 | 81 | def math(a, b), do: ( 82 | c = a + b 83 | c * a * b 84 | ) 85 | 86 | #### Solving optional parenthesis and operators ambiguity 87 | 88 | If an operator can be used on both unary and binary forms and the language also allows optional parenthesis, the language will have some ambiguity. Here are some non-ambiguous examples: 89 | 90 | some_function() + 1 91 | some_function(+1) 92 | 93 | If we remove parenthesis, the parser no longer knows how to handle both expressions without adding an special case: 94 | 95 | some_function + 1 96 | some_function +1 97 | some_function+1 98 | 99 | To solve this problem, this specification defines that those three forms above translate respectively to: 100 | 101 | some_function() + 1 102 | some_function(+1) 103 | some_function() + 1 104 | 105 | In general, white-space should be ignored by Lego implementations, except in the scenario above where white-space must be used to remove ambiguity. This follows the same rules as the Ruby programming language parser. 106 | 107 | Finally, it is important to notice that, once an operator is added to the operator table, operator function calls require explicit parenthesis. For instance, both of these expressions are valid: 108 | 109 | +(1, 2) 110 | 1 + 2 111 | 112 | But `+ 1, 2` isn't. 113 | 114 | ### Key-value blocks 115 | 116 | The third convenience provided by Lego are key-value blocks. Key-value blocks encapsulates the common patterns in the language and allow us to provide key-value args with expressions without a need to use parenthesis. With this feature, our `math` example could be rewritten as: 117 | 118 | def math(a, b) do 119 | c = a + b 120 | c * a * b 121 | end 122 | 123 | Everything marked by the newly introduced `do`/`end` keywords is passed as value in the `do:` key-value argument. Key-value blocks become more useful when we add other keywords to the block. For example, our `if`/`else` example already shown above: 124 | 125 | if(some_variable, do: ( 126 | invoke_some_function() 127 | ), else: ( 128 | done() 129 | )) 130 | 131 | Could now be rewritten as: 132 | 133 | if some_variable do 134 | invoke_some_function 135 | else: 136 | done 137 | end 138 | 139 | This is similar to Ruby blocks. In fact, if we want to insert parenthesis, they would be inserted as follow: 140 | 141 | if(some_variable) do 142 | invoke_some_function 143 | else: 144 | done 145 | end 146 | 147 | Key-value blocks works the same as key-value args with one important difference. Key-value blocks allow multiple values for the same key. This is convenient to implement `case`/`when` (also known as `switch`/`case` in some languages): 148 | 149 | case some_var do 150 | when: 0 151 | when: 1 152 | puts "is zero or one" 153 | when: 2 154 | puts "is two" 155 | when: 156 | puts "none of above" 157 | end 158 | 159 | Finally, key-value blocks also have an alternative syntax as `->`/`end`. This alternate syntax is important because, while `do`/`end` binds to the farthest function call, `->`/`end` binds to the closest. For instance: 160 | 161 | foo bar do 162 | some_call 163 | end 164 | 165 | It is the same as: 166 | 167 | foo(bar) do 168 | some_call 169 | end 170 | 171 | However: 172 | 173 | foo bar -> 174 | some_call 175 | end 176 | 177 | Which is the same as: 178 | 179 | foo(bar -> 180 | some_call 181 | end) 182 | 183 | This difference is crucial when working with functions. For example: 184 | 185 | System.add_callback fn(state) -> 186 | print state 187 | end 188 | 189 | Using `do` instead `->` would likely cause an error because no implementation would be bound to the function. 190 | 191 | ## Data types 192 | 193 | This section specifies a syntax mechanism for the implementation of custom data types besides literals. 194 | 195 | ### Literals 196 | 197 | The following are literals in Lego: 198 | 199 | :atom 200 | 1 201 | 2.0 202 | 100_000 203 | 204 | ### Containers 205 | 206 | In order to support the definition of other data types, Lego provides the concept of containers. A container uses two delimiters to mark the contained elements. For instance, here is how a list or an array could be defined (as in the examples above): 207 | 208 | [1,2,3] 209 | 210 | Which internally is identified and translated to: 211 | 212 | [](1, 2, 3) 213 | 214 | Another data structure could similarly be defined as: 215 | 216 | { 1, 2, 3 } 217 | 218 | Which then translates to: 219 | 220 | {}(1, 2, 3) 221 | 222 | And so forth. Since those are simply a macro/function call, we could implement Ruby like hashes using keyword args: 223 | 224 | { a: foo, b: bar } 225 | {}(a: foo, b: bar) 226 | 227 | ## Parenthesis applicability 228 | 229 | In Lego, parenthesis may apply to any expression although their behavior may be or not supported by the language. For instance, a language could treat all expressions as functions, therefore all those should be valid syntactically: 230 | 231 | 1(2) 232 | [1,2,3](0) 233 | 234 | A language may also allow parenthesis to be applied to an special operator. For instance, imagine a implementation where `.` is a binary operator: 235 | 236 | foo.bar(1, 2) 237 | 238 | This example would translate to the form below, which in a language like Ruby would mean method dispatching: 239 | 240 | .(foo, bar)(1, 2) 241 | 242 | Besides, Lego also supports parenthesis to be applied as in the expression below: 243 | 244 | foo.(1,2) 245 | 246 | Such example would translate to: 247 | 248 | .(foo)(1,2) 249 | 250 | Which has similar translation as a unary operator. 251 | 252 | ## Wrapping up 253 | 254 | So far, we have detailed the syntax of the language and introduced conveniences. With the macro mechanism, we were able to avoid defining several keywords and with a few syntax additions, the language looks pleasant and flexible to work with. 255 | 256 | The keywords are limited to: `,` `(` `)` `:` `;` `do` `end` `{` `}` 257 | 258 | # BNF grammar sample 259 | 260 | TODO 261 | 262 | ## Invalid syntax examples 263 | 264 | In this section, we are going to describe some examples that are invalid code according to this specification. 265 | 266 | ### Wrong usage of parenthesis 267 | 268 | In Lego, parenthesis are used for grouping expressions or doing calls (read Parenthesis applicability section above). Any other usage of parenthesis is invalid. For example, this common syntax for functions is invalid: 269 | 270 | `(a, b) -> (a + b)` 271 | 272 | The reason is that `(a, b)` is supposed to apply to an expression, but it actually doesn't apply to anything. --------------------------------------------------------------------------------