├── .gitignore
├── LICENSE
├── Makefile
├── README.md
├── aocla.c
└── examples
    ├── cat.aocla
    ├── fib.aocla
    ├── firstrest.aocla
    ├── foreach.aocla
    ├── map.aocla
    └── rec-for.aocla


/.gitignore:
--------------------------------------------------------------------------------
1 | aocla
2 | *.dSYM
3 | DESIGN.md
4 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | Copyright (c) 2022, Salvatore Sanfilippo <antirez@gmail.com>
 2 | All rights reserved.
 3 | 
 4 | Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
 5 | 
 6 |     * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
 7 |     * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
 8 | 
 9 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
10 | 


--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
 1 | all: aocla
 2 | 
 3 | SANITIZE=-fsanitize=address
 4 | 
 5 | aocla: aocla.c
 6 | 	$(CC) -g -ggdb aocla.c -Wall -W -pedantic -O2 \
 7 | 	      $(SANITIZE) -o aocla
 8 | 
 9 | clean:
10 | 	rm -rf aocla *.dSYM
11 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | ## Aocla: the Advent of Code toy language
  2 | 
  3 | Aocla (Advent of Code inspired Language) is a toy stack-based programming
  4 | language written as an extension of [day 13 Advent of Code 2022 puzzle](https://adventofcode.com/2022/day/13).
  5 | 
  6 | This story starts with me doing Advent of Code for the first time in my life. I hadn't written a single line of code for two years, busy, as I was, writing my [sci-fi novel](https://www.amazon.com/Wohpe-English-Rimmel-Salvatore-Sanfilippo-ebook/dp/B0BQ3HRDPF/). I wanted to start coding again, but without a project in my hands, what to do? The AoC puzzles helped quite a lot, at first, but they become repetitive and a bit futile quite soon. After completing day 13, a puzzle about comparing nested lists, I saw many other solutions resorting to `eval`. They are missing the point, I thought. To me, the puzzle seemed a hint at writing parsers for nested objects.
  7 | 
  8 | Now, a nice fact about parsers of lists with integers and nested
  9 | lists is that they are dangerously near to become interpreters of Lisp-alike or FORTH-alike programming languages.
 10 | 
 11 | The gentle reader should be aware that I've a soft spot for [little languages](http://oldblog.antirez.com/page/picol.html). However, Picol was too much of a toy, while [Jim](http://jim.tcl.tk/index.html/doc/www/www/index.html) was too big as a coding example. Other than interpreters, I like writing small programs that serve as [examples](https://github.com/antirez/kilo) of how you could design bigger programs, while retaining a manageable size. Don't take me wrong: it's not like I believe my code should be taken as an example, it's just that I learned a lot from such small programs, so, from time to time, I like writing new ones, and while I'm at it I share them in the hope somebody could be interested. This time I wanted to obtain something of roughly the size of the Kilo editor, that is around ~1000 lines of code, showing the real world challenges arising when writing an actual interpreter for a programming language more complex than Picol. That's the result, and as a side effect I really started programming again: after Aocla I wrote more and more code, and now [I've a new project, too](https://github.com/antirez/protoview).
 12 | 
 13 | ## Let's start
 14 | 
 15 | This README will first explain the language briefly. Later we will talk extensively about the implementation and its design. Without counting comments, the Aocla implementation is shorter than 1000 lines of code, and the core itself is around 500 lines (the rest of the code is the library implementation, the REPL, and other accessory parts). I hope you will find the code easy to follow, even if you are not used to C and to writing interpreters. I tried to keep all simple, as I always do when I write code, for myself and the others having the misfortune of reading or modifying it in the future.
 16 | 
 17 | Not every feature I desired to have is implemented, and certain data types, like the string type, lack any useful procedure to work with them. This choice was made in order to avoid making the source code more complex than needed, and also, on my side, to avoid writing too much useless code, given that this language will never be used in the real world. Besides, implementing some of the missing parts is a good exercise for the willing reader, assuming she or he are new to this kind of stuff. Even with all this limitations, it is possible to write small working programs with Aocla, and that's all we need for our goals.
 18 | 
 19 | ## Aocla overview
 20 | 
 21 | Aocla is a very simple language, more similar to Joy than to FORTH (higher level). It has a total of six datatypes:
 22 | 
 23 | * Lists: `[1 2 3 "foo"]`
 24 | * Symbols: `mysymbol`, `==` or `$x`
 25 | * Integers: `500`
 26 | * Booleans: `#t` or `#f`
 27 | * Tuples: `(x y z)`
 28 | * Strings: `"Hello World!\n"`
 29 | 
 30 | Floating point numbers are not provided for simplicity (writing an implementation should not be too hard, and is a good exercise). Aocla programs are valid Aocla lists, so the language is [homoiconic](https://en.wikipedia.org/wiki/Homoiconicity). While Aocla is a stack-based language, like FORTH, Joy and Factor, it introduces the idea of *local variables capturing*. Because of this construct, Aocla programs look a bit different (and simpler to write and understand in my opinion) compared to other stack-based languages. Locals capturing is optional: any program using locals can be rewritten to avoid using them, yet the existence of this feature deeply affects the language in many ways.
 31 | 
 32 | ## Our first program
 33 | 
 34 | The following is a valid Aocla program, taking 5 and squaring it, to obtain 25.
 35 | 
 36 |     [5 dup *]
 37 | 
 38 | Since all the programs must be valid lists, and thus are enclosed between `[` and `]`, both the Aocla CLI (Command Line Interface) and the execution of programs from files are designed to avoid needing the brackets. Aocla will put the program inside `[]` for you, so the above program should be written like that:
 39 | 
 40 |     5 dup *
 41 | 
 42 | Programs are executed from left to right, *word by word*. If a word is not a symbol nor a tuple, its execution results into pushing its value on the stack. Symbols will produce a procedure call: the symbol name will be looked up in the table of procedures, and if a procedure with a matching name is found, it gets called. So the above program will perform the following steps:
 43 | 
 44 | * `5`: the value 5 is pushed on the stack. The stack will contain `(5)`.
 45 | * `dup`: is a symbol. A procedure called `dup` is looked up and executed. What `dup` does is to take the top value on the stack and duplicate it, so now the stack will contain `(5 5)`.
 46 | * `*`: is another symbol. The procedure is called. It will take the last two elements on the stack, check if they are integers, multiply them together and push the result on the stack. Now the stack will contain `(25)`.
 47 | 
 48 | If an Aocla word is a tuple, like `(x y)`, its execution has the effect of removing a corresponding number of elements from the stack and binding them to the local variables having the specified names:
 49 | 
 50 |     10 20 (x y)
 51 | 
 52 | After the above program is executed, the stack will be empty and the local variables x and y will contain 10 and 20.
 53 | 
 54 | Finally, if an Aocla word is a symbol starting with the `$` character and a single additional character, the object stored at the specified variable is pushed on the stack. So the program to square 5 we wrote earlier can be rewritten as:
 55 | 
 56 |     5 (x) $x $x *
 57 | 
 58 | The ability to capture stack values into locals allow to make complex stack manipulations in a simple way, and makes programs more explicit to read and easier to write. Still locals have the remarkable quality of not making the language semantically more complex (if not for a small thing we will cover later -- search `upeval` inside this document if you want to know ASAP, but if you know the Tcl programming language, you already understood from the name, that is similar to Tcl's `uplevel`). In general, while locals help the handling of the stack in the local context of the procedure, words communicate via the stack, so the main advantages of stack-based languages are untouched.
 59 | 
 60 | *Note: why locals must have just single letter names? The only reason is to make the implementation of the Aocla interpreter simpler to understand. This way, we don't need to make use of any dictionary data structure. If I would design Aocla to be a real language, I would remove this limitation.*
 61 | 
 62 | We said that symbols normally trigger a procedure call. But symbols can also be pushed on the stack like any other value. To do so, symbols must be quoted, with the `'` character at the start.
 63 | 
 64 |     'Hello printnl
 65 | 
 66 | The `printnl` procedure prints the last element on the stack and also prints a newline character, so the above program will just print `Hello` on the screen. You may wonder what's the point of quoting symbols. After all, you could just use strings, but later we'll see how this is important in order to write Aocla programs that write Aocla programs.
 67 | 
 68 | Quoting also works with tuples, so if you want to push the tuple `(a b c)` on the stack, instead of capturing the variables a, b and c, you can write:
 69 | 
 70 |     '(a b c) printnl
 71 | 
 72 | ## Inspecting the stack content
 73 | 
 74 | When you start the Aocla interpreter without a file name, it gets executed
 75 | in REPL mode (Read Eval Print Loop). You write a code fragment, press enter, the code gets executed and the current state of the stack is shown:
 76 | 
 77 |     aocla> 1
 78 |     1
 79 |     aocla> 2
 80 |     1 2
 81 |     aocla> [a b "foo"]
 82 |     1 2 [a b "foo"]
 83 | 
 84 | This way you always know the stack content.
 85 | When you execute programs from files, in order to debug their executions you can print the stack content using the `showstack` procedure.
 86 | 
 87 | ## User defined procedures
 88 | 
 89 | Aocla programs are just lists, and Aocla functions are lists bound to a
 90 | name. The name is given as a symbol, and the way to bind a list with a
 91 | symbol is an Aocla procedure itself. No special syntax is required.
 92 | 
 93 |     [dup *] 'square def
 94 | 
 95 | The `def` procedure will bind the list `[dup *]` to the `square` symbol,
 96 | so later we can use the `square` symbol and it will call our procedure:
 97 | 
 98 |     aocla> 5 square
 99 |     25
100 | 
101 | Calling a symbol that is not bound to any list will produce an error:
102 | 
103 |     aocla> foobar
104 |     Symbol not bound to procedure: 'foobar' in unknown:0
105 | 
106 | ## Working with lists
107 | 
108 | Lists are the central data structure of the language: they are used to represent programs and are useful as a general purpose data structure to represent data. So most of the very few built-in procedures that Aocla offers are lists manipulation procedures.
109 | 
110 | The more direct way to show how to write Aocla programs is probably showing examples via its REPL, so I'll procede in this way.
111 | 
112 | To push an empty list on the stack, you can use:
113 | 
114 |     aocla> []
115 |     []
116 | 
117 | Then it is possible to add elements to the tail or the head of the list using the `<-` and `->` procedures:
118 | 
119 |     aocla> 1 swap ->
120 |     [1] 
121 |     aocla> 2 swap ->
122 |     [1 2] 
123 | 
124 | Note that these procedures are designed to insert the penultimate element on the
125 | stack into the list that is the last element on the stack, so,
126 | in this specific case, we have to swap the order of the last two elements
127 | on the stack before calling `->`. It is possible to design these procedures
128 | in a different way, that is: to the expect `list, element` on the stack instead
129 | of `element, list`. There is no clear winner: one or the other approach is
130 | better or worse depending on the use case (but I believe I didn't write enough Aocla code to really pick the best way). In Aocla, local variables make
131 | all this less important compared to other stack based languages. It is always
132 | possible to make things more explicit, like in the following example:
133 | 
134 |     aocla> [1 2 3] 
135 |     [1 2 3] 
136 |     aocla> (l) 4 $l ->
137 |     [1 2 3 4] 
138 |     aocla> (l) 5 $l ->
139 |     [1 2 3 4 5] 
140 | 
141 | Then, to know how many elements there are in the list, we can use the
142 | `len` procedure, that also works for other data types:
143 | 
144 |     aocla> ['a 'b 1 2] 
145 |     [a b 1 2] 
146 |     aocla> len
147 |     4 
148 |     aocla> "foo"
149 |     4 "foo" 
150 |     aocla> len
151 |     4 3 
152 | 
153 | Other useful list operations are the following:
154 | 
155 |     aocla> [1 2 3] [4 5 6] cat
156 |     [1 2 3 4 5 6]
157 |     aocla> [1 2 3] first
158 |     1
159 |     aocla> [1 2 3] rest
160 |     [2 3]
161 | 
162 | *Note: cat also works with strings, tuples, symbols.*
163 | 
164 | There is, of course, map:
165 | 
166 |     aocla> [1 2 3] [dup *] map
167 |     [1 4 9]
168 | 
169 | If you want to do something with list elements, in an imperative way, you can use foreach:
170 | 
171 |     aocla> [1 2 3] [printnl] foreach
172 |     1
173 |     2
174 |     3
175 | 
176 | There are a few more list procedures. There is `get@` to get a specific element in a given position, `sort`, to sort a list, and if I remember correctly nothing
177 | more about lists. Many of the above procedures are implemented inside the
178 | C source code of Aocla, in Aocla language itself. Others are implemented
179 | in C because of performance concerns or because it was simpler to do so.
180 | For instance, this is the implementation of `foreach`:
181 | 
182 |     [(l f) // list and function to call with each element.
183 |         $l len (e)  // Get list len in "e"
184 |         0 (j)       // j is our current index
185 |         [$j $e <] [
186 |             $l $j get@  // Get list[j]
187 |             $f upeval   // We want to evaluate in the context of the caller
188 |             $j 1 + (j)  // Go to the next index
189 |         ] while
190 |     ] 'foreach def
191 | 
192 | As you can see from the above code, Aocla syntax also supports comments.
193 | Anything starting from `//` to the end of the line is ignored.
194 | 
195 | ## Conditionals
196 | 
197 | Aocla conditionals are just `if` and `ifelse`. There is also a
198 | quite imperative looping construct, that is `while`. You could loop
199 | in the Scheme way, using recursion, but I wanted to give the language
200 | a Common Lisp vibe, where you can write imperative code, too.
201 | 
202 | The words `if` and `ifelse` do what you could imagine:
203 | 
204 |     aocla> 5 (a)
205 |     5
206 |     aocla> [$a 2 >] ["a is > 2" printnl] if
207 |     a is > 2
208 | 
209 | So `if` takes two programs (two lists), one is evaluated to see if it is
210 | true or false. The other is executed only if the first program is true.
211 | 
212 | The `ifelse` procedure works similarly, but it takes three programs: condition, true-program, false-program:
213 | 
214 |     aocla> 9 (a)
215 |     aocla> [$a 11 ==] ["11 reached" printnl] [$a 1 + (a)] ifelse
216 |     aocla> [$a 11 ==] ["11 reached" printnl] [$a 1 + (a)] ifelse
217 |     aocla> [$a 11 ==] ["11 reached" printnl] [$a 1 + (a)] ifelse
218 |     11 reached
219 | 
220 | And finally, an example of while:
221 | 
222 |     aocla> 10 [dup 0 >] [dup printnl 1 -] while
223 |     10
224 |     9
225 |     8
226 |     7
227 |     6
228 |     5
229 |     4
230 |     3
231 |     2
232 |     1
233 | 
234 | Or, for a longer but more recognizable program making use of Aocla locals:
235 | 
236 |     aocla> 10 (x) [$x 0 >] [$x printnl $x 1 - (x)] while
237 |     10
238 |     9
239 |     8
240 |     7
241 |     6
242 |     5
243 |     4
244 |     3
245 |     2
246 |     1
247 | 
248 | In some way, two programming styles are possible: one that uses the stack
249 | mainly in order to pass state from different procedures, and otherwise
250 | uses locals a lot for local state, and another one where almost everything
251 | will use the stack, like in FORTH. Even in the second case, locals can be
252 | used from time to time when stack manipulation is more clear using them.
253 | For instance Imagine I've three values on the stack:
254 | 
255 |     aocla> 1 2 3
256 |     1 2 3
257 | 
258 | If I want to sum the first and the third, and leave the second one
259 | on the stack, even in a programming style where the code mainly uses
260 | the stack to hold state, one could write:
261 | 
262 |     aocla> (a _ b) $_ $a $b +
263 |     2 4 
264 | 
265 | ## Evaluating lists
266 | 
267 | Words like `map` or `foreach` are written in Aocla itself. They are not 
268 | implemented in C, even if they could and probably should for performance
269 | reasons (and this is why `while` is implemented in C).
270 | 
271 | In order to implement procedures that execute code, Aocla provides the
272 | `eval` built-in word. It just consumes the list at the top of the
273 | stack and evaluates it.
274 | 
275 |     aocla> 5 [dup dup dup] eval
276 |     5 5 5 5
277 | 
278 | In the above example, we executed the list containing the program that calls
279 | `dup` three times. Let's write a better example, a procedure that executes
280 | the same code a specified number of times:
281 | 
282 |     [(n l)
283 |         [$n 0 >]
284 |         [$l eval $n 1 - (n)]
285 |         while
286 |     ] 'repeat def
287 | 
288 | Example usage:
289 | 
290 |     aocla> 3 ["Hello!" printnl] repeat
291 |     Hello!
292 |     Hello!
293 |     Hello!
294 | 
295 | ## Eval and local variables
296 | 
297 | There is a problem with the above implementation of `repeat`, it does
298 | not mix well with local variables. The following program will not have the expected behavior:
299 | 
300 |     aocla> 10 (x) 3 [$x printnl] repeat
301 |     Unbound local var: '$x' in eval:0  in unknown:0
302 | 
303 | Here the problem is that once we call a new procedure, that is `repeat`,
304 | the local variable `x` no longer exists in the context of the called
305 | procedure. It belongs to the previous procedure, that is, in this specific
306 | case, the *top level* execution stack frame. So when `repeat` evaluates our
307 | program we get an error.
308 | 
309 | This is the only case where Aocla local variables make the semantics of
310 | Aocla more complex than other stack based languages without this feature.
311 | In order to solve the problem above, Aocla has a special form of
312 | `eval` called `upeval`: it executes a program in the context
313 | (again, stack frame, in low level terms) of the caller. Let's rewrite
314 | the `repeat` procedure using `upeval`:
315 | 
316 |     [(n l)
317 |         [$n 0 >]
318 |         [$l upeval $n 1 - (n)]
319 |         while
320 |     ] 'repeat def
321 | 
322 | After the change, it works as expected:
323 | 
324 |     aocla> 10 (x) 3 [$x printnl] repeat
325 |     10
326 |     10
327 |     10
328 | 
329 | Now, out of the blue, without even knowing how Aocla is implemented,
330 | let's check the C implementation of `uplevel`:
331 | 
332 |     /* Like eval, but the code is evaluated in the stack frame of the calling
333 |      * procedure, if any. */
334 |     int procUpeval(aoclactx *ctx) {
335 |         if (checkStackType(ctx,1,OBJ_TYPE_LIST)) return 1;
336 |         obj *l = stackPop(ctx);
337 |         stackframe *saved = NULL;
338 |         if (ctx->frame->prev) {
339 |             saved = ctx->frame;
340 |             ctx->frame = ctx->frame->prev;
341 |         }
342 |         int retval = eval(ctx,l);
343 |         if (saved) ctx->frame = saved;
344 |         release(l);
345 |         return retval;
346 |     }
347 | 
348 | What happens here is quite clear: we check to see if the stack contains
349 | a list, as top level element. If so, we capture that value in the variable
350 | `l`, then save the current stack frame, that contains our local variables
351 | for the current procedure, and substitute it with the *previous procedure*
352 | stack frame. Now we can call `eval()` and finally restore the original
353 | stack frame.
354 | 
355 | ## Creating programs at runtime
356 | 
357 | Aocla is homoiconic, as we already said earlier. Programs are
358 | represented with the same data structures that Aocla code can manipulate.
359 | Because of that, we can write programs writing programs. For instance let's
360 | create a program that creates a procedure incrementing a variable of
361 | the specified name.
362 | 
363 | The procedure expects two objects on the stack: the name of the procedure
364 | we want to create, and the variable name that the procedure will increment. Two symbols, basically:
365 | 
366 |     proc-name, var-name
367 | 
368 | This is the listing of the procedure. Even if each line is commented, being
369 | written in a language that you didn't know until ten minutes ago, and even
370 | a strange enough language, you may want to carefully read each word it is
371 | composed of.
372 | 
373 |     [ (p v) // Procedure, var.
374 |         []                      // Accumulate our program into an empty list
375 |         '$ $v cat swap ->       // Push $<varname> into the stack
376 |         1 swap ->               // Push 1
377 |         '+ swap ->              // Call +
378 |         $v [] -> make-tuple swap -> // Capture back value into <varname>
379 |         [] ->                       // Put all into a nested list
380 |         'upeval swap ->             // Call upeval against the program
381 |         $p def // Create the procedure  // Bind to the specified proc name
382 |     ] 'create-incrementing-proc def
383 | 
384 | Basically calling `create-incrementing-proc` will end generating
385 | a list like that (you can check the intermediate results by adding
386 | `showstack` calls in your programs):
387 | 
388 |     [[$x 1 + (x)] upeval]
389 | 
390 | And finally the list is bound to the specified symbol using `def`.
391 | 
392 | *Note: programs like the above show that, after all, maybe the `->` and `<-` operators should expect the arguments in reverse order. Maybe I'll change my mind.*
393 | 
394 | Certain times, programs that write programs can be quite useful. They are a
395 | central feature in many Lisp dialects. However in the specific case of
396 | Aocla, different procedures can be composed via the stack, and we also
397 | have `uplevel`, so I feel their usefulness is greatly reduced. Also note
398 | that if Aocla was a serious language, it would have a lot more constructs
399 | to make writing programs that write programs a lot simpler than the above. Anyway, as you saw earlier, when we implemented the `repeat` procedure, in Aocla
400 | it is possible to do interesting stuff without using this programming paradigm.
401 | 
402 | Ok, I think that's enough. We saw the basic of stack languages, the specific
403 | stuff Aocla adds and how the language feels like. This isn't a course
404 | on stack languages, nor I would be the best person to talk about the
405 | argument. This is a course on how to write a small interpreter in C, so
406 | let's dive into the Aocla interpreter internals.
407 | 
408 | # From puzzle 13 to Aocla
409 | 
410 | At the start of this README I told you Aocla started from an Advent of
411 | Code puzzle. The Puzzle could be solved by parsing the literal representation
412 | of lists like the one below, and then writing a comparison function for
413 | the the list internal representation (well, actually this is how I solved it,
414 | but one could even take the approach of comparing *while* parsing,
415 | probably). This is an example of such lists:
416 | 
417 |         [1,[2,[3,[4,[5,6,7]]]],8,9]
418 | 
419 | Parsing flat lists is not particularly hard, however this is
420 | not a single-level object. It has elements that are sub-lists. So
421 | a recursive parser was the most obvious solution. This is what I wrote
422 | back then, the 13th of December:
423 | 
424 |     /* This describes our elf object type. It can be used to represent
425 |      * nested lists of lists and/or integers. */
426 |     #define ELFOBJ_TYPE_INT  0
427 |     #define ELFOBJ_TYPE_LIST 1
428 |     typedef struct elfobj {
429 |         int type;       /* ELFOBJ_TYPE_... */
430 |         union {
431 |             int i;      /* Integer value. */
432 |             struct {    /* List value. */
433 |                 struct elfobj **ele;
434 |                 size_t len;
435 |             } l;
436 |         } val;
437 |     } elfobj;
438 | 
439 | Why `elfobj`? Well, because it was Christmas and AoC is about elves.
440 | The structure above is quite trivial, just two types and a union in order
441 | to represent both types.
442 | 
443 | Let's see the parser, that is surely more interesting.
444 | 
445 |     /* Given the string 's' return the elfobj representing the list or
446 |      * NULL on syntax error. '*next' is set to the next byte to parse, after
447 |      * the current value was completely parsed. */
448 |     elfobj *parseList(const char *s, const char **next) {
449 |         elfobj *obj = elfalloc(sizeof(*obj));
450 |         while(isspace(s[0])) s++;
451 |         if (s[0] == '-' || isdigit(s[0])) {
452 |             char buf[64];
453 |             size_t len = 0;
454 |             while((*s == '-' || isdigit(*s)) && len < sizeof(buf)-1)
455 |                 buf[len++] = *s++;
456 |             buf[len] = 0;
457 |             obj->type = ELFOBJ_TYPE_INT;
458 |             obj->val.i = atoi(buf);
459 |             if (next) *next = s;
460 |             return obj;
461 |         } else if (s[0] == '[') {
462 |             obj->type = ELFOBJ_TYPE_LIST;
463 |             obj->val.l.len = 0;
464 |             obj->val.l.ele = NULL;
465 |             s++;
466 |             /* Parse comma separated elements. */
467 |             while(1) {
468 |                 /* The list may be empty, so we need to parse for "]"
469 |                  * ASAP. */
470 |                 while(isspace(s[0])) s++;
471 |                 if (s[0] == ']') {
472 |                     if (next) *next = s+1;
473 |                     return obj;
474 |                 }
475 | 
476 |                 /* Parse the current sub-element recursively. */
477 |                 const char *nextptr;
478 |                 elfobj *element = parseList(s,&nextptr);
479 |                 if (element == NULL) {
480 |                     freeElfObj(obj);
481 |                     return NULL;
482 |                 }
483 |                 obj->val.l.ele = elfrealloc(obj->val.l.ele,
484 |                                             sizeof(elfobj*)*(obj->val.l.len+1));
485 |                 obj->val.l.ele[obj->val.l.len++] = element;
486 |                 s = nextptr; /* Continue from first byte not parsed. */
487 | 
488 |                 while(isspace(s[0])) s++;
489 |                 if (s[0] == ']') continue; /* Will be handled by the loop. */
490 |                 if (s[0] == ',') {
491 |                     s++;
492 |                     continue; /* Parse next element. */
493 |                 }
494 | 
495 |                 /* Syntax error. */
496 |                 freeElfObj(obj);
497 |                 return NULL;
498 |             }
499 |             /* Syntax error (list not closed). */
500 |             freeElfObj(obj);
501 |             return NULL;
502 |         } else {
503 |             /* In a serious program you don't printf() in the middle of
504 |              * a function. Just return NULL. */
505 |             fprintf(stderr,"Syntax error parsing '%s'\n", s);
506 |             return NULL;
507 |         }
508 |         return obj;
509 |     }
510 | 
511 | OK, what are the important parts of the above code? First: the parser is,
512 | as I already said, recursive. To parse each element of the list we call
513 | the same function again and again. This will make the magic of handling
514 | any complex nested list without having to do anything special. I know, I know.
515 | This is quite obvious for experienced enough programmers, but I claim it
516 | is still kinda of a revelation, like a Mandelbrot set, like standing with a
517 | mirror in front of another mirror admiring the infinite repeating images one
518 | inside the other. Recursion remains magic even after it is understood.
519 | 
520 | The second point to note: the function gets a pointer to a string, and returns
521 | the object parsed and also, by referene, the pointer to the start of the *next*
522 | object to parse, that is just at some offset inside the same string.
523 | This is a very comfortable way to write such a parser: we can call the same
524 | function again to get the next object in a loop to parse all the tokens and
525 | sub-tokens. And I'm saying tokens for a reason, because the same exact
526 | structure can be used also when writing tokenizers that just return tokens
527 | one after the other, without any conversion to object.
528 | 
529 | Now, what I did was to take this program and make it the programming language
530 | you just learned about in the first part of this README. How? Well, to
531 | start I upgraded the object structure for more complex object types:
532 | 
533 |     /* Type are defined so that each type ID is a different set bit, this way
534 |      * in checkStackType() we may ask the function to check if some argument
535 |      * is one among a list of types just bitwise-oring the type IDs together. */
536 |     #define OBJ_TYPE_INT    (1<<0)
537 |     #define OBJ_TYPE_LIST   (1<<1)
538 |     #define OBJ_TYPE_TUPLE  (1<<2)
539 |     #define OBJ_TYPE_STRING (1<<3)
540 |     #define OBJ_TYPE_SYMBOL (1<<4)
541 |     #define OBJ_TYPE_BOOL   (1<<5)
542 |     #define OBJ_TYPE_ANY    INT_MAX /* All bits set. For checkStackType(). */
543 |     typedef struct obj {
544 |         int type;       /* OBJ_TYPE_... */
545 |         int refcount;   /* Reference count. */
546 |         int line;       /* Source code line number where this was defined, or 0. */
547 |         union {
548 |             int i;      /* Integer. Literal: 1234 */
549 |             int istrue; /* Boolean. Literal: #t or #f */
550 |             struct {    /* List or Tuple: Literal: [1 2 3 4] or (a b c) */
551 |                 struct obj **ele;
552 |                 size_t len;
553 |                 int quoted; /* Used for quoted tuples. Don't capture vars if true.
554 |                                Just push the tuple on stack. */
555 |             } l;
556 |             struct {    /* Mutable string & unmutable symbol. */
557 |                 char *ptr;
558 |                 size_t len;
559 |                 int quoted; /* Used for quoted symbols: when quoted they are
560 |                                not executed, but just pushed on the stack by
561 |                                eval(). */
562 |             } str;
563 |         };
564 |     } obj;
565 | 
566 | A few important things to note, since this may look like just a trivial extension of the original puzzle structure, but it's not:
567 | 
568 | 1. We now use reference counting. When the object is allocated, it gets a *refcount* of 1. Then the functions `retain()` and `release()` are used in order to increment the reference count when we store the same object elsewhere, or when we want to remove a reference. Finally, when the references drop to zero, the object gets freed.
569 | 2. The object types now are all powers of two: single bits, in binary representation. This means we can store or pass multiple types at once in a single integer, just performing the *bitwise or*. It is useful in practice. No need for functions with a variable number of arguments just to pass many times at once.
570 | 3. There is information about the line number where a given object was defined in the source code. Aocla can be a toy, but a toy that will try to give you some stack trace if there is a runtime error.
571 | 
572 | This is the release() function.
573 | 
574 |     /* Recursively free an Aocla object, if the refcount just dropped to zero. */
575 |     void release(obj *o) {
576 |         if (o == NULL) return;
577 |         assert(o->refcount >= 0);
578 |         if (--o->refcount == 0) {
579 |             switch(o->type) {
580 |             case OBJ_TYPE_LIST:
581 |             case OBJ_TYPE_TUPLE:
582 |                 for (size_t j = 0; j < o->l.len; j++)
583 |                     release(o->l.ele[j]);
584 |                 free(o->l.ele);
585 |                 break;
586 |             case OBJ_TYPE_SYMBOL:
587 |             case OBJ_TYPE_STRING:
588 |                 free(o->str.ptr);
589 |                 break;
590 |             default:
591 |                 break;
592 |                 /* Nothing special to free. */
593 |             }
594 |             free(o);
595 |         }
596 |     }
597 | 
598 | Note that in this implementation, deeply nested data structures will produce many recursive calls. This can be avoided using *lazy freeing*, but that's not needed for something like Aocla. However some reader may want to search *lazy freeing* on the web.
599 | 
600 | Thanks to our parser (that is just a more complex version of the initial day 13 puzzle parser, and is not worth showing here), we can take an Aocla program, in the form of a string, parse it and get an Aocla object (`obj*` type) back. Now, in order to run an Aocla program, we have to *execute* this object. Stack based languages are particularly simple to execute: we just go form left to right, and depending on the object type, we do different actions:
601 | 
602 | * If the object is a symbol (and is not quoted, see the `quoted` field in the object structure), we try to lookup a procedure with that name, and if it exists we execute the procedure. How? By recursively executing the list bound to the symbol.
603 | * If the object is a tuple with single characters elements, we capture the variables on the stack.
604 | * If it's a symbol starting with `$` we push the variable on the stack, and if the variable is not bound, we raise an error.
605 | * For any other type of object, we just push it on the stack.
606 | 
607 | The function responsible to execute the program is called `eval()`, and is so short we can put it fully here, but I'll present the function split in different parts, to explain each one carefully. I will start showing just the first three lines, as they already tell us something.
608 | 
609 |     int eval(aoclactx *ctx, obj *l) {
610 |         assert (l->type == OBJ_TYPE_LIST);
611 | 
612 |         for (size_t j = 0; j < l->l.len; j++) {
613 | 
614 | Here there are three things going on. Eval() takes a context and a list. The list is our program, and it is scanned left-to-right, as Aocla programs are executed left to right, word by word. All should be clear but the context. What is an execution context for our program?
615 | 
616 |     /* Interpreter state. */
617 |     #define ERRSTR_LEN 256
618 |     typedef struct aoclactx {
619 |         size_t stacklen;        /* Stack current len. */
620 |         obj **stack;
621 |         aproc *proc;            /* Defined procedures. */
622 |         stackframe *frame;      /* Stack frame with locals. */
623 |         /* Syntax error context. */
624 |         char errstr[ERRSTR_LEN]; /* Syntax error or execution error string. */
625 |     } aoclactx;
626 | 
627 | It contains the following elements:
628 | 1. The stack. Aocla is a stack based language, so we need a stack where to push and pop Aocla objects.
629 | 2. A list of procedures: lists bound to symbols, via the `def` word.
630 | 3. A stack frame, that is just what contains our local variables:
631 | 
632 | ```
633 |     /* We have local vars, so we need a stack frame. We start with a top level
634 |      * stack frame. Each time a procedure is called, we create a new stack frame
635 |      * and free it once the procedure returns. */
636 |     #define AOCLA_NUMVARS 256
637 |     typedef struct stackframe {
638 |         obj *locals[AOCLA_NUMVARS];/* Local var names are limited to a,b,c,...,z. */
639 |         aproc *curproc;            /* Current procedure executing or NULL.  */
640 |         int curline;               /* Current line number during execution. */
641 |         struct stackframe *prev;   /* Upper level stack frame or NULL. */
642 |     } stackframe;
643 | ```
644 | 
645 | The stack frame has a pointer to the previous stack frame. This is useful both in order to implement `upeval` and to show a stack trace when an exception happens and the program is halted.
646 | 
647 | We can continue looking at the remaining parts of eval() now. We stopped at the `for` loop, so now we are inside the iteration doing something with each element of the list:
648 | 
649 |     obj *o = l->l.ele[j];
650 |     aproc *proc;
651 |     ctx->frame->curline = o->line;
652 | 
653 |     switch(o->type) {
654 |     case OBJ_TYPE_TUPLE:                /* Capture variables. */
655 |         /* Quoted tuples just get pushed on the stack, losing
656 |          * their quoted status. */
657 |         if (o->l.quoted) {
658 |             obj *notq = deepCopy(o);
659 |             notq->l.quoted = 0;
660 |             stackPush(ctx,notq);
661 |             break;
662 |         }
663 | 
664 |         if (ctx->stacklen < o->l.len) {
665 |             setError(ctx,o->l.ele[ctx->stacklen]->str.ptr,
666 |                 "Out of stack while capturing local");
667 |             return 1;
668 |         }
669 | 
670 |         /* Bind each variable to the corresponding locals array,
671 |          * removing it from the stack. */
672 |         ctx->stacklen -= o->l.len;
673 |         for (size_t i = 0; i < o->l.len; i++) {
674 |             int idx = o->l.ele[i]->str.ptr[0];
675 |             release(ctx->frame->locals[idx]);
676 |             ctx->frame->locals[idx] =
677 |                 ctx->stack[ctx->stacklen+i];
678 |         }
679 |         break;
680 | 
681 | The essence of the loop is a `switch` statement doing something different depending on the object type. The object is just the current element of the list. The first case is the tuple. Tuples capture local variables, unless they are quoted like this:
682 | 
683 |     (a b c)  // Normal tuple -- This will capture variables
684 |     `(a b c) // Quoted tuple -- This will be pushed on the stack
685 | 
686 | So if the tuple is not quoted, we check if there are enough stack elements
687 | according to the tuple length. Then, element after element, we move objects
688 | from the Aocla stack to the stack frame, into the array representing the locals. Note that there could be already an object bound to a given local, so we `release()` it before the new assignment.
689 | 
690 |     case OBJ_TYPE_SYMBOL:
691 |         /* Quoted symbols don't generate a procedure call, but like
692 |          * any other object they get pushed on the stack. */
693 |         if (o->str.quoted) {
694 |             obj *notq = deepCopy(o);
695 |             notq->str.quoted = 0;
696 |             stackPush(ctx,notq);
697 |             break;
698 |         }
699 | 
700 |         /* Not quoted symbols get looked up and executed if they
701 |          * don't start with "$". Otherwise are handled as locals
702 |          * push on the stack. */
703 |         if (o->str.ptr[0] == '$') {     /* Push local var. */
704 |             int idx = o->str.ptr[1];
705 |             if (ctx->frame->locals[idx] == NULL) {
706 |                 setError(ctx,o->str.ptr, "Unbound local var");
707 |                 return 1;
708 |             }
709 |             stackPush(ctx,ctx->frame->locals[idx]);
710 |             retain(ctx->frame->locals[idx]);
711 | 
712 | For symbols, as we did for tuples, we check if the symbol is quoted, an in such case we just push it on the stack. Otherwise, we handle two different cases. The above is the one where symbol names start with a `$`. It is, basically, the reverse operation of what we saw earlier in tuples capturing local vars. This time the local variable is transferred to the stack. However **we still take the reference** in the local variable array, as the program may want to push the same variable again and again, so, after pushing the object on the stack, we have to call `retain()` to increment the reference count of the object.
713 | 
714 | If the symbol does not start with `$`, then it's a procedure call:
715 | 
716 |     } else {                        /* Call procedure. */
717 |         proc = lookupProc(ctx,o->str.ptr);
718 |         if (proc == NULL) {
719 |             setError(ctx,o->str.ptr,
720 |                 "Symbol not bound to procedure");
721 |             return 1;
722 |         }
723 |         if (proc->cproc) {
724 |             /* Call a procedure implemented in C. */
725 |             aproc *prev = ctx->frame->curproc;
726 |             ctx->frame->curproc = proc;
727 |             int err = proc->cproc(ctx);
728 |             ctx->frame->curproc = prev;
729 |             if (err) return err;
730 |         } else {
731 |             /* Call a procedure implemented in Aocla. */
732 |             stackframe *oldsf = ctx->frame;
733 |             ctx->frame = newStackFrame(ctx);
734 |             ctx->frame->curproc = proc;
735 |             int err = eval(ctx,proc->proc);
736 |             freeStackFrame(ctx->frame);
737 |             ctx->frame = oldsf;
738 |             if (err) return err;
739 |         }
740 |     }
741 | 
742 | The `lookupProc()` function just scans a linked list of procedures
743 | and returns a list object or, if there is no such procedure defined, NULL.
744 | Now what happens immediately after is much more interesting. Aocla procedures
745 | are just list objects, but it is possible to implement Aocla procedures
746 | directly in C. If the `cproc` is not NULL, then it is a C function pointer
747 | implementing a procedure, otherwise the procedure is *user defined*, that menas it is written in Aocla, and we need to evaluate it. We do this with a nested `eval()` call. As you can see, recursion is crucial in writing interpreters.
748 | 
749 | *A little digression: if we would like to speedup procedure call, we could cache the procedure lookup directly inside the symbol object. However in Aocla procedures can be redefined, so the next time the same procedure name may be bound to a different procedure. To still cache lookedup procedures, a simple way is to use the concept of "epoch". The context has a 64 bit integer called epoch, that is incremented every time a procedure is redefined. So, when we cache the procedure lookup into the object, we also store the current value of the epoch. Then, before using the cached value, we check if the epoch maches. If there is no match, we perform the lookup again, and update the cached procedure and the epoch.*
750 | 
751 | Sorry, let's go back to our `eval` function. Another important thing that's worth noting is that each new Aocla procedure call has its own set of local variables. The scope of local variables, in Aocla, is the lifetime of the procedure call, like in many other languages. So, in the code above, before calling an Aocla procedure we allocate a new stack frame using `newStackFrame()`, then we can finally call `eval()`, free the stack frame and store the old stack frame back in the context structure. Procedures implemented in C don't need a stack frame, as they will not make any use of Aocla local variables. The following is the last part of the `eval()` function implementation:
752 | 
753 |     default:
754 |         stackPush(ctx,o);
755 |         retain(o);
756 |         break;
757 | 
758 | This is the default behavior for all the other object types. They get pushed on the stack, and that's it.
759 | 
760 | Let's see how Aocla C-coded procedures are implemented, by observing the
761 | C function implementing basic mathematical operations such as +, -, ...
762 | 
763 |     /* Implements +, -, *, %, ... */
764 |     int procBasicMath(aoclactx *ctx) {
765 |         if (checkStackType(ctx,2,OBJ_TYPE_INT,OBJ_TYPE_INT)) return 1;
766 |         obj *b = stackPop(ctx);
767 |         obj *a = stackPop(ctx);
768 | 
769 |         int res;
770 |         const char *fname = ctx->frame->curproc->name;
771 |         if (fname[0] == '+' && fname[1] == 0) res = a->i + b->i;
772 |         if (fname[0] == '-' && fname[1] == 0) res = a->i - b->i;
773 |         if (fname[0] == '*' && fname[1] == 0) res = a->i * b->i;
774 |         if (fname[0] == '/' && fname[1] == 0) res = a->i / b->i;
775 |         stackPush(ctx,newInt(res));
776 |         release(a);
777 |         release(b);
778 |         return 0;
779 |     }
780 | 
781 | Here I cheated: the code required to implement each math procedure separately would be almost the same. So we bind all the operators to the same C function, and check the name of the procedure called inside a single implementation (see the above function). Here is where we register many procedures to the same C function.
782 | 
783 |     void loadLibrary(aoclactx *ctx) {
784 |         addProc(ctx,"+",procBasicMath,NULL);
785 |         addProc(ctx,"-",procBasicMath,NULL);
786 |         addProc(ctx,"*",procBasicMath,NULL);
787 |         addProc(ctx,"/",procBasicMath,NULL);
788 |         ...
789 | 
790 | The `procBasicMath()` is quite self-documenting, I guess. The proof of that
791 | is that I didn't add any comment inside the function. When comments are needed, I add them automatically, I can't help myself. Anyway, what it does is
792 | the following: it checks the type of the top objects on the stack, as they
793 | must be integers. Get them with `stackPop()`, perform the math, push a new integer object, release the two old ones. That's it.
794 | 
795 | ## Deep copy of objects
796 | 
797 | Believe it or not, that's it: you already saw all the most important
798 | parts of the Aocla interpreter. But there are a few corner cases that
799 | are forth a few more paragraphs of this README.
800 | 
801 | Imagine the execution of the following Aocla program:
802 | 
803 |     [1 2 3] (x)	// The varialbe x contains the list now
804 |     4 $x ->		// Now the stack contains the list [1 2 3 4]
805 |     $x		// What will be x now? [1 2 3] or [1 2 3 4]?
806 | 
807 | Aocla is designed to be kinda of a *pure* language: words manipulate
808 | objects by taking them from the stack and pushing new objects to the
809 | stack, that result from certain operations. We don't want to expose the
810 | idea of references in such a language, I feel like that would be a mess,
811 | a design error, and a programming nightmare. So if the variable `x` is
812 | bound to the list `[1 2 3]`, pushing it to the stack and adding new
813 | elements to the list should **not produce changes** to the list stored
814 | at `x`.
815 | 
816 | At the same time, we don't want to write an inefficient crap where each
817 | value is copied again and again. When we push our variable content on
818 | the stack, we just push the pointer to the object and increment the reference
819 | count. In order to have the best of both worlds, we want to implement something
820 | called *copy on write*. So normally our objects can be shared, and thanks
821 | to the count of references we know if and object is shared or not.
822 | However, as soon as some operation is going to alter an object whose
823 | reference count is greater than one, it gets copied first, only later modified.
824 | 
825 | In the above program, the list reference count is 2, because the same list
826 | is stored in the array of local variables and on the stack. Let's
827 | give a look at the implementation of the `->` operator:
828 | 
829 |     /* Implements -> and <-, appending element x in list with stack
830 |      *
831 |      * (x [1 2 3]) => ([1 2 3 x]) | ([x 1 2 3])
832 |      *
833 |      * <- is very inefficient as it memmoves all N elements. */
834 |     int procListAppend(aoclactx *ctx) {
835 |         int tail = ctx->frame->curproc->name[0] == '-';     /* Append on tail? */
836 |         if (checkStackType(ctx,2,OBJ_TYPE_ANY,OBJ_TYPE_LIST)) return 1;
837 |         obj *l = getUnsharedObject(stackPop(ctx));
838 |         obj *ele = stackPop(ctx);
839 |         l->l.ele = myrealloc(l->l.ele,sizeof(obj*)*(l->l.len+1));
840 |         if (tail) {
841 |             l->l.ele[l->l.len] = ele;
842 |         } else {
843 |             memmove(l->l.ele+1,l->l.ele,sizeof(obj*)*l->l.len);
844 |             l->l.ele[0] = ele;
845 |         }
846 |         l->l.len++;
847 |         stackPush(ctx,l);
848 |         return 0;
849 |     }
850 | 
851 | The interesting line here is the following one:
852 | 
853 |     obj *l = getUnsharedObject(stackPop(ctx));
854 | 
855 | We want an object that is not shared, right? This function will abstract
856 | the work for us. Let's check, in turn, its implementation:
857 | 
858 |     obj *getUnsharedObject(obj *o) {
859 |         if (o->refcount > 1) {
860 |             release(o);
861 |             return deepCopy(o);
862 |         } else {
863 |             return o;
864 |         }
865 |     }
866 | 
867 | So if the object is already not shared (its *refcount* is one), just return it as it is. Otherwise create a copy and remove a reference from the original object. Why, on copy, we need to remove a reference from the passed object? This may look odd at a first glance, but think at it: the invariant here should be that the caller of this function is the only owner of the object. We want the caller to be able to abstract totally what happens inside the `getUnsharedObject()` function. If the object was shared and we returned the caller a copy, the reference the caller had for the old object should be gone. Let's look at the following example:
868 | 
869 |     obj *o = stackPop(ctx);
870 |     o = getUnsharedObject(o);
871 |     doSomethingThatChanges(o);
872 |     stackPush(ctx,o);
873 | 
874 | Stack pop and push functions don't change the reference count of the object,
875 | so if the object is not shared we get it with a single reference, change it,
876 | push it on the stack and the object has still a single reference.
877 | 
878 | Now imagine that, instead, the object is shared and also lives in a
879 | variable. In this case we pop an object that has two references, call
880 | `getUnsharedObject()` that will return us a copy with a *recount* of one. We
881 | change the object and push it to the stack. The new object will have a
882 | single reference on the stack, and has a reference count of one: all is
883 | fine. What about the old object stored in the local variable? It should
884 | have a reference count of one as well, but if we don't `release()` it
885 | in `getUnsharedObject()` it would have two, causing a memory leak.
886 | 
887 | I'll not show the `deepCopy()` function, it just allocates a new object of the specified type and copy the content. But guess what? It's a recursive function, too. That's why it is a *deep* copy.
888 | 
889 | # The end
890 | 
891 | That's it, and thanks for reading that far. To know more about interpreters you have only one thing to do: write your own, or radically modify Aocla in some crazy way. Get your hands dirty, it's super fun and rewarding. I can only promise that what you will learn will be worthwhile, even if you'll never write an interpreter again.
892 | 
893 | ## Appendix: Aocla locals and Fibonacci
894 | 
895 | I believe the Fibonacci implementation written in Aocla, versus the implementation written in other stack-based languages, is quite telling about the jump forward in readability and usability provided by this simple feature:
896 | 
897 |     [(n)
898 |         [$n 1 <=]
899 |         [
900 |             $n
901 |         ]
902 |         [
903 |             $n 1 - fib
904 |             $n 2 - fib
905 |             +
906 |         ] ifelse
907 |     ] 'fib def
908 | 
909 |     10 fib
910 |     printnl
911 | 
912 | So, while Aocla is a toy language, I believe this feature should be looked more carefully by actual stack-based language designers.
913 | 


--------------------------------------------------------------------------------
/aocla.c:
--------------------------------------------------------------------------------
   1 | #include <stdio.h>
   2 | #include <stdlib.h>
   3 | #include <string.h>
   4 | #include <assert.h>
   5 | #include <limits.h>
   6 | #include <ctype.h>
   7 | #include <stdarg.h>
   8 | 
   9 | #define NOTUSED(V) ((void) V)
  10 | 
  11 | /* =========================== Data structures ============================== */
  12 | 
  13 | /* This describes our Aocla object type. It can be used to represent
  14 |  * lists (and code: they are the same type in Aocla), integers, strings
  15 |  * and so forth.
  16 |  *
  17 |  * Type are defined so that each type ID is a different set bit, this way
  18 |  * in checkStackType() we may ask the function to check if some argument
  19 |  * is one among a list of types just bitwise-oring the type IDs together. */
  20 | #define OBJ_TYPE_INT    (1<<0)
  21 | #define OBJ_TYPE_LIST   (1<<1)
  22 | #define OBJ_TYPE_TUPLE  (1<<2)
  23 | #define OBJ_TYPE_STRING (1<<3)
  24 | #define OBJ_TYPE_SYMBOL (1<<4)
  25 | #define OBJ_TYPE_BOOL   (1<<5)
  26 | #define OBJ_TYPE_ANY    INT_MAX /* All bits set. For checkStackType(). */
  27 | typedef struct obj {
  28 |     int type;       /* OBJ_TYPE_... */
  29 |     int refcount;   /* Reference count. */
  30 |     int line;       /* Source code line number where this was defined, or 0. */
  31 |     union {
  32 |         int i;      /* Integer. Literal: 1234 */
  33 |         int istrue; /* Boolean. Literal: #t or #f */
  34 |         struct {    /* List or Tuple: Literal: [1 2 3 4] or (a b c) */
  35 |             struct obj **ele;
  36 |             size_t len;
  37 |             int quoted; /* Used for quoted tuples. Don't capture vars if true.
  38 |                            Just push the tuple on stack. */
  39 |         } l;
  40 |         struct {    /* Mutable string & unmutable symbol. */
  41 |             char *ptr;
  42 |             size_t len;
  43 |             int quoted; /* Used for quoted symbols: when quoted they are
  44 |                            not executed, but just pushed on the stack by
  45 |                            eval(). */
  46 |         } str;
  47 |     };
  48 | } obj;
  49 | 
  50 | /* Procedures. They are just lists with associated names. There are also
  51 |  * procedures implemented in C. In this case proc is NULL and cproc has
  52 |  * the value of the function pointer implementing the procedure. */
  53 | struct aoclactx;
  54 | typedef struct aproc {
  55 |     const char *name;
  56 |     obj *proc;      /* If not NULL it's an Aocla procedure (list object). */
  57 |     int (*cproc)(struct aoclactx *); /* C procedure. */
  58 |     struct aproc *next;
  59 | } aproc;
  60 | 
  61 | /* We have local vars, so we need a stack frame. We start with a top level
  62 |  * stack frame. Each time a procedure is called, we create a new stack frame
  63 |  * and free it once the procedure returns. */
  64 | #define AOCLA_NUMVARS 256
  65 | typedef struct stackframe {
  66 |     obj *locals[AOCLA_NUMVARS];/* Local var names are limited to a,b,c,...,z. */
  67 |     aproc *curproc;            /* Current procedure executing or NULL.  */
  68 |     int curline;               /* Current line number during execution. */
  69 |     struct stackframe *prev;   /* Upper level stack frame or NULL. */
  70 | } stackframe;
  71 | 
  72 | /* Interpreter state. */
  73 | #define ERRSTR_LEN 256
  74 | typedef struct aoclactx {
  75 |     size_t stacklen;        /* Stack current len. */
  76 |     obj **stack;
  77 |     aproc *proc;            /* Defined procedures. */
  78 |     stackframe *frame;      /* Stack frame with locals. */
  79 |     /* Syntax error context. */
  80 |     char errstr[ERRSTR_LEN]; /* Syntax error or execution error string. */
  81 | } aoclactx;
  82 | 
  83 | void setError(aoclactx *ctx, const char *ptr, const char *msg);
  84 | aproc *lookupProc(aoclactx *ctx, const char *name);
  85 | void loadLibrary(aoclactx *ctx);
  86 | 
  87 | /* ================================= Utils ================================== */
  88 | 
  89 | /* Life is too short to handle OOM. alloc() and realloc() that
  90 |  * abort on OOM. free() is the same, so no wrapper. */
  91 | void *myalloc(size_t size) {
  92 |     void *p = malloc(size);
  93 |     if (!p) {
  94 |         fprintf(stderr,"Out of memory allocating %zu bytes\n", size);
  95 |         exit(1);
  96 |     }
  97 |     return p;
  98 | }
  99 | 
 100 | void *myrealloc(void *ptr, size_t size) {
 101 |     void *p = realloc(ptr,size);
 102 |     if (!p) {
 103 |         fprintf(stderr,"Out of memory allocating %zu bytes\n", size);
 104 |         exit(1);
 105 |     }
 106 |     return p;
 107 | }
 108 | 
 109 | /* =============================== Objects ================================== */
 110 | 
 111 | /* Recursively free an Aocla object, if the refcount just dropped to zero. */
 112 | void release(obj *o) {
 113 |     if (o == NULL) return;
 114 |     assert(o->refcount >= 0);
 115 |     if (--o->refcount == 0) {
 116 |         switch(o->type) {
 117 |         case OBJ_TYPE_LIST:
 118 |         case OBJ_TYPE_TUPLE:
 119 |             for (size_t j = 0; j < o->l.len; j++)
 120 |                 release(o->l.ele[j]);
 121 |             free(o->l.ele);
 122 |             break;
 123 |         case OBJ_TYPE_SYMBOL:
 124 |         case OBJ_TYPE_STRING:
 125 |             free(o->str.ptr);
 126 |             break;
 127 |         default:
 128 |             break;
 129 |             /* Nothing special to free. */
 130 |         }
 131 |         free(o);
 132 |     }
 133 | }
 134 | 
 135 | /* Increment the object ref count. Use when a new reference is created. */
 136 | void retain(obj *o) {
 137 |     o->refcount++;
 138 | }
 139 | 
 140 | /* Allocate a new object of type 'type. */
 141 | obj *newObject(int type) {
 142 |     obj *o = myalloc(sizeof(*o));
 143 |     o->refcount = 1;
 144 |     o->type = type;
 145 |     o->line = 0;
 146 |     return o;
 147 | }
 148 | 
 149 | /* Return true if the character 'c' is within the Aocla symbols charset. */
 150 | int issymbol(int c) {
 151 |     if (isalpha(c)) return 1;
 152 |     switch(c) {
 153 |     case '@':
 154 |     case '$':
 155 |     case '+':
 156 |     case '-':
 157 |     case '*':
 158 |     case '/':
 159 |     case '=':
 160 |     case '?':
 161 |     case '%':
 162 |     case '>':
 163 |     case '<':
 164 |     case '_':
 165 |     case '\'':
 166 |         return 1;
 167 |     default:
 168 |         return 0;
 169 |     }
 170 | }
 171 | 
 172 | /* Utility function for parseObject(). It just consumes spaces and comments
 173 |  * and return the new pointer after the consumed part of the string. */
 174 | const char *parserConsumeSpace(const char *s, int *line) {
 175 |     while(1) {
 176 |         while(isspace(s[0])) {
 177 |             if (s[0] == '\n' && line) (*line)++;
 178 |             s++;
 179 |         }
 180 |         if (s[0] != '/' || s[1] != '/') break; /* // style comments. */
 181 |         while(s[0] && s[0] != '\n') s++; /* Seek newline after comment. */
 182 |     }
 183 |     return s;
 184 | }
 185 | 
 186 | /* Given the string 's' return the obj representing the list or
 187 |  * NULL on syntax error. '*next' is set to the next byte to parse, after
 188 |  * the current e was completely parsed.
 189 |  * 
 190 |  * The 'ctx' argument is only used to set an error in the context in case
 191 |  * of parse error, it is possible to pass NULL.
 192 |  *
 193 |  * Returned object has a ref count of 1. */
 194 | obj *parseObject(aoclactx *ctx, const char *s, const char **next, int *line) {
 195 |     obj *o = newObject(-1);
 196 | 
 197 |     /* Consume empty space and comments. */
 198 |     s = parserConsumeSpace(s,line);
 199 |     if (line)
 200 |         o->line = *line; /* Set line number where this object is defined. */
 201 | 
 202 |     if ((s[0] == '-' && isdigit(s[1])) || isdigit(s[0])) { /* Integer. */
 203 |         char buf[64];
 204 |         size_t len = 0;
 205 |         while((*s == '-' || isdigit(*s)) && len < sizeof(buf)-1)
 206 |             buf[len++] = *s++;
 207 |         buf[len] = 0;
 208 |         o->type = OBJ_TYPE_INT;
 209 |         o->i = atoi(buf);
 210 |         if (next) *next = s;
 211 |     } else if (s[0] == '[' || /* List, tuple or quoted tuple. */
 212 |                s[0] == '(' ||
 213 |                (s[0] == '\'' && s[1] == '('))
 214 |     {
 215 |         if (s[0] == '\'') {
 216 |             o->l.quoted = 1;
 217 |             s++;
 218 |         } else {
 219 |             o->l.quoted = 0;
 220 |         }
 221 |         o->type = s[0] == '[' ? OBJ_TYPE_LIST : OBJ_TYPE_TUPLE;
 222 |         o->l.len = 0;
 223 |         o->l.ele = NULL;
 224 |         s++;
 225 |         /* Parse comma separated elements. */
 226 |         while(1) {
 227 |             /* The list may be empty, so we need to parse for "]"
 228 |              * ASAP. */
 229 |             s = parserConsumeSpace(s,line);
 230 |             if ((o->type == OBJ_TYPE_LIST && s[0] == ']') ||
 231 |                 (o->type == OBJ_TYPE_TUPLE && s[0] == ')'))
 232 |             {
 233 |                 if (next) *next = s+1;
 234 |                 return o;
 235 |             }
 236 | 
 237 |             /* Parse the current sub-element recursively. */
 238 |             const char *nextptr;
 239 |             obj *element = parseObject(ctx,s,&nextptr,line);
 240 |             if (element == NULL) {
 241 |                 release(o);
 242 |                 return NULL;
 243 |             } else if (o->type == OBJ_TYPE_TUPLE &&
 244 |                        (element->type != OBJ_TYPE_SYMBOL ||
 245 |                         element->str.len != 1))
 246 |             {
 247 |                 /* Tuples can be only composed of one character symbols. */
 248 |                 release(element);
 249 |                 release(o);
 250 |                 setError(ctx,s,
 251 |                     "Tuples can only contain single character symbols");
 252 |                 return NULL;
 253 |             }
 254 |             o->l.ele = myrealloc(o->l.ele, sizeof(obj*)*(o->l.len+1));
 255 |             o->l.ele[o->l.len++] = element;
 256 |             s = nextptr; /* Continue from first byte not parsed. */
 257 | 
 258 |             continue; /* Parse next element. */
 259 |         }
 260 |         /* Syntax error (list not closed). */
 261 |         setError(ctx,s,"List never closed");
 262 |         release(o);
 263 |         return NULL;
 264 |     } else if (issymbol(s[0])) {         /* Symbol. */
 265 |         o->type = OBJ_TYPE_SYMBOL;
 266 |         if (s[0] == '\'') {
 267 |             o->str.quoted = 1;
 268 |             s++;
 269 |         } else {
 270 |             o->str.quoted = 0;
 271 |         }
 272 |         const char *end = s;
 273 |         while(issymbol(*end)) end++;
 274 |         o->str.len = end-s;
 275 |         char *dest = myalloc(o->str.len+1);
 276 |         o->str.ptr = dest;
 277 |         memcpy(dest,s,o->str.len);
 278 |         dest[o->str.len] = 0;
 279 |         if (next) *next = end;
 280 |     } else if (s[0]=='#') {             /* Boolean. */
 281 |         if (s[1] != 't' && s[1] != 'f') {
 282 |             setError(ctx,s,"Booelans are either #t or #f");
 283 |             release(o);
 284 |             return NULL;
 285 |         }
 286 |         o->type = OBJ_TYPE_BOOL;
 287 |         o->istrue = s[1] == 't' ? 1 : 0;
 288 |         s += 2;
 289 |         if (next) *next = s;
 290 |     } else if (s[0] == '"') {           /* String. */
 291 |         s++; /* Skip " */
 292 |         o->type = OBJ_TYPE_STRING;
 293 |         o->str.ptr = myalloc(1); /* We need at least space for nullterm. */
 294 |         o->str.len = 0;
 295 |         while(s[0] && s[0] != '"') {
 296 |             int c = s[0];
 297 |             switch(c) {
 298 |             case '\\':
 299 |                 s++;
 300 |                 int q = s[0];
 301 |                 switch(q) {
 302 |                 case 'n': c = '\n'; break;
 303 |                 case 'r': c = '\r'; break;
 304 |                 case 't': c = '\t'; break;
 305 |                 default: c = q; break;
 306 |                 }
 307 |             default:
 308 |                 break;
 309 |             }
 310 |             /* Here we abuse realloc() ability to overallocate for us
 311 |              * in order to avoid complexity. We allocate len+2 because we
 312 |              * need 1 byte for the current char, 1 for the nullterm. */
 313 |             o->str.ptr = myrealloc(o->str.ptr,o->str.len+2);
 314 |             o->str.ptr[o->str.len++] = c;
 315 |             s++;
 316 |         }
 317 |         if (s[0] != '"') {
 318 |             setError(ctx,s,"Quotation marks never closed in string");
 319 |             release(o);
 320 |             return NULL;
 321 |         }
 322 |         o->str.ptr[o->str.len] = 0; /* nullterm. */
 323 |         s++;
 324 |         if (next) *next = s;
 325 |     } else {
 326 |         /* Syntax error. */
 327 |         setError(ctx,s,"No object type starts like this");
 328 |         release(o);
 329 |         return NULL;
 330 |     }
 331 |     return o;
 332 | }
 333 | 
 334 | /* Compare the two objects 'a' and 'b' and return:
 335 |  * -1 if a<b; 0 if a==b; 1 if a>b. */
 336 | #define COMPARE_TYPE_MISMATCH INT_MIN
 337 | int compare(obj *a, obj *b) {
 338 |     /* Int VS Int */
 339 |     if (a->type == OBJ_TYPE_INT && b->type == OBJ_TYPE_INT) {
 340 |         if (a->i < b->i) return -1;
 341 |         else if (a->i > b->i) return 1;
 342 |         return 0;
 343 |     }
 344 | 
 345 |     /* Bool vs Bool. */
 346 |     if (a->type == OBJ_TYPE_BOOL && b->type == OBJ_TYPE_BOOL) {
 347 |         if (a->istrue < b->istrue) return -1;
 348 |         else if (a->istrue > b->istrue) return 1;
 349 |         return 0;
 350 |     }
 351 | 
 352 |     /* String|Symbol VS String|Symbol. */
 353 |     if ((a->type == OBJ_TYPE_STRING || a->type == OBJ_TYPE_SYMBOL) &&
 354 |         (b->type == OBJ_TYPE_STRING || b->type == OBJ_TYPE_SYMBOL))
 355 |     {
 356 |         int cmp = strcmp(a->str.ptr,b->str.ptr);
 357 |         /* Normalize. */
 358 |         if (cmp < 0) return -1;
 359 |         if (cmp > 0) return 1;
 360 |         return 0;
 361 |     }
 362 | 
 363 |     /* List|Tuple vs List|Tuple. */
 364 |     if ((a->type == OBJ_TYPE_LIST || a->type == OBJ_TYPE_TUPLE) &&
 365 |         (b->type == OBJ_TYPE_LIST || b->type == OBJ_TYPE_TUPLE))
 366 |     {
 367 |         /* Len wins. */
 368 |         if (a->l.len < b->l.len) return -1;
 369 |         else if (a->l.len > b->l.len) return 1;
 370 |         return 0;
 371 |     }
 372 | 
 373 |     /* Comparison impossible. */
 374 |     return COMPARE_TYPE_MISMATCH;
 375 | }
 376 | 
 377 | /* qsort() helper to sort arrays of obj pointers. */
 378 | int qsort_obj_cmp(const void *a, const void *b) {
 379 |     obj **obja = (obj**)a, **objb = (obj**)b;
 380 |     return compare(obja[0],objb[0]);
 381 | }
 382 | 
 383 | /* Output an object human readable representation .*/
 384 | #define PRINT_RAW 0             /* Nothing special. */
 385 | #define PRINT_COLOR (1<<0)      /* Colorized by type. */
 386 | #define PRINT_REPR (1<<1)       /* Print in Aocla literal form. */
 387 | void printobj(obj *obj, int flags) {
 388 |     const char *escape;
 389 |     int color = flags & PRINT_COLOR;
 390 |     int repr = flags & PRINT_REPR;
 391 | 
 392 |     if (color) {
 393 |         switch(obj->type) {
 394 |         case OBJ_TYPE_LIST: escape = "\033[33;1m"; break;       /* Yellow. */
 395 |         case OBJ_TYPE_TUPLE: escape = "\033[34;1m"; break;      /* Blue. */
 396 |         case OBJ_TYPE_SYMBOL: escape = "\033[36;1m"; break;     /* Cyan. */
 397 |         case OBJ_TYPE_STRING: escape = "\033[32;1m"; break;     /* Green. */
 398 |         case OBJ_TYPE_INT: escape = "\033[37;1m"; break;        /* Gray. */
 399 |         case OBJ_TYPE_BOOL: escape = "\033[35;1m"; break;       /* Gray. */
 400 |         }
 401 |         printf("%s",escape); /* Set color. */
 402 |     }
 403 | 
 404 |     switch(obj->type) {
 405 |     case OBJ_TYPE_INT:
 406 |         printf("%d",obj->i);
 407 |         break;
 408 |     case OBJ_TYPE_SYMBOL:
 409 |         printf("%s",obj->str.ptr);
 410 |         break;
 411 |     case OBJ_TYPE_STRING:
 412 |         if (!repr) {
 413 |             fwrite(obj->str.ptr,obj->str.len,1,stdout);
 414 |         } else {
 415 |             printf("\"");
 416 |             for (size_t j = 0; j < obj->str.len; j++) {
 417 |                 int c = obj->str.ptr[j];
 418 |                 switch(c) {
 419 |                 case '\n': printf("\\n"); break;
 420 |                 case '\r': printf("\\r"); break;
 421 |                 case '\t': printf("\\t"); break;
 422 |                 case '"': printf("\\\""); break;
 423 |                 default: printf("%c", c); break;
 424 |                 }
 425 |             }
 426 |             printf("\"");
 427 |         }
 428 |         break;
 429 |     case OBJ_TYPE_BOOL:
 430 |         printf("#%c",obj->istrue ? 't' : 'f');
 431 |         break;
 432 |     case OBJ_TYPE_LIST:
 433 |     case OBJ_TYPE_TUPLE:
 434 |         if (repr) printf("%c",obj->type == OBJ_TYPE_LIST ? '[' : '(');
 435 |         for (size_t j = 0; j < obj->l.len; j++) {
 436 |             printobj(obj->l.ele[j],flags);
 437 |             if (j != obj->l.len-1) printf(" ");
 438 |         }
 439 |         if (color) printf("%s",escape); /* Restore upper level color. */
 440 |         if (repr) printf("%c",obj->type == OBJ_TYPE_LIST ? ']' : ')');
 441 |         break;
 442 |     }
 443 |     if (color) printf("\033[0m"); /* Color off. */
 444 | }
 445 | 
 446 | /* Allocate an int object with value 'i'. */
 447 | obj *newInt(int i) {
 448 |     obj *o = newObject(OBJ_TYPE_INT);
 449 |     o->i = i;
 450 |     return o;
 451 | }
 452 | 
 453 | /* Allocate a boolean object with value 'b' (1 true, 0 false). */
 454 | obj *newBool(int b) {
 455 |     obj *o = newObject(OBJ_TYPE_BOOL);
 456 |     o->istrue = b;
 457 |     return o;
 458 | }
 459 | 
 460 | /* Allocate a string object initialized with the content at 's' for
 461 |  * 'len' bytes. */
 462 | obj *newString(const char *s, size_t len) {
 463 |     obj *o = newObject(OBJ_TYPE_STRING);
 464 |     o->str.len = len;
 465 |     o->str.ptr = myalloc(len+1);
 466 |     memcpy(o->str.ptr,s,len);
 467 |     o->str.ptr[len] = 0;
 468 |     return o;
 469 | }
 470 | 
 471 | /* Deep copy the passed object. Return an object with refcount = 1. */
 472 | obj *deepCopy(obj *o) {
 473 |     if (o == NULL) return NULL;
 474 |     obj *c = newObject(o->type);
 475 |     switch(o->type) {
 476 |     case OBJ_TYPE_INT: c->i = o->i; break;
 477 |     case OBJ_TYPE_BOOL: c->istrue = o->istrue; break;
 478 |     case OBJ_TYPE_LIST:
 479 |     case OBJ_TYPE_TUPLE:
 480 |         c->l.len = o->l.len;
 481 |         c->l.ele = myalloc(sizeof(obj*)*o->l.len);
 482 |         for (size_t j = 0; j < o->l.len; j++)
 483 |             c->l.ele[j] = deepCopy(o->l.ele[j]);
 484 |         break;
 485 |     case OBJ_TYPE_STRING:
 486 |     case OBJ_TYPE_SYMBOL:
 487 |         c->str.len = o->str.len;
 488 |         c->str.quoted = o->str.quoted; /* Only useful for symbols. */
 489 |         c->str.ptr = myalloc(o->str.len+1);
 490 |         memcpy(c->str.ptr,o->str.ptr,o->str.len+1);
 491 |         break;
 492 |     }
 493 |     return c;
 494 | }
 495 | 
 496 | /* This function performs a deep copy of the object if it has a refcount > 1.
 497 |  * The copy is returned. Otherwise if refcount is 1, the function returns
 498 |  * the same object we passed as argument. This is useful when we want to
 499 |  * modify a shared object.
 500 |  *
 501 |  * When the function returns a copy, the reference count of the original
 502 |  * object is decremented, as the object logically lost one reference. */
 503 | obj *getUnsharedObject(obj *o) {
 504 |     if (o->refcount > 1) {
 505 |         release(o);
 506 |         return deepCopy(o);
 507 |     } else {
 508 |         return o;
 509 |     }
 510 | }
 511 | 
 512 | /* ========================== Interpreter state ============================= */
 513 | 
 514 | /* Set the syntax or runtime error, if the context is not NULL. */
 515 | void setError(aoclactx *ctx, const char *ptr, const char *msg) {
 516 |     if (!ctx) return;
 517 |     if (!ptr) ptr = ctx->frame->curproc ?
 518 |                     ctx->frame->curproc->name : "unknown context";
 519 |     size_t len =
 520 |         snprintf(ctx->errstr,ERRSTR_LEN,"%s: '%.30s%s'",
 521 |             msg,ptr,strlen(ptr)>30 ? "..." :"");
 522 | 
 523 |     stackframe *sf = ctx->frame;
 524 |     while(sf && len < ERRSTR_LEN) {
 525 |         len += snprintf(ctx->errstr+len,ERRSTR_LEN-len," in %s:%d ",
 526 |             sf->curproc ? sf->curproc->name : "unknown",
 527 |             sf->curline);
 528 |         sf = sf->prev;
 529 |     }
 530 | }
 531 | 
 532 | /* Create a new stack frame. */
 533 | stackframe *newStackFrame(aoclactx *ctx) {
 534 |     stackframe *sf = myalloc(sizeof(*sf));
 535 |     memset(sf->locals,0,sizeof(sf->locals));
 536 |     sf->curproc = NULL;
 537 |     sf->prev = ctx ? ctx->frame : NULL;
 538 |     return sf;
 539 | }
 540 | 
 541 | /* Free a stack frame. */
 542 | void freeStackFrame(stackframe *sf) {
 543 |     for (int j = 0; j < AOCLA_NUMVARS; j++) release(sf->locals[j]);
 544 |     free(sf);
 545 | }
 546 | 
 547 | aoclactx *newInterpreter(void) {
 548 |     aoclactx *i = myalloc(sizeof(*i));
 549 |     i->stacklen = 0;
 550 |     i->stack = NULL; /* Will be allocated on push of new elements. */
 551 |     i->proc = NULL; /* That's a linked list. Starts empty. */
 552 |     i->frame = newStackFrame(NULL);
 553 |     loadLibrary(i);
 554 |     return i;
 555 | }
 556 | 
 557 | /* Push an object on the interpreter stack. No refcount change. */
 558 | void stackPush(aoclactx *ctx, obj *o) {
 559 |     ctx->stack = myrealloc(ctx->stack,sizeof(obj*) * (ctx->stacklen+1));
 560 |     ctx->stack[ctx->stacklen++] = o;
 561 | }
 562 | 
 563 | /* Pop an object from the stack without modifying its refcount.
 564 |  * Return NULL if stack is empty. */
 565 | obj *stackPop(aoclactx *ctx) {
 566 |     if (ctx->stacklen == 0) return NULL;
 567 |     return ctx->stack[--ctx->stacklen];
 568 | }
 569 | 
 570 | /* Return the pointer to the last object (if offset == 0) on the stack
 571 |  * or NULL. Offset of 1 means penultimate and so forth.  */
 572 | obj *stackPeek(aoclactx *ctx, size_t offset) {
 573 |     if (ctx->stacklen <= offset) return NULL;
 574 |     return ctx->stack[ctx->stacklen-1-offset];
 575 | }
 576 | 
 577 | /* Like stack peek, but instead of returning the object sets it. */
 578 | void stackSet(aoclactx *ctx, size_t offset, obj *o) {
 579 |     assert(ctx->stacklen > offset);
 580 |     ctx->stack[ctx->stacklen-1-offset] = o;
 581 | }
 582 | 
 583 | /* Show the current content of the stack. */
 584 | #define STACK_SHOW_MAX_ELE 10
 585 | void stackShow(aoclactx *ctx) {
 586 |     ssize_t j = ctx->stacklen - STACK_SHOW_MAX_ELE;
 587 |     if (j < 0) j = 0;
 588 |     while(j < (ssize_t)ctx->stacklen) {
 589 |         obj *o = ctx->stack[j];
 590 |         printobj(o,PRINT_COLOR|PRINT_REPR); printf(" ");
 591 |         j++;
 592 |     }
 593 |     if (ctx->stacklen > STACK_SHOW_MAX_ELE)
 594 |         printf("[... %zu more object ...]", j);
 595 |     if (ctx->stacklen) printf("\n");
 596 | }
 597 | 
 598 | /* ================================ Eval ==================================== */
 599 | 
 600 | /* Evaluate the program in the list 'l' in the specified context 'ctx'.
 601 |  * Expects a list object. Evaluation uses the following rules:
 602 |  *
 603 |  * 1. List elements are scanned from left to right.
 604 |  * 2. If an element is a symbol, a function bound to such symbol is
 605 |  *    searched and executed. If no function is found with such a name
 606 |  *    an error is raised.
 607 |  * 3. If an element is a tuple, the stack elements are captured into the
 608 |  *    local variables with the same names as the tuple elements. If we
 609 |  *    run out of stack, an error is raised.
 610 |  * 4. Any other object type is just pushed on the stack.
 611 |  *
 612 |  * Return 1 on runtime erorr. Otherwise 0 is returned.
 613 |  */
 614 | int eval(aoclactx *ctx, obj *l) {
 615 |     assert (l->type == OBJ_TYPE_LIST);
 616 | 
 617 |     for (size_t j = 0; j < l->l.len; j++) {
 618 |         obj *o = l->l.ele[j];
 619 |         aproc *proc;
 620 |         ctx->frame->curline = o->line;
 621 | 
 622 |         switch(o->type) {
 623 |         case OBJ_TYPE_TUPLE:                /* Capture variables. */
 624 |             /* Quoted tuples just get pushed on the stack, losing
 625 |              * their quoted status. */
 626 |             if (o->l.quoted) {
 627 |                 obj *notq = deepCopy(o);
 628 |                 notq->l.quoted = 0;
 629 |                 stackPush(ctx,notq);
 630 |                 break;
 631 |             }
 632 | 
 633 |             if (ctx->stacklen < o->l.len) {
 634 |                 setError(ctx,o->l.ele[ctx->stacklen]->str.ptr,
 635 |                     "Out of stack while capturing local");
 636 |                 return 1;
 637 |             }
 638 | 
 639 |             /* Bind each variable to the corresponding locals array,
 640 |              * removing it from the stack. */
 641 |             ctx->stacklen -= o->l.len;
 642 |             for (size_t i = 0; i < o->l.len; i++) {
 643 |                 int idx = o->l.ele[i]->str.ptr[0];
 644 |                 release(ctx->frame->locals[idx]);
 645 |                 ctx->frame->locals[idx] =
 646 |                     ctx->stack[ctx->stacklen+i];
 647 |             }
 648 |             break;
 649 |         case OBJ_TYPE_SYMBOL:
 650 |             /* Quoted symbols don't generate a procedure call, but like
 651 |              * any other object they get pushed on the stack. */
 652 |             if (o->str.quoted) {
 653 |                 obj *notq = deepCopy(o);
 654 |                 notq->str.quoted = 0;
 655 |                 stackPush(ctx,notq);
 656 |                 break;
 657 |             }
 658 | 
 659 |             /* Not quoted symbols get looked up and executed if they
 660 |              * don't start with "$". Otherwise are handled as locals
 661 |              * push on the stack. */
 662 |             if (o->str.ptr[0] == '$') {     /* Push local var. */
 663 |                 int idx = o->str.ptr[1];
 664 |                 if (ctx->frame->locals[idx] == NULL) {
 665 |                     setError(ctx,o->str.ptr, "Unbound local var");
 666 |                     return 1;
 667 |                 }
 668 |                 stackPush(ctx,ctx->frame->locals[idx]);
 669 |                 retain(ctx->frame->locals[idx]);
 670 |             } else {                        /* Call procedure. */
 671 |                 proc = lookupProc(ctx,o->str.ptr);
 672 |                 if (proc == NULL) {
 673 |                     setError(ctx,o->str.ptr,
 674 |                         "Symbol not bound to procedure");
 675 |                     return 1;
 676 |                 }
 677 |                 if (proc->cproc) {
 678 |                     /* Call a procedure implemented in C. */
 679 |                     aproc *prev = ctx->frame->curproc;
 680 |                     ctx->frame->curproc = proc;
 681 |                     int err = proc->cproc(ctx);
 682 |                     ctx->frame->curproc = prev;
 683 |                     if (err) return err;
 684 |                 } else {
 685 |                     /* Call a procedure implemented in Aocla. */
 686 |                     stackframe *oldsf = ctx->frame;
 687 |                     ctx->frame = newStackFrame(ctx);
 688 |                     ctx->frame->curproc = proc;
 689 |                     int err = eval(ctx,proc->proc);
 690 |                     freeStackFrame(ctx->frame);
 691 |                     ctx->frame = oldsf;
 692 |                     if (err) return err;
 693 |                 }
 694 |             }
 695 |             break;
 696 |         default:
 697 |             stackPush(ctx,o);
 698 |             retain(o);
 699 |             break;
 700 |         }
 701 |     }
 702 |     return 0;
 703 | }
 704 | 
 705 | /* ============================== Library ===================================
 706 |  * Here we implement a number of things useful to play with the language.
 707 |  * Performance is not really a concern here, so certain core things are
 708 |  * implemented in Aocla itself for the sake of brevity.
 709 |  * ========================================================================== */
 710 | 
 711 | /* Make sure the stack len is at least 'min' or set an error and return 1.
 712 |  * If there are enough elements 0 is returned. */
 713 | int checkStackLen(aoclactx *ctx, size_t min) {
 714 |     if (ctx->stacklen < min) {
 715 |         setError(ctx,NULL,"Out of stack");
 716 |         return 1;
 717 |     }
 718 |     return 0;
 719 | }
 720 | 
 721 | /* Check that the stack elements contain at least 'count' elements of
 722 |  * the specified type. Otherwise set an error and return 1.
 723 |  * The function returns 0 if there are enough elements of the right type. */
 724 | int checkStackType(aoclactx *ctx, size_t count, ...) {
 725 |     if (checkStackLen(ctx,count)) return 1;
 726 |     va_list ap;
 727 |     va_start(ap, count);
 728 |     for (size_t i = 0; i < count; i++) {
 729 |         int type = va_arg(ap,int);
 730 |         if (!(type & ctx->stack[ctx->stacklen-count+i]->type)) {
 731 |             setError(ctx,NULL,"Type mismatch");
 732 |             return 1;
 733 |         }
 734 |     }
 735 |     va_end(ap);
 736 |     return 0;
 737 | }
 738 | 
 739 | /* Search for a procedure with that name. Return NULL if not found. */
 740 | aproc *lookupProc(aoclactx *ctx, const char *name) {
 741 |     aproc *this = ctx->proc;
 742 |     while(this) {
 743 |         if (!strcmp(this->name,name)) return this;
 744 |         this = this->next;
 745 |     }
 746 |     return NULL;
 747 | }
 748 | 
 749 | /* Allocate a new procedure object and link it to 'ctx'.
 750 |  * It's up to the caller to to fill the actual C or Aocla procedure pointer. */
 751 | aproc *newProc(aoclactx *ctx, const char *name) {
 752 |     aproc *ap = myalloc(sizeof(*ap));
 753 |     ap->name = myalloc(strlen(name)+1);
 754 |     memcpy((char*)ap->name,name,strlen(name)+1);
 755 |     ap->next = ctx->proc;
 756 |     ctx->proc = ap;
 757 |     return ap;
 758 | }
 759 | 
 760 | /* Add a procedure to the specified context. Either cproc or list should
 761 |  * not be null, depending on the fact the new procedure is implemented as
 762 |  * a C function or natively in Aocla. If the procedure already exists it
 763 |  * is replaced with the new one. */
 764 | void addProc(aoclactx *ctx, const char *name, int(*cproc)(aoclactx *), obj *list) {
 765 |     assert((cproc != NULL) + (list != NULL) == 1);
 766 |     aproc *ap = lookupProc(ctx, name);
 767 |     if (ap) {
 768 |         if (ap->proc != NULL) {
 769 |             release(ap->proc);
 770 |             ap->proc = NULL;
 771 |         }
 772 |     } else {
 773 |         ap = newProc(ctx,name);
 774 |     }
 775 |     ap->proc = list;
 776 |     ap->cproc = cproc;
 777 | }
 778 | 
 779 | /* Add a procedure represented by the Aocla code 'prog', that must
 780 |  * be a valid list. On error (not valid list) 1 is returned, otherwise 0. */
 781 | int addProcString(aoclactx *ctx, const char *name, const char *prog) {
 782 |     obj *list = parseObject(NULL,prog,NULL,NULL);
 783 |     if (prog == NULL) return 1;
 784 |     addProc(ctx,name,NULL,list);
 785 |     return 0;
 786 | }
 787 | 
 788 | /* Implements +, -, *, %, ... */
 789 | int procBasicMath(aoclactx *ctx) {
 790 |     if (checkStackType(ctx,2,OBJ_TYPE_INT,OBJ_TYPE_INT)) return 1;
 791 |     obj *b = stackPop(ctx);
 792 |     obj *a = stackPop(ctx);
 793 | 
 794 |     int res;
 795 |     const char *fname = ctx->frame->curproc->name;
 796 |     if (fname[0] == '+' && fname[1] == 0) res = a->i + b->i;
 797 |     if (fname[0] == '-' && fname[1] == 0) res = a->i - b->i;
 798 |     if (fname[0] == '*' && fname[1] == 0) res = a->i * b->i;
 799 |     if (fname[0] == '/' && fname[1] == 0) res = a->i / b->i;
 800 |     stackPush(ctx,newInt(res));
 801 |     release(a);
 802 |     release(b);
 803 |     return 0;
 804 | }
 805 | 
 806 | /* Implements ==, >=, <=, !=. */
 807 | int procCompare(aoclactx *ctx) {
 808 |     if (checkStackLen(ctx,2)) return 1;
 809 |     obj *b = stackPop(ctx);
 810 |     obj *a = stackPop(ctx);
 811 |     int cmp = compare(a,b);
 812 |     if (cmp == COMPARE_TYPE_MISMATCH) {
 813 |         stackPush(ctx,b);
 814 |         stackPush(ctx,a);
 815 |         setError(ctx,NULL,"Type mismatch in comparison");
 816 |         return 1;
 817 |     }
 818 | 
 819 |     int res;
 820 |     const char *fname = ctx->frame->curproc->name;
 821 |     if (fname[1] == '=') {
 822 |         switch(fname[0]) {
 823 |         case '=': res = cmp == 0; break;
 824 |         case '!': res = cmp != 0; break;
 825 |         case '>': res = cmp >= 0; break;
 826 |         case '<': res = cmp <= 0; break;
 827 |         }
 828 |     } else {
 829 |         switch(fname[0]) {
 830 |         case '>': res = cmp > 0; break;
 831 |         case '<': res = cmp < 0; break;
 832 |         }
 833 |     }
 834 |     stackPush(ctx,newBool(res));
 835 |     release(a);
 836 |     release(b);
 837 |     return 0;
 838 | }
 839 | 
 840 | /* Implements sort. Sorts a list in place. */
 841 | int procSortList(aoclactx *ctx) {
 842 |     if (checkStackType(ctx,1,OBJ_TYPE_LIST)) return 1;
 843 |     obj *l = stackPop(ctx);
 844 |     l = getUnsharedObject(l);
 845 |     qsort(l->l.ele,l->l.len,sizeof(obj*),qsort_obj_cmp);
 846 |     stackPush(ctx,l);
 847 |     return 0;
 848 | }
 849 | 
 850 | /* "def" let Aocla define new procedures, binding a list to a
 851 |  * symbol in the procedure table. */
 852 | int procDef(aoclactx *ctx) {
 853 |     if (checkStackType(ctx,2,OBJ_TYPE_LIST,OBJ_TYPE_SYMBOL)) return 1;
 854 |     obj *sym = stackPop(ctx);
 855 |     obj *code = stackPop(ctx);
 856 |     addProc(ctx,sym->str.ptr,NULL,code);
 857 |     release(sym);
 858 |     return 0;
 859 | }
 860 | 
 861 | /* if, ifelse, while.
 862 |  *
 863 |  * (list) => (result)               // if
 864 |  * (list list) => (result)          // ifelse and while
 865 |  *
 866 |  * We could implement while in AOCLA itself, once we have ifelse, however
 867 |  * this way we would build everything on a recursive implementation (still
 868 |  * we don't have tail recursion implemented), making every other thing
 869 |  * using while a issue with the stack length. Also stack trace on error
 870 |  * is a mess. And if you see the implementation, while is mostly an obvious
 871 |  * result of the ifelse implementation itself. */
 872 | int procIf(aoclactx *ctx) {
 873 |     int w = ctx->frame->curproc->name[0] == 'w';        /* while? */
 874 |     int e = ctx->frame->curproc->name[2] == 'e';        /* ifelse? */
 875 |     int retval = 1;
 876 |     if (e) {
 877 |         if (checkStackType(ctx,3,OBJ_TYPE_LIST,OBJ_TYPE_LIST,OBJ_TYPE_LIST))
 878 |             return 1;
 879 |     } else {
 880 |         if (checkStackType(ctx,2,OBJ_TYPE_LIST,OBJ_TYPE_LIST))
 881 |             return 1;
 882 |     }
 883 | 
 884 |     obj *elsebranch, *ifbranch, *cond;
 885 |     elsebranch = e ? stackPop(ctx) : NULL;
 886 |     ifbranch = stackPop(ctx);
 887 |     cond = stackPop(ctx);
 888 | 
 889 |     while(1) {
 890 |         /* Evaluate the conditional program. */
 891 |         if (eval(ctx,cond)) goto rterr;
 892 |         if (checkStackType(ctx,1,OBJ_TYPE_BOOL)) goto rterr;
 893 |         obj *condres = stackPop(ctx);
 894 |         int res = condres->istrue;
 895 |         release(condres);
 896 | 
 897 |         /* Now eval the true or false branch depending on the
 898 |          * result. */
 899 |         if (res) { /* True branch (if, ifelse, while). */
 900 |             if (eval(ctx,ifbranch)) goto rterr;
 901 |             if (w) continue;
 902 |         } else if (e) { /* False branch (ifelse). */
 903 |             if (eval(ctx,elsebranch)) goto rterr;
 904 |         }
 905 |         break;
 906 |     }
 907 |     retval = 0; /* Success. */
 908 | 
 909 | rterr:  /* Cleanup. We jump here on error with retval = 1. */
 910 |     release(cond);
 911 |     release(ifbranch);
 912 |     release(elsebranch);
 913 |     return retval;
 914 | }
 915 | 
 916 | /* Evaluate the given list, consuming it. */
 917 | int procEval(aoclactx *ctx) {
 918 |     if (checkStackType(ctx,1,OBJ_TYPE_LIST)) return 1;
 919 |     obj *l = stackPop(ctx);
 920 |     int retval = eval(ctx,l);
 921 |     release(l);
 922 |     return retval;
 923 | }
 924 | 
 925 | /* Like eval, but the code is evaluated in the stack frame of the calling
 926 |  * procedure, if any. */
 927 | int procUpeval(aoclactx *ctx) {
 928 |     if (checkStackType(ctx,1,OBJ_TYPE_LIST)) return 1;
 929 |     obj *l = stackPop(ctx);
 930 |     stackframe *saved = NULL;
 931 |     if (ctx->frame->prev) {
 932 |         saved = ctx->frame;
 933 |         ctx->frame = ctx->frame->prev;
 934 |     }
 935 |     int retval = eval(ctx,l);
 936 |     if (saved) ctx->frame = saved;
 937 |     release(l);
 938 |     return retval;
 939 | }
 940 | 
 941 | /* Print the top object to stdout, consuming it */
 942 | int procPrint(aoclactx *ctx) {
 943 |     if (checkStackLen(ctx,1)) return 1;
 944 |     obj *o = stackPop(ctx);
 945 |     printobj(o,PRINT_RAW);
 946 |     release(o);
 947 |     return 0;
 948 | }
 949 | 
 950 | /* Like print but also prints a newline at the end. */
 951 | int procPrintnl(aoclactx *ctx) {
 952 |     if (checkStackLen(ctx,1)) return 1;
 953 |     int ret = procPrint(ctx); printf("\n");
 954 |     return ret;
 955 | }
 956 | 
 957 | /* Len -- gets object len. Works with many types.
 958 |  * (object) => (len) */
 959 | int procLen(aoclactx *ctx) {
 960 |     if (checkStackType(ctx,1,OBJ_TYPE_LIST|OBJ_TYPE_TUPLE|OBJ_TYPE_STRING|
 961 |                              OBJ_TYPE_SYMBOL)) return 1;
 962 | 
 963 |     obj *o = stackPop(ctx);
 964 |     int len;
 965 |     switch(o->type) {
 966 |     case OBJ_TYPE_LIST: case OBJ_TYPE_TUPLE:    len = o->l.len; break;
 967 |     case OBJ_TYPE_STRING: case OBJ_TYPE_SYMBOL: len = o->str.len; break;
 968 |     }
 969 |     release(o);
 970 |     stackPush(ctx,newInt(len));
 971 |     return 0;
 972 | }
 973 | 
 974 | /* Implements -> and <-, appending element x in list with stack
 975 |  *
 976 |  * (x [1 2 3]) => ([1 2 3 x]) | ([x 1 2 3])
 977 |  *
 978 |  * <- is very inefficient as it memmoves all N elements. */
 979 | int procListAppend(aoclactx *ctx) {
 980 |     int tail = ctx->frame->curproc->name[0] == '-';     /* Append on tail? */
 981 |     if (checkStackType(ctx,2,OBJ_TYPE_ANY,OBJ_TYPE_LIST)) return 1;
 982 |     obj *l = getUnsharedObject(stackPop(ctx));
 983 |     obj *ele = stackPop(ctx);
 984 |     l->l.ele = myrealloc(l->l.ele,sizeof(obj*)*(l->l.len+1));
 985 |     if (tail) {
 986 |         l->l.ele[l->l.len] = ele;
 987 |     } else {
 988 |         memmove(l->l.ele+1,l->l.ele,sizeof(obj*)*l->l.len);
 989 |         l->l.ele[0] = ele;
 990 |     }
 991 |     l->l.len++;
 992 |     stackPush(ctx,l);
 993 |     return 0;
 994 | }
 995 | 
 996 | /* get@ -- get element at index. Works for lists, strings, tuples.
 997 |  * (object index) => (element). */
 998 | int procListGetAt(aoclactx *ctx) {
 999 |     if (checkStackType(ctx,2,OBJ_TYPE_LIST|OBJ_TYPE_STRING|OBJ_TYPE_TUPLE,
1000 |                              OBJ_TYPE_INT)) return 1;
1001 |     obj *idx = stackPop(ctx);
1002 |     obj *o = stackPop(ctx);
1003 |     int i = idx->i;
1004 |     size_t len = o->type == OBJ_TYPE_STRING ? o->str.len : o->l.len;
1005 |     if (i < 0) i = len+i; /* -1 is last element, and so forth. */
1006 |     release(idx);
1007 |     if (i < 0 || (size_t)i >= len) {
1008 |         stackPush(ctx,newBool(0)); // Out of index? Just push false.
1009 |     } else {
1010 |         if (o->type == OBJ_TYPE_STRING) {
1011 |             stackPush(ctx,newString(o->str.ptr+i,1));
1012 |         } else {
1013 |             stackPush(ctx,o->l.ele[i]);
1014 |             retain(o->l.ele[i]);
1015 |         }
1016 |     }
1017 |     release(o);
1018 |     return 0;
1019 | }
1020 | 
1021 | /* cat -- Concatenates lists, tuples, strings.
1022 |  * (a b) => (a#b) */
1023 | int procCat(aoclactx *ctx) {
1024 |     if (checkStackLen(ctx,2)) return 1;
1025 |     if (ctx->stack[ctx->stacklen-1]->type !=
1026 |         ctx->stack[ctx->stacklen-2]->type)
1027 |     {
1028 |         setError(ctx,NULL,"cat expects two objects of the same type");
1029 |         return 1;
1030 |     }
1031 | 
1032 |     if (checkStackType(ctx,2,OBJ_TYPE_LIST|OBJ_TYPE_STRING|
1033 |                              OBJ_TYPE_TUPLE|OBJ_TYPE_SYMBOL,
1034 |                              OBJ_TYPE_LIST|OBJ_TYPE_STRING|
1035 |                              OBJ_TYPE_TUPLE|OBJ_TYPE_SYMBOL))
1036 |         return 1;
1037 | 
1038 |     obj *src = stackPop(ctx);
1039 |     obj *dst = stackPeek(ctx,0);
1040 |     dst = getUnsharedObject(dst);
1041 |     stackSet(ctx,0,dst);
1042 | 
1043 |     if (src->type & (OBJ_TYPE_STRING|OBJ_TYPE_SYMBOL)) {
1044 |         dst->str.ptr = myrealloc(dst->str.ptr,dst->str.len+src->str.len+1);
1045 |         memcpy(dst->str.ptr+dst->str.len,src->str.ptr,src->str.len+1);
1046 |         dst->str.len += src->str.len;
1047 |     } else {
1048 |         for (size_t j = 0; j < src->l.len; j++) retain(src->l.ele[j]);
1049 |         dst->l.ele = myrealloc(dst->l.ele,(dst->l.len+src->l.len)*sizeof(obj*));
1050 |         memcpy(dst->l.ele+dst->l.len,src->l.ele,src->l.len*sizeof(obj*));
1051 |         dst->l.len += src->l.len;
1052 |     }
1053 |     release(src);
1054 |     return 0;
1055 | }
1056 | 
1057 | // Turns the list on the stack into a tuple.
1058 | int procMakeTuple(aoclactx *ctx) {
1059 |     if (checkStackType(ctx,1,OBJ_TYPE_LIST)) return 1;
1060 |     obj *l = stackPop(ctx);
1061 |     l = getUnsharedObject(l);
1062 |     l->type = OBJ_TYPE_TUPLE;
1063 |     l->l.quoted = 0;
1064 |     stackPush(ctx,l);
1065 |     return 0;
1066 | }
1067 | 
1068 | /* Show the current stack. Useful for debugging. */
1069 | int procShowStack(aoclactx *ctx) {
1070 |     stackShow(ctx);
1071 |     return 0;
1072 | }
1073 | 
1074 | /* Load the "standard library" of Aocla in the specified context. */
1075 | void loadLibrary(aoclactx *ctx) {
1076 |     addProc(ctx,"+",procBasicMath,NULL);
1077 |     addProc(ctx,"-",procBasicMath,NULL);
1078 |     addProc(ctx,"*",procBasicMath,NULL);
1079 |     addProc(ctx,"/",procBasicMath,NULL);
1080 |     addProc(ctx,"==",procCompare,NULL);
1081 |     addProc(ctx,">=",procCompare,NULL);
1082 |     addProc(ctx,">",procCompare,NULL);
1083 |     addProc(ctx,"<=",procCompare,NULL);
1084 |     addProc(ctx,"<",procCompare,NULL);
1085 |     addProc(ctx,"!=",procCompare,NULL);
1086 |     addProc(ctx,"sort",procSortList,NULL);
1087 |     addProc(ctx,"def",procDef,NULL);
1088 |     addProc(ctx,"if",procIf,NULL);
1089 |     addProc(ctx,"ifelse",procIf,NULL);
1090 |     addProc(ctx,"while",procIf,NULL);
1091 |     addProc(ctx,"eval",procEval,NULL);
1092 |     addProc(ctx,"upeval",procUpeval,NULL);
1093 |     addProc(ctx,"print",procPrint,NULL);
1094 |     addProc(ctx,"printnl",procPrintnl,NULL);
1095 |     addProc(ctx,"len",procLen,NULL);
1096 |     addProc(ctx,"->",procListAppend,NULL);
1097 |     addProc(ctx,"<-",procListAppend,NULL);
1098 |     addProc(ctx,"get@",procListGetAt,NULL);
1099 |     addProc(ctx,"showstack",procShowStack,NULL);
1100 |     addProc(ctx,"cat",procCat,NULL);
1101 |     addProc(ctx,"make-tuple",procMakeTuple,NULL);
1102 | 
1103 |     /* Since the point of this interpreter to be a short and understandable
1104 |      * programming example, we implement as much as possible in Aocla itself
1105 |      * without caring much about performances. */
1106 |     addProcString(ctx,"dup","[(x) $x $x]");
1107 |     addProcString(ctx,"swap","[(x y) $y $x]");
1108 |     addProcString(ctx,"drop","[(_)]");
1109 | 
1110 |     /* [1 2 3] [dup *] map => [1 4 9] */
1111 |     addProcString(ctx,"map", "[(l f) $l len (e) 0 (j) [] [$j $e <] [$l $j get@ $f upeval swap -> $j 1 + (j)] while]");
1112 | 
1113 |     /* [1 2 3] [printnl] foreach */
1114 |     addProcString(ctx,"foreach"," [(l f) $l len (e) 0 (j) [$j $e <] [$l $j get@ $f upeval $j 1 + (j)] while]");
1115 | 
1116 |     /* [1 2 3] first => 1 */
1117 |     addProcString(ctx,"first","[0 get@]");
1118 | 
1119 |     /* [1 2 3] rest => [2 3] */
1120 |     addProcString(ctx,"rest","[#t (f) [] (n) [[$f] [#f (f) drop] [$n -> (n)] ifelse] foreach $n]");
1121 | }
1122 | 
1123 | /* ================================ CLI ===================================== */
1124 | 
1125 | /* Real Eval Print Loop. */
1126 | void repl(void) {
1127 |     char buf[1024];
1128 |     aoclactx *ctx = newInterpreter();
1129 |     while(1) {
1130 |         printf("aocla> "); fflush(stdout);
1131 | 
1132 |         /* Aocla programs are Aocla lists, so when users just write
1133 |          * in the REPL we need to surround with []. */
1134 |         buf[0] = '[';
1135 | 
1136 |         if (fgets(buf+1,sizeof(buf)-2,stdin) == NULL) break;
1137 |         size_t l = strlen(buf);
1138 |         if (l && buf[l-1] == '\n') buf[--l] = 0;
1139 |         if (l == 0) continue;
1140 | 
1141 |         /* Add closing ]. */
1142 |         buf[l] = ']';
1143 |         buf[l+1] = 0;
1144 | 
1145 |         obj *list = parseObject(ctx,buf,NULL,NULL);
1146 |         if (!list) {
1147 |             printf("Parsing program: %s\n", ctx->errstr);
1148 |             continue;
1149 |         }
1150 |         if (eval(ctx,list)) {
1151 |             printf("%s\n", ctx->errstr);
1152 |         } else {
1153 |             stackShow(ctx);
1154 |         }
1155 |         release(list);
1156 |     }
1157 | }
1158 | 
1159 | /* Execute the program contained in the specified filename.
1160 |  * Return 1 on error, 0 otherwise. */
1161 | int evalFile(const char *filename, char **argv, int argc) {
1162 |     FILE *fp = fopen(filename,"r");
1163 |     if (!fp) {
1164 |         perror("Opening file");
1165 |         return 1;
1166 |     }
1167 | 
1168 |     /* Read file into buffer. */
1169 |     int incrlen = 1024; /* How much to allocate when we are out of buffer. */
1170 |     char *buf = myalloc(incrlen);
1171 |     size_t buflen = 1, nread;
1172 |     size_t leftspace = incrlen-buflen;
1173 |     buf[0] = '[';
1174 |     while((nread = fread(buf+buflen,1,leftspace,fp)) > 0) {
1175 |         buflen += nread;
1176 |         leftspace -= nread;
1177 |         if (leftspace == 0) {
1178 |             buf = myrealloc(buf,buflen+incrlen);
1179 |             leftspace += incrlen;
1180 |         }
1181 |     }
1182 |     if (leftspace < 2) buf = myrealloc(buf,buflen+2);
1183 |     buf[buflen++] = ']';
1184 |     buf[buflen++] = 0;
1185 |     fclose(fp);
1186 | 
1187 |     /* Parse the program before eval(). */
1188 |     aoclactx *ctx = newInterpreter();
1189 |     int line = 1;
1190 |     obj *l = parseObject(ctx,buf,NULL,&line);
1191 |     free(buf);
1192 |     if (!l) {
1193 |         printf("Parsing program: %s\n", ctx->errstr);
1194 |         return 1;
1195 |     }
1196 | 
1197 |     /* Before evaluating the program, let's push on the arguments
1198 |      * we received on the stack. */
1199 |     for (int j = 0; j < argc; j++) {
1200 |         obj *o = parseObject(NULL,argv[j],NULL,0);
1201 |         if (!o) {
1202 |             printf("Parsing command line argument: %s\n", ctx->errstr);
1203 |             release(l);
1204 |             return 1;
1205 |         }
1206 |         stackPush(ctx,o);
1207 |     }
1208 | 
1209 |     /* Run the program. */
1210 |     int retval = eval(ctx,l);
1211 |     if (retval) printf("Runtime error: %s\n", ctx->errstr);
1212 |     release(l);
1213 |     return retval;
1214 | }
1215 | 
1216 | int main(int argc, char **argv) {
1217 |     if (argc == 1) {
1218 |         repl();
1219 |     } else if (argc >= 2) {
1220 |         if (evalFile(argv[1],argv+2,argc-2)) return 1;
1221 |     }
1222 |     return 0;
1223 | }
1224 | 


--------------------------------------------------------------------------------
/examples/cat.aocla:
--------------------------------------------------------------------------------
 1 | // List concatenation. This procedure is just an example, the standard
 2 | // library "cat" is implemented in C.
 3 | 
 4 | // [1 2 3] [4 5 6] cat => [1 2 3 4 5 6]
 5 | [(a b)
 6 |     $b [$a -> (a)] foreach
 7 |     $a
 8 | ] 'cat def
 9 | 
10 | [1 2 3] [4 5 6] cat
11 | showstack
12 | 


--------------------------------------------------------------------------------
/examples/fib.aocla:
--------------------------------------------------------------------------------
 1 | // Fibonacci example
 2 | 
 3 | [(n)
 4 |     [$n 1 <=]
 5 |     [
 6 |         $n
 7 |     ]
 8 |     [
 9 |         $n 1 - fib
10 |         $n 2 - fib
11 |         +
12 |     ] ifelse
13 | ] 'fib def
14 | 
15 | 10 fib
16 | printnl
17 | 


--------------------------------------------------------------------------------
/examples/firstrest.aocla:
--------------------------------------------------------------------------------
 1 | // Fundamental list manipulation functions. Part of the standard library.
 2 | // first, rest, cat
 3 | 
 4 | [
 5 |     0 get@
 6 | ] 'first def
 7 | 
 8 | [
 9 |     #t (f) // True only for the first element
10 |     [] (n) // New list
11 |     [
12 |         [$f] [
13 |             #f (f)  // Set flag to false
14 |             drop    // Discard first element
15 |         ] [
16 |             $n -> (n)
17 |         ] ifelse
18 |     ] foreach
19 |     $n
20 | ] 'rest def
21 | 
22 | 


--------------------------------------------------------------------------------
/examples/foreach.aocla:
--------------------------------------------------------------------------------
 1 | // This is a commented version of the implementation of 'foreach' inside
 2 | // the standard library.
 3 | 
 4 | [(l f) // list and function to call with each element.
 5 |     $l len (e)  // Get list len in "e"
 6 |     0 (j)       // j is our current index
 7 |     [$j $e <] [
 8 |         $l $j get@  // Get list[j]
 9 |         $f upeval   // We want to evaluate in the context of the caller
10 |         $j 1 + (j)  // Go to the next index
11 |     ] while
12 | ] 'foreach def
13 | 
14 | [1 2 3] [printnl] foreach
15 | 


--------------------------------------------------------------------------------
/examples/map.aocla:
--------------------------------------------------------------------------------
 1 | // This is a commented version of the implementation of 'map' inside
 2 | // the standard library.
 3 | 
 4 | [(l f) // list and function to apply
 5 |     $l len (e)  // Get list len in "e"
 6 |     0 (j)       // j is our current index
 7 |     []          // We will populate this empty list
 8 |     [$j $e <] [
 9 |         $l $j get@
10 |         $f upeval
11 |         swap ->
12 |         $j 1 + (j)
13 |     ] while
14 | ] 'map def
15 | 
16 | [1 2 3] [dup *] map
17 | printnl
18 | 


--------------------------------------------------------------------------------
/examples/rec-for.aocla:
--------------------------------------------------------------------------------
 1 | // For loop implemented with recursion
 2 | 
 3 | [(x y f)
 4 |     [$x $y <] [
 5 |         $x $f eval
 6 |         $x 1 + $y $f count
 7 |     ] if
 8 | ] 'count def
 9 | 
10 | 0 15 [print "\n" print] count
11 | 


--------------------------------------------------------------------------------