├── .gitignore
├── Makefile
├── README.md
├── examples
    ├── life.lisp
    └── nqueens.lisp
├── minilisp.c
└── test.sh


/.gitignore:
--------------------------------------------------------------------------------
1 | *~
2 | minilisp


--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
 1 | CFLAGS=-std=gnu99 -g -O2 -Wall
 2 | 
 3 | .PHONY: clean test
 4 | 
 5 | minilisp: minilisp.c
 6 | 
 7 | clean:
 8 | 	rm -f minilisp *~
 9 | 
10 | test: minilisp
11 | 	@./test.sh
12 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | MiniLisp
  2 | ========
  3 | 
  4 | One day I wanted to see what I can do with 1k lines of C and
  5 | decided to write a Lisp interpreter. That turned to be a
  6 | fun weekend project, and the outcome is a mini lisp implementation
  7 | that supports
  8 | 
  9 | - integers, symbols, cons cells,
 10 | - global variables,
 11 | - lexically-scoped local variables,
 12 | - closures,
 13 | - _if_ conditional,
 14 | - primitive functions, such as +, =, <, or _list_,
 15 | - user-defined functions,
 16 | - a macro system,
 17 | - and a copying garbage collector.
 18 | 
 19 | All those in 1000 lines of C. I didn't sacrifice readability for size.
 20 | The code is in my opinion heavily commented to help the reader understand
 21 | how all these features work.
 22 | 
 23 | Compile
 24 | -------
 25 | 
 26 |     $ make
 27 | 
 28 | MiniLisp has been tested on Linux x86/x86-64 and 64 bit Mac OS. The code is not
 29 | very architecture dependent, so you should be able to compile and run on other
 30 | Unix-like operating systems.
 31 | 
 32 | Test
 33 | ----
 34 | 
 35 | MiniLisp comes with a comprehensive test suite. In order to run the tests, give
 36 | "test" argument to make.
 37 | 
 38 |     $ make test
 39 | 
 40 | Language features
 41 | -----------------
 42 | 
 43 | MiniLisp is a traditional Lisp interpreter. It reads one expression at a time
 44 | from the standard input, evaluates it, and then prints out the return value of
 45 | the expression. Here is an example of a valid input.
 46 | 
 47 |     (+ 1 2)
 48 | 
 49 | The above expression prints "3".
 50 | 
 51 | ### Literals
 52 | 
 53 | MiniLisp supports integer literals, `()`, `t`, symbols, and list literals.
 54 | 
 55 | * Integer literals are positive or negative integers.
 56 | * `()` is the only false value. It also represents the empty list.
 57 | * `t` is a predefined variable evaluated to itself. It's a preferred way to
 58 |   represent a true value, while any non-`()` value is considered to be true.
 59 | * Symbols are objects with unique name. They are used to represent identifiers.
 60 |   Because MiniLisp does not have string type, symbols are sometimes used as a
 61 |   substitute for strings too.
 62 | * List literals are cons cells. It's either a regular list whose last element's
 63 |   cdr is `()` or an dotted list ending with any non-`()` value. A dotted list is
 64 |   written as `(a . b)`.
 65 | 
 66 | ### List operators
 67 | 
 68 | `cons` takes two arguments and returns a new cons cell, making the first
 69 | argument the car, and the second the cdr.
 70 | 
 71 |     (cons 'a 'b)   ; -> (a . b)
 72 |     (cons 'a '(b)) ; -> (a b)
 73 | 
 74 | `car` and `cdr` are accessors for cons cells. `car` returns the car, and `cdr`
 75 | returns the cdr.
 76 | 
 77 |     (car '(a . b)) ; -> a
 78 |     (cdr '(a . b)) ; -> b
 79 | 
 80 | `setcar` mutates a cons cell. `setcar` takes two arguments, assuming the first
 81 | argument is a cons cell. It sets the second argument's value to the cons cell's
 82 | car.
 83 | 
 84 |     (define cell (cons 'a 'b))
 85 |     cell  ; -> (a . b)
 86 |     (setcar cell 'x)
 87 |     cell  ; -> (x . b)
 88 | 
 89 | ### Numeric operators
 90 | 
 91 | `+` returns the sum of the arguments.
 92 | 
 93 |     (+ 1)      ; -> 1
 94 |     (+ 1 2)    ; -> 3
 95 |     (+ 1 2 3)  ; -> 6
 96 | 
 97 | `-` negates the value of the argument if only one argument is given.
 98 | 
 99 |     (- 3)      ; -> -3
100 |     (- -5)     ; -> 5
101 | 
102 | If multiple arguments are given, `-` subtracts each argument from the first one.
103 | 
104 |     (- 5 2)    ; -> 3
105 |     (- 5 2 7)  ; -> -4
106 | 
107 | `=` takes two arguments and returns `t` if the two are the same integer.
108 | 
109 |     (= 11 11)  ; -> t
110 |     (= 11 6)   ; -> ()
111 | 
112 | `<` takes two arguments and returns `t` if the first argument is smaller than
113 | the second.
114 | 
115 |     (< 2 3)    ; -> t
116 |     (< 3 3)    ; -> ()
117 |     (< 4 3)    ; -> ()
118 | 
119 | ### Conditionals
120 | 
121 | `(if cond then else)` is the only conditional in the language. It first
122 | evaluates *cond*. If the result is a true value, *then* is evaluated. Otherwise
123 | *else* is evaluated.
124 | 
125 | ### Loops
126 | 
127 | `(while cond expr ...)` executes `expr ...` until `cond` is evaluated to
128 | `()`. This is the only loop supported by MiniLisp.
129 | 
130 | If you are familiar with Scheme, you might be wondering if you could write a
131 | loop by tail recursion in MiniLisp. The answer is no. Tail calls consume stack
132 | space in MiniLisp, so a loop written as recursion will fail with the memory
133 | exhaustion error.
134 | 
135 | ### Equivalence test operators
136 | 
137 | `eq` takes two arguments and returns `t` if the objects are the same. What `eq`
138 | really does is a pointer comparison, so two objects happened to have the same
139 | contents but actually different are considered to not be the same by `eq`.
140 | 
141 | ### Output operators
142 | 
143 | `println` prints a given object to the standard output.
144 | 
145 |     (println 3)               ; prints "3"
146 |     (println '(hello world))  ; prints "(hello world)"
147 | 
148 | ### Definitions
149 | 
150 | MiniLisp supports variables and functions. They can be defined using `define`.
151 | 
152 |     (define a (+ 1 2))
153 |     (+ a a)   ; -> 6
154 | 
155 | There are two ways to define a function. One way is to use a special form
156 | `lambda`. `(lambda (args ...)  expr ...)` returns a function object which
157 | you can assign to a variable using `define`.
158 | 
159 |     (define double (lambda (x) (+ x x)))
160 |     (double 6)                ; -> 12
161 |     ((lambda (x) (+ x x)) 6)  ; do the same thing without assignment
162 | 
163 | The other way is `defun`. `(defun fn (args ...) expr ...)` is short for
164 | `(define fn (lambda (args ...) expr ...)`.
165 | 
166 |     ;; Define "double" using defun
167 |     (defun double (x) (+ x x))
168 | 
169 | You can write a function that takes variable number of arguments. If the
170 | parameter list is a dotted list, the remaining arguments are bound to the last
171 | parameter as a list.
172 | 
173 |     (defun fn (expr . rest) rest)
174 |     (fn 1)     ; -> ()
175 |     (fn 1 2 3) ; -> (2 3)
176 | 
177 | Variables are lexically scoped and have indefinite extent. References to "outer"
178 | variables remain valid even after the function that created the variables
179 | returns.
180 | 
181 |     ;; A countup function. We use lambda to introduce local variables because we
182 |     ;; do not have "let" and the like.
183 |     (define counter
184 |       ((lambda (count)
185 |          (lambda ()
186 |            (setq count (+ count 1))
187 |            count))
188 |        0))
189 | 
190 |     (counter)  ; -> 1
191 |     (counter)  ; -> 2
192 | 
193 |     ;; This will not return 12345 but 3. Variable "count" in counter function
194 |     ;; is resolved based on its lexical context rather than dynamic context.
195 |     ((lambda (count) (counter)) 12345)  ; -> 3
196 | 
197 | `setq` sets a new value to an existing variable. It's an error if the variable
198 | is not defined.
199 | 
200 |     (define val (+ 3 5))
201 |     (setq val (+ val 1))  ; increment "val"
202 | 
203 | ### Macros
204 | 
205 | Macros look similar to functions, but they are different that macros take an
206 | expression as input and returns a new expression as output. `(defmacro
207 | macro-name (args ...) body ...)` defines a macro. Here is an example.
208 | 
209 |     (defmacro unless (condition expr)
210 |       (list 'if condition () expr))
211 | 
212 | The above `defmacro` defines a new macro *unless*. *unless* is a new conditional
213 | which evaluates *expr* unless *condition* is a true value. You cannot do the
214 | same thing with a function because all the arguments would be evaluated before
215 | the control is passed to the function.
216 | 
217 |     (define x 0)
218 |     (unless (= x 0) '(x is not 0))  ; -> ()
219 |     (unless (= x 1) '(x is not 1))  ; -> (x is not 1)
220 | 
221 | `macroexpand` is a convenient special form to see the expanded form of a macro.
222 | 
223 |     (macroexpand (unless (= x 1) '(x is not 1)))
224 |     ;; -> (if (= x 1) () (quote (x is not 1)))
225 | 
226 | `gensym` creates a new symbol which will never be `eq` to any other symbol other
227 | than itself. Useful for writing a macro that introduces new identifiers.
228 | 
229 |     (gensym)   ; -> a new symbol
230 | 
231 | ### Comments
232 | 
233 | As in the traditional Lisp syntax, `;` (semicolon) starts a single line comment.
234 | The comment continues to the end of line.
235 | 
236 | No GC Branch
237 | ------------
238 | 
239 | There is a MiniLisp branch from which the code for garbage collection has been
240 | stripped. The accepted language is the same, but the code is simpler than the
241 | master branch's one. The reader might want to read the nogc branch first, then
242 | proceed to the master branch, to understand the code step by step.
243 | 
244 | The nogc branch is available at
245 | [nogc](https://github.com/rui314/minilisp/tree/nogc). The original is available
246 | at [master](https://github.com/rui314/minilisp).
247 | 


--------------------------------------------------------------------------------
/examples/life.lisp:
--------------------------------------------------------------------------------
  1 | ;;;
  2 | ;;; Conway's game of life
  3 | ;;;
  4 | 
  5 | ;; (progn expr ...)
  6 | ;; => ((lambda () expr ...))
  7 | (defmacro progn (expr . rest)
  8 |   (list (cons 'lambda (cons () (cons expr rest)))))
  9 | 
 10 | (defun list (x . y)
 11 |   (cons x y))
 12 | 
 13 | (defun not (x)
 14 |   (if x () t))
 15 | 
 16 | ;; (let var val body ...)
 17 | ;; => ((lambda (var) body ...) val)
 18 | (defmacro let (var val . body)
 19 |   (cons (cons 'lambda (cons (list var) body))
 20 | 	(list val)))
 21 | 
 22 | ;; (and e1 e2 ...)
 23 | ;; => (if e1 (and e2 ...))
 24 | ;; (and e1)
 25 | ;; => e1
 26 | (defmacro and (expr . rest)
 27 |   (if rest
 28 |       (list 'if expr (cons 'and rest))
 29 |     expr))
 30 | 
 31 | ;; (or e1 e2 ...)
 32 | ;; => (let <tmp> e1
 33 | ;;      (if <tmp> <tmp> (or e2 ...)))
 34 | ;; (or e1)
 35 | ;; => e1
 36 | ;;
 37 | ;; The reason to use the temporary variables is to avoid evaluating the
 38 | ;; arguments more than once.
 39 | (defmacro or (expr . rest)
 40 |   (if rest
 41 |       (let var (gensym)
 42 |            (list 'let var expr
 43 |                  (list 'if var var (cons 'or rest))))
 44 |     expr))
 45 | 
 46 | ;; (when expr body ...)
 47 | ;; => (if expr (progn body ...))
 48 | (defmacro when (expr . body)
 49 |   (cons 'if (cons expr (list (cons 'progn body)))))
 50 | 
 51 | ;; (unless expr body ...)
 52 | ;; => (if expr () body ...)
 53 | (defmacro unless (expr . body)
 54 |   (cons 'if (cons expr (cons () body))))
 55 | 
 56 | ;;;
 57 | ;;; Numeric operators
 58 | ;;;
 59 | 
 60 | (defun <= (e1 e2)
 61 |   (or (< e1 e2)
 62 |       (= e1 e2)))
 63 | 
 64 | ;;;
 65 | ;;; List operators
 66 | ;;;
 67 | 
 68 | ;;; Applies each element of lis to fn, and returns their return values as a list.
 69 | (defun map (lis fn)
 70 |   (when lis
 71 |     (cons (fn (car lis))
 72 | 	  (map (cdr lis) fn))))
 73 | 
 74 | ;; Returns nth element of lis.
 75 | (defun nth (lis n)
 76 |   (if (= n 0)
 77 |       (car lis)
 78 |     (nth (cdr lis) (- n 1))))
 79 | 
 80 | ;; Returns the nth tail of lis.
 81 | (defun nth-tail (lis n)
 82 |   (if (= n 0)
 83 |       lis
 84 |     (nth-tail (cdr lis) (- n 1))))
 85 | 
 86 | ;; Returns a list consists of m .. n-1 integers.
 87 | (defun %iota (m n)
 88 |   (unless (<= n m)
 89 |     (cons m (%iota (+ m 1) n))))
 90 | 
 91 | ;; Returns a list consists of 0 ... n-1 integers.
 92 | (defun iota (n)
 93 |   (%iota 0 n))
 94 | 
 95 | ;;;
 96 | ;;; Main
 97 | ;;;
 98 | 
 99 | (define width 10)
100 | (define height 10)
101 | 
102 | ;; Returns location (x, y)'s element.
103 | (defun get (board x y)
104 |   (nth (nth board y) x))
105 | 
106 | ;; Returns true if location (x, y)'s value is "@".
107 | (defun alive? (board x y)
108 |   (and (<= 0 x)
109 |        (< x height)
110 |        (<= 0 y)
111 |        (< y width)
112 |        (eq (get board x y) '@)))
113 | 
114 | ;; Print out the given board.
115 | (defun print (board)
116 |   (if (not board)
117 |       '$
118 |     (println (car board))
119 |     (print (cdr board))))
120 | 
121 | (defun count (board x y)
122 |   (let at (lambda (x y)
123 |             (if (alive? board x y) 1 0))
124 |        (+ (at (- x 1) (- y 1))
125 |           (at (- x 1) y)
126 |           (at (- x 1) (+ y 1))
127 |           (at x (- y 1))
128 |           (at x (+ y 1))
129 |           (at (+ x 1) (- y 1))
130 |           (at (+ x 1) y)
131 |           (at (+ x 1) (+ y 1)))))
132 | 
133 | (defun next (board x y)
134 |   (let c (count board x y)
135 |        (if (alive? board x y)
136 |            (or (= c 2) (= c 3))
137 |          (= c 3))))
138 | 
139 | (defun run (board)
140 |   (while t
141 |     (print board)
142 |     (println '*)
143 |     (let newboard (map (iota height)
144 |                        (lambda (y)
145 |                          (map (iota width)
146 |                               (lambda (x)
147 |                                 (if (next board x y) '@ '_)))))
148 |          (setq board newboard))))
149 | 
150 | (run '((_ _ _ _ _ _ _ _ _ _)
151 |        (_ _ _ _ _ _ _ _ _ _)
152 |        (_ _ _ _ _ _ _ _ _ _)
153 |        (_ _ _ _ _ _ _ _ _ _)
154 |        (_ _ _ _ _ _ _ _ _ _)
155 |        (_ _ _ _ _ _ _ _ _ _)
156 |        (_ _ _ _ _ _ _ _ _ _)
157 |        (_ @ @ @ _ _ _ _ _ _)
158 |        (_ _ _ @ _ _ _ _ _ _)
159 |        (_ _ @ _ _ _ _ _ _ _)))
160 | 


--------------------------------------------------------------------------------
/examples/nqueens.lisp:
--------------------------------------------------------------------------------
  1 | ;;;
  2 | ;;; N-queens puzzle solver.
  3 | ;;;
  4 | ;;; The N queens puzzle is the problem of placing N chess queens on an N x N
  5 | ;;; chessboard so that no two queens attack each
  6 | ;;; other. http://en.wikipedia.org/wiki/Eight_queens_puzzle
  7 | ;;;
  8 | ;;; This program solves N-queens puzzle by depth-first backtracking.
  9 | ;;;
 10 | 
 11 | ;;;
 12 | ;;; Basic macros
 13 | ;;;
 14 | ;;; Because the language does not have quasiquote, we need to construct an
 15 | ;;; expanded form using cons and list.
 16 | ;;;
 17 | 
 18 | ;; (progn expr ...)
 19 | ;; => ((lambda () expr ...))
 20 | (defmacro progn (expr . rest)
 21 |   (list (cons 'lambda (cons () (cons expr rest)))))
 22 | 
 23 | (defun list (x . y)
 24 |   (cons x y))
 25 | 
 26 | (defun not (x)
 27 |   (if x () t))
 28 | 
 29 | ;; (let1 var val body ...)
 30 | ;; => ((lambda (var) body ...) val)
 31 | (defmacro let1 (var val . body)
 32 |   (cons (cons 'lambda (cons (list var) body))
 33 | 	(list val)))
 34 | 
 35 | ;; (and e1 e2 ...)
 36 | ;; => (if e1 (and e2 ...))
 37 | ;; (and e1)
 38 | ;; => e1
 39 | (defmacro and (expr . rest)
 40 |   (if rest
 41 |       (list 'if expr (cons 'and rest))
 42 |     expr))
 43 | 
 44 | ;; (or e1 e2 ...)
 45 | ;; => (let1 <tmp> e1
 46 | ;;      (if <tmp> <tmp> (or e2 ...)))
 47 | ;; (or e1)
 48 | ;; => e1
 49 | ;;
 50 | ;; The reason to use the temporary variables is to avoid evaluating the
 51 | ;; arguments more than once.
 52 | (defmacro or (expr . rest)
 53 |   (if rest
 54 |       (let1 var (gensym)
 55 | 	    (list 'let1 var expr
 56 | 		  (list 'if var var (cons 'or rest))))
 57 |     expr))
 58 | 
 59 | ;; (when expr body ...)
 60 | ;; => (if expr (progn body ...))
 61 | (defmacro when (expr . body)
 62 |   (cons 'if (cons expr (list (cons 'progn body)))))
 63 | 
 64 | ;; (unless expr body ...)
 65 | ;; => (if expr () body ...)
 66 | (defmacro unless (expr . body)
 67 |   (cons 'if (cons expr (cons () body))))
 68 | 
 69 | ;;;
 70 | ;;; Numeric operators
 71 | ;;;
 72 | 
 73 | (defun <= (e1 e2)
 74 |   (or (< e1 e2)
 75 |       (= e1 e2)))
 76 | 
 77 | ;;;
 78 | ;;; List operators
 79 | ;;;
 80 | 
 81 | ;; Applies each element of lis to pred. If pred returns a true value, terminate
 82 | ;; the evaluation and returns pred's return value. If all of them return (),
 83 | ;; returns ().
 84 | (defun any (lis pred)
 85 |   (when lis
 86 |     (or (pred (car lis))
 87 | 	(any (cdr lis) pred))))
 88 | 
 89 | ;;; Applies each element of lis to fn, and returns their return values as a list.
 90 | (defun map (lis fn)
 91 |   (when lis
 92 |     (cons (fn (car lis))
 93 | 	  (map (cdr lis) fn))))
 94 | 
 95 | ;; Returns nth element of lis.
 96 | (defun nth (lis n)
 97 |   (if (= n 0)
 98 |       (car lis)
 99 |     (nth (cdr lis) (- n 1))))
100 | 
101 | ;; Returns the nth tail of lis.
102 | (defun nth-tail (lis n)
103 |   (if (= n 0)
104 |       lis
105 |     (nth-tail (cdr lis) (- n 1))))
106 | 
107 | ;; Returns a list consists of m .. n-1 integers.
108 | (defun %iota (m n)
109 |   (unless (<= n m)
110 |     (cons m (%iota (+ m 1) n))))
111 | 
112 | ;; Returns a list consists of 0 ... n-1 integers.
113 | (defun iota (n)
114 |   (%iota 0 n))
115 | 
116 | ;; Returns a new list whose length is len and all members are init.
117 | (defun make-list (len init)
118 |   (unless (= len 0)
119 |     (cons init (make-list (- len 1) init))))
120 | 
121 | ;; Applies fn to each element of lis.
122 | (defun for-each (lis fn)
123 |   (or (not lis)
124 |       (progn (fn (car lis))
125 | 	     (for-each (cdr lis) fn))))
126 | 
127 | ;;;
128 | ;;; N-queens solver
129 | ;;;
130 | 
131 | ;; Creates size x size list filled with symbol "x".
132 | (defun make-board (size)
133 |   (map (iota size)
134 |        (lambda (_)
135 | 	 (make-list size 'x))))
136 | 
137 | ;; Returns location (x, y)'s element.
138 | (defun get (board x y)
139 |   (nth (nth board x) y))
140 | 
141 | ;; Set symbol "@" to location (x, y).
142 | (defun set (board x y)
143 |   (setcar (nth-tail (nth board x) y) '@))
144 | 
145 | ;; Set symbol "x" to location (x, y).
146 | (defun clear (board x y)
147 |   (setcar (nth-tail (nth board x) y) 'x))
148 | 
149 | ;; Returns true if location (x, y)'s value is "@".
150 | (defun set? (board x y)
151 |   (eq (get board x y) '@))
152 | 
153 | ;; Print out the given board.
154 | (defun print (board)
155 |   (if (not board)
156 |       '$
157 |     (println (car board))
158 |     (print (cdr board))))
159 | 
160 | ;; Returns true if we cannot place a queen at position (x, y), assuming that
161 | ;; queens have already been placed on each row from 0 to x-1.
162 | (defun conflict? (board x y)
163 |   (any (iota x)
164 |        (lambda (n)
165 | 	 (or
166 | 	  ;; Check if there's no conflicting queen upward
167 | 	  (set? board n y)
168 | 	  ;; Upper left
169 | 	  (let1 z (+ y (- n x))
170 | 		(and (<= 0 z)
171 | 		     (set? board n z)))
172 | 	  ;; Upper right
173 | 	  (let1 z (+ y (- x n))
174 | 		(and (< z board-size)
175 | 		     (set? board n z)))))))
176 | 
177 | ;; Find positions where we can place queens at row x, and continue searching for
178 | ;; the next row.
179 | (defun %solve (board x)
180 |   (if (= x board-size)
181 |       ;; Problem solved
182 |       (progn (print board)
183 | 	     (println '$))
184 |     (for-each (iota board-size)
185 | 	      (lambda (y)
186 | 		(unless (conflict? board x y)
187 | 		  (set board x y)
188 | 		  (%solve board (+ x 1))
189 | 		  (clear board x y))))))
190 | 
191 | (defun solve (board)
192 |   (println 'start)
193 |   (%solve board 0)
194 |   (println 'done))
195 | 
196 | ;;;
197 | ;;; Main
198 | ;;;
199 | 
200 | (define board-size 8)
201 | (define board (make-board board-size))
202 | (solve board)
203 | 


--------------------------------------------------------------------------------
/minilisp.c:
--------------------------------------------------------------------------------
  1 | // This software is in the public domain.
  2 | 
  3 | #include <assert.h>
  4 | #include <ctype.h>
  5 | #include <stdarg.h>
  6 | #include <stdbool.h>
  7 | #include <stddef.h>
  8 | #include <stdint.h>
  9 | #include <stdio.h>
 10 | #include <stdlib.h>
 11 | #include <string.h>
 12 | #include <sys/mman.h>
 13 | 
 14 | static __attribute((noreturn)) void error(char *fmt, ...) {
 15 |     va_list ap;
 16 |     va_start(ap, fmt);
 17 |     vfprintf(stderr, fmt, ap);
 18 |     fprintf(stderr, "\n");
 19 |     va_end(ap);
 20 |     exit(1);
 21 | }
 22 | 
 23 | //======================================================================
 24 | // Lisp objects
 25 | //======================================================================
 26 | 
 27 | // The Lisp object type
 28 | enum {
 29 |     // Regular objects visible from the user
 30 |     TINT = 1,
 31 |     TCELL,
 32 |     TSYMBOL,
 33 |     TPRIMITIVE,
 34 |     TFUNCTION,
 35 |     TMACRO,
 36 |     TENV,
 37 |     // The marker that indicates the object has been moved to other location by GC. The new location
 38 |     // can be found at the forwarding pointer. Only the functions to do garbage collection set and
 39 |     // handle the object of this type. Other functions will never see the object of this type.
 40 |     TMOVED,
 41 |     // Const objects. They are statically allocated and will never be managed by GC.
 42 |     TTRUE,
 43 |     TNIL,
 44 |     TDOT,
 45 |     TCPAREN,
 46 | };
 47 | 
 48 | // Typedef for the primitive function
 49 | struct Obj;
 50 | typedef struct Obj *Primitive(void *root, struct Obj **env, struct Obj **args);
 51 | 
 52 | // The object type
 53 | typedef struct Obj {
 54 |     // The first word of the object represents the type of the object. Any code that handles object
 55 |     // needs to check its type first, then access the following union members.
 56 |     int type;
 57 | 
 58 |     // The total size of the object, including "type" field, this field, the contents, and the
 59 |     // padding at the end of the object.
 60 |     int size;
 61 | 
 62 |     // Object values.
 63 |     union {
 64 |         // Int
 65 |         int value;
 66 |         // Cell
 67 |         struct {
 68 |             struct Obj *car;
 69 |             struct Obj *cdr;
 70 |         };
 71 |         // Symbol
 72 |         char name[1];
 73 |         // Primitive
 74 |         Primitive *fn;
 75 |         // Function or Macro
 76 |         struct {
 77 |             struct Obj *params;
 78 |             struct Obj *body;
 79 |             struct Obj *env;
 80 |         };
 81 |         // Environment frame. This is a linked list of association lists
 82 |         // containing the mapping from symbols to their value.
 83 |         struct {
 84 |             struct Obj *vars;
 85 |             struct Obj *up;
 86 |         };
 87 |         // Forwarding pointer
 88 |         void *moved;
 89 |     };
 90 | } Obj;
 91 | 
 92 | // Constants
 93 | static Obj *True = &(Obj){ TTRUE };
 94 | static Obj *Nil = &(Obj){ TNIL };
 95 | static Obj *Dot = &(Obj){ TDOT };
 96 | static Obj *Cparen = &(Obj){ TCPAREN };
 97 | 
 98 | // The list containing all symbols. Such data structure is traditionally called the "obarray", but I
 99 | // avoid using it as a variable name as this is not an array but a list.
100 | static Obj *Symbols;
101 | 
102 | //======================================================================
103 | // Memory management
104 | //======================================================================
105 | 
106 | // The size of the heap in byte
107 | #define MEMORY_SIZE 65536
108 | 
109 | // The pointer pointing to the beginning of the current heap
110 | static void *memory;
111 | 
112 | // The pointer pointing to the beginning of the old heap
113 | static void *from_space;
114 | 
115 | // The number of bytes allocated from the heap
116 | static size_t mem_nused = 0;
117 | 
118 | // Flags to debug GC
119 | static bool gc_running = false;
120 | static bool debug_gc = false;
121 | static bool always_gc = false;
122 | 
123 | static void gc(void *root);
124 | 
125 | // Currently we are using Cheney's copying GC algorithm, with which the available memory is split
126 | // into two halves and all objects are moved from one half to another every time GC is invoked. That
127 | // means the address of the object keeps changing. If you take the address of an object and keep it
128 | // in a C variable, dereferencing it could cause SEGV because the address becomes invalid after GC
129 | // runs.
130 | //
131 | // In order to deal with that, all access from C to Lisp objects will go through two levels of
132 | // pointer dereferences. The C local variable is pointing to a pointer on the C stack, and the
133 | // pointer is pointing to the Lisp object. GC is aware of the pointers in the stack and updates
134 | // their contents with the objects' new addresses when GC happens.
135 | //
136 | // The following is a macro to reserve the area in the C stack for the pointers. The contents of
137 | // this area are considered to be GC root.
138 | //
139 | // Be careful not to bypass the two levels of pointer indirections. If you create a direct pointer
140 | // to an object, it'll cause a subtle bug. Such code would work in most cases but fails with SEGV if
141 | // GC happens during the execution of the code. Any code that allocates memory may invoke GC.
142 | 
143 | #define ROOT_END ((void *)-1)
144 | 
145 | #define ADD_ROOT(size)                          \
146 |     void *root_ADD_ROOT_[size + 2];             \
147 |     root_ADD_ROOT_[0] = root;                   \
148 |     for (int i = 1; i <= size; i++)             \
149 |         root_ADD_ROOT_[i] = NULL;               \
150 |     root_ADD_ROOT_[size + 1] = ROOT_END;        \
151 |     root = root_ADD_ROOT_
152 | 
153 | #define DEFINE1(var1)                           \
154 |     ADD_ROOT(1);                                \
155 |     Obj **var1 = (Obj **)(root_ADD_ROOT_ + 1)
156 | 
157 | #define DEFINE2(var1, var2)                     \
158 |     ADD_ROOT(2);                                \
159 |     Obj **var1 = (Obj **)(root_ADD_ROOT_ + 1);  \
160 |     Obj **var2 = (Obj **)(root_ADD_ROOT_ + 2)
161 | 
162 | #define DEFINE3(var1, var2, var3)               \
163 |     ADD_ROOT(3);                                \
164 |     Obj **var1 = (Obj **)(root_ADD_ROOT_ + 1);  \
165 |     Obj **var2 = (Obj **)(root_ADD_ROOT_ + 2);  \
166 |     Obj **var3 = (Obj **)(root_ADD_ROOT_ + 3)
167 | 
168 | #define DEFINE4(var1, var2, var3, var4)         \
169 |     ADD_ROOT(4);                                \
170 |     Obj **var1 = (Obj **)(root_ADD_ROOT_ + 1);  \
171 |     Obj **var2 = (Obj **)(root_ADD_ROOT_ + 2);  \
172 |     Obj **var3 = (Obj **)(root_ADD_ROOT_ + 3);  \
173 |     Obj **var4 = (Obj **)(root_ADD_ROOT_ + 4)
174 | 
175 | // Round up the given value to a multiple of size. Size must be a power of 2. It adds size - 1
176 | // first, then zero-ing the least significant bits to make the result a multiple of size. I know
177 | // these bit operations may look a little bit tricky, but it's efficient and thus frequently used.
178 | static inline size_t roundup(size_t var, size_t size) {
179 |     return (var + size - 1) & ~(size - 1);
180 | }
181 | 
182 | // Allocates memory block. This may start GC if we don't have enough memory.
183 | static Obj *alloc(void *root, int type, size_t size) {
184 |     // The object must be large enough to contain a pointer for the forwarding pointer. Make it
185 |     // larger if it's smaller than that.
186 |     size = roundup(size, sizeof(void *));
187 | 
188 |     // Add the size of the type tag and size fields.
189 |     size += offsetof(Obj, value);
190 | 
191 |     // Round up the object size to the nearest alignment boundary, so that the next object will be
192 |     // allocated at the proper alignment boundary. Currently we align the object at the same
193 |     // boundary as the pointer.
194 |     size = roundup(size, sizeof(void *));
195 | 
196 |     // If the debug flag is on, allocate a new memory space to force all the existing objects to
197 |     // move to new addresses, to invalidate the old addresses. By doing this the GC behavior becomes
198 |     // more predictable and repeatable. If there's a memory bug that the C variable has a direct
199 |     // reference to a Lisp object, the pointer will become invalid by this GC call. Dereferencing
200 |     // that will immediately cause SEGV.
201 |     if (always_gc && !gc_running)
202 |         gc(root);
203 | 
204 |     // Otherwise, run GC only when the available memory is not large enough.
205 |     if (!always_gc && MEMORY_SIZE < mem_nused + size)
206 |         gc(root);
207 | 
208 |     // Terminate the program if we couldn't satisfy the memory request. This can happen if the
209 |     // requested size was too large or the from-space was filled with too many live objects.
210 |     if (MEMORY_SIZE < mem_nused + size)
211 |         error("Memory exhausted");
212 | 
213 |     // Allocate the object.
214 |     Obj *obj = memory + mem_nused;
215 |     obj->type = type;
216 |     obj->size = size;
217 |     mem_nused += size;
218 |     return obj;
219 | }
220 | 
221 | //======================================================================
222 | // Garbage collector
223 | //======================================================================
224 | 
225 | // Cheney's algorithm uses two pointers to keep track of GC status. At first both pointers point to
226 | // the beginning of the to-space. As GC progresses, they are moved towards the end of the
227 | // to-space. The objects before "scan1" are the objects that are fully copied. The objects between
228 | // "scan1" and "scan2" have already been copied, but may contain pointers to the from-space. "scan2"
229 | // points to the beginning of the free space.
230 | static Obj *scan1;
231 | static Obj *scan2;
232 | 
233 | // Moves one object from the from-space to the to-space. Returns the object's new address. If the
234 | // object has already been moved, does nothing but just returns the new address.
235 | static inline Obj *forward(Obj *obj) {
236 |     // If the object's address is not in the from-space, the object is not managed by GC nor it
237 |     // has already been moved to the to-space.
238 |     ptrdiff_t offset = (uint8_t *)obj - (uint8_t *)from_space;
239 |     if (offset < 0 || MEMORY_SIZE <= offset)
240 |         return obj;
241 | 
242 |     // The pointer is pointing to the from-space, but the object there was a tombstone. Follow the
243 |     // forwarding pointer to find the new location of the object.
244 |     if (obj->type == TMOVED)
245 |         return obj->moved;
246 | 
247 |     // Otherwise, the object has not been moved yet. Move it.
248 |     Obj *newloc = scan2;
249 |     memcpy(newloc, obj, obj->size);
250 |     scan2 = (Obj *)((uint8_t *)scan2 + obj->size);
251 | 
252 |     // Put a tombstone at the location where the object used to occupy, so that the following call
253 |     // of forward() can find the object's new location.
254 |     obj->type = TMOVED;
255 |     obj->moved = newloc;
256 |     return newloc;
257 | }
258 | 
259 | static void *alloc_semispace() {
260 |     return mmap(NULL, MEMORY_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, -1, 0);
261 | }
262 | 
263 | // Copies the root objects.
264 | static void forward_root_objects(void *root) {
265 |     Symbols = forward(Symbols);
266 |     for (void **frame = root; frame; frame = *(void ***)frame)
267 |         for (int i = 1; frame[i] != ROOT_END; i++)
268 |             if (frame[i])
269 |                 frame[i] = forward(frame[i]);
270 | }
271 | 
272 | // Implements Cheney's copying garbage collection algorithm.
273 | // http://en.wikipedia.org/wiki/Cheney%27s_algorithm
274 | static void gc(void *root) {
275 |     assert(!gc_running);
276 |     gc_running = true;
277 | 
278 |     // Allocate a new semi-space.
279 |     from_space = memory;
280 |     memory = alloc_semispace();
281 | 
282 |     // Initialize the two pointers for GC. Initially they point to the beginning of the to-space.
283 |     scan1 = scan2 = memory;
284 | 
285 |     // Copy the GC root objects first. This moves the pointer scan2.
286 |     forward_root_objects(root);
287 | 
288 |     // Copy the objects referenced by the GC root objects located between scan1 and scan2. Once it's
289 |     // finished, all live objects (i.e. objects reachable from the root) will have been copied to
290 |     // the to-space.
291 |     while (scan1 < scan2) {
292 |         switch (scan1->type) {
293 |         case TINT:
294 |         case TSYMBOL:
295 |         case TPRIMITIVE:
296 |             // Any of the above types does not contain a pointer to a GC-managed object.
297 |             break;
298 |         case TCELL:
299 |             scan1->car = forward(scan1->car);
300 |             scan1->cdr = forward(scan1->cdr);
301 |             break;
302 |         case TFUNCTION:
303 |         case TMACRO:
304 |             scan1->params = forward(scan1->params);
305 |             scan1->body = forward(scan1->body);
306 |             scan1->env = forward(scan1->env);
307 |             break;
308 |         case TENV:
309 |             scan1->vars = forward(scan1->vars);
310 |             scan1->up = forward(scan1->up);
311 |             break;
312 |         default:
313 |             error("Bug: copy: unknown type %d", scan1->type);
314 |         }
315 |         scan1 = (Obj *)((uint8_t *)scan1 + scan1->size);
316 |     }
317 | 
318 |     // Finish up GC.
319 |     munmap(from_space, MEMORY_SIZE);
320 |     size_t old_nused = mem_nused;
321 |     mem_nused = (size_t)((uint8_t *)scan1 - (uint8_t *)memory);
322 |     if (debug_gc)
323 |         fprintf(stderr, "GC: %zu bytes out of %zu bytes copied.\n", mem_nused, old_nused);
324 |     gc_running = false;
325 | }
326 | 
327 | //======================================================================
328 | // Constructors
329 | //======================================================================
330 | 
331 | static Obj *make_int(void *root, int value) {
332 |     Obj *r = alloc(root, TINT, sizeof(int));
333 |     r->value = value;
334 |     return r;
335 | }
336 | 
337 | static Obj *cons(void *root, Obj **car, Obj **cdr) {
338 |     Obj *cell = alloc(root, TCELL, sizeof(Obj *) * 2);
339 |     cell->car = *car;
340 |     cell->cdr = *cdr;
341 |     return cell;
342 | }
343 | 
344 | static Obj *make_symbol(void *root, char *name) {
345 |     Obj *sym = alloc(root, TSYMBOL, strlen(name) + 1);
346 |     strcpy(sym->name, name);
347 |     return sym;
348 | }
349 | 
350 | static Obj *make_primitive(void *root, Primitive *fn) {
351 |     Obj *r = alloc(root, TPRIMITIVE, sizeof(Primitive *));
352 |     r->fn = fn;
353 |     return r;
354 | }
355 | 
356 | static Obj *make_function(void *root, Obj **env, int type, Obj **params, Obj **body) {
357 |     assert(type == TFUNCTION || type == TMACRO);
358 |     Obj *r = alloc(root, type, sizeof(Obj *) * 3);
359 |     r->params = *params;
360 |     r->body = *body;
361 |     r->env = *env;
362 |     return r;
363 | }
364 | 
365 | struct Obj *make_env(void *root, Obj **vars, Obj **up) {
366 |     Obj *r = alloc(root, TENV, sizeof(Obj *) * 2);
367 |     r->vars = *vars;
368 |     r->up = *up;
369 |     return r;
370 | }
371 | 
372 | // Returns ((x . y) . a)
373 | static Obj *acons(void *root, Obj **x, Obj **y, Obj **a) {
374 |     DEFINE1(cell);
375 |     *cell = cons(root, x, y);
376 |     return cons(root, cell, a);
377 | }
378 | 
379 | //======================================================================
380 | // Parser
381 | //
382 | // This is a hand-written recursive-descendent parser.
383 | //======================================================================
384 | 
385 | #define SYMBOL_MAX_LEN 200
386 | const char symbol_chars[] = "~!@#$%^&*-_=+:/?<>";
387 | 
388 | static Obj *read_expr(void *root);
389 | 
390 | static int peek(void) {
391 |     int c = getchar();
392 |     ungetc(c, stdin);
393 |     return c;
394 | }
395 | 
396 | // Destructively reverses the given list.
397 | static Obj *reverse(Obj *p) {
398 |     Obj *ret = Nil;
399 |     while (p != Nil) {
400 |         Obj *head = p;
401 |         p = p->cdr;
402 |         head->cdr = ret;
403 |         ret = head;
404 |     }
405 |     return ret;
406 | }
407 | 
408 | // Skips the input until newline is found. Newline is one of \r, \r\n or \n.
409 | static void skip_line(void) {
410 |     for (;;) {
411 |         int c = getchar();
412 |         if (c == EOF || c == '\n')
413 |             return;
414 |         if (c == '\r') {
415 |             if (peek() == '\n')
416 |                 getchar();
417 |             return;
418 |         }
419 |     }
420 | }
421 | 
422 | // Reads a list. Note that '(' has already been read.
423 | static Obj *read_list(void *root) {
424 |     DEFINE3(obj, head, last);
425 |     *head = Nil;
426 |     for (;;) {
427 |         *obj = read_expr(root);
428 |         if (!*obj)
429 |             error("Unclosed parenthesis");
430 |         if (*obj == Cparen)
431 |             return reverse(*head);
432 |         if (*obj == Dot) {
433 |             *last = read_expr(root);
434 |             if (read_expr(root) != Cparen)
435 |                 error("Closed parenthesis expected after dot");
436 |             Obj *ret = reverse(*head);
437 |             (*head)->cdr = *last;
438 |             return ret;
439 |         }
440 |         *head = cons(root, obj, head);
441 |     }
442 | }
443 | 
444 | // May create a new symbol. If there's a symbol with the same name, it will not create a new symbol
445 | // but return the existing one.
446 | static Obj *intern(void *root, char *name) {
447 |     for (Obj *p = Symbols; p != Nil; p = p->cdr)
448 |         if (strcmp(name, p->car->name) == 0)
449 |             return p->car;
450 |     DEFINE1(sym);
451 |     *sym = make_symbol(root, name);
452 |     Symbols = cons(root, sym, &Symbols);
453 |     return *sym;
454 | }
455 | 
456 | // Reader marcro ' (single quote). It reads an expression and returns (quote <expr>).
457 | static Obj *read_quote(void *root) {
458 |     DEFINE2(sym, tmp);
459 |     *sym = intern(root, "quote");
460 |     *tmp = read_expr(root);
461 |     *tmp = cons(root, tmp, &Nil);
462 |     *tmp = cons(root, sym, tmp);
463 |     return *tmp;
464 | }
465 | 
466 | static int read_number(int val) {
467 |     while (isdigit(peek()))
468 |         val = val * 10 + (getchar() - '0');
469 |     return val;
470 | }
471 | 
472 | static Obj *read_symbol(void *root, char c) {
473 |     char buf[SYMBOL_MAX_LEN + 1];
474 |     buf[0] = c;
475 |     int len = 1;
476 |     while (isalnum(peek()) || strchr(symbol_chars, peek())) {
477 |         if (SYMBOL_MAX_LEN <= len)
478 |             error("Symbol name too long");
479 |         buf[len++] = getchar();
480 |     }
481 |     buf[len] = '\0';
482 |     return intern(root, buf);
483 | }
484 | 
485 | static Obj *read_expr(void *root) {
486 |     for (;;) {
487 |         int c = getchar();
488 |         if (c == ' ' || c == '\n' || c == '\r' || c == '\t')
489 |             continue;
490 |         if (c == EOF)
491 |             return NULL;
492 |         if (c == ';') {
493 |             skip_line();
494 |             continue;
495 |         }
496 |         if (c == '(')
497 |             return read_list(root);
498 |         if (c == ')')
499 |             return Cparen;
500 |         if (c == '.')
501 |             return Dot;
502 |         if (c == '\'')
503 |             return read_quote(root);
504 |         if (isdigit(c))
505 |             return make_int(root, read_number(c - '0'));
506 |         if (c == '-' && isdigit(peek()))
507 |             return make_int(root, -read_number(0));
508 |         if (isalpha(c) || strchr(symbol_chars, c))
509 |             return read_symbol(root, c);
510 |         error("Don't know how to handle %c", c);
511 |     }
512 | }
513 | 
514 | // Prints the given object.
515 | static void print(Obj *obj) {
516 |     switch (obj->type) {
517 |     case TCELL:
518 |         printf("(");
519 |         for (;;) {
520 |             print(obj->car);
521 |             if (obj->cdr == Nil)
522 |                 break;
523 |             if (obj->cdr->type != TCELL) {
524 |                 printf(" . ");
525 |                 print(obj->cdr);
526 |                 break;
527 |             }
528 |             printf(" ");
529 |             obj = obj->cdr;
530 |         }
531 |         printf(")");
532 |         return;
533 | 
534 | #define CASE(type, ...)                         \
535 |     case type:                                  \
536 |         printf(__VA_ARGS__);                    \
537 |         return
538 |     CASE(TINT, "%d", obj->value);
539 |     CASE(TSYMBOL, "%s", obj->name);
540 |     CASE(TPRIMITIVE, "<primitive>");
541 |     CASE(TFUNCTION, "<function>");
542 |     CASE(TMACRO, "<macro>");
543 |     CASE(TMOVED, "<moved>");
544 |     CASE(TTRUE, "t");
545 |     CASE(TNIL, "()");
546 | #undef CASE
547 |     default:
548 |         error("Bug: print: Unknown tag type: %d", obj->type);
549 |     }
550 | }
551 | 
552 | // Returns the length of the given list. -1 if it's not a proper list.
553 | static int length(Obj *list) {
554 |     int len = 0;
555 |     for (; list->type == TCELL; list = list->cdr)
556 |         len++;
557 |     return list == Nil ? len : -1;
558 | }
559 | 
560 | //======================================================================
561 | // Evaluator
562 | //======================================================================
563 | 
564 | static Obj *eval(void *root, Obj **env, Obj **obj);
565 | 
566 | static void add_variable(void *root, Obj **env, Obj **sym, Obj **val) {
567 |     DEFINE2(vars, tmp);
568 |     *vars = (*env)->vars;
569 |     *tmp = acons(root, sym, val, vars);
570 |     (*env)->vars = *tmp;
571 | }
572 | 
573 | // Returns a newly created environment frame.
574 | static Obj *push_env(void *root, Obj **env, Obj **vars, Obj **vals) {
575 |     DEFINE3(map, sym, val);
576 |     *map = Nil;
577 |     for (; (*vars)->type == TCELL; *vars = (*vars)->cdr, *vals = (*vals)->cdr) {
578 |         if ((*vals)->type != TCELL)
579 |             error("Cannot apply function: number of argument does not match");
580 |         *sym = (*vars)->car;
581 |         *val = (*vals)->car;
582 |         *map = acons(root, sym, val, map);
583 |     }
584 |     if (*vars != Nil)
585 |         *map = acons(root, vars, vals, map);
586 |     return make_env(root, map, env);
587 | }
588 | 
589 | // Evaluates the list elements from head and returns the last return value.
590 | static Obj *progn(void *root, Obj **env, Obj **list) {
591 |     DEFINE2(lp, r);
592 |     for (*lp = *list; *lp != Nil; *lp = (*lp)->cdr) {
593 |         *r = (*lp)->car;
594 |         *r = eval(root, env, r);
595 |     }
596 |     return *r;
597 | }
598 | 
599 | // Evaluates all the list elements and returns their return values as a new list.
600 | static Obj *eval_list(void *root, Obj **env, Obj **list) {
601 |     DEFINE4(head, lp, expr, result);
602 |     *head = Nil;
603 |     for (lp = list; *lp != Nil; *lp = (*lp)->cdr) {
604 |         *expr = (*lp)->car;
605 |         *result = eval(root, env, expr);
606 |         *head = cons(root, result, head);
607 |     }
608 |     return reverse(*head);
609 | }
610 | 
611 | static bool is_list(Obj *obj) {
612 |     return obj == Nil || obj->type == TCELL;
613 | }
614 | 
615 | static Obj *apply_func(void *root, Obj **env, Obj **fn, Obj **args) {
616 |     DEFINE3(params, newenv, body);
617 |     *params = (*fn)->params;
618 |     *newenv = (*fn)->env;
619 |     *newenv = push_env(root, newenv, params, args);
620 |     *body = (*fn)->body;
621 |     return progn(root, newenv, body);
622 | }
623 | 
624 | // Apply fn with args.
625 | static Obj *apply(void *root, Obj **env, Obj **fn, Obj **args) {
626 |     if (!is_list(*args))
627 |         error("argument must be a list");
628 |     if ((*fn)->type == TPRIMITIVE)
629 |         return (*fn)->fn(root, env, args);
630 |     if ((*fn)->type == TFUNCTION) {
631 |         DEFINE1(eargs);
632 |         *eargs = eval_list(root, env, args);
633 |         return apply_func(root, env, fn, eargs);
634 |     }
635 |     error("not supported");
636 | }
637 | 
638 | // Searches for a variable by symbol. Returns null if not found.
639 | static Obj *find(Obj **env, Obj *sym) {
640 |     for (Obj *p = *env; p != Nil; p = p->up) {
641 |         for (Obj *cell = p->vars; cell != Nil; cell = cell->cdr) {
642 |             Obj *bind = cell->car;
643 |             if (sym == bind->car)
644 |                 return bind;
645 |         }
646 |     }
647 |     return NULL;
648 | }
649 | 
650 | // Expands the given macro application form.
651 | static Obj *macroexpand(void *root, Obj **env, Obj **obj) {
652 |     if ((*obj)->type != TCELL || (*obj)->car->type != TSYMBOL)
653 |         return *obj;
654 |     DEFINE3(bind, macro, args);
655 |     *bind = find(env, (*obj)->car);
656 |     if (!*bind || (*bind)->cdr->type != TMACRO)
657 |         return *obj;
658 |     *macro = (*bind)->cdr;
659 |     *args = (*obj)->cdr;
660 |     return apply_func(root, env, macro, args);
661 | }
662 | 
663 | // Evaluates the S expression.
664 | static Obj *eval(void *root, Obj **env, Obj **obj) {
665 |     switch ((*obj)->type) {
666 |     case TINT:
667 |     case TPRIMITIVE:
668 |     case TFUNCTION:
669 |     case TTRUE:
670 |     case TNIL:
671 |         // Self-evaluating objects
672 |         return *obj;
673 |     case TSYMBOL: {
674 |         // Variable
675 |         Obj *bind = find(env, *obj);
676 |         if (!bind)
677 |             error("Undefined symbol: %s", (*obj)->name);
678 |         return bind->cdr;
679 |     }
680 |     case TCELL: {
681 |         // Function application form
682 |         DEFINE3(fn, expanded, args);
683 |         *expanded = macroexpand(root, env, obj);
684 |         if (*expanded != *obj)
685 |             return eval(root, env, expanded);
686 |         *fn = (*obj)->car;
687 |         *fn = eval(root, env, fn);
688 |         *args = (*obj)->cdr;
689 |         if ((*fn)->type != TPRIMITIVE && (*fn)->type != TFUNCTION)
690 |             error("The head of a list must be a function");
691 |         return apply(root, env, fn, args);
692 |     }
693 |     default:
694 |         error("Bug: eval: Unknown tag type: %d", (*obj)->type);
695 |     }
696 | }
697 | 
698 | //======================================================================
699 | // Primitive functions and special forms
700 | //======================================================================
701 | 
702 | // 'expr
703 | static Obj *prim_quote(void *root, Obj **env, Obj **list) {
704 |     if (length(*list) != 1)
705 |         error("Malformed quote");
706 |     return (*list)->car;
707 | }
708 | 
709 | // (cons expr expr)
710 | static Obj *prim_cons(void *root, Obj **env, Obj **list) {
711 |     if (length(*list) != 2)
712 |         error("Malformed cons");
713 |     Obj *cell = eval_list(root, env, list);
714 |     cell->cdr = cell->cdr->car;
715 |     return cell;
716 | }
717 | 
718 | // (car <cell>)
719 | static Obj *prim_car(void *root, Obj **env, Obj **list) {
720 |     Obj *args = eval_list(root, env, list);
721 |     if (args->car->type != TCELL || args->cdr != Nil)
722 |         error("Malformed car");
723 |     return args->car->car;
724 | }
725 | 
726 | // (cdr <cell>)
727 | static Obj *prim_cdr(void *root, Obj **env, Obj **list) {
728 |     Obj *args = eval_list(root, env, list);
729 |     if (args->car->type != TCELL || args->cdr != Nil)
730 |         error("Malformed cdr");
731 |     return args->car->cdr;
732 | }
733 | 
734 | // (setq <symbol> expr)
735 | static Obj *prim_setq(void *root, Obj **env, Obj **list) {
736 |     if (length(*list) != 2 || (*list)->car->type != TSYMBOL)
737 |         error("Malformed setq");
738 |     DEFINE2(bind, value);
739 |     *bind = find(env, (*list)->car);
740 |     if (!*bind)
741 |         error("Unbound variable %s", (*list)->car->name);
742 |     *value = (*list)->cdr->car;
743 |     *value = eval(root, env, value);
744 |     (*bind)->cdr = *value;
745 |     return *value;
746 | }
747 | 
748 | // (setcar <cell> expr)
749 | static Obj *prim_setcar(void *root, Obj **env, Obj **list) {
750 |     DEFINE1(args);
751 |     *args = eval_list(root, env, list);
752 |     if (length(*args) != 2 || (*args)->car->type != TCELL)
753 |         error("Malformed setcar");
754 |     (*args)->car->car = (*args)->cdr->car;
755 |     return (*args)->car;
756 | }
757 | 
758 | // (while cond expr ...)
759 | static Obj *prim_while(void *root, Obj **env, Obj **list) {
760 |     if (length(*list) < 2)
761 |         error("Malformed while");
762 |     DEFINE2(cond, exprs);
763 |     *cond = (*list)->car;
764 |     while (eval(root, env, cond) != Nil) {
765 |         *exprs = (*list)->cdr;
766 |         eval_list(root, env, exprs);
767 |     }
768 |     return Nil;
769 | }
770 | 
771 | // (gensym)
772 | static Obj *prim_gensym(void *root, Obj **env, Obj **list) {
773 |   static int count = 0;
774 |   char buf[10];
775 |   snprintf(buf, sizeof(buf), "G__%d", count++);
776 |   return make_symbol(root, buf);
777 | }
778 | 
779 | // (+ <integer> ...)
780 | static Obj *prim_plus(void *root, Obj **env, Obj **list) {
781 |     int sum = 0;
782 |     for (Obj *args = eval_list(root, env, list); args != Nil; args = args->cdr) {
783 |         if (args->car->type != TINT)
784 |             error("+ takes only numbers");
785 |         sum += args->car->value;
786 |     }
787 |     return make_int(root, sum);
788 | }
789 | 
790 | // (- <integer> ...)
791 | static Obj *prim_minus(void *root, Obj **env, Obj **list) {
792 |     Obj *args = eval_list(root, env, list);
793 |     for (Obj *p = args; p != Nil; p = p->cdr)
794 |         if (p->car->type != TINT)
795 |             error("- takes only numbers");
796 |     if (args->cdr == Nil)
797 |         return make_int(root, -args->car->value);
798 |     int r = args->car->value;
799 |     for (Obj *p = args->cdr; p != Nil; p = p->cdr)
800 |         r -= p->car->value;
801 |     return make_int(root, r);
802 | }
803 | 
804 | // (< <integer> <integer>)
805 | static Obj *prim_lt(void *root, Obj **env, Obj **list) {
806 |     Obj *args = eval_list(root, env, list);
807 |     if (length(args) != 2)
808 |         error("malformed <");
809 |     Obj *x = args->car;
810 |     Obj *y = args->cdr->car;
811 |     if (x->type != TINT || y->type != TINT)
812 |         error("< takes only numbers");
813 |     return x->value < y->value ? True : Nil;
814 | }
815 | 
816 | static Obj *handle_function(void *root, Obj **env, Obj **list, int type) {
817 |     if ((*list)->type != TCELL || !is_list((*list)->car) || (*list)->cdr->type != TCELL)
818 |         error("Malformed lambda");
819 |     Obj *p = (*list)->car;
820 |     for (; p->type == TCELL; p = p->cdr)
821 |         if (p->car->type != TSYMBOL)
822 |             error("Parameter must be a symbol");
823 |     if (p != Nil && p->type != TSYMBOL)
824 |         error("Parameter must be a symbol");
825 |     DEFINE2(params, body);
826 |     *params = (*list)->car;
827 |     *body = (*list)->cdr;
828 |     return make_function(root, env, type, params, body);
829 | }
830 | 
831 | // (lambda (<symbol> ...) expr ...)
832 | static Obj *prim_lambda(void *root, Obj **env, Obj **list) {
833 |     return handle_function(root, env, list, TFUNCTION);
834 | }
835 | 
836 | static Obj *handle_defun(void *root, Obj **env, Obj **list, int type) {
837 |     if ((*list)->car->type != TSYMBOL || (*list)->cdr->type != TCELL)
838 |         error("Malformed defun");
839 |     DEFINE3(fn, sym, rest);
840 |     *sym = (*list)->car;
841 |     *rest = (*list)->cdr;
842 |     *fn = handle_function(root, env, rest, type);
843 |     add_variable(root, env, sym, fn);
844 |     return *fn;
845 | }
846 | 
847 | // (defun <symbol> (<symbol> ...) expr ...)
848 | static Obj *prim_defun(void *root, Obj **env, Obj **list) {
849 |     return handle_defun(root, env, list, TFUNCTION);
850 | }
851 | 
852 | // (define <symbol> expr)
853 | static Obj *prim_define(void *root, Obj **env, Obj **list) {
854 |     if (length(*list) != 2 || (*list)->car->type != TSYMBOL)
855 |         error("Malformed define");
856 |     DEFINE2(sym, value);
857 |     *sym = (*list)->car;
858 |     *value = (*list)->cdr->car;
859 |     *value = eval(root, env, value);
860 |     add_variable(root, env, sym, value);
861 |     return *value;
862 | }
863 | 
864 | // (defmacro <symbol> (<symbol> ...) expr ...)
865 | static Obj *prim_defmacro(void *root, Obj **env, Obj **list) {
866 |     return handle_defun(root, env, list, TMACRO);
867 | }
868 | 
869 | // (macroexpand expr)
870 | static Obj *prim_macroexpand(void *root, Obj **env, Obj **list) {
871 |     if (length(*list) != 1)
872 |         error("Malformed macroexpand");
873 |     DEFINE1(body);
874 |     *body = (*list)->car;
875 |     return macroexpand(root, env, body);
876 | }
877 | 
878 | // (println expr)
879 | static Obj *prim_println(void *root, Obj **env, Obj **list) {
880 |     DEFINE1(tmp);
881 |     *tmp = (*list)->car;
882 |     print(eval(root, env, tmp));
883 |     printf("\n");
884 |     return Nil;
885 | }
886 | 
887 | // (if expr expr expr ...)
888 | static Obj *prim_if(void *root, Obj **env, Obj **list) {
889 |     if (length(*list) < 2)
890 |         error("Malformed if");
891 |     DEFINE3(cond, then, els);
892 |     *cond = (*list)->car;
893 |     *cond = eval(root, env, cond);
894 |     if (*cond != Nil) {
895 |         *then = (*list)->cdr->car;
896 |         return eval(root, env, then);
897 |     }
898 |     *els = (*list)->cdr->cdr;
899 |     return *els == Nil ? Nil : progn(root, env, els);
900 | }
901 | 
902 | // (= <integer> <integer>)
903 | static Obj *prim_num_eq(void *root, Obj **env, Obj **list) {
904 |     if (length(*list) != 2)
905 |         error("Malformed =");
906 |     Obj *values = eval_list(root, env, list);
907 |     Obj *x = values->car;
908 |     Obj *y = values->cdr->car;
909 |     if (x->type != TINT || y->type != TINT)
910 |         error("= only takes numbers");
911 |     return x->value == y->value ? True : Nil;
912 | }
913 | 
914 | // (eq expr expr)
915 | static Obj *prim_eq(void *root, Obj **env, Obj **list) {
916 |     if (length(*list) != 2)
917 |         error("Malformed eq");
918 |     Obj *values = eval_list(root, env, list);
919 |     return values->car == values->cdr->car ? True : Nil;
920 | }
921 | 
922 | static void add_primitive(void *root, Obj **env, char *name, Primitive *fn) {
923 |     DEFINE2(sym, prim);
924 |     *sym = intern(root, name);
925 |     *prim = make_primitive(root, fn);
926 |     add_variable(root, env, sym, prim);
927 | }
928 | 
929 | static void define_constants(void *root, Obj **env) {
930 |     DEFINE1(sym);
931 |     *sym = intern(root, "t");
932 |     add_variable(root, env, sym, &True);
933 | }
934 | 
935 | static void define_primitives(void *root, Obj **env) {
936 |     add_primitive(root, env, "quote", prim_quote);
937 |     add_primitive(root, env, "cons", prim_cons);
938 |     add_primitive(root, env, "car", prim_car);
939 |     add_primitive(root, env, "cdr", prim_cdr);
940 |     add_primitive(root, env, "setq", prim_setq);
941 |     add_primitive(root, env, "setcar", prim_setcar);
942 |     add_primitive(root, env, "while", prim_while);
943 |     add_primitive(root, env, "gensym", prim_gensym);
944 |     add_primitive(root, env, "+", prim_plus);
945 |     add_primitive(root, env, "-", prim_minus);
946 |     add_primitive(root, env, "<", prim_lt);
947 |     add_primitive(root, env, "define", prim_define);
948 |     add_primitive(root, env, "defun", prim_defun);
949 |     add_primitive(root, env, "defmacro", prim_defmacro);
950 |     add_primitive(root, env, "macroexpand", prim_macroexpand);
951 |     add_primitive(root, env, "lambda", prim_lambda);
952 |     add_primitive(root, env, "if", prim_if);
953 |     add_primitive(root, env, "=", prim_num_eq);
954 |     add_primitive(root, env, "eq", prim_eq);
955 |     add_primitive(root, env, "println", prim_println);
956 | }
957 | 
958 | //======================================================================
959 | // Entry point
960 | //======================================================================
961 | 
962 | // Returns true if the environment variable is defined and not the empty string.
963 | static bool getEnvFlag(char *name) {
964 |     char *val = getenv(name);
965 |     return val && val[0];
966 | }
967 | 
968 | int main(int argc, char **argv) {
969 |     // Debug flags
970 |     debug_gc = getEnvFlag("MINILISP_DEBUG_GC");
971 |     always_gc = getEnvFlag("MINILISP_ALWAYS_GC");
972 | 
973 |     // Memory allocation
974 |     memory = alloc_semispace();
975 | 
976 |     // Constants and primitives
977 |     Symbols = Nil;
978 |     void *root = NULL;
979 |     DEFINE2(env, expr);
980 |     *env = make_env(root, &Nil, &Nil);
981 |     define_constants(root, env);
982 |     define_primitives(root, env);
983 | 
984 |     // The main loop
985 |     for (;;) {
986 |         *expr = read_expr(root);
987 |         if (!*expr)
988 |             return 0;
989 |         if (*expr == Cparen)
990 |             error("Stray close parenthesis");
991 |         if (*expr == Dot)
992 |             error("Stray dot");
993 |         print(eval(root, env, expr));
994 |         printf("\n");
995 |     }
996 | }
997 | 


--------------------------------------------------------------------------------
/test.sh:
--------------------------------------------------------------------------------
  1 | #!/bin/bash
  2 | 
  3 | function fail() {
  4 |   echo -n -e '\e[1;31m[ERROR]\e[0m '
  5 |   echo "$1"
  6 |   exit 1
  7 | }
  8 | 
  9 | function do_run() {
 10 |   error=$(echo "$3" | ./minilisp 2>&1 > /dev/null)
 11 |   if [ -n "$error" ]; then
 12 |     echo FAILED
 13 |     fail "$error"
 14 |   fi
 15 | 
 16 |   result=$(echo "$3" | ./minilisp 2> /dev/null | tail -1)
 17 |   if [ "$result" != "$2" ]; then
 18 |     echo FAILED
 19 |     fail "$2 expected, but got $result"
 20 |   fi
 21 | }
 22 | 
 23 | function run() {
 24 |   echo -n "Testing $1 ... "
 25 |   # Run the tests twice to test the garbage collector with different settings.
 26 |   MINILISP_ALWAYS_GC= do_run "$@"
 27 |   MINILISP_ALWAYS_GC=1 do_run "$@"
 28 |   echo ok
 29 | }
 30 | 
 31 | # Basic data types
 32 | run integer 1 1
 33 | run integer -1 -1
 34 | run symbol a "'a"
 35 | run quote a "(quote a)"
 36 | run quote 63 "'63"
 37 | run quote '(+ 1 2)' "'(+ 1 2)"
 38 | 
 39 | run + 3 '(+ 1 2)'
 40 | run + -2 '(+ 1 -3)'
 41 | 
 42 | run 'unary -' -3 '(- 3)'
 43 | run '-' -2 '(- 3 5)'
 44 | run '-' -9 '(- 3 5 7)'
 45 | 
 46 | run '<' t '(< 2 3)'
 47 | run '<' '()' '(< 3 3)'
 48 | run '<' '()' '(< 4 3)'
 49 | 
 50 | run 'literal list' '(a b c)' "'(a b c)"
 51 | run 'literal list' '(a b . c)' "'(a b . c)"
 52 | 
 53 | # List manipulation
 54 | run cons "(a . b)" "(cons 'a 'b)"
 55 | run cons "(a b c)" "(cons 'a (cons 'b (cons 'c ())))"
 56 | 
 57 | run car a "(car '(a b c))"
 58 | run cdr "(b c)" "(cdr '(a b c))"
 59 | 
 60 | run setcar "(x . b)" "(define obj (cons 'a 'b)) (setcar obj 'x) obj"
 61 | 
 62 | # Comments
 63 | run comment 5 "
 64 |   ; 2
 65 |   5 ; 3"
 66 | 
 67 | # Global variables
 68 | run define 7 '(define x 7) x'
 69 | run define 10 '(define x 7) (+ x 3)'
 70 | run define 7 '(define + 7) +'
 71 | run setq 11 '(define x 7) (setq x 11) x'
 72 | run setq 17 '(setq + 17) +'
 73 | 
 74 | # Conditionals
 75 | run if a "(if 1 'a)"
 76 | run if '()' "(if () 'a)"
 77 | run if a "(if 1 'a 'b)"
 78 | run if a "(if 0 'a 'b)"
 79 | run if a "(if 'x 'a 'b)"
 80 | run if b "(if () 'a 'b)"
 81 | run if c "(if () 'a 'b 'c)"
 82 | 
 83 | # Numeric comparisons
 84 | run = t '(= 3 3)'
 85 | run = '()' '(= 3 2)'
 86 | 
 87 | # eq
 88 | run eq t "(eq 'foo 'foo)"
 89 | run eq t "(eq + +)"
 90 | run eq '()' "(eq 'foo 'bar)"
 91 | run eq '()' "(eq + 'bar)"
 92 | 
 93 | # gensym
 94 | run gensym G__0 '(gensym)'
 95 | run gensym '()' "(eq (gensym) 'G__0)"
 96 | run gensym '()' '(eq (gensym) (gensym))'
 97 | run gensym t '((lambda (x) (eq x x)) (gensym))'
 98 | 
 99 | # Functions
100 | run lambda '<function>' '(lambda (x) x)'
101 | run lambda t '((lambda () t))'
102 | run lambda 9 '((lambda (x) (+ x x x)) 3)'
103 | run defun 12 '(defun double (x) (+ x x)) (double 6)'
104 | 
105 | run args 15 '(defun f (x y z) (+ x y z)) (f 3 5 7)'
106 | 
107 | run restargs '(3 5 7)' '(defun f (x . y) (cons x y)) (f 3 5 7)'
108 | run restargs '(3)'    '(defun f (x . y) (cons x y)) (f 3)'
109 | 
110 | # Lexical closures
111 | run closure 3 '(defun call (f) ((lambda (var) (f)) 5))
112 |   ((lambda (var) (call (lambda () var))) 3)'
113 | 
114 | run counter 3 '
115 |   (define counter
116 |     ((lambda (val)
117 |        (lambda () (setq val (+ val 1)) val))
118 |      0))
119 |   (counter)
120 |   (counter)
121 |   (counter)'
122 | 
123 | # While loop
124 | run while 45 "
125 |   (define i 0)
126 |   (define sum 0)
127 |   (while (< i 10)
128 |     (setq sum (+ sum i))
129 |     (setq i (+ i 1)))
130 |   sum"
131 | 
132 | # Macros
133 | run macro 42 "
134 |   (defun list (x . y) (cons x y))
135 |   (defmacro if-zero (x then) (list 'if (list '= x 0) then))
136 |   (if-zero 0 42)"
137 | 
138 | run macro 7 '(defmacro seven () 7) ((lambda () (seven)))'
139 | 
140 | run macroexpand '(if (= x 0) (print x))' "
141 |   (defun list (x . y) (cons x y))
142 |   (defmacro if-zero (x then) (list 'if (list '= x 0) then))
143 |   (macroexpand (if-zero x (print x)))"
144 | 
145 | 
146 | # Sum from 0 to 10
147 | run recursion 55 '(defun f (x) (if (= x 0) 0 (+ (f (+ x -1)) x))) (f 10)'
148 | 


--------------------------------------------------------------------------------