├── .gitignore ├── Makefile ├── README.md ├── examples ├── life.lisp └── nqueens.lisp ├── minilisp.c └── test.sh /.gitignore: -------------------------------------------------------------------------------- 1 | *~ 2 | minilisp -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | CFLAGS=-std=gnu99 -g -O2 -Wall 2 | 3 | .PHONY: clean test 4 | 5 | minilisp: minilisp.c 6 | 7 | clean: 8 | rm -f minilisp *~ 9 | 10 | test: minilisp 11 | @./test.sh 12 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | MiniLisp 2 | ======== 3 | 4 | One day I wanted to see what I can do with 1k lines of C and 5 | decided to write a Lisp interpreter. That turned to be a 6 | fun weekend project, and the outcome is a mini lisp implementation 7 | that supports 8 | 9 | - integers, symbols, cons cells, 10 | - global variables, 11 | - lexically-scoped local variables, 12 | - closures, 13 | - _if_ conditional, 14 | - primitive functions, such as +, =, <, or _list_, 15 | - user-defined functions, 16 | - a macro system, 17 | - and a copying garbage collector. 18 | 19 | All those in 1000 lines of C. I didn't sacrifice readability for size. 20 | The code is in my opinion heavily commented to help the reader understand 21 | how all these features work. 22 | 23 | Compile 24 | ------- 25 | 26 | $ make 27 | 28 | MiniLisp has been tested on Linux x86/x86-64 and 64 bit Mac OS. The code is not 29 | very architecture dependent, so you should be able to compile and run on other 30 | Unix-like operating systems. 31 | 32 | Test 33 | ---- 34 | 35 | MiniLisp comes with a comprehensive test suite. In order to run the tests, give 36 | "test" argument to make. 37 | 38 | $ make test 39 | 40 | Language features 41 | ----------------- 42 | 43 | MiniLisp is a traditional Lisp interpreter. It reads one expression at a time 44 | from the standard input, evaluates it, and then prints out the return value of 45 | the expression. Here is an example of a valid input. 46 | 47 | (+ 1 2) 48 | 49 | The above expression prints "3". 50 | 51 | ### Literals 52 | 53 | MiniLisp supports integer literals, `()`, `t`, symbols, and list literals. 54 | 55 | * Integer literals are positive or negative integers. 56 | * `()` is the only false value. It also represents the empty list. 57 | * `t` is a predefined variable evaluated to itself. It's a preferred way to 58 | represent a true value, while any non-`()` value is considered to be true. 59 | * Symbols are objects with unique name. They are used to represent identifiers. 60 | Because MiniLisp does not have string type, symbols are sometimes used as a 61 | substitute for strings too. 62 | * List literals are cons cells. It's either a regular list whose last element's 63 | cdr is `()` or an dotted list ending with any non-`()` value. A dotted list is 64 | written as `(a . b)`. 65 | 66 | ### List operators 67 | 68 | `cons` takes two arguments and returns a new cons cell, making the first 69 | argument the car, and the second the cdr. 70 | 71 | (cons 'a 'b) ; -> (a . b) 72 | (cons 'a '(b)) ; -> (a b) 73 | 74 | `car` and `cdr` are accessors for cons cells. `car` returns the car, and `cdr` 75 | returns the cdr. 76 | 77 | (car '(a . b)) ; -> a 78 | (cdr '(a . b)) ; -> b 79 | 80 | `setcar` mutates a cons cell. `setcar` takes two arguments, assuming the first 81 | argument is a cons cell. It sets the second argument's value to the cons cell's 82 | car. 83 | 84 | (define cell (cons 'a 'b)) 85 | cell ; -> (a . b) 86 | (setcar cell 'x) 87 | cell ; -> (x . b) 88 | 89 | ### Numeric operators 90 | 91 | `+` returns the sum of the arguments. 92 | 93 | (+ 1) ; -> 1 94 | (+ 1 2) ; -> 3 95 | (+ 1 2 3) ; -> 6 96 | 97 | `-` negates the value of the argument if only one argument is given. 98 | 99 | (- 3) ; -> -3 100 | (- -5) ; -> 5 101 | 102 | If multiple arguments are given, `-` subtracts each argument from the first one. 103 | 104 | (- 5 2) ; -> 3 105 | (- 5 2 7) ; -> -4 106 | 107 | `=` takes two arguments and returns `t` if the two are the same integer. 108 | 109 | (= 11 11) ; -> t 110 | (= 11 6) ; -> () 111 | 112 | `<` takes two arguments and returns `t` if the first argument is smaller than 113 | the second. 114 | 115 | (< 2 3) ; -> t 116 | (< 3 3) ; -> () 117 | (< 4 3) ; -> () 118 | 119 | ### Conditionals 120 | 121 | `(if cond then else)` is the only conditional in the language. It first 122 | evaluates *cond*. If the result is a true value, *then* is evaluated. Otherwise 123 | *else* is evaluated. 124 | 125 | ### Loops 126 | 127 | `(while cond expr ...)` executes `expr ...` until `cond` is evaluated to 128 | `()`. This is the only loop supported by MiniLisp. 129 | 130 | If you are familiar with Scheme, you might be wondering if you could write a 131 | loop by tail recursion in MiniLisp. The answer is no. Tail calls consume stack 132 | space in MiniLisp, so a loop written as recursion will fail with the memory 133 | exhaustion error. 134 | 135 | ### Equivalence test operators 136 | 137 | `eq` takes two arguments and returns `t` if the objects are the same. What `eq` 138 | really does is a pointer comparison, so two objects happened to have the same 139 | contents but actually different are considered to not be the same by `eq`. 140 | 141 | ### Output operators 142 | 143 | `println` prints a given object to the standard output. 144 | 145 | (println 3) ; prints "3" 146 | (println '(hello world)) ; prints "(hello world)" 147 | 148 | ### Definitions 149 | 150 | MiniLisp supports variables and functions. They can be defined using `define`. 151 | 152 | (define a (+ 1 2)) 153 | (+ a a) ; -> 6 154 | 155 | There are two ways to define a function. One way is to use a special form 156 | `lambda`. `(lambda (args ...) expr ...)` returns a function object which 157 | you can assign to a variable using `define`. 158 | 159 | (define double (lambda (x) (+ x x))) 160 | (double 6) ; -> 12 161 | ((lambda (x) (+ x x)) 6) ; do the same thing without assignment 162 | 163 | The other way is `defun`. `(defun fn (args ...) expr ...)` is short for 164 | `(define fn (lambda (args ...) expr ...)`. 165 | 166 | ;; Define "double" using defun 167 | (defun double (x) (+ x x)) 168 | 169 | You can write a function that takes variable number of arguments. If the 170 | parameter list is a dotted list, the remaining arguments are bound to the last 171 | parameter as a list. 172 | 173 | (defun fn (expr . rest) rest) 174 | (fn 1) ; -> () 175 | (fn 1 2 3) ; -> (2 3) 176 | 177 | Variables are lexically scoped and have indefinite extent. References to "outer" 178 | variables remain valid even after the function that created the variables 179 | returns. 180 | 181 | ;; A countup function. We use lambda to introduce local variables because we 182 | ;; do not have "let" and the like. 183 | (define counter 184 | ((lambda (count) 185 | (lambda () 186 | (setq count (+ count 1)) 187 | count)) 188 | 0)) 189 | 190 | (counter) ; -> 1 191 | (counter) ; -> 2 192 | 193 | ;; This will not return 12345 but 3. Variable "count" in counter function 194 | ;; is resolved based on its lexical context rather than dynamic context. 195 | ((lambda (count) (counter)) 12345) ; -> 3 196 | 197 | `setq` sets a new value to an existing variable. It's an error if the variable 198 | is not defined. 199 | 200 | (define val (+ 3 5)) 201 | (setq val (+ val 1)) ; increment "val" 202 | 203 | ### Macros 204 | 205 | Macros look similar to functions, but they are different that macros take an 206 | expression as input and returns a new expression as output. `(defmacro 207 | macro-name (args ...) body ...)` defines a macro. Here is an example. 208 | 209 | (defmacro unless (condition expr) 210 | (list 'if condition () expr)) 211 | 212 | The above `defmacro` defines a new macro *unless*. *unless* is a new conditional 213 | which evaluates *expr* unless *condition* is a true value. You cannot do the 214 | same thing with a function because all the arguments would be evaluated before 215 | the control is passed to the function. 216 | 217 | (define x 0) 218 | (unless (= x 0) '(x is not 0)) ; -> () 219 | (unless (= x 1) '(x is not 1)) ; -> (x is not 1) 220 | 221 | `macroexpand` is a convenient special form to see the expanded form of a macro. 222 | 223 | (macroexpand (unless (= x 1) '(x is not 1))) 224 | ;; -> (if (= x 1) () (quote (x is not 1))) 225 | 226 | `gensym` creates a new symbol which will never be `eq` to any other symbol other 227 | than itself. Useful for writing a macro that introduces new identifiers. 228 | 229 | (gensym) ; -> a new symbol 230 | 231 | ### Comments 232 | 233 | As in the traditional Lisp syntax, `;` (semicolon) starts a single line comment. 234 | The comment continues to the end of line. 235 | 236 | No GC Branch 237 | ------------ 238 | 239 | There is a MiniLisp branch from which the code for garbage collection has been 240 | stripped. The accepted language is the same, but the code is simpler than the 241 | master branch's one. The reader might want to read the nogc branch first, then 242 | proceed to the master branch, to understand the code step by step. 243 | 244 | The nogc branch is available at 245 | [nogc](https://github.com/rui314/minilisp/tree/nogc). The original is available 246 | at [master](https://github.com/rui314/minilisp). 247 | -------------------------------------------------------------------------------- /examples/life.lisp: -------------------------------------------------------------------------------- 1 | ;;; 2 | ;;; Conway's game of life 3 | ;;; 4 | 5 | ;; (progn expr ...) 6 | ;; => ((lambda () expr ...)) 7 | (defmacro progn (expr . rest) 8 | (list (cons 'lambda (cons () (cons expr rest))))) 9 | 10 | (defun list (x . y) 11 | (cons x y)) 12 | 13 | (defun not (x) 14 | (if x () t)) 15 | 16 | ;; (let var val body ...) 17 | ;; => ((lambda (var) body ...) val) 18 | (defmacro let (var val . body) 19 | (cons (cons 'lambda (cons (list var) body)) 20 | (list val))) 21 | 22 | ;; (and e1 e2 ...) 23 | ;; => (if e1 (and e2 ...)) 24 | ;; (and e1) 25 | ;; => e1 26 | (defmacro and (expr . rest) 27 | (if rest 28 | (list 'if expr (cons 'and rest)) 29 | expr)) 30 | 31 | ;; (or e1 e2 ...) 32 | ;; => (let e1 33 | ;; (if (or e2 ...))) 34 | ;; (or e1) 35 | ;; => e1 36 | ;; 37 | ;; The reason to use the temporary variables is to avoid evaluating the 38 | ;; arguments more than once. 39 | (defmacro or (expr . rest) 40 | (if rest 41 | (let var (gensym) 42 | (list 'let var expr 43 | (list 'if var var (cons 'or rest)))) 44 | expr)) 45 | 46 | ;; (when expr body ...) 47 | ;; => (if expr (progn body ...)) 48 | (defmacro when (expr . body) 49 | (cons 'if (cons expr (list (cons 'progn body))))) 50 | 51 | ;; (unless expr body ...) 52 | ;; => (if expr () body ...) 53 | (defmacro unless (expr . body) 54 | (cons 'if (cons expr (cons () body)))) 55 | 56 | ;;; 57 | ;;; Numeric operators 58 | ;;; 59 | 60 | (defun <= (e1 e2) 61 | (or (< e1 e2) 62 | (= e1 e2))) 63 | 64 | ;;; 65 | ;;; List operators 66 | ;;; 67 | 68 | ;;; Applies each element of lis to fn, and returns their return values as a list. 69 | (defun map (lis fn) 70 | (when lis 71 | (cons (fn (car lis)) 72 | (map (cdr lis) fn)))) 73 | 74 | ;; Returns nth element of lis. 75 | (defun nth (lis n) 76 | (if (= n 0) 77 | (car lis) 78 | (nth (cdr lis) (- n 1)))) 79 | 80 | ;; Returns the nth tail of lis. 81 | (defun nth-tail (lis n) 82 | (if (= n 0) 83 | lis 84 | (nth-tail (cdr lis) (- n 1)))) 85 | 86 | ;; Returns a list consists of m .. n-1 integers. 87 | (defun %iota (m n) 88 | (unless (<= n m) 89 | (cons m (%iota (+ m 1) n)))) 90 | 91 | ;; Returns a list consists of 0 ... n-1 integers. 92 | (defun iota (n) 93 | (%iota 0 n)) 94 | 95 | ;;; 96 | ;;; Main 97 | ;;; 98 | 99 | (define width 10) 100 | (define height 10) 101 | 102 | ;; Returns location (x, y)'s element. 103 | (defun get (board x y) 104 | (nth (nth board y) x)) 105 | 106 | ;; Returns true if location (x, y)'s value is "@". 107 | (defun alive? (board x y) 108 | (and (<= 0 x) 109 | (< x height) 110 | (<= 0 y) 111 | (< y width) 112 | (eq (get board x y) '@))) 113 | 114 | ;; Print out the given board. 115 | (defun print (board) 116 | (if (not board) 117 | '$ 118 | (println (car board)) 119 | (print (cdr board)))) 120 | 121 | (defun count (board x y) 122 | (let at (lambda (x y) 123 | (if (alive? board x y) 1 0)) 124 | (+ (at (- x 1) (- y 1)) 125 | (at (- x 1) y) 126 | (at (- x 1) (+ y 1)) 127 | (at x (- y 1)) 128 | (at x (+ y 1)) 129 | (at (+ x 1) (- y 1)) 130 | (at (+ x 1) y) 131 | (at (+ x 1) (+ y 1))))) 132 | 133 | (defun next (board x y) 134 | (let c (count board x y) 135 | (if (alive? board x y) 136 | (or (= c 2) (= c 3)) 137 | (= c 3)))) 138 | 139 | (defun run (board) 140 | (while t 141 | (print board) 142 | (println '*) 143 | (let newboard (map (iota height) 144 | (lambda (y) 145 | (map (iota width) 146 | (lambda (x) 147 | (if (next board x y) '@ '_))))) 148 | (setq board newboard)))) 149 | 150 | (run '((_ _ _ _ _ _ _ _ _ _) 151 | (_ _ _ _ _ _ _ _ _ _) 152 | (_ _ _ _ _ _ _ _ _ _) 153 | (_ _ _ _ _ _ _ _ _ _) 154 | (_ _ _ _ _ _ _ _ _ _) 155 | (_ _ _ _ _ _ _ _ _ _) 156 | (_ _ _ _ _ _ _ _ _ _) 157 | (_ @ @ @ _ _ _ _ _ _) 158 | (_ _ _ @ _ _ _ _ _ _) 159 | (_ _ @ _ _ _ _ _ _ _))) 160 | -------------------------------------------------------------------------------- /examples/nqueens.lisp: -------------------------------------------------------------------------------- 1 | ;;; 2 | ;;; N-queens puzzle solver. 3 | ;;; 4 | ;;; The N queens puzzle is the problem of placing N chess queens on an N x N 5 | ;;; chessboard so that no two queens attack each 6 | ;;; other. http://en.wikipedia.org/wiki/Eight_queens_puzzle 7 | ;;; 8 | ;;; This program solves N-queens puzzle by depth-first backtracking. 9 | ;;; 10 | 11 | ;;; 12 | ;;; Basic macros 13 | ;;; 14 | ;;; Because the language does not have quasiquote, we need to construct an 15 | ;;; expanded form using cons and list. 16 | ;;; 17 | 18 | ;; (progn expr ...) 19 | ;; => ((lambda () expr ...)) 20 | (defmacro progn (expr . rest) 21 | (list (cons 'lambda (cons () (cons expr rest))))) 22 | 23 | (defun list (x . y) 24 | (cons x y)) 25 | 26 | (defun not (x) 27 | (if x () t)) 28 | 29 | ;; (let1 var val body ...) 30 | ;; => ((lambda (var) body ...) val) 31 | (defmacro let1 (var val . body) 32 | (cons (cons 'lambda (cons (list var) body)) 33 | (list val))) 34 | 35 | ;; (and e1 e2 ...) 36 | ;; => (if e1 (and e2 ...)) 37 | ;; (and e1) 38 | ;; => e1 39 | (defmacro and (expr . rest) 40 | (if rest 41 | (list 'if expr (cons 'and rest)) 42 | expr)) 43 | 44 | ;; (or e1 e2 ...) 45 | ;; => (let1 e1 46 | ;; (if (or e2 ...))) 47 | ;; (or e1) 48 | ;; => e1 49 | ;; 50 | ;; The reason to use the temporary variables is to avoid evaluating the 51 | ;; arguments more than once. 52 | (defmacro or (expr . rest) 53 | (if rest 54 | (let1 var (gensym) 55 | (list 'let1 var expr 56 | (list 'if var var (cons 'or rest)))) 57 | expr)) 58 | 59 | ;; (when expr body ...) 60 | ;; => (if expr (progn body ...)) 61 | (defmacro when (expr . body) 62 | (cons 'if (cons expr (list (cons 'progn body))))) 63 | 64 | ;; (unless expr body ...) 65 | ;; => (if expr () body ...) 66 | (defmacro unless (expr . body) 67 | (cons 'if (cons expr (cons () body)))) 68 | 69 | ;;; 70 | ;;; Numeric operators 71 | ;;; 72 | 73 | (defun <= (e1 e2) 74 | (or (< e1 e2) 75 | (= e1 e2))) 76 | 77 | ;;; 78 | ;;; List operators 79 | ;;; 80 | 81 | ;; Applies each element of lis to pred. If pred returns a true value, terminate 82 | ;; the evaluation and returns pred's return value. If all of them return (), 83 | ;; returns (). 84 | (defun any (lis pred) 85 | (when lis 86 | (or (pred (car lis)) 87 | (any (cdr lis) pred)))) 88 | 89 | ;;; Applies each element of lis to fn, and returns their return values as a list. 90 | (defun map (lis fn) 91 | (when lis 92 | (cons (fn (car lis)) 93 | (map (cdr lis) fn)))) 94 | 95 | ;; Returns nth element of lis. 96 | (defun nth (lis n) 97 | (if (= n 0) 98 | (car lis) 99 | (nth (cdr lis) (- n 1)))) 100 | 101 | ;; Returns the nth tail of lis. 102 | (defun nth-tail (lis n) 103 | (if (= n 0) 104 | lis 105 | (nth-tail (cdr lis) (- n 1)))) 106 | 107 | ;; Returns a list consists of m .. n-1 integers. 108 | (defun %iota (m n) 109 | (unless (<= n m) 110 | (cons m (%iota (+ m 1) n)))) 111 | 112 | ;; Returns a list consists of 0 ... n-1 integers. 113 | (defun iota (n) 114 | (%iota 0 n)) 115 | 116 | ;; Returns a new list whose length is len and all members are init. 117 | (defun make-list (len init) 118 | (unless (= len 0) 119 | (cons init (make-list (- len 1) init)))) 120 | 121 | ;; Applies fn to each element of lis. 122 | (defun for-each (lis fn) 123 | (or (not lis) 124 | (progn (fn (car lis)) 125 | (for-each (cdr lis) fn)))) 126 | 127 | ;;; 128 | ;;; N-queens solver 129 | ;;; 130 | 131 | ;; Creates size x size list filled with symbol "x". 132 | (defun make-board (size) 133 | (map (iota size) 134 | (lambda (_) 135 | (make-list size 'x)))) 136 | 137 | ;; Returns location (x, y)'s element. 138 | (defun get (board x y) 139 | (nth (nth board x) y)) 140 | 141 | ;; Set symbol "@" to location (x, y). 142 | (defun set (board x y) 143 | (setcar (nth-tail (nth board x) y) '@)) 144 | 145 | ;; Set symbol "x" to location (x, y). 146 | (defun clear (board x y) 147 | (setcar (nth-tail (nth board x) y) 'x)) 148 | 149 | ;; Returns true if location (x, y)'s value is "@". 150 | (defun set? (board x y) 151 | (eq (get board x y) '@)) 152 | 153 | ;; Print out the given board. 154 | (defun print (board) 155 | (if (not board) 156 | '$ 157 | (println (car board)) 158 | (print (cdr board)))) 159 | 160 | ;; Returns true if we cannot place a queen at position (x, y), assuming that 161 | ;; queens have already been placed on each row from 0 to x-1. 162 | (defun conflict? (board x y) 163 | (any (iota x) 164 | (lambda (n) 165 | (or 166 | ;; Check if there's no conflicting queen upward 167 | (set? board n y) 168 | ;; Upper left 169 | (let1 z (+ y (- n x)) 170 | (and (<= 0 z) 171 | (set? board n z))) 172 | ;; Upper right 173 | (let1 z (+ y (- x n)) 174 | (and (< z board-size) 175 | (set? board n z))))))) 176 | 177 | ;; Find positions where we can place queens at row x, and continue searching for 178 | ;; the next row. 179 | (defun %solve (board x) 180 | (if (= x board-size) 181 | ;; Problem solved 182 | (progn (print board) 183 | (println '$)) 184 | (for-each (iota board-size) 185 | (lambda (y) 186 | (unless (conflict? board x y) 187 | (set board x y) 188 | (%solve board (+ x 1)) 189 | (clear board x y)))))) 190 | 191 | (defun solve (board) 192 | (println 'start) 193 | (%solve board 0) 194 | (println 'done)) 195 | 196 | ;;; 197 | ;;; Main 198 | ;;; 199 | 200 | (define board-size 8) 201 | (define board (make-board board-size)) 202 | (solve board) 203 | -------------------------------------------------------------------------------- /minilisp.c: -------------------------------------------------------------------------------- 1 | // This software is in the public domain. 2 | 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | 14 | static __attribute((noreturn)) void error(char *fmt, ...) { 15 | va_list ap; 16 | va_start(ap, fmt); 17 | vfprintf(stderr, fmt, ap); 18 | fprintf(stderr, "\n"); 19 | va_end(ap); 20 | exit(1); 21 | } 22 | 23 | //====================================================================== 24 | // Lisp objects 25 | //====================================================================== 26 | 27 | // The Lisp object type 28 | enum { 29 | // Regular objects visible from the user 30 | TINT = 1, 31 | TCELL, 32 | TSYMBOL, 33 | TPRIMITIVE, 34 | TFUNCTION, 35 | TMACRO, 36 | TENV, 37 | // The marker that indicates the object has been moved to other location by GC. The new location 38 | // can be found at the forwarding pointer. Only the functions to do garbage collection set and 39 | // handle the object of this type. Other functions will never see the object of this type. 40 | TMOVED, 41 | // Const objects. They are statically allocated and will never be managed by GC. 42 | TTRUE, 43 | TNIL, 44 | TDOT, 45 | TCPAREN, 46 | }; 47 | 48 | // Typedef for the primitive function 49 | struct Obj; 50 | typedef struct Obj *Primitive(void *root, struct Obj **env, struct Obj **args); 51 | 52 | // The object type 53 | typedef struct Obj { 54 | // The first word of the object represents the type of the object. Any code that handles object 55 | // needs to check its type first, then access the following union members. 56 | int type; 57 | 58 | // The total size of the object, including "type" field, this field, the contents, and the 59 | // padding at the end of the object. 60 | int size; 61 | 62 | // Object values. 63 | union { 64 | // Int 65 | int value; 66 | // Cell 67 | struct { 68 | struct Obj *car; 69 | struct Obj *cdr; 70 | }; 71 | // Symbol 72 | char name[1]; 73 | // Primitive 74 | Primitive *fn; 75 | // Function or Macro 76 | struct { 77 | struct Obj *params; 78 | struct Obj *body; 79 | struct Obj *env; 80 | }; 81 | // Environment frame. This is a linked list of association lists 82 | // containing the mapping from symbols to their value. 83 | struct { 84 | struct Obj *vars; 85 | struct Obj *up; 86 | }; 87 | // Forwarding pointer 88 | void *moved; 89 | }; 90 | } Obj; 91 | 92 | // Constants 93 | static Obj *True = &(Obj){ TTRUE }; 94 | static Obj *Nil = &(Obj){ TNIL }; 95 | static Obj *Dot = &(Obj){ TDOT }; 96 | static Obj *Cparen = &(Obj){ TCPAREN }; 97 | 98 | // The list containing all symbols. Such data structure is traditionally called the "obarray", but I 99 | // avoid using it as a variable name as this is not an array but a list. 100 | static Obj *Symbols; 101 | 102 | //====================================================================== 103 | // Memory management 104 | //====================================================================== 105 | 106 | // The size of the heap in byte 107 | #define MEMORY_SIZE 65536 108 | 109 | // The pointer pointing to the beginning of the current heap 110 | static void *memory; 111 | 112 | // The pointer pointing to the beginning of the old heap 113 | static void *from_space; 114 | 115 | // The number of bytes allocated from the heap 116 | static size_t mem_nused = 0; 117 | 118 | // Flags to debug GC 119 | static bool gc_running = false; 120 | static bool debug_gc = false; 121 | static bool always_gc = false; 122 | 123 | static void gc(void *root); 124 | 125 | // Currently we are using Cheney's copying GC algorithm, with which the available memory is split 126 | // into two halves and all objects are moved from one half to another every time GC is invoked. That 127 | // means the address of the object keeps changing. If you take the address of an object and keep it 128 | // in a C variable, dereferencing it could cause SEGV because the address becomes invalid after GC 129 | // runs. 130 | // 131 | // In order to deal with that, all access from C to Lisp objects will go through two levels of 132 | // pointer dereferences. The C local variable is pointing to a pointer on the C stack, and the 133 | // pointer is pointing to the Lisp object. GC is aware of the pointers in the stack and updates 134 | // their contents with the objects' new addresses when GC happens. 135 | // 136 | // The following is a macro to reserve the area in the C stack for the pointers. The contents of 137 | // this area are considered to be GC root. 138 | // 139 | // Be careful not to bypass the two levels of pointer indirections. If you create a direct pointer 140 | // to an object, it'll cause a subtle bug. Such code would work in most cases but fails with SEGV if 141 | // GC happens during the execution of the code. Any code that allocates memory may invoke GC. 142 | 143 | #define ROOT_END ((void *)-1) 144 | 145 | #define ADD_ROOT(size) \ 146 | void *root_ADD_ROOT_[size + 2]; \ 147 | root_ADD_ROOT_[0] = root; \ 148 | for (int i = 1; i <= size; i++) \ 149 | root_ADD_ROOT_[i] = NULL; \ 150 | root_ADD_ROOT_[size + 1] = ROOT_END; \ 151 | root = root_ADD_ROOT_ 152 | 153 | #define DEFINE1(var1) \ 154 | ADD_ROOT(1); \ 155 | Obj **var1 = (Obj **)(root_ADD_ROOT_ + 1) 156 | 157 | #define DEFINE2(var1, var2) \ 158 | ADD_ROOT(2); \ 159 | Obj **var1 = (Obj **)(root_ADD_ROOT_ + 1); \ 160 | Obj **var2 = (Obj **)(root_ADD_ROOT_ + 2) 161 | 162 | #define DEFINE3(var1, var2, var3) \ 163 | ADD_ROOT(3); \ 164 | Obj **var1 = (Obj **)(root_ADD_ROOT_ + 1); \ 165 | Obj **var2 = (Obj **)(root_ADD_ROOT_ + 2); \ 166 | Obj **var3 = (Obj **)(root_ADD_ROOT_ + 3) 167 | 168 | #define DEFINE4(var1, var2, var3, var4) \ 169 | ADD_ROOT(4); \ 170 | Obj **var1 = (Obj **)(root_ADD_ROOT_ + 1); \ 171 | Obj **var2 = (Obj **)(root_ADD_ROOT_ + 2); \ 172 | Obj **var3 = (Obj **)(root_ADD_ROOT_ + 3); \ 173 | Obj **var4 = (Obj **)(root_ADD_ROOT_ + 4) 174 | 175 | // Round up the given value to a multiple of size. Size must be a power of 2. It adds size - 1 176 | // first, then zero-ing the least significant bits to make the result a multiple of size. I know 177 | // these bit operations may look a little bit tricky, but it's efficient and thus frequently used. 178 | static inline size_t roundup(size_t var, size_t size) { 179 | return (var + size - 1) & ~(size - 1); 180 | } 181 | 182 | // Allocates memory block. This may start GC if we don't have enough memory. 183 | static Obj *alloc(void *root, int type, size_t size) { 184 | // The object must be large enough to contain a pointer for the forwarding pointer. Make it 185 | // larger if it's smaller than that. 186 | size = roundup(size, sizeof(void *)); 187 | 188 | // Add the size of the type tag and size fields. 189 | size += offsetof(Obj, value); 190 | 191 | // Round up the object size to the nearest alignment boundary, so that the next object will be 192 | // allocated at the proper alignment boundary. Currently we align the object at the same 193 | // boundary as the pointer. 194 | size = roundup(size, sizeof(void *)); 195 | 196 | // If the debug flag is on, allocate a new memory space to force all the existing objects to 197 | // move to new addresses, to invalidate the old addresses. By doing this the GC behavior becomes 198 | // more predictable and repeatable. If there's a memory bug that the C variable has a direct 199 | // reference to a Lisp object, the pointer will become invalid by this GC call. Dereferencing 200 | // that will immediately cause SEGV. 201 | if (always_gc && !gc_running) 202 | gc(root); 203 | 204 | // Otherwise, run GC only when the available memory is not large enough. 205 | if (!always_gc && MEMORY_SIZE < mem_nused + size) 206 | gc(root); 207 | 208 | // Terminate the program if we couldn't satisfy the memory request. This can happen if the 209 | // requested size was too large or the from-space was filled with too many live objects. 210 | if (MEMORY_SIZE < mem_nused + size) 211 | error("Memory exhausted"); 212 | 213 | // Allocate the object. 214 | Obj *obj = memory + mem_nused; 215 | obj->type = type; 216 | obj->size = size; 217 | mem_nused += size; 218 | return obj; 219 | } 220 | 221 | //====================================================================== 222 | // Garbage collector 223 | //====================================================================== 224 | 225 | // Cheney's algorithm uses two pointers to keep track of GC status. At first both pointers point to 226 | // the beginning of the to-space. As GC progresses, they are moved towards the end of the 227 | // to-space. The objects before "scan1" are the objects that are fully copied. The objects between 228 | // "scan1" and "scan2" have already been copied, but may contain pointers to the from-space. "scan2" 229 | // points to the beginning of the free space. 230 | static Obj *scan1; 231 | static Obj *scan2; 232 | 233 | // Moves one object from the from-space to the to-space. Returns the object's new address. If the 234 | // object has already been moved, does nothing but just returns the new address. 235 | static inline Obj *forward(Obj *obj) { 236 | // If the object's address is not in the from-space, the object is not managed by GC nor it 237 | // has already been moved to the to-space. 238 | ptrdiff_t offset = (uint8_t *)obj - (uint8_t *)from_space; 239 | if (offset < 0 || MEMORY_SIZE <= offset) 240 | return obj; 241 | 242 | // The pointer is pointing to the from-space, but the object there was a tombstone. Follow the 243 | // forwarding pointer to find the new location of the object. 244 | if (obj->type == TMOVED) 245 | return obj->moved; 246 | 247 | // Otherwise, the object has not been moved yet. Move it. 248 | Obj *newloc = scan2; 249 | memcpy(newloc, obj, obj->size); 250 | scan2 = (Obj *)((uint8_t *)scan2 + obj->size); 251 | 252 | // Put a tombstone at the location where the object used to occupy, so that the following call 253 | // of forward() can find the object's new location. 254 | obj->type = TMOVED; 255 | obj->moved = newloc; 256 | return newloc; 257 | } 258 | 259 | static void *alloc_semispace() { 260 | return mmap(NULL, MEMORY_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, -1, 0); 261 | } 262 | 263 | // Copies the root objects. 264 | static void forward_root_objects(void *root) { 265 | Symbols = forward(Symbols); 266 | for (void **frame = root; frame; frame = *(void ***)frame) 267 | for (int i = 1; frame[i] != ROOT_END; i++) 268 | if (frame[i]) 269 | frame[i] = forward(frame[i]); 270 | } 271 | 272 | // Implements Cheney's copying garbage collection algorithm. 273 | // http://en.wikipedia.org/wiki/Cheney%27s_algorithm 274 | static void gc(void *root) { 275 | assert(!gc_running); 276 | gc_running = true; 277 | 278 | // Allocate a new semi-space. 279 | from_space = memory; 280 | memory = alloc_semispace(); 281 | 282 | // Initialize the two pointers for GC. Initially they point to the beginning of the to-space. 283 | scan1 = scan2 = memory; 284 | 285 | // Copy the GC root objects first. This moves the pointer scan2. 286 | forward_root_objects(root); 287 | 288 | // Copy the objects referenced by the GC root objects located between scan1 and scan2. Once it's 289 | // finished, all live objects (i.e. objects reachable from the root) will have been copied to 290 | // the to-space. 291 | while (scan1 < scan2) { 292 | switch (scan1->type) { 293 | case TINT: 294 | case TSYMBOL: 295 | case TPRIMITIVE: 296 | // Any of the above types does not contain a pointer to a GC-managed object. 297 | break; 298 | case TCELL: 299 | scan1->car = forward(scan1->car); 300 | scan1->cdr = forward(scan1->cdr); 301 | break; 302 | case TFUNCTION: 303 | case TMACRO: 304 | scan1->params = forward(scan1->params); 305 | scan1->body = forward(scan1->body); 306 | scan1->env = forward(scan1->env); 307 | break; 308 | case TENV: 309 | scan1->vars = forward(scan1->vars); 310 | scan1->up = forward(scan1->up); 311 | break; 312 | default: 313 | error("Bug: copy: unknown type %d", scan1->type); 314 | } 315 | scan1 = (Obj *)((uint8_t *)scan1 + scan1->size); 316 | } 317 | 318 | // Finish up GC. 319 | munmap(from_space, MEMORY_SIZE); 320 | size_t old_nused = mem_nused; 321 | mem_nused = (size_t)((uint8_t *)scan1 - (uint8_t *)memory); 322 | if (debug_gc) 323 | fprintf(stderr, "GC: %zu bytes out of %zu bytes copied.\n", mem_nused, old_nused); 324 | gc_running = false; 325 | } 326 | 327 | //====================================================================== 328 | // Constructors 329 | //====================================================================== 330 | 331 | static Obj *make_int(void *root, int value) { 332 | Obj *r = alloc(root, TINT, sizeof(int)); 333 | r->value = value; 334 | return r; 335 | } 336 | 337 | static Obj *cons(void *root, Obj **car, Obj **cdr) { 338 | Obj *cell = alloc(root, TCELL, sizeof(Obj *) * 2); 339 | cell->car = *car; 340 | cell->cdr = *cdr; 341 | return cell; 342 | } 343 | 344 | static Obj *make_symbol(void *root, char *name) { 345 | Obj *sym = alloc(root, TSYMBOL, strlen(name) + 1); 346 | strcpy(sym->name, name); 347 | return sym; 348 | } 349 | 350 | static Obj *make_primitive(void *root, Primitive *fn) { 351 | Obj *r = alloc(root, TPRIMITIVE, sizeof(Primitive *)); 352 | r->fn = fn; 353 | return r; 354 | } 355 | 356 | static Obj *make_function(void *root, Obj **env, int type, Obj **params, Obj **body) { 357 | assert(type == TFUNCTION || type == TMACRO); 358 | Obj *r = alloc(root, type, sizeof(Obj *) * 3); 359 | r->params = *params; 360 | r->body = *body; 361 | r->env = *env; 362 | return r; 363 | } 364 | 365 | struct Obj *make_env(void *root, Obj **vars, Obj **up) { 366 | Obj *r = alloc(root, TENV, sizeof(Obj *) * 2); 367 | r->vars = *vars; 368 | r->up = *up; 369 | return r; 370 | } 371 | 372 | // Returns ((x . y) . a) 373 | static Obj *acons(void *root, Obj **x, Obj **y, Obj **a) { 374 | DEFINE1(cell); 375 | *cell = cons(root, x, y); 376 | return cons(root, cell, a); 377 | } 378 | 379 | //====================================================================== 380 | // Parser 381 | // 382 | // This is a hand-written recursive-descendent parser. 383 | //====================================================================== 384 | 385 | #define SYMBOL_MAX_LEN 200 386 | const char symbol_chars[] = "~!@#$%^&*-_=+:/?<>"; 387 | 388 | static Obj *read_expr(void *root); 389 | 390 | static int peek(void) { 391 | int c = getchar(); 392 | ungetc(c, stdin); 393 | return c; 394 | } 395 | 396 | // Destructively reverses the given list. 397 | static Obj *reverse(Obj *p) { 398 | Obj *ret = Nil; 399 | while (p != Nil) { 400 | Obj *head = p; 401 | p = p->cdr; 402 | head->cdr = ret; 403 | ret = head; 404 | } 405 | return ret; 406 | } 407 | 408 | // Skips the input until newline is found. Newline is one of \r, \r\n or \n. 409 | static void skip_line(void) { 410 | for (;;) { 411 | int c = getchar(); 412 | if (c == EOF || c == '\n') 413 | return; 414 | if (c == '\r') { 415 | if (peek() == '\n') 416 | getchar(); 417 | return; 418 | } 419 | } 420 | } 421 | 422 | // Reads a list. Note that '(' has already been read. 423 | static Obj *read_list(void *root) { 424 | DEFINE3(obj, head, last); 425 | *head = Nil; 426 | for (;;) { 427 | *obj = read_expr(root); 428 | if (!*obj) 429 | error("Unclosed parenthesis"); 430 | if (*obj == Cparen) 431 | return reverse(*head); 432 | if (*obj == Dot) { 433 | *last = read_expr(root); 434 | if (read_expr(root) != Cparen) 435 | error("Closed parenthesis expected after dot"); 436 | Obj *ret = reverse(*head); 437 | (*head)->cdr = *last; 438 | return ret; 439 | } 440 | *head = cons(root, obj, head); 441 | } 442 | } 443 | 444 | // May create a new symbol. If there's a symbol with the same name, it will not create a new symbol 445 | // but return the existing one. 446 | static Obj *intern(void *root, char *name) { 447 | for (Obj *p = Symbols; p != Nil; p = p->cdr) 448 | if (strcmp(name, p->car->name) == 0) 449 | return p->car; 450 | DEFINE1(sym); 451 | *sym = make_symbol(root, name); 452 | Symbols = cons(root, sym, &Symbols); 453 | return *sym; 454 | } 455 | 456 | // Reader marcro ' (single quote). It reads an expression and returns (quote ). 457 | static Obj *read_quote(void *root) { 458 | DEFINE2(sym, tmp); 459 | *sym = intern(root, "quote"); 460 | *tmp = read_expr(root); 461 | *tmp = cons(root, tmp, &Nil); 462 | *tmp = cons(root, sym, tmp); 463 | return *tmp; 464 | } 465 | 466 | static int read_number(int val) { 467 | while (isdigit(peek())) 468 | val = val * 10 + (getchar() - '0'); 469 | return val; 470 | } 471 | 472 | static Obj *read_symbol(void *root, char c) { 473 | char buf[SYMBOL_MAX_LEN + 1]; 474 | buf[0] = c; 475 | int len = 1; 476 | while (isalnum(peek()) || strchr(symbol_chars, peek())) { 477 | if (SYMBOL_MAX_LEN <= len) 478 | error("Symbol name too long"); 479 | buf[len++] = getchar(); 480 | } 481 | buf[len] = '\0'; 482 | return intern(root, buf); 483 | } 484 | 485 | static Obj *read_expr(void *root) { 486 | for (;;) { 487 | int c = getchar(); 488 | if (c == ' ' || c == '\n' || c == '\r' || c == '\t') 489 | continue; 490 | if (c == EOF) 491 | return NULL; 492 | if (c == ';') { 493 | skip_line(); 494 | continue; 495 | } 496 | if (c == '(') 497 | return read_list(root); 498 | if (c == ')') 499 | return Cparen; 500 | if (c == '.') 501 | return Dot; 502 | if (c == '\'') 503 | return read_quote(root); 504 | if (isdigit(c)) 505 | return make_int(root, read_number(c - '0')); 506 | if (c == '-' && isdigit(peek())) 507 | return make_int(root, -read_number(0)); 508 | if (isalpha(c) || strchr(symbol_chars, c)) 509 | return read_symbol(root, c); 510 | error("Don't know how to handle %c", c); 511 | } 512 | } 513 | 514 | // Prints the given object. 515 | static void print(Obj *obj) { 516 | switch (obj->type) { 517 | case TCELL: 518 | printf("("); 519 | for (;;) { 520 | print(obj->car); 521 | if (obj->cdr == Nil) 522 | break; 523 | if (obj->cdr->type != TCELL) { 524 | printf(" . "); 525 | print(obj->cdr); 526 | break; 527 | } 528 | printf(" "); 529 | obj = obj->cdr; 530 | } 531 | printf(")"); 532 | return; 533 | 534 | #define CASE(type, ...) \ 535 | case type: \ 536 | printf(__VA_ARGS__); \ 537 | return 538 | CASE(TINT, "%d", obj->value); 539 | CASE(TSYMBOL, "%s", obj->name); 540 | CASE(TPRIMITIVE, ""); 541 | CASE(TFUNCTION, ""); 542 | CASE(TMACRO, ""); 543 | CASE(TMOVED, ""); 544 | CASE(TTRUE, "t"); 545 | CASE(TNIL, "()"); 546 | #undef CASE 547 | default: 548 | error("Bug: print: Unknown tag type: %d", obj->type); 549 | } 550 | } 551 | 552 | // Returns the length of the given list. -1 if it's not a proper list. 553 | static int length(Obj *list) { 554 | int len = 0; 555 | for (; list->type == TCELL; list = list->cdr) 556 | len++; 557 | return list == Nil ? len : -1; 558 | } 559 | 560 | //====================================================================== 561 | // Evaluator 562 | //====================================================================== 563 | 564 | static Obj *eval(void *root, Obj **env, Obj **obj); 565 | 566 | static void add_variable(void *root, Obj **env, Obj **sym, Obj **val) { 567 | DEFINE2(vars, tmp); 568 | *vars = (*env)->vars; 569 | *tmp = acons(root, sym, val, vars); 570 | (*env)->vars = *tmp; 571 | } 572 | 573 | // Returns a newly created environment frame. 574 | static Obj *push_env(void *root, Obj **env, Obj **vars, Obj **vals) { 575 | DEFINE3(map, sym, val); 576 | *map = Nil; 577 | for (; (*vars)->type == TCELL; *vars = (*vars)->cdr, *vals = (*vals)->cdr) { 578 | if ((*vals)->type != TCELL) 579 | error("Cannot apply function: number of argument does not match"); 580 | *sym = (*vars)->car; 581 | *val = (*vals)->car; 582 | *map = acons(root, sym, val, map); 583 | } 584 | if (*vars != Nil) 585 | *map = acons(root, vars, vals, map); 586 | return make_env(root, map, env); 587 | } 588 | 589 | // Evaluates the list elements from head and returns the last return value. 590 | static Obj *progn(void *root, Obj **env, Obj **list) { 591 | DEFINE2(lp, r); 592 | for (*lp = *list; *lp != Nil; *lp = (*lp)->cdr) { 593 | *r = (*lp)->car; 594 | *r = eval(root, env, r); 595 | } 596 | return *r; 597 | } 598 | 599 | // Evaluates all the list elements and returns their return values as a new list. 600 | static Obj *eval_list(void *root, Obj **env, Obj **list) { 601 | DEFINE4(head, lp, expr, result); 602 | *head = Nil; 603 | for (lp = list; *lp != Nil; *lp = (*lp)->cdr) { 604 | *expr = (*lp)->car; 605 | *result = eval(root, env, expr); 606 | *head = cons(root, result, head); 607 | } 608 | return reverse(*head); 609 | } 610 | 611 | static bool is_list(Obj *obj) { 612 | return obj == Nil || obj->type == TCELL; 613 | } 614 | 615 | static Obj *apply_func(void *root, Obj **env, Obj **fn, Obj **args) { 616 | DEFINE3(params, newenv, body); 617 | *params = (*fn)->params; 618 | *newenv = (*fn)->env; 619 | *newenv = push_env(root, newenv, params, args); 620 | *body = (*fn)->body; 621 | return progn(root, newenv, body); 622 | } 623 | 624 | // Apply fn with args. 625 | static Obj *apply(void *root, Obj **env, Obj **fn, Obj **args) { 626 | if (!is_list(*args)) 627 | error("argument must be a list"); 628 | if ((*fn)->type == TPRIMITIVE) 629 | return (*fn)->fn(root, env, args); 630 | if ((*fn)->type == TFUNCTION) { 631 | DEFINE1(eargs); 632 | *eargs = eval_list(root, env, args); 633 | return apply_func(root, env, fn, eargs); 634 | } 635 | error("not supported"); 636 | } 637 | 638 | // Searches for a variable by symbol. Returns null if not found. 639 | static Obj *find(Obj **env, Obj *sym) { 640 | for (Obj *p = *env; p != Nil; p = p->up) { 641 | for (Obj *cell = p->vars; cell != Nil; cell = cell->cdr) { 642 | Obj *bind = cell->car; 643 | if (sym == bind->car) 644 | return bind; 645 | } 646 | } 647 | return NULL; 648 | } 649 | 650 | // Expands the given macro application form. 651 | static Obj *macroexpand(void *root, Obj **env, Obj **obj) { 652 | if ((*obj)->type != TCELL || (*obj)->car->type != TSYMBOL) 653 | return *obj; 654 | DEFINE3(bind, macro, args); 655 | *bind = find(env, (*obj)->car); 656 | if (!*bind || (*bind)->cdr->type != TMACRO) 657 | return *obj; 658 | *macro = (*bind)->cdr; 659 | *args = (*obj)->cdr; 660 | return apply_func(root, env, macro, args); 661 | } 662 | 663 | // Evaluates the S expression. 664 | static Obj *eval(void *root, Obj **env, Obj **obj) { 665 | switch ((*obj)->type) { 666 | case TINT: 667 | case TPRIMITIVE: 668 | case TFUNCTION: 669 | case TTRUE: 670 | case TNIL: 671 | // Self-evaluating objects 672 | return *obj; 673 | case TSYMBOL: { 674 | // Variable 675 | Obj *bind = find(env, *obj); 676 | if (!bind) 677 | error("Undefined symbol: %s", (*obj)->name); 678 | return bind->cdr; 679 | } 680 | case TCELL: { 681 | // Function application form 682 | DEFINE3(fn, expanded, args); 683 | *expanded = macroexpand(root, env, obj); 684 | if (*expanded != *obj) 685 | return eval(root, env, expanded); 686 | *fn = (*obj)->car; 687 | *fn = eval(root, env, fn); 688 | *args = (*obj)->cdr; 689 | if ((*fn)->type != TPRIMITIVE && (*fn)->type != TFUNCTION) 690 | error("The head of a list must be a function"); 691 | return apply(root, env, fn, args); 692 | } 693 | default: 694 | error("Bug: eval: Unknown tag type: %d", (*obj)->type); 695 | } 696 | } 697 | 698 | //====================================================================== 699 | // Primitive functions and special forms 700 | //====================================================================== 701 | 702 | // 'expr 703 | static Obj *prim_quote(void *root, Obj **env, Obj **list) { 704 | if (length(*list) != 1) 705 | error("Malformed quote"); 706 | return (*list)->car; 707 | } 708 | 709 | // (cons expr expr) 710 | static Obj *prim_cons(void *root, Obj **env, Obj **list) { 711 | if (length(*list) != 2) 712 | error("Malformed cons"); 713 | Obj *cell = eval_list(root, env, list); 714 | cell->cdr = cell->cdr->car; 715 | return cell; 716 | } 717 | 718 | // (car ) 719 | static Obj *prim_car(void *root, Obj **env, Obj **list) { 720 | Obj *args = eval_list(root, env, list); 721 | if (args->car->type != TCELL || args->cdr != Nil) 722 | error("Malformed car"); 723 | return args->car->car; 724 | } 725 | 726 | // (cdr ) 727 | static Obj *prim_cdr(void *root, Obj **env, Obj **list) { 728 | Obj *args = eval_list(root, env, list); 729 | if (args->car->type != TCELL || args->cdr != Nil) 730 | error("Malformed cdr"); 731 | return args->car->cdr; 732 | } 733 | 734 | // (setq expr) 735 | static Obj *prim_setq(void *root, Obj **env, Obj **list) { 736 | if (length(*list) != 2 || (*list)->car->type != TSYMBOL) 737 | error("Malformed setq"); 738 | DEFINE2(bind, value); 739 | *bind = find(env, (*list)->car); 740 | if (!*bind) 741 | error("Unbound variable %s", (*list)->car->name); 742 | *value = (*list)->cdr->car; 743 | *value = eval(root, env, value); 744 | (*bind)->cdr = *value; 745 | return *value; 746 | } 747 | 748 | // (setcar expr) 749 | static Obj *prim_setcar(void *root, Obj **env, Obj **list) { 750 | DEFINE1(args); 751 | *args = eval_list(root, env, list); 752 | if (length(*args) != 2 || (*args)->car->type != TCELL) 753 | error("Malformed setcar"); 754 | (*args)->car->car = (*args)->cdr->car; 755 | return (*args)->car; 756 | } 757 | 758 | // (while cond expr ...) 759 | static Obj *prim_while(void *root, Obj **env, Obj **list) { 760 | if (length(*list) < 2) 761 | error("Malformed while"); 762 | DEFINE2(cond, exprs); 763 | *cond = (*list)->car; 764 | while (eval(root, env, cond) != Nil) { 765 | *exprs = (*list)->cdr; 766 | eval_list(root, env, exprs); 767 | } 768 | return Nil; 769 | } 770 | 771 | // (gensym) 772 | static Obj *prim_gensym(void *root, Obj **env, Obj **list) { 773 | static int count = 0; 774 | char buf[10]; 775 | snprintf(buf, sizeof(buf), "G__%d", count++); 776 | return make_symbol(root, buf); 777 | } 778 | 779 | // (+ ...) 780 | static Obj *prim_plus(void *root, Obj **env, Obj **list) { 781 | int sum = 0; 782 | for (Obj *args = eval_list(root, env, list); args != Nil; args = args->cdr) { 783 | if (args->car->type != TINT) 784 | error("+ takes only numbers"); 785 | sum += args->car->value; 786 | } 787 | return make_int(root, sum); 788 | } 789 | 790 | // (- ...) 791 | static Obj *prim_minus(void *root, Obj **env, Obj **list) { 792 | Obj *args = eval_list(root, env, list); 793 | for (Obj *p = args; p != Nil; p = p->cdr) 794 | if (p->car->type != TINT) 795 | error("- takes only numbers"); 796 | if (args->cdr == Nil) 797 | return make_int(root, -args->car->value); 798 | int r = args->car->value; 799 | for (Obj *p = args->cdr; p != Nil; p = p->cdr) 800 | r -= p->car->value; 801 | return make_int(root, r); 802 | } 803 | 804 | // (< ) 805 | static Obj *prim_lt(void *root, Obj **env, Obj **list) { 806 | Obj *args = eval_list(root, env, list); 807 | if (length(args) != 2) 808 | error("malformed <"); 809 | Obj *x = args->car; 810 | Obj *y = args->cdr->car; 811 | if (x->type != TINT || y->type != TINT) 812 | error("< takes only numbers"); 813 | return x->value < y->value ? True : Nil; 814 | } 815 | 816 | static Obj *handle_function(void *root, Obj **env, Obj **list, int type) { 817 | if ((*list)->type != TCELL || !is_list((*list)->car) || (*list)->cdr->type != TCELL) 818 | error("Malformed lambda"); 819 | Obj *p = (*list)->car; 820 | for (; p->type == TCELL; p = p->cdr) 821 | if (p->car->type != TSYMBOL) 822 | error("Parameter must be a symbol"); 823 | if (p != Nil && p->type != TSYMBOL) 824 | error("Parameter must be a symbol"); 825 | DEFINE2(params, body); 826 | *params = (*list)->car; 827 | *body = (*list)->cdr; 828 | return make_function(root, env, type, params, body); 829 | } 830 | 831 | // (lambda ( ...) expr ...) 832 | static Obj *prim_lambda(void *root, Obj **env, Obj **list) { 833 | return handle_function(root, env, list, TFUNCTION); 834 | } 835 | 836 | static Obj *handle_defun(void *root, Obj **env, Obj **list, int type) { 837 | if ((*list)->car->type != TSYMBOL || (*list)->cdr->type != TCELL) 838 | error("Malformed defun"); 839 | DEFINE3(fn, sym, rest); 840 | *sym = (*list)->car; 841 | *rest = (*list)->cdr; 842 | *fn = handle_function(root, env, rest, type); 843 | add_variable(root, env, sym, fn); 844 | return *fn; 845 | } 846 | 847 | // (defun ( ...) expr ...) 848 | static Obj *prim_defun(void *root, Obj **env, Obj **list) { 849 | return handle_defun(root, env, list, TFUNCTION); 850 | } 851 | 852 | // (define expr) 853 | static Obj *prim_define(void *root, Obj **env, Obj **list) { 854 | if (length(*list) != 2 || (*list)->car->type != TSYMBOL) 855 | error("Malformed define"); 856 | DEFINE2(sym, value); 857 | *sym = (*list)->car; 858 | *value = (*list)->cdr->car; 859 | *value = eval(root, env, value); 860 | add_variable(root, env, sym, value); 861 | return *value; 862 | } 863 | 864 | // (defmacro ( ...) expr ...) 865 | static Obj *prim_defmacro(void *root, Obj **env, Obj **list) { 866 | return handle_defun(root, env, list, TMACRO); 867 | } 868 | 869 | // (macroexpand expr) 870 | static Obj *prim_macroexpand(void *root, Obj **env, Obj **list) { 871 | if (length(*list) != 1) 872 | error("Malformed macroexpand"); 873 | DEFINE1(body); 874 | *body = (*list)->car; 875 | return macroexpand(root, env, body); 876 | } 877 | 878 | // (println expr) 879 | static Obj *prim_println(void *root, Obj **env, Obj **list) { 880 | DEFINE1(tmp); 881 | *tmp = (*list)->car; 882 | print(eval(root, env, tmp)); 883 | printf("\n"); 884 | return Nil; 885 | } 886 | 887 | // (if expr expr expr ...) 888 | static Obj *prim_if(void *root, Obj **env, Obj **list) { 889 | if (length(*list) < 2) 890 | error("Malformed if"); 891 | DEFINE3(cond, then, els); 892 | *cond = (*list)->car; 893 | *cond = eval(root, env, cond); 894 | if (*cond != Nil) { 895 | *then = (*list)->cdr->car; 896 | return eval(root, env, then); 897 | } 898 | *els = (*list)->cdr->cdr; 899 | return *els == Nil ? Nil : progn(root, env, els); 900 | } 901 | 902 | // (= ) 903 | static Obj *prim_num_eq(void *root, Obj **env, Obj **list) { 904 | if (length(*list) != 2) 905 | error("Malformed ="); 906 | Obj *values = eval_list(root, env, list); 907 | Obj *x = values->car; 908 | Obj *y = values->cdr->car; 909 | if (x->type != TINT || y->type != TINT) 910 | error("= only takes numbers"); 911 | return x->value == y->value ? True : Nil; 912 | } 913 | 914 | // (eq expr expr) 915 | static Obj *prim_eq(void *root, Obj **env, Obj **list) { 916 | if (length(*list) != 2) 917 | error("Malformed eq"); 918 | Obj *values = eval_list(root, env, list); 919 | return values->car == values->cdr->car ? True : Nil; 920 | } 921 | 922 | static void add_primitive(void *root, Obj **env, char *name, Primitive *fn) { 923 | DEFINE2(sym, prim); 924 | *sym = intern(root, name); 925 | *prim = make_primitive(root, fn); 926 | add_variable(root, env, sym, prim); 927 | } 928 | 929 | static void define_constants(void *root, Obj **env) { 930 | DEFINE1(sym); 931 | *sym = intern(root, "t"); 932 | add_variable(root, env, sym, &True); 933 | } 934 | 935 | static void define_primitives(void *root, Obj **env) { 936 | add_primitive(root, env, "quote", prim_quote); 937 | add_primitive(root, env, "cons", prim_cons); 938 | add_primitive(root, env, "car", prim_car); 939 | add_primitive(root, env, "cdr", prim_cdr); 940 | add_primitive(root, env, "setq", prim_setq); 941 | add_primitive(root, env, "setcar", prim_setcar); 942 | add_primitive(root, env, "while", prim_while); 943 | add_primitive(root, env, "gensym", prim_gensym); 944 | add_primitive(root, env, "+", prim_plus); 945 | add_primitive(root, env, "-", prim_minus); 946 | add_primitive(root, env, "<", prim_lt); 947 | add_primitive(root, env, "define", prim_define); 948 | add_primitive(root, env, "defun", prim_defun); 949 | add_primitive(root, env, "defmacro", prim_defmacro); 950 | add_primitive(root, env, "macroexpand", prim_macroexpand); 951 | add_primitive(root, env, "lambda", prim_lambda); 952 | add_primitive(root, env, "if", prim_if); 953 | add_primitive(root, env, "=", prim_num_eq); 954 | add_primitive(root, env, "eq", prim_eq); 955 | add_primitive(root, env, "println", prim_println); 956 | } 957 | 958 | //====================================================================== 959 | // Entry point 960 | //====================================================================== 961 | 962 | // Returns true if the environment variable is defined and not the empty string. 963 | static bool getEnvFlag(char *name) { 964 | char *val = getenv(name); 965 | return val && val[0]; 966 | } 967 | 968 | int main(int argc, char **argv) { 969 | // Debug flags 970 | debug_gc = getEnvFlag("MINILISP_DEBUG_GC"); 971 | always_gc = getEnvFlag("MINILISP_ALWAYS_GC"); 972 | 973 | // Memory allocation 974 | memory = alloc_semispace(); 975 | 976 | // Constants and primitives 977 | Symbols = Nil; 978 | void *root = NULL; 979 | DEFINE2(env, expr); 980 | *env = make_env(root, &Nil, &Nil); 981 | define_constants(root, env); 982 | define_primitives(root, env); 983 | 984 | // The main loop 985 | for (;;) { 986 | *expr = read_expr(root); 987 | if (!*expr) 988 | return 0; 989 | if (*expr == Cparen) 990 | error("Stray close parenthesis"); 991 | if (*expr == Dot) 992 | error("Stray dot"); 993 | print(eval(root, env, expr)); 994 | printf("\n"); 995 | } 996 | } 997 | -------------------------------------------------------------------------------- /test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | function fail() { 4 | echo -n -e '\e[1;31m[ERROR]\e[0m ' 5 | echo "$1" 6 | exit 1 7 | } 8 | 9 | function do_run() { 10 | error=$(echo "$3" | ./minilisp 2>&1 > /dev/null) 11 | if [ -n "$error" ]; then 12 | echo FAILED 13 | fail "$error" 14 | fi 15 | 16 | result=$(echo "$3" | ./minilisp 2> /dev/null | tail -1) 17 | if [ "$result" != "$2" ]; then 18 | echo FAILED 19 | fail "$2 expected, but got $result" 20 | fi 21 | } 22 | 23 | function run() { 24 | echo -n "Testing $1 ... " 25 | # Run the tests twice to test the garbage collector with different settings. 26 | MINILISP_ALWAYS_GC= do_run "$@" 27 | MINILISP_ALWAYS_GC=1 do_run "$@" 28 | echo ok 29 | } 30 | 31 | # Basic data types 32 | run integer 1 1 33 | run integer -1 -1 34 | run symbol a "'a" 35 | run quote a "(quote a)" 36 | run quote 63 "'63" 37 | run quote '(+ 1 2)' "'(+ 1 2)" 38 | 39 | run + 3 '(+ 1 2)' 40 | run + -2 '(+ 1 -3)' 41 | 42 | run 'unary -' -3 '(- 3)' 43 | run '-' -2 '(- 3 5)' 44 | run '-' -9 '(- 3 5 7)' 45 | 46 | run '<' t '(< 2 3)' 47 | run '<' '()' '(< 3 3)' 48 | run '<' '()' '(< 4 3)' 49 | 50 | run 'literal list' '(a b c)' "'(a b c)" 51 | run 'literal list' '(a b . c)' "'(a b . c)" 52 | 53 | # List manipulation 54 | run cons "(a . b)" "(cons 'a 'b)" 55 | run cons "(a b c)" "(cons 'a (cons 'b (cons 'c ())))" 56 | 57 | run car a "(car '(a b c))" 58 | run cdr "(b c)" "(cdr '(a b c))" 59 | 60 | run setcar "(x . b)" "(define obj (cons 'a 'b)) (setcar obj 'x) obj" 61 | 62 | # Comments 63 | run comment 5 " 64 | ; 2 65 | 5 ; 3" 66 | 67 | # Global variables 68 | run define 7 '(define x 7) x' 69 | run define 10 '(define x 7) (+ x 3)' 70 | run define 7 '(define + 7) +' 71 | run setq 11 '(define x 7) (setq x 11) x' 72 | run setq 17 '(setq + 17) +' 73 | 74 | # Conditionals 75 | run if a "(if 1 'a)" 76 | run if '()' "(if () 'a)" 77 | run if a "(if 1 'a 'b)" 78 | run if a "(if 0 'a 'b)" 79 | run if a "(if 'x 'a 'b)" 80 | run if b "(if () 'a 'b)" 81 | run if c "(if () 'a 'b 'c)" 82 | 83 | # Numeric comparisons 84 | run = t '(= 3 3)' 85 | run = '()' '(= 3 2)' 86 | 87 | # eq 88 | run eq t "(eq 'foo 'foo)" 89 | run eq t "(eq + +)" 90 | run eq '()' "(eq 'foo 'bar)" 91 | run eq '()' "(eq + 'bar)" 92 | 93 | # gensym 94 | run gensym G__0 '(gensym)' 95 | run gensym '()' "(eq (gensym) 'G__0)" 96 | run gensym '()' '(eq (gensym) (gensym))' 97 | run gensym t '((lambda (x) (eq x x)) (gensym))' 98 | 99 | # Functions 100 | run lambda '' '(lambda (x) x)' 101 | run lambda t '((lambda () t))' 102 | run lambda 9 '((lambda (x) (+ x x x)) 3)' 103 | run defun 12 '(defun double (x) (+ x x)) (double 6)' 104 | 105 | run args 15 '(defun f (x y z) (+ x y z)) (f 3 5 7)' 106 | 107 | run restargs '(3 5 7)' '(defun f (x . y) (cons x y)) (f 3 5 7)' 108 | run restargs '(3)' '(defun f (x . y) (cons x y)) (f 3)' 109 | 110 | # Lexical closures 111 | run closure 3 '(defun call (f) ((lambda (var) (f)) 5)) 112 | ((lambda (var) (call (lambda () var))) 3)' 113 | 114 | run counter 3 ' 115 | (define counter 116 | ((lambda (val) 117 | (lambda () (setq val (+ val 1)) val)) 118 | 0)) 119 | (counter) 120 | (counter) 121 | (counter)' 122 | 123 | # While loop 124 | run while 45 " 125 | (define i 0) 126 | (define sum 0) 127 | (while (< i 10) 128 | (setq sum (+ sum i)) 129 | (setq i (+ i 1))) 130 | sum" 131 | 132 | # Macros 133 | run macro 42 " 134 | (defun list (x . y) (cons x y)) 135 | (defmacro if-zero (x then) (list 'if (list '= x 0) then)) 136 | (if-zero 0 42)" 137 | 138 | run macro 7 '(defmacro seven () 7) ((lambda () (seven)))' 139 | 140 | run macroexpand '(if (= x 0) (print x))' " 141 | (defun list (x . y) (cons x y)) 142 | (defmacro if-zero (x then) (list 'if (list '= x 0) then)) 143 | (macroexpand (if-zero x (print x)))" 144 | 145 | 146 | # Sum from 0 to 10 147 | run recursion 55 '(defun f (x) (if (= x 0) 0 (+ (f (+ x -1)) x))) (f 10)' 148 | --------------------------------------------------------------------------------